|
View:
New views
8 Messages
—
Rating Filter:
Alert me
|
|
|
Migrating documentation from HTML filesI'm a newbie using doxia. I've a lot of documentation in HTML format an I'd
like convert these files to apt format. Is there some way to transform easily? I want to create a maven site for my project and, right now, I only have this documentation in HTML format without css styles nor menu. Could you help me? Very thanks Cristóbal |
|
|
Re: Migrating documentation from HTML filesHi,
Frankly, I never test your use case. But I guess that you need to have an XHTML file in input with no header, footer or navbar something to the div bodyColumn in [1]. The snippet should be something like the following: File f = new File( "blabla.html" ); XhtmlParser parser = new XhtmlParser(); StringWriter output = new StringWriter(); Sink sink = new AptSink( output ); parser.parse( new FileReader( f ), output ); Output will contain APT declaration. HTH, Vincent [1] http://maven.apache.org/doxia/ 2008/3/1, krycho fandino <cristobalft@...>: > I'm a newbie using doxia. I've a lot of documentation in HTML format an I'd > like convert these files to apt format. Is there some way to transform > easily? I want to create a maven site for my project and, right now, I only > have this documentation in HTML format without css styles nor menu. > > Could you help me? Very thanks > Cristóbal > |
|
|
Re: Migrating documentation from HTML filesIf you use the current development branch of doxia (beta-1-SNAPSHOT)
then this should work rather well for simple html files. However, you will probably loose a lot of information if you have anything fancy (eg special layout, tables, figures are not well supported), don't expect it to be perfect. In particular if you have figures you might try to translate to xdoc instead of apt (use XdocSink), that should work better. Cheers, -Lukas Vincent Siveton wrote: > Hi, > > Frankly, I never test your use case. > > But I guess that you need to have an XHTML file in input with no > header, footer or navbar something to the div bodyColumn in [1]. > > The snippet should be something like the following: > > File f = new File( "blabla.html" ); > XhtmlParser parser = new XhtmlParser(); > StringWriter output = new StringWriter(); > Sink sink = new AptSink( output ); > parser.parse( new FileReader( f ), output ); > > Output will contain APT declaration. > > HTH, > > Vincent > > [1] http://maven.apache.org/doxia/ > > 2008/3/1, krycho fandino <cristobalft@...>: > >>I'm a newbie using doxia. I've a lot of documentation in HTML format an I'd >> like convert these files to apt format. Is there some way to transform >> easily? I want to create a maven site for my project and, right now, I only >> have this documentation in HTML format without css styles nor menu. >> >> Could you help me? Very thanks >> Cristóbal > >> > |
|
|
Re: Migrating documentation from HTML filesThanks for your help, however my HTML files isn't XHTML and XhtmlParser
throws a lot of exceptions. Perhaps, I should convert these HTML files to XHTML format, but I've a lot of pages and should be a hard task. Really, I has generated these HTML files using latex2html conversion tool. I don't know how I could transform latex files to some markup languages supported by doxia (apt or xdoc). Could you give me some advice? 2008/3/2, Lukas Theussl <ltheussl@...>: > > If you use the current development branch of doxia (beta-1-SNAPSHOT) > then this should work rather well for simple html files. However, you > will probably loose a lot of information if you have anything fancy (eg > special layout, tables, figures are not well supported), don't expect it > to be perfect. In particular if you have figures you might try to > translate to xdoc instead of apt (use XdocSink), that should work better. > > Cheers, > > -Lukas > > > > Vincent Siveton wrote: > > Hi, > > > > Frankly, I never test your use case. > > > > But I guess that you need to have an XHTML file in input with no > > header, footer or navbar something to the div bodyColumn in [1]. > > > > The snippet should be something like the following: > > > > File f = new File( "blabla.html" ); > > XhtmlParser parser = new XhtmlParser(); > > StringWriter output = new StringWriter(); > > Sink sink = new AptSink( output ); > > parser.parse( new FileReader( f ), output ); > > > > Output will contain APT declaration. > > > > HTH, > > > > Vincent > > > > [1] http://maven.apache.org/doxia/ > > > > 2008/3/1, krycho fandino <cristobalft@...>: > > > >>I'm a newbie using doxia. I've a lot of documentation in HTML format an > I'd > >> like convert these files to apt format. Is there some way to transform > >> easily? I want to create a maven site for my project and, right now, I > only > >> have this documentation in HTML format without css styles nor menu. > >> > >> Could you help me? Very thanks > >> Cristóbal > > > >> > > > |
|
|
Re: Migrating documentation from HTML filesdoxia doesn't have a latex parser (I'd like to have one too!),
latex2html is the only solution I can think of (there exist other latex translators though but that's the only one I know). I am not sure what kind of output latex2html produces, however, the difference HTML - xhtml shouldn't matter here. What kind of exceptions do you get? Maybe you could attach an example file at jira [1] with a snippet of your code so we can try to reproce the problem? -Lukas [1] http://jira.codehaus.org/browse/DOXIA krycho fandino wrote: > Thanks for your help, however my HTML files isn't XHTML and XhtmlParser > throws a lot of exceptions. Perhaps, I should convert these HTML files to > XHTML format, but I've a lot of pages and should be a hard task. > > Really, I has generated these HTML files using latex2html conversion tool. I > don't know how I could transform latex files to some markup languages > supported by doxia (apt or xdoc). Could you give me some advice? > > > 2008/3/2, Lukas Theussl <ltheussl@...>: > >>If you use the current development branch of doxia (beta-1-SNAPSHOT) >>then this should work rather well for simple html files. However, you >>will probably loose a lot of information if you have anything fancy (eg >>special layout, tables, figures are not well supported), don't expect it >>to be perfect. In particular if you have figures you might try to >>translate to xdoc instead of apt (use XdocSink), that should work better. >> >>Cheers, >> >>-Lukas >> >> >> >>Vincent Siveton wrote: >> >>>Hi, >>> >>>Frankly, I never test your use case. >>> >>>But I guess that you need to have an XHTML file in input with no >>>header, footer or navbar something to the div bodyColumn in [1]. >>> >>>The snippet should be something like the following: >>> >>>File f = new File( "blabla.html" ); >>>XhtmlParser parser = new XhtmlParser(); >>>StringWriter output = new StringWriter(); >>>Sink sink = new AptSink( output ); >>>parser.parse( new FileReader( f ), output ); >>> >>>Output will contain APT declaration. >>> >>>HTH, >>> >>>Vincent >>> >>>[1] http://maven.apache.org/doxia/ >>> >>>2008/3/1, krycho fandino <cristobalft@...>: >>> >>> >>>>I'm a newbie using doxia. I've a lot of documentation in HTML format an >> >>I'd >> >>>>like convert these files to apt format. Is there some way to transform >>>>easily? I want to create a maven site for my project and, right now, I >> >>only >> >>>>have this documentation in HTML format without css styles nor menu. >>>> >>>>Could you help me? Very thanks >>>>Cristóbal >>> >> > |
|
|
Re: Migrating documentation from HTML filesOutput latex2html produces no XHTML code. For example:
HTML ========== <LINK REL="STYLESHEET" HREF="embebidos.css"> XhtmlParser ========== org.apache.maven.doxia.parser.ParseException: Error parsing the model: end tag name </HEAD> must be the same as start tag <LINK> from line 19 (position: TEXT seen ...<LINK REL="STYLESHEET" HREF="embebidos.css">\n\n</HEAD>... @21:8) at org.apache.maven.doxia.parser.AbstractXmlParser.parse( AbstractXmlParser.java:57) HTML ========== <H2><A NAME="SECTION00221000000000000000"></A> <A NAME="74"></A> <BR> Grupos de usuarios </H2> XhtmlParser ========== org.apache.maven.doxia.parser.ParseException: Error parsing the model: end tag name </H2> must be the same as start tag <BR> from line 119 (position: TEXT seen ...<BR>\nGrupos de usuarios\n</H2>... @121:6) at org.apache.maven.doxia.parser.AbstractXmlParser.parse( AbstractXmlParser.java:57) XhtmlParser ========== org.apache.maven.doxia.parser.ParseException: Error parsing the model: attribute value must start with quotation or apostrophe not 3 (position: TEXT seen ...<A NAME="91"></A>\n<TABLE CELLPADDING=3... @171:21) at org.apache.maven.doxia.parser.AbstractXmlParser.parse( AbstractXmlParser.java:57) ... and far more 2008/3/3, Lukas Theussl <ltheussl@...>: > > doxia doesn't have a latex parser (I'd like to have one too!), > latex2html is the only solution I can think of (there exist other latex > translators though but that's the only one I know). I am not sure what > kind of output latex2html produces, however, the difference HTML - xhtml > shouldn't matter here. What kind of exceptions do you get? Maybe you > could attach an example file at jira [1] with a snippet of your code so > we can try to reproce the problem? > > -Lukas > > [1] http://jira.codehaus.org/browse/DOXIA > > > krycho fandino wrote: > > Thanks for your help, however my HTML files isn't XHTML and XhtmlParser > > throws a lot of exceptions. Perhaps, I should convert these HTML files > to > > XHTML format, but I've a lot of pages and should be a hard task. > > > > Really, I has generated these HTML files using latex2html conversion > tool. I > > don't know how I could transform latex files to some markup languages > > supported by doxia (apt or xdoc). Could you give me some advice? > > > > > > 2008/3/2, Lukas Theussl <ltheussl@...>: > > > >>If you use the current development branch of doxia (beta-1-SNAPSHOT) > >>then this should work rather well for simple html files. However, you > >>will probably loose a lot of information if you have anything fancy (eg > >>special layout, tables, figures are not well supported), don't expect it > >>to be perfect. In particular if you have figures you might try to > >>translate to xdoc instead of apt (use XdocSink), that should work > better. > >> > >>Cheers, > >> > >>-Lukas > >> > >> > >> > >>Vincent Siveton wrote: > >> > >>>Hi, > >>> > >>>Frankly, I never test your use case. > >>> > >>>But I guess that you need to have an XHTML file in input with no > >>>header, footer or navbar something to the div bodyColumn in [1]. > >>> > >>>The snippet should be something like the following: > >>> > >>>File f = new File( "blabla.html" ); > >>>XhtmlParser parser = new XhtmlParser(); > >>>StringWriter output = new StringWriter(); > >>>Sink sink = new AptSink( output ); > >>>parser.parse( new FileReader( f ), output ); > >>> > >>>Output will contain APT declaration. > >>> > >>>HTH, > >>> > >>>Vincent > >>> > >>>[1] http://maven.apache.org/doxia/ > >>> > >>>2008/3/1, krycho fandino <cristobalft@...>: > >>> > >>> > >>>>I'm a newbie using doxia. I've a lot of documentation in HTML format > an > >> > >>I'd > >> > >>>>like convert these files to apt format. Is there some way to transform > >>>>easily? I want to create a maven site for my project and, right now, I > >> > >>only > >> > >>>>have this documentation in HTML format without css styles nor menu. > >>>> > >>>>Could you help me? Very thanks > >>>>Cristóbal > >>> > >> > > > |
|
|
Re: Migrating documentation from HTML filesEhm, yes, sorry, I talked quicker than I thought. Of course, the parser
is an xml parser so it will cough up any tags that are not properly closed. So it has to be xhtml. You can use tools like htmltidy [1] to convert html to xhtml. Btw, Vincent just added a simple tool to do document translations with doxia: http://svn.apache.org/viewvc?view=rev&revision=633328 Feel free to test and comment! :) Cheers, -Lukas [1] http://tidy.sourceforge.net/ Cristóbal Fandiño wrote: > Output latex2html produces no XHTML code. For example: > > HTML > ========== > <LINK REL="STYLESHEET" HREF="embebidos.css"> > > XhtmlParser > ========== > org.apache.maven.doxia.parser.ParseException: Error parsing the model: end > tag name </HEAD> must be the same as start tag <LINK> from line 19 > (position: TEXT seen ...<LINK REL="STYLESHEET" > HREF="embebidos.css">\n\n</HEAD>... > @21:8) > at org.apache.maven.doxia.parser.AbstractXmlParser.parse( > AbstractXmlParser.java:57) > > > HTML > ========== > <H2><A NAME="SECTION00221000000000000000"></A> > <A NAME="74"></A> > <BR> > Grupos de usuarios > </H2> > > XhtmlParser > ========== > org.apache.maven.doxia.parser.ParseException: Error parsing the model: end > tag name </H2> must be the same as start tag <BR> from line 119 (position: > TEXT seen ...<BR>\nGrupos de usuarios\n</H2>... @121:6) > at org.apache.maven.doxia.parser.AbstractXmlParser.parse( > AbstractXmlParser.java:57) > > > XhtmlParser > ========== > org.apache.maven.doxia.parser.ParseException: Error parsing the model: > attribute value must start with quotation or apostrophe not 3 (position: > TEXT seen ...<A NAME="91"></A>\n<TABLE CELLPADDING=3... @171:21) > at org.apache.maven.doxia.parser.AbstractXmlParser.parse( > AbstractXmlParser.java:57) > > ... and far more > > > 2008/3/3, Lukas Theussl <ltheussl@...>: > >>doxia doesn't have a latex parser (I'd like to have one too!), >>latex2html is the only solution I can think of (there exist other latex >>translators though but that's the only one I know). I am not sure what >>kind of output latex2html produces, however, the difference HTML - xhtml >>shouldn't matter here. What kind of exceptions do you get? Maybe you >>could attach an example file at jira [1] with a snippet of your code so >>we can try to reproce the problem? >> >>-Lukas >> >>[1] http://jira.codehaus.org/browse/DOXIA >> >> >>krycho fandino wrote: >> >>>Thanks for your help, however my HTML files isn't XHTML and XhtmlParser >>>throws a lot of exceptions. Perhaps, I should convert these HTML files >> >>to >> >>>XHTML format, but I've a lot of pages and should be a hard task. >>> >>>Really, I has generated these HTML files using latex2html conversion >> >>tool. I >> >>>don't know how I could transform latex files to some markup languages >>>supported by doxia (apt or xdoc). Could you give me some advice? >>> >>> >>>2008/3/2, Lukas Theussl <ltheussl@...>: >>> >>> >>>>If you use the current development branch of doxia (beta-1-SNAPSHOT) >>>>then this should work rather well for simple html files. However, you >>>>will probably loose a lot of information if you have anything fancy (eg >>>>special layout, tables, figures are not well supported), don't expect it >>>>to be perfect. In particular if you have figures you might try to >>>>translate to xdoc instead of apt (use XdocSink), that should work >> >>better. >> >>>>Cheers, >>>> >>>>-Lukas >>>> >>>> >>>> >>>>Vincent Siveton wrote: >>>> >>>> >>>>>Hi, >>>>> >>>>>Frankly, I never test your use case. >>>>> >>>>>But I guess that you need to have an XHTML file in input with no >>>>>header, footer or navbar something to the div bodyColumn in [1]. >>>>> >>>>>The snippet should be something like the following: >>>>> >>>>>File f = new File( "blabla.html" ); >>>>>XhtmlParser parser = new XhtmlParser(); >>>>>StringWriter output = new StringWriter(); >>>>>Sink sink = new AptSink( output ); >>>>>parser.parse( new FileReader( f ), output ); >>>>> >>>>>Output will contain APT declaration. >>>>> >>>>>HTH, >>>>> >>>>>Vincent >>>>> >>>>>[1] http://maven.apache.org/doxia/ >>>>> >>>>>2008/3/1, krycho fandino <cristobalft@...>: >>>>> >>>>> >>>>> >>>>>>I'm a newbie using doxia. I've a lot of documentation in HTML format >> >>an >> >>>>I'd >>>> >>>> >>>>>>like convert these files to apt format. Is there some way to transform >>>>>>easily? I want to create a maven site for my project and, right now, I >>>> >>>>only >>>> >>>> >>>>>>have this documentation in HTML format without css styles nor menu. >>>>>> >>>>>>Could you help me? Very thanks >>>>>>Cristóbal >>>>> >> > |
|
|
Re: Migrating documentation from HTML files2008/3/4, Lukas Theussl <ltheussl@...>:
> Ehm, yes, sorry, I talked quicker than I thought. Of course, the parser > is an xml parser so it will cough up any tags that are not properly > closed. So it has to be xhtml. You can use tools like htmltidy [1] to > convert html to xhtml. > > Btw, Vincent just added a simple tool to do document translations with > doxia: http://svn.apache.org/viewvc?view=rev&revision=633328 > Feel free to test and comment! :) You need to use the entire trunk for this. I guess it will be easy to patch the converter with jtidy to support html as an input format. Patches are welcome :) Cheers, Vincent > Cheers, > -Lukas > > [1] http://tidy.sourceforge.net/ > > > > Cristóbal Fandiño wrote: > > Output latex2html produces no XHTML code. For example: > > > > HTML > > ========== > > <LINK REL="STYLESHEET" HREF="embebidos.css"> > > > > XhtmlParser > > ========== > > org.apache.maven.doxia.parser.ParseException: Error parsing the model: end > > tag name </HEAD> must be the same as start tag <LINK> from line 19 > > (position: TEXT seen ...<LINK REL="STYLESHEET" > > HREF="embebidos.css">\n\n</HEAD>... > > @21:8) > > at org.apache.maven.doxia.parser.AbstractXmlParser.parse( > > AbstractXmlParser.java:57) > > > > > > HTML > > ========== > > <H2><A NAME="SECTION00221000000000000000"></A> > > <A NAME="74"></A> > > <BR> > > Grupos de usuarios > > </H2> > > > > XhtmlParser > > ========== > > org.apache.maven.doxia.parser.ParseException: Error parsing the model: end > > tag name </H2> must be the same as start tag <BR> from line 119 (position: > > TEXT seen ...<BR>\nGrupos de usuarios\n</H2>... @121:6) > > at org.apache.maven.doxia.parser.AbstractXmlParser.parse( > > AbstractXmlParser.java:57) > > > > > > XhtmlParser > > ========== > > org.apache.maven.doxia.parser.ParseException: Error parsing the model: > > attribute value must start with quotation or apostrophe not 3 (position: > > TEXT seen ...<A NAME="91"></A>\n<TABLE CELLPADDING=3... @171:21) > > at org.apache.maven.doxia.parser.AbstractXmlParser.parse( > > AbstractXmlParser.java:57) > > > > ... and far more > > > > > > 2008/3/3, Lukas Theussl <ltheussl@...>: > > > >>doxia doesn't have a latex parser (I'd like to have one too!), > >>latex2html is the only solution I can think of (there exist other latex > >>translators though but that's the only one I know). I am not sure what > >>kind of output latex2html produces, however, the difference HTML - xhtml > >>shouldn't matter here. What kind of exceptions do you get? Maybe you > >>could attach an example file at jira [1] with a snippet of your code so > >>we can try to reproce the problem? > >> > >>-Lukas > >> > >>[1] http://jira.codehaus.org/browse/DOXIA > >> > >> > >>krycho fandino wrote: > >> > >>>Thanks for your help, however my HTML files isn't XHTML and XhtmlParser > >>>throws a lot of exceptions. Perhaps, I should convert these HTML files > >> > >>to > >> > >>>XHTML format, but I've a lot of pages and should be a hard task. > >>> > >>>Really, I has generated these HTML files using latex2html conversion > >> > >>tool. I > >> > >>>don't know how I could transform latex files to some markup languages > >>>supported by doxia (apt or xdoc). Could you give me some advice? > >>> > >>> > >>>2008/3/2, Lukas Theussl <ltheussl@...>: > >>> > >>> > >>>>If you use the current development branch of doxia (beta-1-SNAPSHOT) > >>>>then this should work rather well for simple html files. However, you > >>>>will probably loose a lot of information if you have anything fancy (eg > >>>>special layout, tables, figures are not well supported), don't expect it > >>>>to be perfect. In particular if you have figures you might try to > >>>>translate to xdoc instead of apt (use XdocSink), that should work > >> > >>better. > >> > >>>>Cheers, > >>>> > >>>>-Lukas > >>>> > >>>> > >>>> > >>>>Vincent Siveton wrote: > >>>> > >>>> > >>>>>Hi, > >>>>> > >>>>>Frankly, I never test your use case. > >>>>> > >>>>>But I guess that you need to have an XHTML file in input with no > >>>>>header, footer or navbar something to the div bodyColumn in [1]. > >>>>> > >>>>>The snippet should be something like the following: > >>>>> > >>>>>File f = new File( "blabla.html" ); > >>>>>XhtmlParser parser = new XhtmlParser(); > >>>>>StringWriter output = new StringWriter(); > >>>>>Sink sink = new AptSink( output ); > >>>>>parser.parse( new FileReader( f ), output ); > >>>>> > >>>>>Output will contain APT declaration. > >>>>> > >>>>>HTH, > >>>>> > >>>>>Vincent > >>>>> > >>>>>[1] http://maven.apache.org/doxia/ > >>>>> > >>>>>2008/3/1, krycho fandino <cristobalft@...>: > >>>>> > >>>>> > >>>>> > >>>>>>I'm a newbie using doxia. I've a lot of documentation in HTML format > >> > >>an > >> > >>>>I'd > >>>> > >>>> > >>>>>>like convert these files to apt format. Is there some way to transform > >>>>>>easily? I want to create a maven site for my project and, right now, I > >>>> > >>>>only > >>>> > >>>> > >>>>>>have this documentation in HTML format without css styles nor menu. > >>>>>> > >>>>>>Could you help me? Very thanks > >>>>>>Cristóbal > >>>>> > >> > > > |
| Free Forum Powered by Nabble | Forum Help |