|
View:
New views
5 Messages
—
Rating Filter:
Alert me
|
|
|
Wikipedia HTML & Syntax specificationHi,
I have two questions related to each other. I would like to know if there is a simple way to get the Static HTML from the Wikipedia Articles i.e. extraction as HTML files. In this regard I managed to put the Text Table into a MySQL database. It can give me the Wiki Text – which I could then parse. However I found the Wikipedia Syntax more complicated than what I had used when contributing to Wikipedia myself. Is there some place where the complete syntax is specified? At least if I have the specification I can think about working on a parser. Thanks a lot. O.O. __________________________________________________ Correo Yahoo! Espacio para todos tus mensajes, antivirus y antispam ¡gratis! Regístrate ya - http://correo.espanol.yahoo.com/ _______________________________________________ Wikipedia-l mailing list Wikipedia-l@... https://lists.wikimedia.org/mailman/listinfo/wikipedia-l |
|
|
Re: Wikipedia HTML & Syntax specificationO. Olson skrev:
> Hi, > > I have two questions related to each other. > > I would like to know if there is a simple way to get the Static HTML from the Wikipedia Articles i.e. extraction as HTML files. Try this: http://static.wikipedia.org/ Regards, // Rolf Lampa _______________________________________________ Wikipedia-l mailing list Wikipedia-l@... https://lists.wikimedia.org/mailman/listinfo/wikipedia-l |
|
|
Re: Wikipedia HTML & Syntax specificationDear Rolf,
I don’t think I understood what you are referring to. To get the Static HTML you mean I would have to spider/crawl through those pages? I thought Wikipedia explicitly did not allow this. Also any idea regarding the syntax specification? Thanks again for your post. O.O. --- El sáb 5-jul-08, Rolf Lampa <rolf.lampa@...> escribió: > De: Rolf Lampa <rolf.lampa@...> > Asunto: Re: [Wikipedia-l] Wikipedia HTML & Syntax specification > A: wikipedia-l@... > Fecha: sábado, 5 julio, 2008, 5:49 am > O. Olson skrev: > > Hi, > > > > I have two questions related to each other. > > > > I would like to know if there is a simple way to get > the Static HTML from the Wikipedia Articles i.e. extraction > as HTML files. > > Try this: > http://static.wikipedia.org/ > > Regards, > > // Rolf Lampa > > > _______________________________________________ > Wikipedia-l mailing list > Wikipedia-l@... > https://lists.wikimedia.org/mailman/listinfo/wikipedia-l __________________________________________________ Correo Yahoo! Espacio para todos tus mensajes, antivirus y antispam ¡gratis! Regístrate ya - http://correo.espanol.yahoo.com/ _______________________________________________ Wikipedia-l mailing list Wikipedia-l@... https://lists.wikimedia.org/mailman/listinfo/wikipedia-l |
|
|
Re: Wikipedia HTML & Syntax specificationO. Olson skrev:
> Dear Rolf, > > I don’t think I understood what you are referring to. > To get the Static HTML you mean I would have to > spider/crawl through those pages? I thought Wikipedia > explicitly did not allow this. ??? You wouldn't have to crawl. There's a download-link in the middle of the page I linked to. If you for example would like to have the entire English Wikipedia's content in html format, then download it: http://static.wikipedia.org/downloads/2008-06/en/ Or did I misunderstand your question entirely? > Also any idea regarding the syntax specification? There's no end to all the pages written on that subject. But I don't know of any place where you can find all of it on one page. For a start see this: http://en.wikipedia.org/wiki/Help:Contents and http://en.wikipedia.org/wiki/Wikipedia:Cheatsheet and go on with referred pages, like: http://meta.wikimedia.org/wiki/Help:Link Regards, // Rolf Lampa _______________________________________________ Wikipedia-l mailing list Wikipedia-l@... https://lists.wikimedia.org/mailman/listinfo/wikipedia-l |
|
|
Re: Wikipedia HTML & Syntax specification--- El lun 7-jul-08, Rolf Lampa <rolf.lampa@...> escribió: > You wouldn't have to crawl. There's a download-link > in the middle of the > page I linked to. If you for example would like to have the > entire > English Wikipedia's content in html format, then > download it: > http://static.wikipedia.org/downloads/2008-06/en/ > Thanks Rolf. This was not clear from your original post – but I have since downloaded it. However it seems too big to extract in the 200 GB space I have on my drive. I am trying to borrow a terabyte drive from my friend over the next week to see how everything looks. Thanks again. O.O. __________________________________________________ Correo Yahoo! Espacio para todos tus mensajes, antivirus y antispam ¡gratis! Regístrate ya - http://correo.espanol.yahoo.com/ _______________________________________________ Wikipedia-l mailing list Wikipedia-l@... https://lists.wikimedia.org/mailman/listinfo/wikipedia-l |
| Free Forum Powered by Nabble | Forum Help |