|
View:
New views
6 Messages
—
Rating Filter:
Alert me
|
|
|
MathML entities don't degrade gracefullyI think the inclusion of the MathML entities in HTML5 regardless of a MathML context violates the Degrade Gracefully design principle of the HTML WG. The entities don't add anything to the expressiveness of the language: anything that you can express with the entities you can also express with numeric character references or by using UTF-8 directly. However, when an author uses entities that have not been traditionally supported by HTML, the rendering of the document in legacy user agents will be worse than in the situation where numeric character references or direct UTF-8 is used. Could we get away with not supporting the MathML entity set in text/ html, considering that MathML subtrees are expected to be generated by converter software anyway? As for application/xhtml+xml, the situation is even worse. DTDs don't work on the Web[1] and are mostly useless legacy. So far, HTML 5 has encouraged DTDlessness for XHTML5--and rightly so. Using the MathML entities in XML requires a doctype, because otherwise the document would be ill-formed. Browsers won't fetch a DTD based on the doctype, so we need to consider existing magic public IDs and potential future public IDs. Either way, the situation will be bad from the point of view of the Degrade Gracefully design principle: When an old magic public ID is used, Firefox renders the right character, Safari shows an XML parse error and Opera renders a placeholder that looks like an entity reference.[2] When a future public ID is used, Firefox shows an XML parse error, Safari shows an XML parse error and Opera renders a placeholder that looks like an entity reference.[3] The result in Opera is bad in application/xhtml+xml although no worse than in text/html. In Safari, MathML entities in application/xhtml+xml are dramatically user experience-breaking in both public ID cases. In Firefox, using an old magic public ID would work, but trying to introduce *any* new public ID *ever* would lead to a dramatically bad experience in old versions. Wouldn't it be better to just say "No" to the MathML entities on the Web and ask MathML generators to produce Unicode directly? (The few people who write MathML by hand are probably proficient enough to parse with DTD and re-serialize without DTD at their end before sending the re-serialized document over the public network.) [1] http://hsivonen.iki.fi/no-dtd/ [2] http://hsivonen.iki.fi/test/moz/math-entity-known-dtd.xhtml [3] http://hsivonen.iki.fi/test/moz/math-entity-unknown-dtd.xhtml -- Henri Sivonen hsivonen@... http://hsivonen.iki.fi/ |
|
|
Re: MathML entities don't degrade gracefullyHenri, > As for application/xhtml+xml, the situation is even worse. The fact that using an entity that's not defined is a wellformedness error that probably causes the entire document to be rejected is, hmm problematic, and the main reason why we try to keep the set of mathml entity names unchanged, even if we occasionally change the definitions to take account of additions to Unicode. The XML spec does leave an escape clause that if the document references a DTD and the application does not fetch the dtd then the error need not be fatal (thus allowing Opera's current behaviour). Although most XML parsers (and certainly anything using xslt/xpath/xquery) have to reject the document as the xpath data model doesn't support undefined entities. Going forward it has often been suggested that a possible way to alleviate this problem is just for everyone to use the same set of entities always, and putting them all in html5 would be a move in that direction, although behaviour on existing systems is as you describe. So it's really a matter of future benefits against bad fallback behaviour on existing systems. As I said to Ian earlier I think the most important thing is that the definitions agree where they use the same name (and I think html5 and mathml3 drafts do now agree). Whether html5 should include all the names is less clear. It has some advantages and I would not argue against it, but it also has some disadvantages and I wouldn't argue too strongly for them to be kept either. The MathML3 draft has modified all the example fragments of mathml code never to use the entity form and always to use numeric character references (together with a comment with the unicode name) to try to wean people off entities. David (Personal response) |
|
|
Re: MathML entities don't degrade gracefullyHenri, > Using the MathML entities in XML requires a doctype, because otherwise > the document would be ill-formed. Yes and no. The HTML5 spec could state that when processing application/xhtml+xml documents that the application should (effectively) use a catalog that supplies DTD entity definitions for the HTML5 entities (it may make sense to do this regardless of whether the "html5 entity set" ends up being the html4 names or html4+mathml names). <!DOCTYPE html> <html> <p>φ</p> </html> or even just <html> <p>φ</p> </html> is well formed (but not valid) if the parser is using a catalog that says (for example) that any document with document element "html" should use a dtd that (just) defines some set of html5 entities. David |
|
|
Re: MathML entities don't degrade gracefullyOn Apr 25, 2008, at 13:59 , David Carlisle wrote: >> Using the MathML entities in XML requires a doctype, because >> otherwise >> the document would be ill-formed. > > Yes and no. The HTML5 spec could state that when processing > application/xhtml+xml documents that the application should > (effectively) use a catalog that supplies DTD entity definitions for > the HTML5 entities (it may make sense to do this regardless of whether > the "html5 entity set" ends up being the html4 names or html4+mathml > names). The HTML 5 spec could indeed specify precise what kind of entity resolver needs to be supplied to a vanilla XML 1.0 parser when parsing application/xhtml+xml without having to fork XML. If we do that, I suggest standardizing Gecko's catalog of two special DTDs and the particular public IDs that map to these. > <!DOCTYPE html> > <html> > <p>φ</p> > </html> > > or even just > > <html> > <p>φ</p> > </html> > > is well formed (but not valid) if the parser is using a catalog that > says > (for example) that any document with document element "html" should > use > a dtd that (just) defines some set of html5 entities. This, on the other hand, would mean forking XML and creating something that's almost XML but not quite--thereby making it incompatible with deployed browsers and the existing XML toolchain. If we went that route, I think we should do it the right way the first time and have only one major discontinuity point. In that case, instead of fixing one XML design flaw at a time, we should go all the way to "XML5" on the first try specifying non-Draconian streamable error handling, adding MathML entities as built-in, removing *all* restrictions on what characters can appear in a Name and removing DTDs all in the same go. -- Henri Sivonen hsivonen@... http://hsivonen.iki.fi/ |
|
|
Re: MathML entities don't degrade gracefully> This, on the other hand, would mean forking XML and creating something > that's almost XML but not quite- No, as I said I think, given that the mechanism by which an XML parser finds (or does not find) an external DTD is more or less unspecified, this does not require forking XML, certainly any XML parser with catalog support already could be made to accept those two examples. > thereby making it incompatible withdeployed browsers Ye, as you showed at the start of the thread. If you pass an undefined entity to (most) browsers in XML mode you get a very agressive rejection of the whole document. If you put keeping existing behaviour as top priority then there is nothing you can do to change that, you are saying you want to keep that error behaviour. If you do something (anything) to make the entity have a default definition or in some other way prevent the rejection of the entire document then it will be incompatible with deployed browsers. > and the existing XML toolchain. I think this can work with existing XML toolchain. It stretches things a bit and isn't without problems, but no solution here is without problems, it's just a judgement call on which is the least horrible solution. > we should go all the way to "XML5" on the first try specifying > non-Draconian streamable error handling, adding MathML entities as > built-in, removing *all* restrictions on what characters can appear in > a Name and removing DTDs all in the same go. Yes the idea of building in all the entities has come up several times in "XML 2" (aka XML 5) discussions on xml-dev and elsewhere. David |
|
|
Re: MathML entities don't degrade gracefullyOn Thu, 24 Apr 2008, Henri Sivonen wrote: > > I think the inclusion of the MathML entities in HTML5 regardless of a > MathML context violates the Degrade Gracefully design principle of the > HTML WG. The entities don't add anything to the expressiveness of the > language: anything that you can express with the entities you can also > express with numeric character references or by using UTF-8 directly. > However, when an author uses entities that have not been traditionally > supported by HTML, the rendering of the document in legacy user agents > will be worse than in the situation where numeric character references > or direct UTF-8 is used. It will be worse, but it won't be dramatically worse. In the transition period, people can avoid using the new entities. However, I don't see a good reason to prevent their use in the future. If we ever want to use new entities, we have to add them. We've added entities before (e.g. €) without major problems. > As for application/xhtml+xml, the situation is even worse. DTDs don't > work on the Web[1] and are mostly useless legacy. So far, HTML 5 has > encouraged DTDlessness for XHTML5--and rightly so. Using the MathML > entities in XML requires a doctype, because otherwise the document would > be ill-formed. Browsers won't fetch a DTD based on the doctype, so we > need to consider existing magic public IDs and potential future public > IDs. Either way, the situation will be bad from the point of view of the > Degrade Gracefully design principle [...] I don't think it's a critical problem if the XML authoring experience is worse than the text/html one. After all, it's already worse for many other reasons. What's special about this one? The entities in HTML5 don't apply to XHTML5. The spec says as much. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.' |
| Free Forum Powered by Nabble | Forum Help |