Re: Notes on validome test suite / validators comparison

View: New views
19 Messages — Rating Filter:   Alert me  

Parent Message unknown Re: Notes on validome test suite / validators comparison

by Validome-Staff :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.
Hi Olivier,
 
Here is our statement:
 
 
Here Validome advices the user to use our XML-Validator, as a HTML-Validator is not the appropriated tool to check XML...;-)
 
> * http://www.validome.org/out/ena1005
 
Here we corrected our claims, sorry for not keeping the comparison up to date. BUT: Until you renewed the W3C-Validator by implementing LibXML, the announcements were right and you knew this about an year... (http://lists.w3.org/Archives/Public/www-validator/2006Apr/0072.html)
 
> * http://www.validome.org/out/ena4011
> HTML 4.01 document with no system Id.
> Validome sends a warning... Not necessary per the spec.
> W3C Markup validator passes validation.
> Why is W3C validator marked as faulty here? References please?
 
Other way: Where is specified, that System-Id can be missed?
 
> * http://www.validome.org/out/ena4012
> XHTML doctype without system Id, but valid public id.
> Validation should report an error (both validators do), but why does 
> validome count this as a fatal error?
 
That one has coding reasons.
 
> * http://www.validome.org/out/ena4023
> Validome says valid. OpenSP and W3C Markup validator says not valid.
> I'd tend to trust opensp here. The comparison page's claim that 
 validome is the only validator doing the right thing is very dubious.
 
> * http://www.validome.org/out/ena4024
> Ditto above. The comparison page's claim that validome is the only 
> validator doing the right thing is very dubious.
 
What is here dubious? It's about SGML (not HTML) documents.
 
 
> * http://www.validome.org/out/ena8
> W3C markup validator uses algorithm for charset detection, finds 
> none, uses fallback
> Validome uses... exactly the same algorith (to the point of having 
> almost the same error message...), finds no charset, yields a fatal 
> error.
> I'm very curious to know why validome passes and w3c markup validator 
> fails here. I think the opposite: validome's taste for fatal error is 
> a grave failure in usability.
 
The "old" W3C-Validator made a fallback o US-ASCII, the "new" to UTF-8. Can you explain this, please?
We asked many times W3C-Germany and Bjoern Hoehrmann in regard to the *correct* behaviour of an validator in the case of a fallback, but we didn't get any *exact* answer. In this case, the specs are very unexact and ambiguous. Please give us a *mandatory" answer - with a link reference to appropriate specifications - upon this case. The only clear case till now is XHTML, there validators should make a fallback to UTF-8 (depending on MIME-Type), HTML is still ambiguous...
 
 
> * http://www.validome.org/out/ena2008
> * http://www.validome.org/out/ena2009
> * http://www.validome.org/out/ena2010
Old, deprecated examples because of unclear and ambiguous specifications.
 
> * http://www.validome.org/out/ena2041
> The comparison page is incorrect. The W3C Markup validator has the 
> proper behavior here.
 
The W3C-Validator doesn't detect the conflict.
 
> * http://www.validome.org/out/ena5020
> I strongly disagree that the W3C Markup's validator behavior is 
> incorrect, here.
> text/html is allowed for XHTML 1.0
 
We don't claim here, the behaviour of the W3C-Validator is wrong, we say that we miss the appropriate note.
In accordance to http://www.w3.org/TR/2002/REC-xhtml1-20020801/#media a XHTML document should be delivered with MIME-Type text/html when it meets the guidelines of HTML compatibility. That is what a validator shoul claim and Validome does it.
 
> * http://www.validome.org/out/ena7003
> I'd like to see a reference for this.
 
http://www.w3.org/TR/html401/struct/links.html#h-12.2.3
"...The id attribute, on the other hand, may not contain character references."
 
 
> * http://www.validome.org/out/ena7005 (and 7006)
> This has nothing to do with validation. If validome emulates some of 
> the features of a link checker, compare it to link checkers, not 
> validator. This test is moot.
 
http://www.w3.org/TR/html401/struct/links.html#h-12.2.4
"A reference to an unavailable or unidentifiable resource is an error"
...
"If a user agent cannot locate a linked resource, it should alert the user"
 
Where is here the "moot"? The W3C-Specification is very clear in this case...
 
 
> * http://www.validome.org/out/ena3002
> This test is bogus. Sorry. An XML declaration also happens to be a 
> proper SGML PI. Giving a warning asking the HTML4 author "are you 
> sure you want this here" may be a good idea. Making this a fatal 
> error is wrong, wrong, wrong.
 
If a XML-declaration is allowed in SGML, I'd like to see a reference for this.
*If* this should be right, ehat about the priority of the encoding attribute within declararion vs. META-element??????
 
> * http://www.validome.org/out/ena3006
> The comparison page is incorrect. Output of a warning for a shorttag 
> construct is a good thing (dev version of w3c validator actually does 
> it) but not required. The current W3C Validator's behavior is not wrong.
>
> * http://www.validome.org/out/ena3007
> ditto. Learn about shorttags. Validome is actually wrong here, this 
> should not be reported as an error, at most a warning.
 
Oh, her we have hundred opinions of the case. Could you please show us a *exact* reference?
 
BTW:
 
At the moment, we are implementing the W3C Validator in a free out of the box software solution for Windows users, together with validome.
When trying to implement, there are some inconsitencies/bugs we found:
 
1. The W3C-SGML-Parser uses two catalog files: xml.soc and sgml.soc. Within xml.soc there are 21 points missing, all regarding SVG 1.1 Tiny and "SVG 1.1 Basic.
2. We missed 6 DTDs, necessary to get the download package running.
3. Your LibXML-Implementation was not correct - you just use the catalog files of your SGML-Parser instead of taking care of the the "official" catalog specification (http://www.xmlsoft.org/catalog.html#Simple).
Because of this, LibXML tries to get the external DTDs instead of the local ones.
 
You write within your CVS (http://dev.w3.org/cvsweb/validator/httpd/cgi-bin/check?rev=1.574&content-type=text/x-cvsweb-markup):
 
# [NOT] loading the XML catalog for entities resolution as it seems to cause a lot of unnecessary DTD/entities fetching (requires >= 1.53 if enabled)
#$xmlparser->load_catalog( File::Spec->catfile($CFG->{Paths}->{SGML}->{Library}, 'xml.soc') );
 
That is not right, as your implementation is not correct. BUT: Youcan download the fixes on http://www.validome.org/W3C_fix.rar, with the fixes it works (Problem 1+2+3 solved).
 
Best regards,
 
Alex

Re: Notes on validome test suite / validators comparison

by Frank Ellermann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Validome-Staff wrote:

> Validome advices the user to use our XML-Validator, as a HTML-Validator
> is not the appropriated tool to check XML...;-)

When I try to validate <http://idn.icann.org/IDNwiki> at your site
I get the same incorrect "valid" result as with the W3C validator.

For the W3C validator I know that it can't (yet) check URI syntax,
but it's disappointing that your validator also fails.  Is than an
issue in the "XHTML 1.0 transitional" schema or in your code ?

 Frank




Parent Message unknown Re: Notes on validome test suite / validators comparison

by Validome-Staff :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi Frank,

> Is than an
> issue in the "XHTML 1.0 transitional" schema or in your code ?

Neither an issue in our schema, nor an issue in our code. Our schema
validator in the current version simply verificates URIs in accordance to
appropriate demands:
http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#anyURI
 "such rules and restrictions are not part of type validity and are not
checked by ·minimally conforming· processors. Thus in practice the above
definition imposes only very modest obligations on ·minimally conforming·
processors. "

As you know, there is no so simple as you claim to provide a reliable URI
check. At the moment Validome processes URI handling with help of a (simple)
schema validation ...It's not exactly brilliant, but - as we know - there is
no validaor at the moment, which handles it much better. It is necessary to
develop another *concept* for handling URI check. There is someone in our
team, who currently works on URI handling concepts in our validator. As
"URI" is a nontrivial issue, it will take some more time for modelling and
coding an acceptable solution for a sustainable URI check. This will be
probably in V3.0 (current version: 2.6.1).

BTW:
Validome supports validation of IDN domains since May 2007:
http://www.validome.org/validate/?uri=http://www.h%c3%a4ndewaschen.de
This would also make sense for the W3C-Validator, as it can not handle IDN
domain validation till now:
http://validator.w3.org/check?uri=http%3A%2F%2Fwww.h%C3%A4ndewaschen.de&charset=%28detect+automatically%29&doctype=Inline&group=0



Re: Notes on validome test suite / validators comparison

by Frank Ellermann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Validome-Staff wrote:
 
> http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#anyURI
> "such rules and restrictions are not part of type validity
> and are not checked by ·minimally conforming· processors.
> Thus in practice the above definition imposes only very
> modest obligations on ·minimally conforming· processors. "

The 2nd edition 2004 still has the same text talking about
RFC 2396 as amended by 2732 instead of RFC 3986 (STD 66) -
okay, just checked it, STD 66 was published in January 2005.

> As you know, there is no so simple as you claim to provide
> a reliable URI check.

The regexp in STD 66 is a one-liner, and determining the set
of visible ASCII characters allowed in an URI is "possible".
(Actually it's trivial, but it took me almost year to figure
 it out with the help Roy and others on the W3C URI list. ;-)

> as we know - there is no validaor at the moment, which
> handles it much better.

Indeed, I just asked WDG and schneegans.de what they think,
they also said "valid".

 Frank



Parent Message unknown Re: Notes on validome test suite / validators comparison

by Validome-Staff :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi Frank,

> The regexp in STD 66 is a one-liner, and determining the set
> of visible ASCII characters allowed in an URI is "possible".

My post was not about the ASCII character issue only...There are some "URI"
problems more, a schema validator doesn't catch at the moment. Even MSXML
and Altova DON'T detect them. A "RFC Conformity Checker" for URIs is much
more than this single ASCII issue. We have already about 100 test cases on
issues, schema validators can not check. So, NONtrivial...;-)

Best regards,

Alex
---------------------------
http://www.validome.org/



Critical bug 4916 (was: Notes on validome test suite / validators comparison)

by Frank Ellermann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Alex wrote:
 
>> The regexp in STD 66 is a one-liner, and determining the set
>> of visible ASCII characters allowed in an URI is "possible".
 
> My post was not about the ASCII character issue only...There
> are some "URI" problems more, a schema validator doesn't catch
> at the moment.

Well, it's about time to fix this.  After the installation of a
"popular browser" on a "popular OS" virtually all applications
allowing to click on URIs could indirectly start malware.  It's
hard to decide whose fault that is, but saying that it's only
the fault of the user is no option.

All, please "vote" for bug 4916 and support its reclassification
as "critical" with "priority 1" for an immediate fix.  We all had
almost three years to think about RFC 3986 and 3987.  It's a good
thing that the IDN test finally forces some action.
 
> A "RFC Conformity Checker" for URIs is much more than this single
> ASCII issue.

The generic RFC 3986 syntax is no rocket science, just ignore all
idiosyncrasies of legacy definitions as in RFC 2368, admittedly
mailto: is a hard case.  The syntax in the expired mailto-bis draft
is better.

For a validator you're not forced to guess what invalid syntax is
supposed to mean, simply flag it as invalid and be done with it.

> NONtrivial...;-)

Maybe we can agree on an "interesting clerical task".  The xmpp
folks (i.e. Peter) had to fix their syntax for 3986-compatibility,
they (i.e. he) managed.

 Frank



Parent Message unknown Re: Critical bug 4916 (was: Notes on validome test suite / validators comparison)

by Validome-Staff :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Some parts of this message have been removed. Learn more about Nabble's security policy.
Hi Frank,
 
>> The regexp in STD 66 is a one-liner, and determining the set
>> of visible ASCII characters allowed in an URI is "possible".
 
and
 
> all applications
> allowing to click on URIs could indirectly start malware.
 
Fixing the schema is one thing...What about the same issue within HTML documents?
No XHTML --> no schema validation...As you see, what we need is much more than a schema fix. It is a new concept and code for a reliable URI check in XHTML *and* HTML. A URI checker that covers ALL aspects of failures...
Frank, beeing honest, you could create hundreds of test cases concerning URI, even schema validators don't detect. Just try it and you'll see, it is about new, reliable solutions and not about "patchy fixes"...;-)
 
Perhaps it's time to antiquate the term "validator" as it is and seriously discuss about "conformity checkers", as a validator - as defined at the moment - can not keep pace with new requirements and fast application development of these days.
 
Best regards,
 
Alex

Re: Notes on validome test suite / validators comparison

by olivier Thereaux :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi Alex,

Thanks a lot for going through the list, and giving more references.  
This is very useful.

On Oct 20, 2007, at 00:05 , Validome-Staff wrote:
> Here Validome advices the user to use our XML-Validator, as a HTML-
> Validator is not the appropriated tool to check XML...;-)

Understood, but as I wrote, I think it's not very good usability to  
call this a fatal error, when you could transparently redirect to  
your XML checker.


> Here we corrected our claims, sorry for not keeping the comparison  
> up to date.

Appreciated.


>  > * http://www.validome.org/out/ena4011
> > HTML 4.01 document with no system Id.
> > Validome sends a warning... Not necessary per the spec.
> > W3C Markup validator passes validation.
> > Why is W3C validator marked as faulty here? References please?
>
> http://www.w3.org/TR/1999/REC-html401-19991224/struct/ 
> global.html#h-7.2
> Other way: Where is specified, that System-Id can be missed?

SGML, which HTML 4.01 is an application of.
Only in XML is the system identifier required, per:
http://www.w3.org/TR/xml/#NT-ExternalID


> > * http://www.validome.org/out/ena4023
> > Validome says valid. OpenSP and W3C Markup validator says not valid.
> > I'd tend to trust opensp here. The comparison page's claim that
>  validome is the only validator doing the right thing is very dubious.
>
> > * http://www.validome.org/out/ena4024
> > Ditto above. The comparison page's claim that validome is the only
> > validator doing the right thing is very dubious.
>
> What is here dubious? It's about SGML (not HTML) documents.

And?


> The "old" W3C-Validator made a fallback o US-ASCII, the "new" to  
> UTF-8. Can you explain this, please?
> We asked many times W3C-Germany and Bjoern Hoehrmann in regard to  
> the *correct* behaviour of an validator in the case of a fallback,  
> but we didn't get any *exact* answer. In this case, the specs are  
> very unexact and ambiguous. Please give us a *mandatory" answer -  
> with a link reference to appropriate specifications - upon this  
> case. The only clear case till now is XHTML, there validators  
> should make a fallback to UTF-8 (depending on MIME-Type), HTML is  
> still ambiguous...

There is no authoritative answer as far as I can tell, which supports  
my question: why do you consider your sending a fatal error the right  
thing to do, and other validators trying a fallback wrong? If there  
is no rule, you are not supposed to make arbitrary ones and claim you  
are the only ones to respect them.


> > * http://www.validome.org/out/ena7003
> > I'd like to see a reference for this.
>
> http://www.w3.org/TR/html401/struct/links.html#h-12.2.3
> "...The id attribute, on the other hand, may not contain character  
> references."

Interesting discrepancy between prose and DTD here, thanks for the  
pointer.


> > * http://www.validome.org/out/ena7005 (and 7006)
> > This has nothing to do with validation. If validome emulates some of
> > the features of a link checker, compare it to link checkers, not
> > validator. This test is moot.
>
> http://www.w3.org/TR/html401/struct/links.html#h-12.2.4
> "A reference to an unavailable or unidentifiable resource is an error"
> ...
> "If a user agent cannot locate a linked resource, it should alert  
> the user"
>
> Where is here the "moot"? The W3C-Specification is very clear in  
> this case...

This is the usual confusion between user agent conformance (which the  
sections you quote are about) and document conformance (which  
validome and the markup validator are checking).


>   * http://www.validome.org/out/ena3002
> > This test is bogus. Sorry. An XML declaration also happens to be a
> > proper SGML PI. Giving a warning asking the HTML4 author "are you
> > sure you want this here" may be a good idea. Making this a fatal
> > error is wrong, wrong, wrong.
>
> If a XML-declaration is allowed in SGML, I'd like to see a  
> reference for this.

What I have is the SGML spec, chapter 8. Processing instructions.


[on shorttags]
> Oh, her we have hundred opinions of the case. Could you please show  
> us a *exact* reference?

The best I have is the informative:
http://www.w3.org/TR/html401/appendix/notes.html#h-B.3.7
and the normative DTD, which allows the shorttags. As such, the spec  
clearly allows the construct, while informatively warning against it.


I'll reply to your notes on distributing the markup validator in a  
separate mail.

Thank you,
--
olivier




Re: validator catalogs

by olivier Thereaux :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi Alex,

On Oct 20, 2007, at 00:05 , Validome-Staff wrote:
> 1. The W3C-SGML-Parser uses two catalog files: xml.soc and  
> sgml.soc. Within xml.soc there are 21 points missing, all regarding  
> SVG 1.1 Tiny and "SVG 1.1 Basic.

The issues with SVG 1.1 Tiny and Basic are actualy a bit more  
complicated.
See this mail:
http://lists.w3.org/Archives/Public/www-svg/2007Oct/0005.html
I think the workaround we found last month is better, see:
http://lists.w3.org/Archives/Public/www-validator-cvs/2007Oct/0018.html

I note that you also added a number of modules and files for XHTML  
print and basic, good idea.


> 2. We missed 6 DTDs, necessary to get the download package running.

Added, thanks.

> 3. Your LibXML-Implementation was not correct - you just use the  
> catalog files of your SGML-Parser instead of taking care of the the  
> "official" catalog specification (http://www.xmlsoft.org/ 
> catalog.html#Simple).
> Because of this, LibXML tries to get the external DTDs instead of  
> the local ones.

Indeed, it was incorrect, but in the end we decided to not fix it,  
because loading of the catalogue is only supported after a certain  
version of XML::LibXML - hence we just didn't load anything and muted  
the entities errors. It may still be a good idea to fix it, although  
I'm not sure what version of XML::LibXML is supported by most systems.

Thank you,
--
olivier Thereaux - W3C - http://www.w3.org/People/olivier/
W3C Open Source Software: http://www.w3.org/Status





Re: Notes on validome test suite / validators comparison

by olivier Thereaux :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Frank,

On Oct 20, 2007, at 23:22 , Frank Ellermann wrote:
> When I try to validate <http://idn.icann.org/IDNwiki> at your site
> I get the same incorrect "valid" result as with the W3C validator.
>
> For the W3C validator I know that it can't (yet) check URI syntax,
> but it's disappointing that your validator also fails.  Is than an
> issue in the "XHTML 1.0 transitional" schema or in your code ?

I'm curious as to why you so adamantly want to ban non-ascii IRIs  
from HTML?

More on this later, but from what I am gathering from the experts,  
given the spirit of the specs (written before IDNs and IRIs) and the  
level of support for IDNs, barking at IRIs in href and src would be  
counterproductive for the internationalization of the web.

--
olivier


Re: validator catalogs

by olivier Thereaux :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Alex, all

On Oct 24, 2007, at 15:04 , olivier Thereaux wrote:
> I note that you also added a number of modules and files for XHTML  
> print and basic, good idea.

FWIW, the file included in the RAR had a number of typos, that would  
break any validator using them.

I committed to CVS a proper version, I suggest you use this if you  
are to package the w3c markup validator.

Thanks again.
--
olivier




Re: Notes on validome test suite / validators comparison

by Frank Ellermann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


olivier Thereaux wrote:
 
> I'm curious as to why you so adamantly want to ban non-ascii IRIs  
> from HTML?

Please tell me that you're joking.  Native IRIs are nice where they
are permitted.  But on ICANN's Wiki using XHTML 1.0 they will cause
havoc:

Sooner or later mediawiki will be fixed to generate valid XHTML 1.0,
translating native IRIs to equivalent URIs on the fly.  After all
that's REQUIRED for backwards compatibility in the numerous Wikis
based on mediawiki.  Users want that something happens when they
click on a link, without upgrading their browser.  And native IRIs
are designed to have an equivalent URI-form.

Sooner or later validators will be fixed to validate URIs, what with
all those "URI exploits" we've seen in the last weeks for XP after
the installation of IE7. And when validators do their job all users
who naively followed ICANN and W3C into the realms of "who cares
about validity if it works" will be seriously annoyed.

I can still tell you the day when the W3C validator started to flag
€ as invalid on a windows-1252 page. I was working on this
page, it was stunning.

> from what I am gathering from the experts, given the spirit of the
> specs (written before IDNs and IRIs) and the level of support for
> IDNs, barking at IRIs in href and src would be counterproductive
> for the internationalization of the web.

I'm curious which expert propagates to violate specifications.  Want
to know how long it took me to create an XHTML ersatz-DTD permitting
IRIs everywhere ?

30 minutes.  Check out
http://hmdmhdfmhdjmzdtjmzdtzktdkztdjz.googlepages.com/IDN-IRI-test.html

 Frank



Parent Message unknown Re: Notes on validome test suite / validators comparison

by Validome-Staff :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi Frank,

> Sooner or later validators will be fixed to validate URIs

I asked our guy working on it, he told me it will be beta at the beginning
of December 2007. Could you take a look on it before release? Severe
criticism is welcome.

Best regards,

Alex
------------------------
http://www.validome.org/



Re: IRIs in href (Was: Notes on validome test suite / validators comparison)

by olivier Thereaux :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message



On Oct 25, 2007, at 03:39 , Frank Ellermann wrote:
> Users want that something happens when they
> click on a link, without upgrading their browser.  And native IRIs
> are designed to have an equivalent URI-form.

Lack of support for IRIs in legacy user agents is an issue, understood.
Now, if today the HTML 4.01 and XHTML 1.0 specs and above were  
updated to say "IRIs" instead of "URIs", what would you do?

As I wrote before, these specs were written before IRIs were a reality.
The HTML4 spec contains advice on how to treat "URIs containing non-
ASCII characters".
See   http://www.w3.org/TR/html4/appendix/notes.html#h-B.2.1
Although it clearly calls these illegal, it prepares the ground for  
IRIs (for which we didn't yet have that name at that time).

Saying that IRIs should not be used because they break in legacy  
software, is an argument I have sympathy for, but have trouble  
accepting. This reminds me of the situation whereby, in Japan, one  
still can't safely use unicode in mails, because so many MUAs or  
webmails just don't support it.

> Sooner or later validators will be fixed to validate URIs, what with
> all those "URI exploits" we've seen in the last weeks for XP after
> the installation of IE7.

This is irrelevant to the discussion about IRIs. Please don't use  
internationalization as a scapegoat for bad coding.

> I can still tell you the day when the W3C validator started to flag
> € as invalid on a windows-1252 page. I was working on this
> page, it was stunning.

There once was a bug, and IIRC it was fixed in a few hours. Now, how  
is that relevant to the discussion at hand?

> I'm curious which expert propagates to violate specifications.  Want
> to know how long it took me to create an XHTML ersatz-DTD permitting
> IRIs everywhere ? 30 minutes.

Here you must be joking, bluffing, or mistaken, Frank. The current  
XHTML DTD says that DTDs are CDATA, and thus any SGML or XML  
validator has to accept all the characters allowed in the document,  
which includes all those usable in IRIs.

--
olivier



Re: IRIs in href (Was: Notes on validome test suite / validators comparison)

by rubys :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


olivier Thereaux wrote:

>
>
> On Oct 25, 2007, at 03:39 , Frank Ellermann wrote:
>> Users want that something happens when they
>> click on a link, without upgrading their browser.  And native IRIs
>> are designed to have an equivalent URI-form.
>
> Lack of support for IRIs in legacy user agents is an issue, understood.
> Now, if today the HTML 4.01 and XHTML 1.0 specs and above were updated
> to say "IRIs" instead of "URIs", what would you do?
>
> As I wrote before, these specs were written before IRIs were a reality.
> The HTML4 spec contains advice on how to treat "URIs containing
> non-ASCII characters".
> See   http://www.w3.org/TR/html4/appendix/notes.html#h-B.2.1
> Although it clearly calls these illegal, it prepares the ground for IRIs
> (for which we didn't yet have that name at that time).
>
> Saying that IRIs should not be used because they break in legacy
> software, is an argument I have sympathy for, but have trouble
> accepting. This reminds me of the situation whereby, in Japan, one still
> can't safely use unicode in mails, because so many MUAs or webmails just
> don't support it.

What about saying that IRIs should not be used because RFC 3987 section
1.2 item (a) says that this standard is not intended to apply to any
protocol or format element unless those formats or protocols explicitly
say that IRIs are supported?

- Sam Ruby


Re: IRIs in href

by Frank Ellermann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


olivier Thereaux wrote:

> Now, if today the HTML 4.01 and XHTML 1.0 specs and above were  
> updated to say "IRIs" instead of "URIs", what would you do?

Maybe ditch the W3C and post the reasons in an Internet Draft.
I'd certainly consider it as unethical.

RFC 3987 does not "update" 3986.  The spec.s should be updated
with s/2396/3986/g, s/3066/4646/g, and similar clerical tasks,
e.g. explaining why xml:lang is forced to be still an NMTOKEN
wrt these document types.

But for incompatible modifications we need new document types.

Not worldwide "upgrade your browser" campaigns, some users
can't, and besides it's completely unnecessary, all IRIs by
definition have an equivalent URI working with "any browser".

> Saying that IRIs should not be used because they break in
> legacy software, is an argument I have sympathy for, but
> have trouble accepting.
<