NEW ISSUE: repeating non-list-type-headers

View: New views
20 Messages — Rating Filter:   Alert me  
< Prev | 1 - 2 - 3 | Next >

NEW ISSUE: repeating non-list-type-headers

by Julian Reschke :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi,

(follow-up to a discussion over at the HTML mailing list, see
<http://lists.w3.org/Archives/Public/public-html/2007Nov/0271.html>).

We currently say in Section 4.2:

    Multiple message-header fields with the same field-name MAY be
    present in a message if and only if the entire field-value for that
    header field is defined as a comma-separated list [i.e., #(values)].

-- <http://tools.ietf.org/html/rfc2616#section-4.2>

Now this seems to be kind of backwards, wouldn't it be *much* clearer if
it said:

    Multiple message-header fields with the same field-name MUST NOT be
    present in a message unless the entire field-value for that
    header field is defined as a comma-separated list [i.e., #(values)].

That being said, do we have a recommendation for recipients when that
requirement is violated? I would assume that servers SHOULD return a 400
(Bad Request), but what about clients?

Best regards, Julian


Re: NEW ISSUE: repeating non-list-type-headers

by Bjoern Hoehrmann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


E* Julian Reschke wrote:

>We currently say in Section 4.2:
>
>    Multiple message-header fields with the same field-name MAY be
>    present in a message if and only if the entire field-value for that
>    header field is defined as a comma-separated list [i.e., #(values)].
>
>-- <http://tools.ietf.org/html/rfc2616#section-4.2>
>
>Now this seems to be kind of backwards, wouldn't it be *much* clearer if
>it said:
>
>    Multiple message-header fields with the same field-name MUST NOT be
>    present in a message unless the entire field-value for that
>    header field is defined as a comma-separated list [i.e., #(values)].

No, unlike the old text, that does not say when you may use them.

>That being said, do we have a recommendation for recipients when that
>requirement is violated? I would assume that servers SHOULD return a 400
>(Bad Request), but what about clients?

You fold them into a single value as the specification suggests unless
there is some reason not to do that. Servers should not be required to
respond with Bad Request, they might not know the header, and they
should not treat

  X: a
  X: b

differently from X:a,b, so if they don't give you a Bad Request for the
latter, they should not do it for the former. I think the current text
is fine.
--
Björn Höhrmann · mailto:bjoern@... · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 


Re: NEW ISSUE: repeating non-list-type-headers

by Julian Reschke :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Bjoern Hoehrmann wrote:

> E* Julian Reschke wrote:
>> We currently say in Section 4.2:
>>
>>    Multiple message-header fields with the same field-name MAY be
>>    present in a message if and only if the entire field-value for that
>>    header field is defined as a comma-separated list [i.e., #(values)].
>>
>> -- <http://tools.ietf.org/html/rfc2616#section-4.2>
>>
>> Now this seems to be kind of backwards, wouldn't it be *much* clearer if
>> it said:
>>
>>    Multiple message-header fields with the same field-name MUST NOT be
>>    present in a message unless the entire field-value for that
>>    header field is defined as a comma-separated list [i.e., #(values)].
>
> No, unlike the old text, that does not say when you may use them.

Ahem? "...unless the entire field-value..."?

>> That being said, do we have a recommendation for recipients when that
>> requirement is violated? I would assume that servers SHOULD return a 400
>> (Bad Request), but what about clients?
>
> You fold them into a single value as the specification suggests unless
> there is some reason not to do that. Servers should not be required to

Well, for Content-Type the specification says you can't do that.

> respond with Bad Request, they might not know the header, and they
> should not treat
>
>   X: a
>   X: b
>
> differently from X:a,b, so if they don't give you a Bad Request for the
> latter, they should not do it for the former. I think the current text
> is fine.

Of course both forms should be treated the same. The question I was
asking: what is a recipient -- in particular a client -- supposed to do
with a message where header values are known to be invalid?

Best regards, Julian



Re: NEW ISSUE: repeating non-list-type-headers

by Bjoern Hoehrmann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


* Julian Reschke wrote:
>>>    Multiple message-header fields with the same field-name MUST NOT be
>>>    present in a message unless the entire field-value for that
>>>    header field is defined as a comma-separated list [i.e., #(values)].
>>
>> No, unlike the old text, that does not say when you may use them.
>
>Ahem? "...unless the entire field-value..."?

You are turning "Messages may X iff Y" into "Messages must not X unless
Y"; if Y is true, with the old version you know "Messages may X", with
your version you just know you are not violating "Messages must not X".
It might well be that Messages SHOULD NOT include duplicates even then.

>Of course both forms should be treated the same. The question I was
>asking: what is a recipient -- in particular a client -- supposed to do
>with a message where header values are known to be invalid?

Where the specification does not say that, the client is supposed to do
something that's appropriate for the particular client and the circum-
stances it is acting in. Some clients may close the connection on sight
of the bad header, others might ignore it, others might process only
some part of it, and yet others might not notice the error at all. It's
unlikely there is a one size fits all recommendation we could make.
--
Björn Höhrmann · mailto:bjoern@... · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 


Re: NEW ISSUE: repeating non-list-type-headers

by Julian Reschke :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Bjoern Hoehrmann wrote:

> * Julian Reschke wrote:
>>>>    Multiple message-header fields with the same field-name MUST NOT be
>>>>    present in a message unless the entire field-value for that
>>>>    header field is defined as a comma-separated list [i.e., #(values)].
>>> No, unlike the old text, that does not say when you may use them.
>> Ahem? "...unless the entire field-value..."?
>
> You are turning "Messages may X iff Y" into "Messages must not X unless
> Y"; if Y is true, with the old version you know "Messages may X", with
> your version you just know you are not violating "Messages must not X".
> It might well be that Messages SHOULD NOT include duplicates even then.

Sorry?

Unless there's another place in the spec making statements about
repeating headers (is there?), both

   MAY do X, iff Y

and

   MUST NOT do X, unless Y

are equivalent.

>> Of course both forms should be treated the same. The question I was
>> asking: what is a recipient -- in particular a client -- supposed to do
>> with a message where header values are known to be invalid?
>
> Where the specification does not say that, the client is supposed to do
> something that's appropriate for the particular client and the circum-
> stances it is acting in. Some clients may close the connection on sight
> of the bad header, others might ignore it, others might process only
> some part of it, and yet others might not notice the error at all. It's
> unlikely there is a one size fits all recommendation we could make.

I'm concerned about clients that ignore the problem and randomly pick
one of the values.

Best regards, Julian


Re: NEW ISSUE: repeating non-list-type-headers

by Frank Ellermann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Bjoern Hoehrmann wrote:

> You are turning "Messages may X iff Y" into "Messages must not X unless Y"

You've lost me here, isn't that logically equivalent ?  If it is equivalent
Julian's version is clearer.  If it is not I likely miss the point.

 Frank



Re: NEW ISSUE: repeating non-list-type-headers

by Bjoern Hoehrmann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


* Julian Reschke wrote:

>Unless there's another place in the spec making statements about
>repeating headers (is there?), both
>
>   MAY do X, iff Y
>
>and
>
>   MUST NOT do X, unless Y
>
>are equivalent.

Well you see, with your text you would have to search the rest of the
specification to be sure, while that is not necessary with the old text.
Your goal was to make the text clearer and less backwards, I think you
did the opposite.
--
Björn Höhrmann · mailto:bjoern@... · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 


Re: NEW ISSUE: repeating non-list-type-headers

by Adrien de Croy :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message




Julian Reschke wrote:

>
> Unless there's another place in the spec making statements about
> repeating headers (is there?), both
>
>   MAY do X, iff Y
>
> and
>
>   MUST NOT do X, unless Y
>
> are equivalent.
>
neither of these constructs are that great - having read this one being
thrashed out in other conversations.  I think we need to look into what
the goal is.  Clarity surely?

there are complications as previously discussed around use of the word
"MAY", in the strict (RFC defined) sense, vs the everyday sense

Personally I feel the first form is more correct, since to derive the
same meaning from the second form, you have to implicitly convert a
conditional non-denial into an optional conditional permission.  They
aren't 100% the same thing. Lack of denial is not necessarily a grant of
permission.  There may be other factors.

If you want real clarity, we may need to say something more convoluted,
i.e. repetitive, such as

MAY do X but only if Y. If not Y then MUST NOT do X

It's redundant etc, but it drums in that doing X is _optional_, but only
on condition that Y is met, otherwise X is not permitted.

Use of the word "only" (which is in the spec in a few places) can make
things a bit fuzzy.  Adding "but" can help there in terms of general
legibility.


--
Adrien de Croy - WinGate Proxy Server - http://www.wingate.com



Re: NEW ISSUE: repeating non-list-type-headers

by Bjoern Hoehrmann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


* Frank Ellermann wrote:
>Bjoern Hoehrmann wrote:
>
>> You are turning "Messages may X iff Y" into "Messages must not X unless Y"
>
>You've lost me here, isn't that logically equivalent ?  If it is equivalent
>Julian's version is clearer.  If it is not I likely miss the point.

Julian's version allows using the same name more than once through a
double negative under the assumption that MAY is the antonym of MUST
NOT. I would not make this assumption, and I think double negatives
are generally more difficult to understand than positive statements.

If you want clearer text you would have to split the statements up
and say: list-valued headers may occur more than once, other headers
must not occur more than once.
--
Björn Höhrmann · mailto:bjoern@... · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 


Re: NEW ISSUE: repeating non-list-type-headers

by Julian Reschke :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Adrien de Croy wrote:
> ...

Agreed. None of the header types (list and others) is an exception, so
it probably makes sense to spell out both explicitly.

Best regards, Julian



Re: NEW ISSUE: repeating non-list-type-headers

by Julian Reschke :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Bjoern Hoehrmann wrote:

> * Frank Ellermann wrote:
>> Bjoern Hoehrmann wrote:
>>
>>> You are turning "Messages may X iff Y" into "Messages must not X unless Y"
>> You've lost me here, isn't that logically equivalent ?  If it is equivalent
>> Julian's version is clearer.  If it is not I likely miss the point.
>
> Julian's version allows using the same name more than once through a
> double negative under the assumption that MAY is the antonym of MUST
> NOT. I would not make this assumption, and I think double negatives
> are generally more difficult to understand than positive statements.

No, "MUST NOT" is not an antonym for "MAY".

However

        MAY do X if and only if Y

means the same as

        MUST NOT do X unless Y

I think the latter was clearer as it communicates that the default is
that repeating header names are not allowed (that is, headers using list
syntax being the exception).

> If you want clearer text you would have to split the statements up
> and say: list-valued headers may occur more than once, other headers
> must not occur more than once.

Agreed.

BR, Julian



Re: NEW ISSUE: repeating non-list-type-headers

by Frank Ellermann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Bjoern Hoehrmann wrote:

> If you want clearer text you would have to split the statements up
> and say: list-valued headers may occur more than once, other headers
> must not occur more than once.

Okay, something in this direction will do.  Maybe twist that "may"
into a "should not" if it's known to cause trouble.  

 Frank



Re: NEW ISSUE: repeating non-list-type-headers

by Julian Reschke :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Frank Ellermann wrote:
> Bjoern Hoehrmann wrote:
>
>> If you want clearer text you would have to split the statements up
>> and say: list-valued headers may occur more than once, other headers
>> must not occur more than once.
>
> Okay, something in this direction will do.  Maybe twist that "may"
> into a "should not" if it's known to cause trouble.  

No, the "may" is perfectly fine in case the header uses the list syntax...

BR, Julian


Re: NEW ISSUE: repeating non-list-type-headers

by Frank Ellermann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Julian Reschke wrote:
 
> the "may" is perfectly fine in case the header uses the list syntax...

A bit unusual, but if it works as expected, good.  I vividly recall
an excursion into the weeds with multiple a=b ; a=c name=value pairs
instead of a="b c" elsewhere, once bitten twice shy... ;-)





Re: NEW ISSUE: repeating non-list-type-headers

by Jamie Lokier :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Julian Reschke wrote:
> Now this seems to be kind of backwards, wouldn't it be *much* clearer if
> it said:
>
>    Multiple message-header fields with the same field-name MUST NOT be
>    present in a message unless the entire field-value for that
>    header field is defined as a comma-separated list [i.e., #(values)].

It would be clearer, but it would clash with reality.  All web servers
and web clients use Set-Cookie, which is prohibited by that.

I think it's important to acknowledge that Set-Cookie is still around,
and all public web servers and clients must deal with it in practice
(if they support cookies).

> That being said, do we have a recommendation for recipients when that
> requirement is violated? I would assume that servers SHOULD return a 400
> (Bad Request), but what about clients?

An HTTP agent's implementation _ought_ to be able to parse the headers
into a name->value dictionary, concatenating any multiple values for
the same field-name with ", " between them, with the practical
exception of Set-Cookie, for which a list must be kept separately.

Some servers and clients are implemented like that, and they are fine.

The module responsible for parsing headers generally doesn't have a
list of the syntaxes of each header type, and such a list would be
difficult to obtain because of application-specific headers which may
be different for different resources on the same server.

Hence the open-endedness of the text you focused on in RFC 2616, I guess.

-- JAmie


Re: NEW ISSUE: repeating non-list-type-headers

by Julian Reschke :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Jamie Lokier wrote:

> Julian Reschke wrote:
>> Now this seems to be kind of backwards, wouldn't it be *much* clearer if
>> it said:
>>
>>    Multiple message-header fields with the same field-name MUST NOT be
>>    present in a message unless the entire field-value for that
>>    header field is defined as a comma-separated list [i.e., #(values)].
>
> It would be clearer, but it would clash with reality.  All web servers
> and web clients use Set-Cookie, which is prohibited by that.

Well, it clashes with reality the same way the old text did :-)

That being said, we should treat that as separate issue. I believe Roy
mentioned that problem before.

> I think it's important to acknowledge that Set-Cookie is still around,
> and all public web servers and clients must deal with it in practice
> (if they support cookies).

Agreed.

>> That being said, do we have a recommendation for recipients when that
>> requirement is violated? I would assume that servers SHOULD return a 400
>> (Bad Request), but what about clients?
>
> An HTTP agent's implementation _ought_ to be able to parse the headers
> into a name->value dictionary, concatenating any multiple values for
> the same field-name with ", " between them, with the practical
> exception of Set-Cookie, for which a list must be kept separately.
>
> Some servers and clients are implemented like that, and they are fine.
>
> The module responsible for parsing headers generally doesn't have a
> list of the syntaxes of each header type, and such a list would be
> difficult to obtain because of application-specific headers which may
> be different for different resources on the same server.
>
> Hence the open-endedness of the text you focused on in RFC 2616, I guess.

That's all true, but it doesn't answer the question of what a recipient
should do with something like:

    Content-Type: text/html; charset=ISO-8859-1
    Content-Type: text/plain

(see <http://lists.w3.org/Archives/Public/public-html/2007Nov/0271.html>).

...or even worse, with conflicting Content-Length headers....

Best regards, Julian


Re: NEW ISSUE: repeating non-list-type-headers

by David Morris-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message




On Tue, 20 Nov 2007, Julian Reschke wrote:

>
> That's all true, but it doesn't answer the question of what a recipient
> should do with something like:
>
>     Content-Type: text/html; charset=ISO-8859-1
>     Content-Type: text/plain
>
> (see <http://lists.w3.org/Archives/Public/public-html/2007Nov/0271.html>).
>
> ...or even worse, with conflicting Content-Length headers....

In the end, quite simple ... if the recipient doesn't understand the
message, it should report an error and reject the message. As I recall,
in the original 1.1 time frame, there was a discussion of creating a
mechanism where by user agents could report errors to origin servers and
this was rejected, in part at least, because of concerns re. DOS attacks.

There really isn't that much point in folding headers and in fact this
possiblity makes parsing more difficult. What a revised spec should do is
focus on interoperability and describe requirements which insure
interoperability...

a. The order of the values of repeated header must be preserved
b. The order of repeated headers known to have list values MAY be
   folded OR unfolded at the convenience of the processing entity.
c. The Content-type example is no more confusing than receiving a
   flash movie (.flv type) coded as text/plain. Processing entities
   must write code which will survive the wild west. The most we should
   do is list alternatives and recommend that any individual
   implementation be consistent. Take the first, take the last, use as
   hints in analysis of the actual entity... Report as an error.

Where it doesn't matter, the specification should not impose restrictions
since there is no power of enforcement.

Dave Morris


Re: NEW ISSUE: repeating non-list-type-headers

by Jamie Lokier :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Jamie Lokier wrote:

>
> Julian Reschke wrote:
> > Now this seems to be kind of backwards, wouldn't it be *much* clearer if
> > it said:
> >
> >    Multiple message-header fields with the same field-name MUST NOT be
> >    present in a message unless the entire field-value for that
> >    header field is defined as a comma-separated list [i.e., #(values)].
>
> It would be clearer, but it would clash with reality.  All web servers
> and web clients use Set-Cookie, which is prohibited by that.

After reading the rest of this thread, I see that you didn't
change the meaning (I was mistaken), you simply clarified it,
(notwithstanding the subtleties of double negatives and permission
vs. not-denial etc.).  So I withdraw any objection on that basis.

However, I still have a point which is mainly response to your other
query, and I offer an alternative clarification which spells it out
more.

> > That being said, do we have a recommendation for recipients when that
> > requirement is violated? I would assume that servers SHOULD return a 400
> > (Bad Request), but what about clients?

Recipients don't always have the necessary information to decide which
headers have comma-separated syntax.  Some headers meaning may depend
on which resource is requested and other factors, outside the scope of
the general purpose HTTP part of an implementation.

Only recipients which _intepret_ a particular header are likely to
have this information for that header.  In that case, perhaps it's
reasonable to say _those_ SHOULD return 400 Bad Request.

However, I think that's a bit demanding.  There are quite a few client
and server implementations which parse HTTP headers into a key->value
dictionary at an early stage, folding duplicates together, and pass
that onto application code, and only application code has knowledge of
the meaning of some headers.  It works fine even on the big nasty
internet.  (Set-Cookie is handled separately).

That architecture seems reasonable to me, so I propose replacing
SHOULD with MAY, as in "... MAY return a 400 (Bad Request)".


Dave Morris wrote:
> In the end, quite simple ... if the recipient doesn't understand the
> message, it should report an error and reject the message.  [...]

I agree, and add that aspects of the message which the recipient
doesn't care about should stay ignored.

> There really isn't that much point in folding headers and in fact this
> possiblity makes parsing more difficult.

But it's required now, it really occurs in the wild with some headers.

> What a revised spec should do is
> focus on interoperability and describe requirements which insure
> interoperability...

I agree, and think the old spec is a bit weak in some
found-in-practice interop areas.

> a. The order of the values of repeated header must be preserved

Yes.

> b. The order of repeated headers known to have list values MAY be
>    folded OR unfolded at the convenience of the processing entity.

True when interpreting headers, but please don't write a proxy which
forwards those folded/unfolded headers - especially not a
"transparent" proxy.  A few buggy clients/servers do process the two
differently, and occasionally one needs to be explicit as a workaround
for some problem, and proxies "normalising" things does not help.

> Where it doesn't matter, the specification should not impose restrictions
> since there is no power of enforcement.

I echo that, when it comes to things like what to do when sent some
kinds of technically malformed message.  However, restrictions which
say what to send (and what not) are good for interoperability, as are
requirements that insist everyone parse different but equivalent
things the same way.

How about this.  It's a bit long, but I think it's clear, reflects
common practice as well as suggesting good practice, includes Julian's
suggestion to reject (but only when appropriate), and is equivalent to
Daves suggestion "folded or unfolded at the convenience" without
putting it that way.



Proposed text:

Duplicate headers
=================

1. Duplicate headers means duplicate headers with the same
   field-name.  Case differences and LWS before the colon MUST be
   ignored in the comparison.

2. Messages MUST NOT have duplicate headers, except as permitted:

      + Headers whose field-value syntax is a comma-separated list.

      + More generally, when explicitly permitted by other
        specifications and applications, whose syntax is such that
        concatenating syntactically valid values with "," (with and
        without surrounding LWS) does not change the interpretation.

      + Headers received and forwarded unmodified by a proxy (except
        leading and trailing LWS and multi-line formatting changes,
        and field-name case changes).

      + Set-Cookie in a response message, due to historical accident.

3. An implementation SHOULD NOT reject a message for containing
   duplicate headers unknown to the implementation.

4. At the point where specific headers are interpreted during message
   processing, if duplicates are present and not permitted as
   described above, the message SHOULD be rejected as malformed.

5. An implementation MAY reject the message earlier using a list of
   headers for which duplicates are not permitted (e.g., at least
   those defined in this specification whose syntax is not a
   comma-separated list).

6. The meaning of duplicate headers whose field-value syntax is a
   comma-separated list, provided the individual values satisfy that
   syntax, is equivalent to concatening the elements of each list,
   preserving the order.  The transformation of section 7 gives the
   same result.  Implementations MUST respect this equivalence.

7. When interpreting any header, implementations MAY merge duplicates
   by concatenating the values with "," between them (optionally with
   LWS), preserving the order.  This is permitted for all headers and
   independent of syntax.  In practice, some implementations do merge
   all duplicate headers in this way internally, except for
   Set-Cookie, and the technique does satisfy this specification.
   However, see sections 4 and 5 for preferred behaviours.

8. When a proxy forwards particular headers without modification
   (except leading and trailing LWS and multi-line formatting changes,
   and field-name case changes), duplicate headers MUST be forwarded
   separately in their original order.  A proxy may still apply
   sections 4, 5, 6 and 7 separately to header interpretation, and it
   may replace duplicate headers with the concatenated form for those
   headers whose value is modified prior to forwarding.


-- Jamie