Skip to content

Instantly share code, notes, and snippets.

@phluid61
Created September 15, 2019 04:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save phluid61/bef2920b2c6106788950fc559b601d0a to your computer and use it in GitHub Desktop.
Save phluid61/bef2920b2c6106788950fc559b601d0a to your computer and use it in GitHub Desktop.
thoughts on draft-ietf-httpbis-header-structure

Thoughts on draft-ietf-httpbis-header-structure

2019-09-15

RE: httpwg/http-extensions#913 (and 790, and 629, ...)

I've been thinking about data types and models again. My premise is that a type is a combination of a range/domain of values, and a set of operations that can be performed on those values.

A 'token' has one operation: identity comparison. But we're not going to do just that with our tokens, because implementation specs are going to say things like "case-insensitive" (so we have textual operations like toupper/tolower/casecmp), or they're going to say things like "if it starts with 'text/' default to utf-8" (so we have substring operations like split/match). So I think what we currently call a 'token' is going to be treated as a string-without-quotes.

Then there's the recurring "in this location you could find a string or a token", implying that there's no semantic difference between the two. (Aside: I'm still keen to see one of these, BTW. I can't imagine how it could exist and not be resolved by using sh-item instead.)

Further, the argument that reintroduced tokens (nee identifiers) in httpwg/http-extensions#629 seems to have been more about aesthetics than types. (Aside: Are domain names in origins compared case-insensitively? Because if so, that counts against using tokens to carry origins.)

I don't think underspecifying an immature concept is going to help us in any future revisions or extensions. We should either make strings and tokens serialisations of the same underlying data type*, or take a hard stance on what a token is and where/how it should be used**.

* I think that means: rename sh-token so it's not exposed as an sh- construct and alternate its ABNF in with sh-string's, get rid of 3.7 Tokens, and get rid of 4.1.7. Serializing a Token, plus some editorial stuff.

** I think that means: explaining why token and string aren't the same thing, and why you have to expose specific API hooks to convert your language's native string type to one vs the other, and why you shouldn't have said those things I mentioned above about case or substrings or whatever.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment