Skip to content

Instantly share code, notes, and snippets.

@mitchross
Created February 1, 2023 18:39
Show Gist options
  • Save mitchross/5078d8bdb7de3dc5c6045677d7cc4290 to your computer and use it in GitHub Desktop.
Save mitchross/5078d8bdb7de3dc5c6045677d7cc4290 to your computer and use it in GitHub Desktop.
The long answer
There are other application-specific content types, and there are other content types based on JSON that are also registered with IANA with their own identifiers.
While the “json” identifier alone would have been unambiguous, the “application” identifier does not serve to disambiguate “json” subtypes. In fact, the answer to this question is within the definition of the “Accept” HTTP header alone, but let’s go through the rest of the specs first, for some context.
HTTP
In RFC 7231 (2014)[1], or Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content, one of a set of documents that defines HTTP/1.1, you can find this about these headers (in page 8):
HTTP uses Internet media types [RFC2046] in the Content-Type (Section 3.1.1.5) and Accept (Section 5.3.2) header fields in order to provide open and extensible data typing and type negotiation.
And in case you were wondering about HTTP/2, there is the RFC 7540 (2015), or Hypertext Transfer Protocol Version 2 (HTTP/2), which states (in page 1):
This specification is an alternative to, but does not obsolete, the HTTP/1.1 message syntax. HTTP's existing semantics remain unchanged.
In other words, just treat these headers as defined by HTTP/1.1. The only difference in HTTP/2 in this regard is the mandatory use of lowercase header names.
The HTTP specifications just used an existing framework to negotiate the content type, which often helps people to avoid the risks of defining and implementing novel protocols, allows people to agree on a known and better understood protocol, and allows some people to reuse existing implementations of this particular feature.
In this case, it uses definitions from MIME to negotiate content types. Note that it stands for Multipurpose Internet Mail Extensions, which means it was originally created for mail exchange, and that mechanism was probably deemed “good enough” for HTTP by its authors.
[1]: The RFC 7231 is a revision of other RFC, of which the earliest seems to be the RFC 2068, of 1997.
MIME
RFC 2046 (1996), or Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types, is part of another set of documents that defines MIME, including MIME types. This RFC is also a revision, of which the oldest seems to be RFC 1341, from 1992.
RFC 2046 itself defines top-level media types, or the left part of the “Content-Type” header value, or what “application” means, in this case.
The "application" media type is to be used for discrete data which do not fit in any of the other categories, and particularly for data to be processed by some type of application program. This is information which must be processed by an application before it is viewable or usable by a user.
It basically defines “application/json” as a data format that does not qualify as any of the others, namely “text”, “image”, “audio” and “video” (or composite, like “multipart”), and requires a specific application to make it usable. And JSON seems to qualify as such, although it used to be registered under “text/json”.
But let’s also look at RFC 2045 - Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies, about the syntax of the “Content-Type” header (in page 12):
content := "Content-Type" ":" type "/" subtype
*(";" parameter)
Note that the syntax requires a “type” and a “subtype”. The “type” non-terminal is the top-level media type, in this case, “application”. And the “subtype” is an identifier representing a more specific data format, like “json”.
So part of the reason we have to specify JSON as “application/json” is because the HTTP specification you have to use whatever MIME says, and MIME says you have to provide both the top-level media type and the subtype, to be syntactically compliant.
This is by itself not a good answer to the question, though. But a good part of the answer was already hinted in the definition of the “application” top-level media type. Or instead, in how top-level media types are supposed to be handled by user agents.
Maybe this another excerpt from RFC 2046 (page 4) would make it more clear:
The definition of a top-level media type consists of: (…) how a user agent and/or gateway should handle unknown subtypes of this type, (…)
Which means it has to do with informing user agents (like browsers) how they are expected to handle these kinds of files, especially if they don’t support a particular MIME subtype.
And in the case of “application/json”, the user agent either hands it to a known application (the website’s JavaScript, or one of its API endpoints), or asks the user what it should do (probably just download), because otherwise it is not supposed to know how to handle it, by definition. It requires an application other than itself to handle it, or it is not meant to be handled at all, like “application/octet-stream”.
Other JSON MIME subtypes
It is worth mentioning that RFC 2048 - Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures - delegates registration of subtypes to IANA. Its registry can be found here.
Note that that list contains other subtypes suffixed with “+json”, to indicate that it is actually JSON. It is probably not a requirement, but people seem to enjoy the idea. Let’s take GeoJSON, for example, which is registered as “application/geo+json”, but defined in RFC 7946 (2016), The GeoJSON Format.
If you only look at its syntax, it is just your plain old JSON. But not any JSON can be considered a GeoJSON. To be considered GeoJSON, it must comply with additional rules and restrictions, like having certain keys with values of certain types or structure. Throwing any valid JSON into geojson.io will not be good enough to render geographical features, but valid GeoJSON is always valid JSON.
The Accept HTTP header
On the other hand, if you paid any attention to the definition of the Accept HTTP header, you would have noticed that you always had the option of specifying the value “application/*” in requests.
Which, in the case of the “application” top-level media type, it would have not been very useful. But in the case for example of “text” or “image” top-level media types, this use case would have made the answer to this question rather obvious.
Quite often, the user agent is not interested in a specific subtype, or supports several alternative subtypes, in which case it could just request a generic “text” or a generic “image”, for example. In which case it would have sent “text/*” or “image/*” as values for the Accept header, and allow the server to decide which specific data format to respond with.
So, to make the value unambiguous with regards to whether it refers to a top-level media type or a subtype, you have to specify “application” as the top-level media type for JSON. And even if “json” would never refer to a top-level media type, a malicious registration could have used one of the top-level media types as an identifier, although that would have never been intentionally approved, I suspect.
Original question: Why do we say "application/json" and not just "json" when specifying the Content-Type or Accept header in HTTP requests? Are there other types of JSON to be aware of?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment