Thoughts on Existing Formats
I think they are all well intended, but miss the mark a tiny bit. In fact, I would combine several aspects from each to be honest. To be 100% clear, I generally agree with and appreciate the proposals here, but I think more work is required to really polish things up.
When comparing these formats it's a little bit obvious that there were specific needs in mind. Collection+JSON was obviously targeted more toward NoSQL/schemaless data sources and had a more heavy emphasis on visual representation within the client application than the other formats. Many formats (and there's a few I didn't cover) have a strong emphasis on e-commerce. Obviously e-commerce applications are different than most other systems.
However, what's really important to take away from this is that I think a consolidated format could serve all of these needs and intended use cases. I think each time a new format was created it solved a different problem, but didn't think much about the previous problem being solved. I guarantee a few hours of putting some people's heads together is all we'd need to have a unified format.
Last, I don't think there's enough definition around the "actions" or forms in any of these formats. They all seem to focus on API orientation and navigation more than data managment. This is ok since we have API docs of course (and this format is by no means a replacment for documentation), but I think we can do better.
They all seem to have opened the door or perhaps have their hand on the knob...But they aren't walking through. They should. If you're providing insight on "what" to send back to the API (and under which method), then there should be some context around "how" it needs to look when it gets there. Especially if you want to dictate validation from a single point that can be easily updated without the need for client application updates. My reasoning here is because doing so is a win win win scenario:
Win #1: The API is hit less frequently which allows it to serve more clients without the need to scale so you're winning on costs too in the long run (win 1.5), but this means more stability for your API.
Win #2: Again, less maintenance. Faster development cycles. You do not need to update the client applications if they obey the rules coming from the JSON response. How many client applications might you have? This could be a major improvement on the maintenence considerations of your application and API.
So what would I combine? Well, I like the underscore notation. I think while there may be fields in one's data source with underscores (think MongoDB's _id for example off the top of my head - and I personally use them in MongoDB to denote relationships), it is still good to use for visual distinction for an engineer. Especially when these fields are top level items in the JSON response. An engineer can very quickly see that it's part of the Hypermedia convention and isn't something coming from the data source (or perhaps it is - but they'll easily know when).
Separation of Data
This brings me to hierarchy. It's super important. Mixing data from the data source with additional meta data is a bad idea. Especially the common term "links" ... Of course there could be a "links" key from the data source with all sorts of values and meaning. If one of the goals of Hypermedia JSON is to present the engineer with an easy to understand (and navigatable) response, then I think any confusion would put the format at risk of failing to meet that goal. In some cases, it may even become impossible to use the format with the data set unless aliases were used for field names.
While a complete data dump may be an unfavorable design practice, I still don't believe the API response and the data source should be that far apart. While the end-user engineer may not care (or know any better), the internal engineers who are developing the system should also not need to be bothered with any unnecssary field mapping or aliasing. What got moved where for this particular API and nothing else? What do they call this field in this particular response? etc. It introduces a maintenance problem. Anytime there's confusion and additional maintenance time, business goals and costs suffer. No matter how awesome an engineer may believe something to be, there are real world factors and goals and budgets that must be cosidered.
So keep meta data and application data organizationally separate and distinct. It's extremely important. This is why most formats separate out "links" in the first place. However, if you see them embedded within each item, I still think it's a problem (unless each item is an object with a "data" key and a "links" key as with Collection+JSON). I would prefer instead, routing rules like what HAL suggests.
So I would provide a ``_data``` wrapper for the items coming back from the data source. Essentially like "items" in Collection+JSON. I would not go so far as to show relationships in the data like Siren or HAL. I think this assumes too much and not every application will have the same needs or schema. I don't think we can ever replace the need for API documentation or a general understanding of the application's data. Likewise, APIs don't point guns to people's heads saying you must use every bit of data I give you! =)
Actions & Forms
I think Siren put the most thought into forms or actions. There is no confusion about the HTTP method that the API will accept and the response and this is nice because it means your API is truly more navigatable than most others. Think about the time you'll be saving not referencing up the API docs for example. This leads to faster and easier development. Plus, your client application can be a bit more dynamic and take instruction so that you may end up with less maintenance.
So I would take the way Siren handles actions and replace Collection+JSON's "template" section which is a little confusing anyway. When we see "template" we don't really think about forms and actions. Plus, there's more detail within the action items.
However, I would add to this even more (of course optional) information for validation rules. I think Siren hits the nail on the head here in terms of providing form element types and default values...But I would also add optional placeholder keys. Though the validation rules I think are super important. These can even come with messages. If all this comes from the API then maintenance is easy on the server side and again there's that win win win situation here for user experience and API health.
I would also add an
_errors key to the forms/actions section. All forms would be given a name and this object would then have named forms with their errors so that a client application knew exactly how to display and handle the errors. Collection+JSON introduces errors, but they seem to be unrelated to the "template" section.
Yes, this is what HTTP headers are for. However, I believe additional information may be desired by certain APIs. Anything outside of HTTP response header standards, etc. I also don't believe the headers are an appropriate place for lengthy messages. I think HTTP status codes are great, but they are short codes in the first place for a reason. I would include an
_http key with an object value that contained additional error details, similar to Collection+JSON and perhaps even more depending on the needs of the application. This provides better flexibility.
One thing you don't see a lot of here are messages. "Flash" messages are quite common in applications to help notify the user that something saved properly or there was an error, etc. Of course with form validation we also have the need for messages. They should all be strings and any internationalization considerations should be taken care of in the API URL.
/es/posts for example. While one might consider putting translations in the response itself, I think this would unecessarily increase the payload size and most client applications would have no need to display more than one language at a time.
I would add a
_meta key with values for extra things like pagination, response time, flash message, etc. If you're dealing with a listing of items, this is a great place for a total count, sorting options, pagination, etc. It's also nice to get a response time from an API. This should always be held separate from the other data.
Again, I would keep
_links separate from the data. This is data added on top of the application data that has more to do with the API than it does the actual application. So there's absolutely no reason to mix it in with the data. Also, I think it's ok to use routing logic here like HAL does. However, HAL mixes in "_links" as if it was data coming straight from the data source along with everything else. I believe to avoid conflicts and to stay better organized, it should just be moved a little bit. Simple change and Collection+JSON does this.
I think we should follow semantic versioning in our APIs. Maybe the level of detail is a bit silly at times, but loosely we should follow it. I don't believe in the "v1" we commonly see and I think I'm not alone in this. If you look at Twitter's API, you'll notice they don't use "v" in their version number.
The API version number is contained in the API URL and it is the first item. I agree with this positioning and format. While we never see a real need for "patch" I think "major" and "minor" are more common enough. I think the version number should be kept out of the response as well. Though it wouldn't hurt to optionally add it to the meta data, it does belong in the URL.
Locale also belongs in the URL. One should not POST arguments for the language. It should be set (preferable immediately following API version number) in the URL. This could even be optional with a default to a specific language. If the API does not support internationalization, then of course we wouldn't expect to see it.