This is a draft proposal for BuiltWith's Firehose web service API. It leans a bit on HTTP and uses some status codes from spec extensions and custom media types. If a user agent only accepts application/json, the API should still work as each representation is still JSON. The only thing a user agent would lose is a clear indication of the contract of the payload, which specifies how to parse it beyond just JSON.
This covers enough to give a feel for the API, but probably isn't yet comprehensive for all the edge cases.
POST https://builtwith.com/firehose
Accept: application/json+builtwith.firehose.promise
Content-Type: application/json+builtwith.firehose.request
{
"name": "some user-defined name",
"domains": [ ... ],
"categories": [ ... ]
}
202 (ACCEPTED)
Content-Type: application/json+builtwith.firehose.promise
{
"links": {
"result": "http://builtwith.com/firehose/results/rackspace/1",
"status": "http://builtwith.com/firehose/results/rackspace/1/status",
"jobs": "http://builtwith.com/firehose/results/rackspace"
}
}
GET http://builtwith.com/firehose/results/rackspace/1/status
Accept: application/json+builtwith.firehose.status
200 (OK)
Content-Type: application/json+builtwith.firehose.status
{
"name": "some user-defined name",
"status": <"DONE", "WORKING", "CANCELLED", or "ERROR">,
"request_time": <ISO-8601 timestamp>,
// as makes sense depending on status
"progress": <integer from 0 to 100>,
"cancel_time": <ISO-8601 timestamp>,
"finish_time": <ISO-8601 timestamp>,
"cost": {
"amount": <fixed decimal>,
"currency": <ISO 4217 currency code>
},
"error": {
"code": <whatever code is useful to you>
"reason": <some description of the problem>
}
}
GET http://builtwith.com/firehose/results/rackspace
content-type: application/json+builtwith.firehose.jobs
We can use query parameters to limit the number of jobs returned if the list is getting kind of large and there's no retention policy.
200 (OK)
content-type: application/json+builtwith.firehose.jobs
{
"jobs": [
{
"links": {
"result": "http://builtwith.com/firehose/results/rackspace/1"
"status": "http://builtwith.com/firehose/results/rackspace/1/status"
},
// fields from status request to avoid secondary calls
"name": "some user-defined name",
"request_time": <ISO-8601 timestamp>
...
},
...
]
}
GET http://builtwith.com/firehose/results/rackspace/1
Accept: application/zip+builtwith.firehose.result,
application/zip+builtwith.firehose.status
200 (OK)
Content-Type: application/zip+builtwith.firehose.result
<zipped file with results files>
We could get different compression schemes by changing the accept header.
102 (PROCESSING)
Content-type: application/json+builtwith.firehose.status
<same payload as a status request>
DELETE http://builtwith.com/firehose/results/rackspace/1
200 (OK)