Skip to content

Instantly share code, notes, and snippets.

@shajra
Last active August 29, 2015 14:05
Show Gist options
  • Save shajra/1c44160d9dd7fc887c73 to your computer and use it in GitHub Desktop.
Save shajra/1c44160d9dd7fc887c73 to your computer and use it in GitHub Desktop.

BuiltWith Firehose API (DRAFT)

This is a draft proposal for BuiltWith's Firehose web service API. It leans a bit on HTTP and uses some status codes from spec extensions and custom media types. If a user agent only accepts application/json, the API should still work as each representation is still JSON. The only thing a user agent would lose is a clear indication of the contract of the payload, which specifies how to parse it beyond just JSON.

This covers enough to give a feel for the API, but probably isn't yet comprehensive for all the edge cases.

Submitting a Request

Request

POST https://builtwith.com/firehose
Accept: application/json+builtwith.firehose.promise
Content-Type: application/json+builtwith.firehose.request
{
    "name": "some user-defined name",
    "domains": [ ... ],
    "categories": [ ... ]
}

Response

202 (ACCEPTED)
Content-Type: application/json+builtwith.firehose.promise
{
    "links": {
        "result": "http://builtwith.com/firehose/results/rackspace/1",
        "status": "http://builtwith.com/firehose/results/rackspace/1/status",
        "jobs": "http://builtwith.com/firehose/results/rackspace"
    }
}

Checking Status

Request

GET http://builtwith.com/firehose/results/rackspace/1/status
Accept: application/json+builtwith.firehose.status

Response

200 (OK)
Content-Type: application/json+builtwith.firehose.status
{
    "name": "some user-defined name",
    "status": <"DONE", "WORKING", "CANCELLED", or "ERROR">,
    "request_time": <ISO-8601 timestamp>,

    // as makes sense depending on status
    "progress": <integer from 0 to 100>,
    "cancel_time": <ISO-8601 timestamp>,
    "finish_time": <ISO-8601 timestamp>,
    "cost": {
        "amount": <fixed decimal>,
        "currency": <ISO 4217 currency code>
    },
    "error": {
        "code": <whatever code is useful to you>
        "reason": <some description of the problem>
    }

}

Checking All Jobs Status

Request

GET http://builtwith.com/firehose/results/rackspace
content-type: application/json+builtwith.firehose.jobs

We can use query parameters to limit the number of jobs returned if the list is getting kind of large and there's no retention policy.

Response

200 (OK)
content-type: application/json+builtwith.firehose.jobs
{
    "jobs": [
        {
            "links": {
                "result": "http://builtwith.com/firehose/results/rackspace/1"
                "status": "http://builtwith.com/firehose/results/rackspace/1/status"
            },

            // fields from status request to avoid secondary calls
            "name": "some user-defined name",
            "request_time": <ISO-8601 timestamp>
            ...
        },
        ...
    ]
}

Get Result

Request

GET http://builtwith.com/firehose/results/rackspace/1
Accept: application/zip+builtwith.firehose.result,
    application/zip+builtwith.firehose.status

Response if finished

200 (OK)
Content-Type: application/zip+builtwith.firehose.result
<zipped file with results files>

We could get different compression schemes by changing the accept header.

Response if still processing

102 (PROCESSING)
Content-type: application/json+builtwith.firehose.status
<same payload as a status request>

Cancel a Job

Request

DELETE http://builtwith.com/firehose/results/rackspace/1

Response

200 (OK)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment