Skip to content

Instantly share code, notes, and snippets.

@mumrah
Created August 3, 2012 14:35
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save mumrah/3248179 to your computer and use it in GitHub Desktop.
Save mumrah/3248179 to your computer and use it in GitHub Desktop.
Kafka REST proposal

REST interface for Kafka

Taking inspiration from the projects page...

I think it would really useful and pretty simple to add a REST interface to Kafka. I could see two possible routes (not mutually exclusive): using HTTP as a dumb transport layer, and using it with different media types for application-friendly consumption of messages (JSON, XML, etc). The dumb transport would be useful for languages without first-class clients, and the content-type extension would be useful for writing web apps that are Kafka-enabled.

HTTP as a transport

Consuming data

Some simple HTTP endpoints for consuming data:

  • GET /kafka -> welcome page, link to docs, etc
  • GET /kafka/topics -> list of topics
  • GET /kafka/topics/[topic] -> list of partitions for the topic
  • GET /kafka/topics/[topic]/[partition] -> get data from a partition

An endpoint for producing data:

  • POST /kakfa/topics/[topic]/[partition]

Canonical HTTP headers would be used. Use a special "x-kafka" content type so applications don't get confused. Use byte Range requests for specifying topic offsets.

> GET /kafka/topics/foo/0 HTTP/1.1
> Accept: application/x-kafka
> Accept-Encoding: gzip
> Range: bytes=0-1023

< HTTP/1.1 206 Partial Content
< Content-Type: application/x-kafka
< Content-Length: 768
< Content-Encoding: gzip
< Content-Range: 0-1023/10240
< Content-MD5: d3b07384d113edec49eaa6238ad5ff00
<
< [gzip compressed byte content]

The Content-Range response header indicates what bytes were consumed and the largest current offset (10240 in the preceding example). Consumers would need to keep track of their own offsets (for now). A HEAD request could also be used to determine the largest offset for a topic+partition

> HEAD /kafka/topics/foo/0 HTTP/1.1
> Accept: application/x-kafka
> Accept-Encoding: gzip
> Range: bytes=0-1023

< HTTP/1.1 204 No Content
< Content-Type: application/x-kafka
< Content-Length: 768
< Content-Encoding: gzip
< Content-Range: 0-1023/10240
< Content-MD5: d3b07384d113edec49eaa6238ad5ff00

Producing data

A simple POST request can be used for producing data to a topic+partition

> POST /kafka/topics/foo/0 HTTP/1.1
> Content-Type: application/x-kafka
> Content-Encoding: gzip
> Content-MD5: d3b07384d113edec49eaa6238ad5ff00
>
> [gzip compressed byte content]

< HTTP/1.1 204 No Content

POSTs could also be pipelined using a multipart request

> POST /kafka/topics/foo/0 HTTP/1.1
> Content-Type: multipart/mixed; boundary="some boundary"
>
> ignored
> --some boundary
> Content-Type: application/x-kafka
> Content-Encoding: gzip
> Content-MD5: d3b07384d113edec49eaa6238ad5ff00
>
> [gzip compressed byte content]
> --some boundary
> Content-Type: application/x-kafka
> Content-Encoding: gzip
> Content-MD5: d3b07384d113edec49eaa6238ad5ff00
>
> [gzip compressed byte content]
> --some boundary--
> ignored

< HTTP/1.1 204 No Content

HTTP as an application protocol

Another useful application of REST to Kafka would be to expose Kafka messages as various web-friendly content types. For example, if instead of using "x-kafka" as the "Content-Type", one could specify "text/plain" and get back newline delimited messages. This, of course, assumes the messages in Kafka are able (and meaningful) to be represented as strings. This is probably an acceptable limitation for the gained convenience (in some cases).

> GET /kafka/topics/foo/0 HTTP/1.1
> Accept: text/plain
> Accept-Charset: utf-8
> Range: bytes=0-1023

< HTTP/1.1 206 Partial Content
< Content-Type: text/plain
< Content-Encoding: utf-8
< Content-Length: 1030
< Content-Range: 0-1023/10240
< Content-MD5: d3b07384d113edec49eaa6238ad5ff00
<
< this is message 1
< this is message 2
< [more line delimited messages]

Other media types could be implemented in a similar fashion

> GET /kafka/topics/foo/0 HTTP/1.1
> Accept: text/json
> Accept-Charset: utf-8
> Range: bytes=0-1023

< HTTP/1.1 206 Partial Content
< Content-Type: text/json
< Content-Encoding: utf-8
< Content-Length: 1060
< Content-Range: 0-1023/10240
< Content-MD5: d3b07384d113edec49eaa6238ad5ff00
<
< ["this is message 1","this is message 2",...]

The same could be used for sending data to Kafka

> POST /kafka/topics/foo/0 HTTP/1.1
> Content-Type: text/plain; charset=utf-8
>
> this is message 1

< HTTP/1.1 204 No Content

Or to send multiple messages

> POST /kafka/topics/foo/0 HTTP/1.1
> Content-Type: multipart/mixed; boundary="some boundary"
>
> ignored
> --some boundary
> Content-Type: text/plain; charset=utf-8
>
> this is message 1
> --some boundary
> Content-Type: text/plain; charset=utf-8
>
> this is message 2
> --some boundary--
> ignored

< HTTP/1.1 204 No Content

Adding an HTTP server to Kafka

Coming from the Java world, I would be inclined to use JAX-RS api and a lightweight ASL web server (Apache Wink maybe?). However, Kafka is a Scala project, so I would defer this decision to the committers/maintainers.

@mumrah
Copy link
Author

mumrah commented Aug 6, 2012

Added some stuff around multi-part POSTs for pipelining writes

@tovbinm
Copy link

tovbinm commented Oct 11, 2013

Any progress?

@hartmamt
Copy link

hartmamt commented Nov 9, 2013

Can we use Play! Framework to write a wrapper around producer / consumers?

@naeemkhedarun
Copy link

Thank you for the work mumrah. An implementation has been started at https://github.com/naeemkhedarun/reka and feedback / help is welcome. Some great ideas in this gist so I'll be following them closely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment