Skip to content

Instantly share code, notes, and snippets.

@shubhamvashisht
Last active February 1, 2024 09:36
Show Gist options
  • Save shubhamvashisht/21515516f6015adc8671029d7f21cc52 to your computer and use it in GitHub Desktop.
Save shubhamvashisht/21515516f6015adc8671029d7f21cc52 to your computer and use it in GitHub Desktop.
This is summary of my GSoC project with Jboss by Red hat.

GSOC 2018 Work product submission.

This is summary of my GSoC project with Jboss by Red hat. In this gist i have included some of the major features i implemented during the programe.

Strimzi - Bridging HTTP to Apache Kafka

Strimzi is a project about running Apache Kafka on platforms like kubernetes and openshift. It contains two main modules.
  • Strimzi-Kafka-operator
  • AMQP-Kafka-bridge.
The project idea for GSoC was to add HTTP support to the bridge so that it can listen to http clients and bridge them to Kafka.
  • As GSoC is divided into many phases i will breakdown my work done on the project in various phases as well.

The Architecture

The first challege was to design an architecture for HTTP and kafka interactions. At that time bridge supported AMQP and Both protocols have many key differences which definately reflected on the architecture. Some of the key differences were :
  • Uni-direction communication (request-response)
  • Records could not be stored inside bridge and pushed to client.
  • Automatic poll would not be very efficient ( used on-demand polling)
  • Handling delivery reports

I divided the project into four parts.

  • HTTP server Implementation
  • Producer API : Producing Records to kafka.
  • Consumer API : Consuming Records from kafka.
  • Testing

HTTP Server

I used Vert.xHttpServer library to create a Http Server. Here the major part was to distinguish between incoming requests and map them to various corresponding operations on kafka. such as produce,creation of consumer, subscription , consuming, deleting consumer instances etc.

Producer API

Producer part is strongly related KafkaProducers which produces messages to kafka server. some key features:

  • The creation of producer is not handled by the client unlike consumers.
  • There exists only one KafkaProducer per connection.
  • The clients makes a HTTP POST request to the producer endpoint defined in order to produce records.
Producer PR : https://github.com/strimzi/amqp-kafka-bridge/pull/100/commits/0d5ae1eb119b9eb4f4e14dafc1455b3e7cc9e6a6 In this commit, I implemented the following features/commits :
  • route producer requests to HttpSourceEndpoint
  • added Json Message converter and producing records
  • added response from producer and test for producing simple messages
  • fixed types in Message Converter
  • send record metadata in delivery response
  • added request type indentify utility and documentation about distinguishing requests
  • added and fixed logs
  • added missing license header to files
  • fixed codestyle related errors
  • fixed rejected delivery response
  • fixed producer tests
  • reformat RequestIdentifier
  • added doc for producer API

Consumer API

The consumer API is about fetching records from kafka. HTTP clients can use consumer api in order to create consumers, manage subscriptions, manage offsets and consuming records. Some key features :

  • Create multiple consumers
  • Subsribe to multiple topics(one per request)
  • consume records.
  • Commit offsets
  • Delete consumer instances
All requests to consumer API uses JSON format for body/data. Http responses also uses JSON format.

TESTING

For testing i used vert.x unit and JUnit.

Consumer testing
Producer Testing
  • added tests for sending messages
  • added test for sending periodic messages

Final work product

Details about the API's and usage of project can be found here in these doc.

Blog posts about the GSoC project

These are few blog posts i wrote during the programe

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment