itayB/ApacheKafka.md

## ApacheKafka.md

      
    Raw
  

              ApacheKafka.md
            
          
Written by Itay Bittan.

Apache Kafka

Background / Motivation

Our primary product contains a lot of modules (separate processes) talking together. At the beginning we wrote our own IMP (internal message protocol) written in C. It allowed us to send buffers (C structures) between modules over TCP connection.  In addition, we developed a communication module - process that responsible to pass messages between all of the system modules.
Here are some of the pitfalls we suffered from:


Limited queue size - working with TCP sockets queues we find out that we where limited to relatively small size. If you want see your buffer size in terminal, you can take a look at:
/proc/sys/net/ipv4/tcp_rmem (for read)
/proc/sys/net/ipv4/tcp_wmem (for write)
thanks to this answer on stack-overflow.
After performing our load test we end up with full buffer queues and a lot of missing messages.


Malformed messages / Segmentation faults - Our message structure was precede with a four bytes which represent the message length. Once the queue buffer is full part of the data was missing and then we suffered from shifting of the new arrival messages. In other case, too big messages wasn't ready to be pull (only half message arrived) and pulling them cause a lot of pain.


No persistency - once a service went down (because of server failure, for example) - all of it's waiting messages are lost (because they are waiting in operating system sockets buffers).


No replicas - if our communication module failed - there was no backup module and all other system modules were unable to connect.


Peer to peer - if module wants to send message to all other modules of the same type (all of the web servers, for example) - he had to send each one of them message separately. We build automated mechanism to solve this issue (we called it broadcast message) but still not effective (see Kafka solution for this issue below).


Debug - peeking the operating system buffers, and trying to figure out how many messages are waiting in queue and which kind of messages are wasn't that easy.


The solution

It turns out that there are a lot of queues out there: RabbitMQ, NSQ, Apache Kafka and more. I found this great queue.io web site after we've already start using Kafka. We heard about a lot of companies around us using Kafka so we decided to give it a try. We were also looking for a configuration service (to replace our ConfD as a configuration service with Zookeeper) but it will be covered in a different summary.
Our design

Now, regard the pitfalls above:


Queue size - queues (called topics) are written to disk so the sizes are much bigger and limitation here suites are needs (and configurable of course).


Solved. Messages are managed with Kafka.


Messages are persisted on disk and replicated within the cluster to prevent data loss


See section 3 above.


Different concept. Modules of the same type can now subscribe as a consumer to one shared queue (topic) - instead of maintaining queue per each module. Each module consume the same message instead of sending the same message multiple times.


Debugging is much easier - we knows how many messages are in queue and can iterate over them simply with the API.


Cons


Complicated configuration.
Rely on a third party code.