Skip to content

Instantly share code, notes, and snippets.

@sesteves
Last active August 13, 2018 23:57
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sesteves/9c37e23f7e0b79cfb6be1f7ee6a31acc to your computer and use it in GitHub Desktop.
Save sesteves/9c37e23f7e0b79cfb6be1f7ee6a31acc to your computer and use it in GitHub Desktop.
GSoC'18 Final Report

Implementing a plugin to integrate RocketMQ with HBase

Student: Sergio Esteves

Mentors: Von Gosling and Xinyu Zhou


Project description

This project aimed at designing, implementing and evaluating a plugin that integrates RocketMQ with HBase, a large-scale non-relational database. The plugin comprises two parts: (i) the HBase sink, that replicates HBase tables to RocketMQ topics, and (ii) the HBase source, which replicates RocketMQ topics to HBase tables.

The HBase sink involved creating a replication endpoint for HBase. This endpoint can track updates (put and delete operations) performed on specified tables, and replicate them to a RocketMQ topic. For this replication process, I created a RocketMQ producer, using reliable synchronous transmission, that effectively pushes the messages to a RocketMQ server.

The HBase source consisted of creating a daemon program that is continuously pulling messages, at regular time intervals, from specified RocketMQ topics and writing them to HBase tables. To pull messages from a RocketMQ server, I created a RocketMQ consumer that effectively pulls the messages from the RocketMQ topics, thereby using a broadcast message model. Finally, I created an HBase client to effectively write messages to HBase tables in batch.

This integration plugin for HBase improves RocketMQ offline storage capabilities and benefits users with stringent large-scale and data-intensive processing needs.

Code

List of commits in WIP branch

Repository containing all code developed for GSoC

Pull request to main (apache) repository

What's left to do

  • add fault tolerance capabilities

Final remarks

It was a very interesting and enriching experience spending the summer working on a massive scale publish/subscribe message queue system such as RocketMQ. I have had the opportunity to learn about the architecture, principles, concepts, and properties that make a message system achieve low latency, high throughput, and high scalability while being reliable in the presence of arbitrary failures.

@vongosling
Copy link

vongosling commented Aug 13, 2018

LGTM, It would be helpful if you could make a pr to external repository

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment