Skip to content

Instantly share code, notes, and snippets.

@nipun2
Forked from frsyuki/my_thoughts_on_msgpack.md
Last active August 29, 2015 14:24
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nipun2/8fe63c570f366f6bed5b to your computer and use it in GitHub Desktop.
Save nipun2/8fe63c570f366f6bed5b to your computer and use it in GitHub Desktop.

My thoughts on MessagePack

Hi. My name is Sadayuki "Sada" Furuhashi. I am the author of the MessagePack serialization format as well as its implementation in C/C++/Ruby.

Recently, MessagePack made it to the front page of Hacker News with this blog entry by Olaf, the creator of the Facebook game ZeroPilot. In the comment thread, there were several criticisms for the blog post as well as MessagePack itself, and I thought this was a good opportunity for me to address the questions and share my thoughts.

My high-level response to the comments

To the best of my understanding, roughly speaking, the criticisms fell into the following two categories.

  1. MessagePack may not be the best choice for client-side serialization as described by the blog author.

  2. MessagePack makes boastful claims about its performance. Several comments made a reference to the graph comparing MessagePack to Protocol Buffer and JSON.

I perfectly agree with the first point. When I conceived MessagePack in junior year of college, I never imagined that it would be used in a browser (It was originally conceived for a distributed file system I was writing at the time). While I am pleased to see MessagePack's wider adoption, its pros and cons should be carefully considered, and there are many situations where it simply does not offer enough advantage to JSON. That said, it is true that MessagePack is more space-efficient than JSON (being binary, more efficient storage of small integers, etc.), and if that's something that your project requires, MessagePack is a quite attractive choice.

With regards to 2, it was an editorial shortcoming on my part. I still stand by the claim that MessagePack can be four times as fast as Protocol Buffer (the implementation open-sourced by Google), but I should have done a better job explaining the context. I posted that comparison to highlight the power of "zero-copy" implemented in MessagePack's C/C++ library, and I should have said so. This has been fixed since then.

Of course, there are cases where other serializers (including Protocol Buffer) outperform MessagePack in terms of serialization/deserialization speed. As we move forward, I intend to create/consolidate a well-annotated, detailed set of spatial/temporal performance benchmarks so that users can have a better understanding of what MessagePack can and cannot offer. (If anyone would like to help us do this, it would be greatly appreciated!)

MessagePack's known use cases

As I mentioned earlier, MessagePack is no panacea. However, I believe it is a great tool for certain tasks, and I would like to share some of MessagePack's use cases to highlight what MessagePack is good for.

  1. Space-efficient storage for Memcache entries (Pinterest).

    At Pinterest, MessagePack is used to serialize a list of 64-bit ID's as a Memcache entry. According to them, MessagePack allows them to compress their 64-bit ID's to 5 bytes on average (5/8 ‾ 62.5%).

  2. For an RPC mechanism (DotCloud's zerorpc).

    DotCloud recently open-sourced zerorpc, an RPC mechanism built on ZeroMQ and MessagePack. This use case is fairly close to my original intent. When one is designing an RPC system, one of the first tasks is to specify and implement a communication protocol. This process can get pretty hairy as you need to worry about a lot of low-level issues like Endian-ness. By using MessagePack, one can skip designing and implementing a communication protocol entirely and accelerate development.

A few words on MessagePack's culture

To wrap things up, let me describe my view of MessagePack's culture.

  1. The MessagePack Project is highly decentralized. I originally came up with the format and implemented it for C/C++/Ruby (and I continue to maintain it for these languages), but each language has its own project leader and develops at its own pace. I find this model to be more effective than having a huge, monolithic body of committers as it keeps each project nimble and autonomous.

  2. The MessagePack Project believes in craftsmanship and expertise. While space efficiency of MessagePack comes from the format, speed is the result of sustained efforts by expert hackers who relentlessly optimized implementations of different programming languages. Over the past three years, I have seen them achieve remarkable performance gains that are only possible with an intimate knowledge of language internals. I have a lot of faith in their abilities to continue making MessagePack better and faster.

While I am proud of all the things we have achieved with everyone involved in the project, MessagePack still has long ways to go. There are a couple of projects in the pipeline. For example, we are working on a website redesign as well as better documentation and repository reorganization.

Thank you for reading this far, and happy hacking!

Sada

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment