Skip to content

Instantly share code, notes, and snippets.

@mroth
Created March 29, 2014 16:24
Show Gist options
  • Save mroth/9857482 to your computer and use it in GitHub Desktop.
Save mroth/9857482 to your computer and use it in GitHub Desktop.
Comparison of JSON and MsgPack for typical Emojitracker messages
#!/usr/bin/env ruby
require 'json'
require 'oj'
require 'msgpack'
require 'benchmark'
#################################
# set up sample data
#################################
def hash2array(hash)
hash.map {|k,v| [k.to_i(16),v]}
end
s1_hash =
{"1F4AA"=>1,"1F61C"=>1,"2665"=>2}
s2_hash =
{"1F42E"=>1,"1F61C"=>1,"1F601"=>1,"1F612"=>2,
"2665"=>1,"263A"=>1,"1F44C"=>1,"1F602"=>2,
"1F4AF"=>1,"1F60F"=>1,"1F48B"=>1}
s1_array = hash2array(s1_hash)
s2_array = hash2array(s2_hash)
#################################
# compare relative sizes
#################################
def show_comparison(sample_hash,sample_array,i)
puts "*** Sample #{i} ***"
puts "\nHash representation"
puts sample_hash
puts "\nArray represenation"
puts sample_array.to_s
s_hash_json = Oj.dump(sample_hash)
s_hash_msgpack = MessagePack.pack(sample_hash)
s_array_json = Oj.dump(sample_array)
s_array_msgpack = MessagePack.pack(sample_array)
puts "\n"
puts "-> Hash (JSON): #{s_hash_json.bytesize} bytes"
puts "-> Hash (MsgPack): #{s_hash_msgpack.bytesize} bytes"
puts "-> Array (JSON): #{s_array_json.bytesize} bytes"
puts "-> Array (MsgPack): #{s_array_msgpack.bytesize} bytes"
puts "\n\n"
end
[[s1_hash,s1_array],[s2_hash,s2_array]].each.with_index(1) do |vals,i|
show_comparison(*vals,i)
end
#################################
# benchmark performance
#################################
iterations = 100_000
puts "*** Benchmarks (#{iterations} iterations) ***"
Benchmark.bm(20) do |bm|
bm.report('JSON.generate(hash)') do
iterations.times { JSON.generate(s1_hash); JSON.generate(s2_hash) }
end
bm.report('JSON.generate(array)') do
iterations.times { JSON.generate(s1_array); JSON.generate(s2_array) }
end
bm.report('Oj.dump(hash)') do
iterations.times { Oj.dump(s1_hash); Oj.dump(s2_hash) }
end
bm.report('Oj.dump(array)') do
iterations.times { Oj.dump(s1_array); Oj.dump(s2_array) }
end
bm.report('MsgPack.pack(hash)') do
iterations.times { MessagePack.pack(s1_hash); MessagePack.pack(s2_hash) }
end
bm.report('MsgPack.pack(array)') do
iterations.times { MessagePack.pack(s1_array); MessagePack.pack(s2_array) }
end
end
*** Sample 1 ***
Hash representation
{"1F4AA"=>1, "1F61C"=>1, "2665"=>2}
Array represenation
[[128170, 1], [128540, 1], [9829, 2]]
-> Hash (JSON): 30 bytes
-> Hash (MsgPack): 21 bytes
-> Array (JSON): 32 bytes
-> Array (MsgPack): 20 bytes
*** Sample 2 ***
Hash representation
{"1F42E"=>1, "1F61C"=>1, "1F601"=>1, "1F612"=>2, "2665"=>1, "263A"=>1, "1F44C"=>1, "1F602"=>2, "1F4AF"=>1, "1F60F"=>1, "1F48B"=>1}
Array represenation
[[128046, 1], [128540, 1], [128513, 1], [128530, 2], [9829, 1], [9786, 1], [128076, 1], [128514, 2], [128175, 1], [128527, 1], [128139, 1]]
-> Hash (JSON): 109 bytes
-> Hash (MsgPack): 76 bytes
-> Array (JSON): 118 bytes
-> Array (MsgPack): 74 bytes
*** Benchmarks (100000 iterations) ***
user system total real
JSON.generate(hash) 1.150000 0.000000 1.150000 ( 1.159673)
JSON.generate(array) 0.580000 0.000000 0.580000 ( 0.580608)
Oj.dump(hash) 0.150000 0.000000 0.150000 ( 0.150604)
Oj.dump(array) 0.160000 0.000000 0.160000 ( 0.152734)
MsgPack.pack(hash) 0.210000 0.010000 0.220000 ( 0.218053)
MsgPack.pack(array) 0.230000 0.000000 0.230000 ( 0.222266)
@mroth
Copy link
Author

mroth commented Mar 29, 2014

Conclusions:

  • For the type of messages Emojitracker sends, msgpack results in roughly 30% size savings over JSON.
  • Since MsgPack is a binary protocol, I expected the savings to be significantly greater if I converted my (hexadecimal) keys from string representation to integer values, which necessitates sending an array instead of a hash. The difference appeared to be fairly insignificant, and I'm not sure why. Perhaps the nested arrays are more inefficient to represent in msgpack, resulting in evening things out?
  • The performance of MsgPack encoding in MRI Ruby 2.1 is actually quite good, not quite as fast as the super-optimized Oj JSON library, but significantly better than the really bad default JSON encoder many people still use.

@mroth
Copy link
Author

mroth commented Mar 29, 2014

Unfortunately, the performance characteristics in Javascript/v8 appear to be quite different. The node-msgpack project notes "msgpack.pack() is about 5x slower than JSON.stringify(), and msgpack.unpack() is about 3.5x slower than JSON.parse()".

I haven't seen cross-platform comparisons of the new super optimized JSON libraries in Node.js and Oj, but if you are comparing options on your platform the relative comparisons are probably what matters.

@danlo
Copy link

danlo commented Dec 18, 2020

I came across this while google searching for "Oj vs MsgPack" and re-ran the tests

This was done using ruby 2.7.2 and the latest gems as of 2020-12-18.

~/test$ bundle exec ruby go.rb
*** Sample 1 ***

Hash representation
{"1F4AA"=>1, "1F61C"=>1, "2665"=>2}

Array represenation
[[128170, 1], [128540, 1], [9829, 2]]

-> Hash  (JSON):    30 bytes
-> Hash  (MsgPack): 21 bytes
-> Array (JSON):    32 bytes
-> Array (MsgPack): 20 bytes


*** Sample 2 ***

Hash representation
{"1F42E"=>1, "1F61C"=>1, "1F601"=>1, "1F612"=>2, "2665"=>1, "263A"=>1, "1F44C"=>1, "1F602"=>2, "1F4AF"=>1, "1F60F"=>1, "1F48B"=>1}

Array represenation
[[128046, 1], [128540, 1], [128513, 1], [128530, 2], [9829, 1], [9786, 1], [128076, 1], [128514, 2], [128175, 1], [128527, 1], [128139, 1]]

-> Hash  (JSON):    109 bytes
-> Hash  (MsgPack): 76 bytes
-> Array (JSON):    118 bytes
-> Array (MsgPack): 74 bytes


*** Benchmarks (100000 iterations) ***
                           user     system      total        real
JSON.generate(hash)    0.358862   0.003684   0.362546 (  0.363171)
JSON.generate(array)   0.352255   0.000000   0.352255 (  0.353107)
Oj.dump(hash)          0.183132   0.000000   0.183132 (  0.183659)
Oj.dump(array)         0.205717   0.000000   0.205717 (  0.206111)
MsgPack.pack(hash)     0.246996   0.000000   0.246996 (  0.247439)
MsgPack.pack(array)    0.247751   0.000000   0.247751 (  0.248370)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment