kriszyp/CBOR comparisons.md

## CBOR comparisons.md

      
    Raw
  

              CBOR comparisons.md
            
          
    The cbor-x's packed implementation only packs whole strings that occur multiple times, it does not search for repeated prefixes or postfixes, as they would almost certainly be vastly more expensive. Strings are packed if they occur multiples in a data structures. When using packed + record tags, strings as keys are not searched for string repetition (since it assumed repetition will mostly be eliminated by the structure reuse).
The table shows encoded size for each technique, and the encoding and decoding performance. The last column also includes the gzipped size for comparison sake (no gzip performance, but generally is about 2-4x slower with gzipping in my tests). The table compares plain CBOR encoding, packed, record structures with a 1+1 definition tag and 1+2 tag, and the combination of packed and record structures.
The first comparison test uses an 8KB JSON data structure from our database of medical studies, that has a fairly complicated and dynamic structure:
https://github.com/kriszyp/cbor-x/blob/master/tests/example4.json


Method
size
encode/sec
decode/sec
gzip size


CBOR
6376
140000
99900
2308


CBOR Packed
4734
37300
103800
2456


CBOR with record tags (1+1)
5227
105000
113000
2425


CBOR with record tags (1+2)
5243
105000
113000
2429


CBOR Packed + records
4515
48000
110400
2440


CBOR with stringrefs
5138
99000
101600


The second comparison test uses an 25KB JSON data structure from Twitter's example response from their search API, which is much more homogenous and repetitive in structure:
https://developer.twitter.com/en/docs/twitter-api/v1/tweets/search/api-reference/get-search-tweets


Method
size
encode/sec
decode/sec
gzip size


CBOR
12213
76000
54000
3000


CBOR Packed
6795
23000
63000
3260


CBOR with record tags (1+1)
7633
82000
62000
3081


CBOR with record tags (1+2)
7643
80000
62000
3084


CBOR Packed + records
6008
39000
62000
3076


CBOR with stringrefs
7295
65000
63000
Method	size	encode/sec	decode/sec	gzip size
CBOR	6376	140000	99900	2308
CBOR Packed	4734	37300	103800	2456
CBOR with record tags (1+1)	5227	105000	113000	2425
CBOR with record tags (1+2)	5243	105000	113000	2429
CBOR Packed + records	4515	48000	110400	2440
CBOR with stringrefs	5138	99000	101600
Method	size	encode/sec	decode/sec	gzip size
CBOR	12213	76000	54000	3000
CBOR Packed	6795	23000	63000	3260
CBOR with record tags (1+1)	7633	82000	62000	3081
CBOR with record tags (1+2)	7643	80000	62000	3084
CBOR Packed + records	6008	39000	62000	3076
CBOR with stringrefs	7295	65000	63000