Skip to content

Instantly share code, notes, and snippets.

@myhrvold
Last active December 31, 2019 06:44
Show Gist options
  • Save myhrvold/e818f54f5675fc826b3c to your computer and use it in GitHub Desktop.
Save myhrvold/e818f54f5675fc826b3c to your computer and use it in GitHub Desktop.
Encoding Protocol and Compression Algorithm Test Script
foreach (encode/decode, compress/decompress):
start = time.now()
packed = compress(encode(json))
size = len(packed)
unpack = decode(decompress(packed))
elapsed = time.now() - start
@myhrvold
Copy link
Author

(Pseudocode :0) )

@Alanscut
Copy link

Hi Conor,

The topic of testing the performance of a combination of serialization and compression algorithms is very interesting, and it is very valuable. After reading the article How Uber Engineering Evaluated JSON Encoding and Compression Algorithms to Put the Squeeze on Trip Data, I have some questions:

  1. If a third-party data serialization library(json, messagepack, avro ...) is used here, such as Demjson, will it affect the experimental results?
  2. Is there a compression library that can only be implemented in python?
  3. If a third-party compress library is used here, will it affect the experimental results?
  4. Is this test result valid only for platforms built with Python? Does this mean that for projects such as C / Java / Go, I need to build a test platform corresponding to the programming language, so that I can successfully call serialization software and compression software in different languages?

Thanks! Looking forward to your guidance!

Alan Wang

Process of processing JSON format data

import json,zlib

// Step 1: Processing raw input json data, before start time
data = json.loads(jsonTextFile);        //  jsonTextFile is the original input data, put in a text file as JSON)

startTime = time.clock();

// Step 2: encoding
// Question 1: If a third-party JSON library is used here, such as Demjson, will it affect the experimental results?
encode_json = json.dumps(data); 

// Step 3: compressing
// Question 2: Is there a compression library that can only be implemented in python?
// Question 3: If a third-party compress library is used here, will it affect the experimental results?
pack = zlib.compress(encode_json);
middleTime = time.clock();

// Step 4: decoding
unpack = zlib.decompress(pack);

// Step 5: decompressing
decode_json = json.loads(unpack)
endTime = time.clock();

encodeTime = middleTime - startTime;	        // the time of encode and compress
decodeTime = endTime - middleTime;		// the time of decompress and decode 

Process of processing MessagePack format data

import msgpack,json,zlib

// Step 1: Processing raw input json data, before start time
data = json.loads(jsonTextFile);        //  jsonTextFile is the original input data, put in a text file as JSON)

startTime = time.clock();

// Step 2: encoding
// Question 1: If a third-party MessagePack library is used here, will it affect the experimental results?
encode_msg = msgpack.packb(data); 

// Step 3: compressing
// Question 2: Is there a compression library that can only be implemented in python?
// Question 3: If a third-party compress library is used here, will it affect the experimental results?
pack = zlib.compress(encode_msg);
middleTime = time.clock();

// Step 4: decoding
unpack = zlib.decompress(pack);

// Step 5: decompressing
decode_msg = msgpack.unpackb(unpack)
endTime = time.clock();

encodeTime = middleTime - startTime;	        // the time of encode and compress
decodeTime = endTime - middleTime;		// the time of decompress and decode 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment