Last active
December 31, 2019 06:44
-
-
Save myhrvold/e818f54f5675fc826b3c to your computer and use it in GitHub Desktop.
Encoding Protocol and Compression Algorithm Test Script
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
foreach (encode/decode, compress/decompress): | |
start = time.now() | |
packed = compress(encode(json)) | |
size = len(packed) | |
unpack = decode(decompress(packed)) | |
elapsed = time.now() - start |
Hi Conor,
The topic of testing the performance of a combination of serialization and compression algorithms is very interesting, and it is very valuable. After reading the article How Uber Engineering Evaluated JSON Encoding and Compression Algorithms to Put the Squeeze on Trip Data, I have some questions:
- If a third-party data serialization library(json, messagepack, avro ...) is used here, such as Demjson, will it affect the experimental results?
- Is there a compression library that can only be implemented in python?
- If a third-party compress library is used here, will it affect the experimental results?
- Is this test result valid only for platforms built with Python? Does this mean that for projects such as C / Java / Go, I need to build a test platform corresponding to the programming language, so that I can successfully call serialization software and compression software in different languages?
Thanks! Looking forward to your guidance!
Alan Wang
Process of processing JSON format data
import json,zlib
// Step 1: Processing raw input json data, before start time
data = json.loads(jsonTextFile); // jsonTextFile is the original input data, put in a text file as JSON)
startTime = time.clock();
// Step 2: encoding
// Question 1: If a third-party JSON library is used here, such as Demjson, will it affect the experimental results?
encode_json = json.dumps(data);
// Step 3: compressing
// Question 2: Is there a compression library that can only be implemented in python?
// Question 3: If a third-party compress library is used here, will it affect the experimental results?
pack = zlib.compress(encode_json);
middleTime = time.clock();
// Step 4: decoding
unpack = zlib.decompress(pack);
// Step 5: decompressing
decode_json = json.loads(unpack)
endTime = time.clock();
encodeTime = middleTime - startTime; // the time of encode and compress
decodeTime = endTime - middleTime; // the time of decompress and decode
Process of processing MessagePack format data
import msgpack,json,zlib
// Step 1: Processing raw input json data, before start time
data = json.loads(jsonTextFile); // jsonTextFile is the original input data, put in a text file as JSON)
startTime = time.clock();
// Step 2: encoding
// Question 1: If a third-party MessagePack library is used here, will it affect the experimental results?
encode_msg = msgpack.packb(data);
// Step 3: compressing
// Question 2: Is there a compression library that can only be implemented in python?
// Question 3: If a third-party compress library is used here, will it affect the experimental results?
pack = zlib.compress(encode_msg);
middleTime = time.clock();
// Step 4: decoding
unpack = zlib.decompress(pack);
// Step 5: decompressing
decode_msg = msgpack.unpackb(unpack)
endTime = time.clock();
encodeTime = middleTime - startTime; // the time of encode and compress
decodeTime = endTime - middleTime; // the time of decompress and decode
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
(Pseudocode :0) )