ericmoritz/README.rst

## README.rst

      
    Raw
  

              README.rst
            
          
    I was a bit skeptical about a new data format call tnetstring.  It claims to be both human and machine readable. They claim that a parser is easier to built for a tnetstring and therefore is better.  In fact,
while the implementation of the parser may be easier, human readability suffers with negligible speed gains.
I assert that if human readability is harmed in favor of speed, then you would be much better off using a
binary solution such as Protobufs.

Results

Time is in milliseconds:
(experiments)emoritzmbpro:tnetstringshootout emoritz$ python tnet.py
13.1070613861
(experiments)emoritzmbpro:tnetstringshootout emoritz$ python jsontest.py
1780.90310097
(experiments)emoritzmbpro:tnetstringshootout emoritz$ python json-min-test.py
1514.28318024
(experiments)emoritzmbpro:tnetstringshootout emoritz$ python simplejsontest.py
35.9201431274
(experiments)emoritzmbpro:tnetstringshootout emoritz$ python simplejson-min-test.py
16.233921051
(experiments)emoritzmbpro:tnetstringshootout emoritz$ python pkltest.py
36.780834198
(experiments)emoritzmbpro:tnetstringshootout emoritz$ python bsontest.py
22032.9511166


## bsontest.py
from data import data
import bson

from time import time
doc = bson.dumps(data)

start = time()
result = bson.loads(doc)
end = time()
print (end - start)*1000

## data.py
# Create a large nested dictionary


root = {}
current = root
for i in range(100):
    current["big list"] = range(1000)
    current["big string"] = "a" * 1000
    current['child'] = {}
    current = current['child']

data = root

## json-min-test.py
from data import data
import json

from time import time
doc = json.dumps(data)

start = time()
result = json.loads(doc)
end = time()
print (end - start)*1000

## jsontest.py
from data import data
import json

from time import time
doc = json.dumps(data, indent=4)

start = time()
result = json.loads(doc)
end = time()
print (end - start)*1000

## pkltest.py
from data import data
import cPickle as pickle

from time import time
doc = pickle.dumps(data)

start = time()
result = pickle.loads(doc)
end = time()
print (end - start)*1000

## simplejson-min-test.py
from data import data
import simplejson as json

from time import time
doc = json.dumps(data)

start = time()
result = json.loads(doc)
end = time()
print (end - start)*1000

## simplejsontest.py
from data import data
import simplejson as json

from time import time
doc = json.dumps(data, indent=4)

start = time()
result = json.loads(doc)
end = time()
print (end - start)*1000

## tnet.py
from data import data
import tnetstring
from time import time
doc = tnetstring.dumps(data)

start = time()
result = tnetstring.loads(doc)
end = time()
print (end - start)*1000
	from data import data
	import bson

	from time import time
	doc = bson.dumps(data)

	start = time()
	result = bson.loads(doc)
	end = time()
	print (end - start)*1000
	# Create a large nested dictionary


	root = {}
	current = root
	for i in range(100):
	current["big list"] = range(1000)
	current["big string"] = "a" * 1000
	current['child'] = {}
	current = current['child']

	data = root
	from data import data
	import json

	from time import time
	doc = json.dumps(data)

	start = time()
	result = json.loads(doc)
	end = time()
	print (end - start)*1000
	from data import data
	import cPickle as pickle

	from time import time
	doc = pickle.dumps(data)

	start = time()
	result = pickle.loads(doc)
	end = time()
	print (end - start)*1000
	from data import data
	import simplejson as json

	from time import time
	doc = json.dumps(data)

	start = time()
	result = json.loads(doc)
	end = time()
	print (end - start)*1000
	from data import data
	import tnetstring
	from time import time
	doc = tnetstring.dumps(data)

	start = time()
	result = tnetstring.loads(doc)
	end = time()
	print (end - start)*1000