Skip to content

Instantly share code, notes, and snippets.

@amaltaro
Last active May 20, 2019 13:05
Show Gist options
  • Save amaltaro/f782d53df503c8d11c0c38c8a9ccebf7 to your computer and use it in GitHub Desktop.
Save amaltaro/f782d53df503c8d11c0c38c8a9ccebf7 to your computer and use it in GitHub Desktop.
Profiling DB bind conversion from unicode to byte string
#!/usr/bin/env python
from __future__ import print_function
import copy
import timeit
def convertBinds(bindsList, encodeType="ascii"):
for item in bindsList:
for k in item:
if isinstance(k, unicode):
item[k.encode(encodeType)] = item.pop(k)
if isinstance(item[k], unicode):
item[k] = item[k].encode(encodeType)
return bindsList
BIND_SIZE = int(1e5)
NUM_REPEAT = 10
strList = []
uniVals = []
uniList = []
strEx = {"key1": "value1", "key2": 123,
"dataset": "/RelValProdMinBias/DMWM_Test-ProdMinBias_TaskChain_ProdMinBias_Agent122_Validation_Privv2-v11/GEN-SIM",
"block_name": "/RelValProdMinBias/DMWM_Test-ProdMinBias_TaskChain_ProdMinBias_Agent122_Validation_Privv2-v11/GEN-SIM#62762f76-a322-43ae-856b-f8e0a85e0bf5",
"logical_file_name": "/store/backfill/1/DMWM_Test/RelValProdMinBias/GEN-SIM/ProdMinBias_TaskChain_ProdMinBias_Agent122_Validation_Privv2-v11/00000/A6247F84-C378-E911-BFEF-FA163E45C62F.root"}
uniEx1 = {"key1": u"value1", "key2": 123,
"dataset": u"/RelValProdMinBias/DMWM_Test-ProdMinBias_TaskChain_ProdMinBias_Agent122_Validation_Privv2-v11/GEN-SIM",
"block_name": u"/RelValProdMinBias/DMWM_Test-ProdMinBias_TaskChain_ProdMinBias_Agent122_Validation_Privv2-v11/GEN-SIM#62762f76-a322-43ae-856b-f8e0a85e0bf5",
"logical_file_name": u"/store/backfill/1/DMWM_Test/RelValProdMinBias/GEN-SIM/ProdMinBias_TaskChain_ProdMinBias_Agent122_Validation_Privv2-v11/00000/A6247F84-C378-E911-BFEF-FA163E45C62F.root"}
uniEx2 = {u"key1": u"value1", u"key2": 123,
u"dataset": u"/RelValProdMinBias/DMWM_Test-ProdMinBias_TaskChain_ProdMinBias_Agent122_Validation_Privv2-v11/GEN-SIM",
u"block_name": u"/RelValProdMinBias/DMWM_Test-ProdMinBias_TaskChain_ProdMinBias_Agent122_Validation_Privv2-v11/GEN-SIM#62762f76-a322-43ae-856b-f8e0a85e0bf5",
u"logical_file_name": u"/store/backfill/1/DMWM_Test/RelValProdMinBias/GEN-SIM/ProdMinBias_TaskChain_ProdMinBias_Agent122_Validation_Privv2-v11/00000/A6247F84-C378-E911-BFEF-FA163E45C62F.root"}
for i in range(0, BIND_SIZE):
strList.append(copy.copy(strEx)) # 0 unicode
uniVals.append(copy.copy(uniEx1)) # 4 values unicode
uniList.append(copy.copy(uniEx2)) # 5 keys and 4 values unicode
ti = timeit.timeit("convertBinds(strList)", setup="from __main__ import convertBinds, strList", number=NUM_REPEAT)
print("\nTime for byte string only 5-dict with %d binds: %s secs" % (BIND_SIZE, ti / NUM_REPEAT))
ti = timeit.timeit("convertBinds(uniVals)", setup="from __main__ import convertBinds, uniVals", number=NUM_REPEAT)
print("Time for unicode values only 5-dict with %d binds: %s secs" % (BIND_SIZE, ti / NUM_REPEAT))
ti = timeit.timeit("convertBinds(uniList)", setup="from __main__ import convertBinds, uniList", number=NUM_REPEAT)
print("Time for unicode key/value pairs 5-dict with %d binds: %s secs" % (BIND_SIZE, ti / NUM_REPEAT))
@amaltaro
Copy link
Author

Results for 100k length list (10 repetitions):

Time for byte string only 5-dict with 100000 binds: 0.322575998306 secs
Time for unicode values only 5-dict with 100000 binds: 0.344154810905 secs
Time for unicode key/value pairs 5-dict with 100000 binds: 0.388173699379 secs

Results for 1M length list:

Time for byte string only 5-dict with 1000000 binds: 3.28408708572 secs
Time for unicode values only 5-dict with 1000000 binds: 3.54650349617 secs
Time for unicode key/value pairs 5-dict with 1000000 binds: 3.85928421021 secs

@amaltaro
Copy link
Author

With more realistic key/value pairs, 100k binds (around 10% slower):

Time for byte string only 5-dict with 100000 binds: 0.330755209923 secs
Time for unicode values only 5-dict with 100000 binds: 0.372075390816 secs
Time for unicode key/value pairs 5-dict with 100000 binds: 0.42782599926 secs

same key/value pairs for 1M binds (< 10% slower):

Time for byte string only 5-dict with 1000000 binds: 3.32281398773 secs
Time for unicode values only 5-dict with 1000000 binds: 3.74059360027 secs
Time for unicode key/value pairs 5-dict with 1000000 binds: 4.24555339813 secs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment