Skip to content

Instantly share code, notes, and snippets.

@pierreglaser
Created February 22, 2019 10:30
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pierreglaser/26207c20dd8526d6df7963044a3e4477 to your computer and use it in GitHub Desktop.
Save pierreglaser/26207c20dd8526d6df7963044a3e4477 to your computer and use it in GitHub Desktop.
Use c pickler extension to communicate data to workers
import time
import loky
def process_dict(x):
return len(x)
if __name__ == "__main__":
e = loky.get_reusable_executor(max_workers=2)
large_dict = dict(zip(range(500000), range(500000)))
t0 = time.time()
e.submit(process_dict, large_dict).result()
total_time = round(time.time() - t0, 3)
print(f'total time: {total_time}s')
@pierreglaser
Copy link
Author

using current cloudpickle:

total time: 3.591 s

using a cloudpickle that extends the C pickler:

total time: 0.256 s

We observe a 15x speedup here, but there is an irreductible startup time of 0.15 seconds (time required to send a negligable amount of data). If we subtract his startup time, we are talking about a x30 speedup here.

@pitrou
Copy link

pitrou commented Mar 22, 2019

Wow, this is zippy!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment