Skip to content

Instantly share code, notes, and snippets.

@wusixer
Created September 23, 2021 01:43
Show Gist options
  • Save wusixer/b84a6c977f5fc1596f01e3f8cfeacfc9 to your computer and use it in GitHub Desktop.
Save wusixer/b84a6c977f5fc1596f01e3f8cfeacfc9 to your computer and use it in GitHub Desktop.
Vectorize a python function to speed up computation
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ericmjl
Copy link

ericmjl commented Sep 23, 2021

This is so totally a blog post, @wusixer. Add a few more annotations - and maybe compare it to JAX's vmap, which operates only on numerical data but still good for you to see in action.

@wusixer
Copy link
Author

wusixer commented Sep 23, 2021

Good point, @ericmjl! Thanks for the insight!!

This is so totally a blog post, @wusixer. Add a few more annotations - and maybe compare it to JAX's vmap, which operates only on numerical data but still good for you to see in action.

@alokito
Copy link

alokito commented Sep 29, 2021

Wow, didn't know about np.frompyfunc, but I think it might not actually do vectorization... that would be another interesting comparison.

https://stackoverflow.com/questions/50123009/how-does-numpy-frompyfunc-generate-ufunc-form-a-python-function-containing-if-st

@alokito
Copy link

alokito commented Sep 29, 2021

Okay checkout my gist!

The pandas approach on 1d array takes ..
371 µs ± 30.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
The vectorized approach on 1d array takes ..
7.14 µs ± 406 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
A list comprehension takes ..
8.7 µs ± 732 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

If anything, you have demonstrated that pandas apply is surprisingly slow.

@alokito
Copy link

alokito commented Sep 29, 2021

Final comment: why use gists instead of bitbucket repos? I found it a pain to "clone" and run this locally, and it seems it will be harder to keep track of than a repo. Also what if you ever want to add another file?

@ericmjl
Copy link

ericmjl commented Sep 29, 2021

@alokito I think gists are a good staging ground, perhaps? I've put code snippets up and notebooks (with outputs) up here to quickly share it with others. NBs with outputs are a bad idea in a repo. Perhaps if @wusixer amasses a collection of NBs, then that's a good time to put the stuff into a repo?

@wusixer
Copy link
Author

wusixer commented Sep 29, 2021

@Alokit Nice! The reason I put it in gist is that it was a quick and easy thing for me to do, and I can look up what I need by searching the names of gist (I guess you can do that in bitbucket/github too but it would require a bit more structure to make it a repo). Maybe someday I will have a repo called "little_things_Iearned_from_work" :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment