Skip to content

Instantly share code, notes, and snippets.

@kujjwal02
Last active May 11, 2019 14:16
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kujjwal02/71405d233b975da6cef5aafd57065183 to your computer and use it in GitHub Desktop.
Save kujjwal02/71405d233b975da6cef5aafd57065183 to your computer and use it in GitHub Desktop.
Parallel apply pandas
# Source: https://towardsdatascience.com/make-your-own-super-pandas-using-multiproc-1c04f41944a1
def parallelize_dataframe(df, func, n_cores=4):
df_split = np.array_split(df, n_cores)
pool = Pool(n_cores)
df = pd.concat(pool.map(func, df_split))
pool.close()
pool.join()
return df
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment