Skip to content

Instantly share code, notes, and snippets.

@psinger
Created November 5, 2019 14:08
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save psinger/6e5f11981588378bc9316397131be66a to your computer and use it in GitHub Desktop.
Save psinger/6e5f11981588378bc9316397131be66a to your computer and use it in GitHub Desktop.
Pandas groupby apply multiprocessing #python #pandas
from joblib import Parallel, delayed
import multiprocessing
import pandas as pd
import time
def applyParallel(dfGrouped, func):
retLst = Parallel(n_jobs=multiprocessing.cpu_count())(delayed(func)(group) for name, group in dfGrouped)
return pd.concat(retLst)
def myfunc(df)
return df
start_time = time.time()
res = applyParallel(df.groupby(['id']), myfunc)
print(time.time() - start_time)
@darkhan-ai
Copy link

Thanks!

@Jakobhenningjensen
Copy link

if we want to parse som "args" to myfunc - how would we do that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment