Skip to content

Instantly share code, notes, and snippets.

@pavlov99
Last active September 12, 2019 10:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pavlov99/f638197b5aa72fe3c54a518a56060eed to your computer and use it in GitHub Desktop.
Save pavlov99/f638197b5aa72fe3c54a518a56060eed to your computer and use it in GitHub Desktop.
Pandas cross-join
from functools import reduce
def crossjoin(*dfs, **kwargs):
"""Calculate a cartesian product of given dataframes.
Subsequently join each dataframe using a temporary constant key and then remove it.
Also set a MultiIndex - cartesian product of the indices of the input dataframes.
See: https://github.com/pydata/pandas/issues/5401
Args:
*dfs (pandas.DataFrame): dataframes to be merged
**kwargs: merge arguments that will be passed to pd.merge()
Returns:
pandas.DataFrame: cartesian product of given dataframes
"""
return reduce(
lambda df1, df2: pd.merge(df1.assign(_tmpkey=1), df2.assign(_tmpkey=1), on='_tmpkey', **kwargs),
dfs
)\
.drop(columns='_tmpkey')\
.set_index(pd.MultiIndex.from_product([df.index for df in dfs]))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment