Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save sammatuba/70269c2b5268a83344f5de609ea9b3cc to your computer and use it in GitHub Desktop.
Save sammatuba/70269c2b5268a83344f5de609ea9b3cc to your computer and use it in GitHub Desktop.
quickHow: how to split a column of tuples into a pandas dataframe
# Given a pandas dataframe containing a pandas series/column of tuples B,
# we want to extract B into B1 and B2 and assign them into separate pandas series
# Method 1: (faster)
# use pd.Series.tolist() method to return a list of tuples
# use pd.DataFrame on the resulting list to turn it into a new pd.DataFrame object, while specifying the original df index
# add to the original df
import pandas as pd
import time
# Put your dataframe here
df = pd.DataFrame({'A':[1,2], 'B':[(1,2), (3,4)]})
print("Original Dataset")
print(df)
start = time.time()
df[['B1','B2']] = pd.DataFrame(df['B'].tolist(),index=df.index)
print("Method 1")
print("Time elapsed :" + str(time.time()-start))
print(df)
# Method 2: (more Pythonic but much slower for larger dataframes)
# use the pd.DataFram.apply method to the column with the pd.Series function
start = time.time()
df[['B1','B2']] = df['B'].apply(pd.Series)
print("Method 2")
print("Time elapsed :" + str(time.time()-start))
print(df)
@cenuno
Copy link

cenuno commented Dec 22, 2020

This was incredibly helpful! Thank you for sharing 🎉

@LazolaJavu
Copy link

Thank you, this was really helpful.

@Prathamesh-Ghatole
Copy link

The best tip I've read in a while. Thanks a ton!

@EladDan
Copy link

EladDan commented Dec 14, 2022

Bravo!

@Phuongbui2711
Copy link

My kernel even died with the second method. Thank you so much, very helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment