Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
quickHow: how to split a column of tuples into a pandas dataframe
# Given a pandas dataframe containing a pandas series/column of tuples B,
# we want to extract B into B1 and B2 and assign them into separate pandas series
# Method 1: (faster)
# use pd.Series.tolist() method to return a list of tuples
# use pd.DataFrame on the resulting list to turn it into a new pd.DataFrame object, while specifying the original df index
# add to the original df
import pandas as pd
import time
# Put your dataframe here
df = pd.DataFrame({'A':[1,2], 'B':[(1,2), (3,4)]})
print("Original Dataset")
print(df)
start = time.time()
df[['B1','B2']] = pd.DataFrame(df['B'].tolist(),index=df.index)
print("Method 1")
print("Time elapsed :" + str(time.time()-start))
print(df)
# Method 2: (more Pythonic but much slower for larger dataframes)
# use the pd.DataFram.apply method to the column with the pd.Series function
start = time.time()
df[['B1','B2']] = df['B'].apply(pd.Series)
print("Method 2")
print("Time elapsed :" + str(time.time()-start))
print(df)
@Prathamesh-Ghatole
Copy link

The best tip I've read in a while. Thanks a ton!

@EladDan
Copy link

EladDan commented Dec 14, 2022

Bravo!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment