Skip to content

Instantly share code, notes, and snippets.

@gh640
Created December 8, 2018 11:53
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gh640/d741bdf990069b3e8f4be22962fd7950 to your computer and use it in GitHub Desktop.
Save gh640/d741bdf990069b3e8f4be22962fd7950 to your computer and use it in GitHub Desktop.
Python Pandas: Stack a column in a DataFrame
import pandas as pd
def stack_with_column(df, column, sep=','):
"""Stack a column splitting with `sep` in a DataFrame.
"""
stacked_column = (
df[column].str.split(sep, expand=True)
.stack()
.reset_index(1, drop=True)
.to_frame(column)
)
original_without_column = df.drop(column, axis=1)
stacked = (
original_without_column
.join(stacked_column)
.reset_index(drop=True)
[df.columns]
)
return stacked
df = pd.DataFrame({
'other1': ['a', 'b'],
'other2': ['c', 'd'],
'gene': ['GSTM1', 'LOC101927027,PRKRA'],
})
# =>
# other1 other2 gene
# 0 a d GSTM1
# 1 b e LOC101927027,PRKRA
stack_with_column(df)
# =>
# other1 other2 gene
# 0 a c GSTM1
# 1 b d LOC101927027
# 2 b d PRKRA
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment