Skip to content

Instantly share code, notes, and snippets.

@escuccim
Created August 9, 2018 16:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save escuccim/8e8f42e8600c14fede59a3a70e60ccee to your computer and use it in GitHub Desktop.
Save escuccim/8e8f42e8600c14fede59a3a70e60ccee to your computer and use it in GitHub Desktop.
Function to find duplicated columns in pandas dataframe
def duplicate_columns(frame):
groups = frame.columns.to_series().groupby(frame.dtypes).groups
dups = []
for t, v in groups.items():
cs = frame[v].columns
vs = frame[v]
lcs = len(cs)
for i in range(lcs):
iv = vs.iloc[:,i].tolist()
for j in range(i+1, lcs):
jv = vs.iloc[:,j].tolist()
if iv == jv:
dups.append(cs[i])
break
return dups
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment