Skip to content

Instantly share code, notes, and snippets.

@fbrundu
Last active December 30, 2015 18:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save fbrundu/7869191 to your computer and use it in GitHub Desktop.
Save fbrundu/7869191 to your computer and use it in GitHub Desktop.
Generate consensus array from pandas DataFrame (NaN values are ignored)
import pandas as pd
# load data
mat = pd.read_table('class_matrix.txt', index_col=0)
# initialize consensus array
consensus_a = pd.Series(index=mat.index)
# define columns subset on which compute consensus
# in this case all columns are used
columns = mat.columns
# compute consensus array
for ind in mat.index:
most_class = mat.ix[ind, columns].value_counts()[0]
n_elements = mat.ix[ind, columns].value_counts().sum()
consensus_a.ix[ind] = float(most_class) / n_elements
# save to csv
mat.to_csv('consensus_array.txt', sep='\t', index_label='Samples')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment