Last active
December 30, 2015 18:49
-
-
Save fbrundu/7870063 to your computer and use it in GitHub Desktop.
Generation of a joint probability consensus matrix from pandas dataframe
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
import pandas as pd | |
# load data | |
mat = pd.read_table('matrix.txt', index_col=0) | |
# get classes | |
classes = np.unique(mat.values) | |
classes = classes[~np.isnan(classes)] | |
# create support dataframe | |
sup = pd.DataFrame(index=mat.index, columns=classes) | |
# get class probabilities for each sample | |
for ind in mat.index: | |
vc = mat.ix[ind].value_counts() | |
for cls in vc.index: | |
sup.ix[ind, cls] = float(vc.ix[cls]) / vc.sum() | |
sup = sup.fillna(0).astype(float) | |
# generate consensus matrix | |
cmat = sup.dot(sup.T) | |
# to output | |
cmat.to_csv('consensus_mat.txt', sep='\t', index_label='Samples') |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment