Skip to content

Instantly share code, notes, and snippets.

@Olshansk
Last active May 25, 2020 23:24
Show Gist options
  • Save Olshansk/e32d35eba69a270e8966ef37a8ab9866 to your computer and use it in GitHub Desktop.
Save Olshansk/e32d35eba69a270e8966ef37a8ab9866 to your computer and use it in GitHub Desktop.
Joint Probability Matricies - Group and Merge the Data
# Generate a pandas dataframe where the index represents the student number
df_GT = pd.DataFrame({'bucket': bucket_GT}).reset_index()
display(df_GT.head())
df_P = pd.DataFrame({'bucket': cut_P}).reset_index()
display(df_P.head())
# Merged the actual predicted grades
merged_df = pd.merge(df_GT, df_P, on=['index'], suffixes=('_grouth_truth', '_predicted'))
display(merged_df.head())
# Create a multi-leveled
merged_df = merged_df.groupby(['bucket_grouth_truth', 'bucket_predicted']).count()
display(merged_df.head())
# Taken from: https://stackoverflow.com/a/43921476/768439
# Convert multi-leveled pandas index into a 2d numpy array
m, n = len(merged_df.index.levels[0]), len(merged_df.index.levels[1])
jp_matrix = merged_df.values.reshape(m, n)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment