Skip to content

Instantly share code, notes, and snippets.

@havron
Last active October 16, 2018 12:55
Show Gist options
  • Save havron/3912fe4074c1ff2c62847ef48146c765 to your computer and use it in GitHub Desktop.
Save havron/3912fe4074c1ff2c62847ef48146c765 to your computer and use it in GitHub Desktop.
Easily form random groups of students for homework assignments. Aimed at Cornell faculty/staff using CMSX for course management.
#!/usr/bin/env python3
import pandas as pd
import numpy as np
GROUPSIZE = 4
INFILE = 'CS_5435_student_table.csv'
OUTFILE = 'CS_5435_hw_groups.csv'
'''
Load the student data (download it from
https://cmsx.cs.cornell.edu/web/auth/?action=exporttable&courseid=<YOURCOURSEID>)
Or click on 'Students' -> 'enrolled student table' on CMSX.
'''
df = pd.read_csv(INFILE)
num_students = df.shape[0]
# randomize ordering of students
df = df.sample(frac=1)
# group the students by GROUPSIZE
hw_groups = df.groupby(np.arange(num_students)//GROUPSIZE, group_keys=False)
# transpose the dataframe by netIDs to form groups
df = hw_groups.apply(lambda hw_group: hw_group['NetID'].reset_index(drop=True).to_frame().T)
# rename the transposed columns and reset the index
df.index.name = 'Group'
df.columns = ["Member"+str(x) for x in range(GROUPSIZE)]
df = df.reset_index(level=0, drop=True)
# check for groups without teammates (class not divisable by GROUPSIZE)
missing = df.apply(lambda group_members: group_members.isnull().sum(), axis=1)
if missing.sum() > 0:
missed = missing[missing > 0]
print("Group number {} (currently only {}) is missing {} member(s).\n"\
"Group sizes of {} aren't evenly divisable by {}."\
.format(missed.index[0], df.loc[missed.index[0],:].dropna().tolist(),\
missed.values[0], GROUPSIZE, num_students))
df = df.fillna('no member')
# save the group assignments to csv.
df.to_csv(OUTFILE, index=True)
print("The groups are as follows (saved to '{}'):".format(OUTFILE))
print('-'*80+"\n",df,"\n"+'-'*80)
print("Run this again to resample the groups.")
@havron
Copy link
Author

havron commented Oct 13, 2018

The input (roster) has the following columns for all Cornell courses:

['Last Name', 'First Name', 'NetID', ['Assignment names'],'CP', 'Final', 'Total Score', 'Final Grade']

As long as your roster is stored in CSV and has some column for NetID/email id or names of the students, just replace "NetID" on the gist to whatever your primary column is named.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment