Skip to content

Instantly share code, notes, and snippets.

@mkandalf
Created October 24, 2012 21:31
Show Gist options
  • Save mkandalf/3949031 to your computer and use it in GitHub Desktop.
Save mkandalf/3949031 to your computer and use it in GitHub Desktop.
Penn Course Review Script
import sys
import requests
import simplejson as json
class ReviewRetriever:
def __init__(self, dept, key):
self.key = key
self.dept = dept
self.professor_scores = {}
# Retrieves an endpoint on the Penn Course Review API
def get_data(self, endpoint):
base_url = "http://api.penncoursereview.com/v1"
url = base_url + endpoint + "?token=" + self.key
r = requests.get(url)
if hasattr(r, 'json'):
return r.json
else:
return json.loads(r.content)
# Retrieve a list of course reviews for a given department
def get_courses(self):
courses = self.get_data("/depts/" + self.dept)['result']['coursehistories']
course_ids = [course['id'] for course in courses]
return course_ids
# Load the reviews for a given course id
def load_reviews(self, course):
reviews = self.get_data('/coursehistories/' + str(course) + '/reviews')['result']['values']
for review in reviews:
name = review['instructor']['name']
rating = float(review['ratings']['rInstructorQuality'])
if not (name in self.professor_scores):
self.professor_scores[name] = []
self.professor_scores[name].append(rating)
# Return the average score for each professor
def average_scores(self):
return {prof: (sum(scores) / len(scores)) for prof, scores in self.professor_scores.items()}
# Sort and print the list of professor averages
def print_scores(self):
self.professor_scores = {}
course_ids = self.get_courses()
for course_id in course_ids:
self.load_reviews(course_id)
scores = self.average_scores()
sorted_scores = sorted(scores.items(), lambda x, y: cmp(y[1], x[1]))
for prof, score in sorted_scores:
print "%s %.2f" % (prof, score)
def main(dept, key):
rr = ReviewRetriever(dept, key)
rr.print_scores()
if __name__ == "__main__":
if (len(sys.argv) < 2):
print "usage: scores.py <department> <apikey>"
else:
main(*sys.argv[1:])
@AlexeyMK
Copy link

average_scores is a tough metric - you're equating the input of a student in a 200-person class with somebody else in a 10-person seminar - if I've only ever taught a big class and a small one, I might as well not try on the small one because those weights won't matter.

Alternately, you could get an average score per class and then average scores per course taught. That's a little bit better, but runs into the same problem of not being very fairly representative.

A 'good enough' solution usually isn't here, because for something to go into PCR itself professors need to not be offended / feel scared of it. This is their livelihood - imagine if your GPA was based on any grade in every class you ever got rather than an average of any class you've taken (not that a GPA isn't a highly flawed metric as well).

Scoring teachers is a non-trivial data visualization and representation problem. I for one would love to see some way to visualize the data that takes into account the intricacies of the problem.

@Ceasar
Copy link

Ceasar commented Oct 25, 2012

Probably no reason to update this, but if a reason ever arises check out the PCR wrapper I wrote. It ought to simplify a bit of the logic.

@Ceasar
Copy link

Ceasar commented Oct 25, 2012

Actually, on second though, seems like a useful example. If you don't mind, I'd like to integrate into the project after updating it.

@Ceasar
Copy link

Ceasar commented Oct 26, 2012

Please see my fork. It cuts the load time significantly (though it is apparent there is a need for threads inside of the penncoursereview library).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment