Skip to content

Instantly share code, notes, and snippets.

@dansondergaard
Created October 30, 2013 10:09
Show Gist options
  • Save dansondergaard/7230085 to your computer and use it in GitHub Desktop.
Save dansondergaard/7230085 to your computer and use it in GitHub Desktop.
def compute_error(partition):
"""Find the majority category of the partition. Count how big
a part of the partition that the majority category is.
Return the ratio of things that are *not* of the majority category."""
categories = map(get_category, partition)
majority = max(set(categories), key=categories.count)
return 1 - (categories.count(majority) / float(len(partition)))
# This is what a typical partition looks like. The inner tuple may
# be very large (in the thousands) and the list of the length may
# be a couple of thousand.
test_partition = [((1, 2), 0), ((2, 3), 0), ((3, 2), 0), ((3, 5), 0),
((4, 1), 1), ((4, 4), 1), ((5, 4), 1), ((6, 3), 1),
((6, 5), 1), ((2, 7), 1)]
compute_error(test_partition) # => 0.4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment