Created
November 10, 2018 09:20
-
-
Save weallwegot/68d150e98720387122f4d6ac60c65c1d to your computer and use it in GitHub Desktop.
Function to determine the top 10 occurrences of a particular metric
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
get_top_x_occurrences('curse_used',curses,10) | |
get_top_x_occurrences('laugh_used',laughs,10) | |
get_top_x_occurrences('link_used',links,10) | |
def get_top_x_occurrences(special_key,list_of_dicts,occurrence_number): | |
""" | |
function to find the top occurrences of a given instance | |
for instance top x emojis | |
top x curse words | |
top x laughing expressions | |
:param special_key: the key in the dictionary that contains the text | |
:special_key type: string | |
:param list_of_dicts: list of dictionaries containing the special_key | |
:list_of_dicts type: list | |
:param occurrence_number: top x number to return | |
:occurrence_number type: int | |
:returns: a list indicating the most common occurrences in list_of_dicts | |
:rtype: list of length occurrence_number or shorter | |
""" | |
results = [] | |
results_dict = {} | |
for lil_d in list_of_dicts: | |
if special_key in lil_d.keys(): | |
these_instances = lil_d[special_key] | |
for this_instance in these_instances: | |
z=this_instance | |
if not z in results_dict.keys(): | |
results_dict[z] = 1 | |
else: | |
results_dict[z] += 1 | |
num = len(results_dict.keys()) | |
#http://stackoverflow.com/questions/7197315/5-maximum-values-in-a-python-dictionary | |
if num < occurrence_number: | |
results = sorted(results_dict, key=results_dict.get, reverse=True)[:num] | |
else: | |
results = sorted(results_dict, key=results_dict.get, reverse=True)[:occurrence_number] | |
return results |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment