Skip to content

Instantly share code, notes, and snippets.

@oatsandsugar
Last active July 28, 2017 14:57
Show Gist options
  • Save oatsandsugar/1184e10d2f2ecfc26b47a6617dc75293 to your computer and use it in GitHub Desktop.
Save oatsandsugar/1184e10d2f2ecfc26b47a6617dc75293 to your computer and use it in GitHub Desktop.
def count(tally_dataset, tally_column, tally_count, feature_dataset, feature_column, comparator = False, comparator_value = '', comparator_column = ''):
"""
Count variables of interest in one dataframe, write count into appropriate row of second dataframe.
Keyword arguments:
tally_dataset -- dataset in which tally of variable of interest is to be recorded
tally_column -- column containing variable which is the key to the tally of (e.g. ZIP code)
tally_count -- column in which tally of variable of interest is to be recorded
feature_dataset -- dataset containing variable the occurence of which is counted
feature_column -- column containing variable the occurence of which is key in the count (e.g. each ZIP code in this column is recorded to tally_count according to row tally_column)
comparator -- boolean value determining whether (if False) each row in feature_dataset is to be tallied or (if True) only occurrence of comparator_value is to be tallied (default False)
comparator_value -- a value the presence of which (in the comparator_column) will be tallied (default = '')
comparator_column -- the column in feature_dataset which will be iterated through to find and tally comparator_value (default = '')
"""
dictionary = {}
for index, row in tqdm(feature_dataset.iterrows()):
if comparator == True:
if comparator_value == row[comparator_column]:
if row[feature_column] in dictionary:
dictionary[row[feature_column]] = dictionary[row[feature_column]] + 1
else:
dictionary[row[feature_column]] = 1
else:
if row[feature_column] in dictionary:
dictionary[row[feature_column]] = dictionary[row[feature_column]] + 1
else:
dictionary[row[feature_column]] = 1
for index, row in tally_dataset.iterrows():
if row[tally_column] in dictionary:
tally_dataset.loc[index, tally_count] = dictionary[row[tally_column]]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment