Skip to content

Instantly share code, notes, and snippets.

@dhesse
Last active April 24, 2017 16:36
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dhesse/aa2e2425548bf3e4ceb7 to your computer and use it in GitHub Desktop.
Save dhesse/aa2e2425548bf3e4ceb7 to your computer and use it in GitHub Desktop.
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv('data/Greenville_County_School_District_Spending.csv',
converters={'Amount': lambda x: float(x.replace('$', ''))},
parse_dates=9)
singleDigitCounts = data\
.Amount[data.Amount >= 1]\
.apply(lambda x: int(str(x)[:1]))\
.value_counts()
def compareVisuallyToBenford(counts, plotTitle):
"""Compare digit counts visually with Benford's law.
Will create a plot.
Inputs
------
counts: A pandas series of counts with digits as index.
plotTitle: The plot's title.
"""
total = counts.sum()
indexSorted = np.array(sorted(counts.index))
plt.plot(indexSorted, np.log10(1 + 1./indexSorted) * total, label='Benford\'s Law')
plt.bar(counts.index.values - 0.4, counts, label='Actual Counts')
plt.title(plotTitle)
plt.legend()
if __name__ == '__main__':
compareVisuallyToBenford(singleDigitCounts, 'First Digit Counts')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment