Skip to content

Instantly share code, notes, and snippets.

@neilkod
Created June 3, 2012 20:45
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save neilkod/2864982 to your computer and use it in GitHub Desktop.
Save neilkod/2864982 to your computer and use it in GitHub Desktop.
benfords law on twitter data
nkodner@hadoop4 strip_numbers$ cat numbers_from_12milliontweets.txt |awk '{print substr($1,0,1)}'|sort -n|uniq -c|sort -n
69606 7
70809 9
80228 6
80468 8
125992 0
131495 4
194264 5
369118 3
394841 2
534885 1
nkodner@hadoop4 strip_numbers$ awk '{print substr($1,0,1)}' numbers_from_12milliontweets_cast_as_long.txt |sort -n|uniq -c|sort -n
73051 0
73089 7
74252 9
83887 6
85316 8
135144 4
213307 5
373189 3
399356 2
541115 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment