Skip to content

Instantly share code, notes, and snippets.

View jappy's full-sized avatar

Scott Jappinen jappy

View GitHub Profile
@jappy
jappy / gist:2012357
Created March 10, 2012 18:16
unix command to extract words from a file (Mac/Linux)
tr -sc 'A-Za-z' '\n' < filename.txt
@jappy
jappy / gist:2012386
Created March 10, 2012 18:25
unix command to extract unique words by frequency from a file (Mac/Linux)
tr -sc 'A-Za-z' '\n' < filename.txt | sort | uniq -c | sort -n -r
@jappy
jappy / gist:2012442
Created March 10, 2012 18:42
unix command to extract unique words by frequency normalized by case from a file (Mac/Linux)
tr 'A-Z' 'a-z' < filename.txt | tr -sc 'a-z' '\n' | sort | uniq -c | sort -n -r
@jappy
jappy / gist:2015420
Created March 11, 2012 07:25
unix command to merge the counts for upper and lower case
tr 'a-z' 'A-Z' < filename.txt | tr -sc 'A-Z' 'n' | sort | uniq -c | sort -nr
@jappy
jappy / gist:2015423
Created March 11, 2012 07:26
unix command to count sequences of vowels from a text
tr 'a-z' 'A-Z' < filename.txt | tr -sc 'AEIOU' 'n' | sort | uniq -c | sort -nr
@jappy
jappy / gist:2015447
Created March 11, 2012 07:36
unix command to sort works in a text by dictionary order
tr -sc 'A-Za-z' '\n' < filename.txt | sort | uniq | sort -d
tr 'A-Z' 'a-z' < filename.txt | tr -sc 'a-z' '\n'| sort | uniq | sort -d | less # normalizes to lowercase first
@jappy
jappy / gist:2015427
Created March 11, 2012 07:28
unix command to count sequences of consonants in a text
tr 'a-z' 'A-Z' < filename.txt | tr -sc 'BCDFGHJKLMNPQRSTVWXYZ' '\n' | sort | uniq -c | sort -nr
@jappy
jappy / gist:2015460
Created March 11, 2012 07:41
unix command to sort words into rhyming order
tr 'A-Z' 'a-z' < filename.txt | tr -sc 'a-z' '\n' | rev | sort | uniq | sort -d | rev | less
@jappy
jappy / gist:2015480
Created March 11, 2012 07:51
unix command to sort bigrams from a text in order of frequency
# Case sensitive version
tr -sc 'A-Za-z' '\n' < textfile > textfile.words
tail +2 textfile.words > textfile.nextwords
paste textfile.words textfile.nextwords | sort | uniq -c > textfile.bigrams
sort -nr < textfile.bigrams
# Case insensitive version
tr 'A-Z' 'a-z' < textfile | tr -sc 'a-z' '\n' > textfile.words
@jappy
jappy / gist:2015491
Created March 11, 2012 07:54
unix command to sort trigrams from a text in order of frequency
# Case sensitive version
tr -sc 'A-Za-z' '\n' < textfile > textfile.words
tail +2 textfile.words > textfile.nextwords
tail +2 textfile.nextwords > textfile.nextnextwords
paste textfile.words textfile.nextwords textfile.nextnextwords | sort | uniq -c > textfile.trigrams
sort -nr < textfile.trigrams
# Case insensitive version