Skip to content

Instantly share code, notes, and snippets.

@dipanjannag
Created April 20, 2018 13:32
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dipanjannag/27db7607ca0e9ec20ade2acce87f8c42 to your computer and use it in GitHub Desktop.
Save dipanjannag/27db7607ca0e9ec20ade2acce87f8c42 to your computer and use it in GitHub Desktop.
#!/bin/sh
cat corpus.txt | tr '[:lower:]' '[:upper:]' > corpus_upper.txt
ngram-count -text corpus_upper.txt -order 3 -limit-vocab -vocab words.txt -unk -map-unk "<unk>" -kndiscount -interpolate -lm lm.arpa
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment