Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Izhaki/21687d23274ca8bc18ad4f2c235d7834 to your computer and use it in GitHub Desktop.
Save Izhaki/21687d23274ca8bc18ad4f2c235d7834 to your computer and use it in GitHub Desktop.
Part of Speech tokens used by CoreNLP, based on the Penn Treebank Project.
// Based on http://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html
// Note that CoreNLP sometimes emit POS that are not in this list, like "``" or "."
var iCoreNlpPos = [ "CC", "CD", "DT", "EX", "FW", "IN", "JJ", "JJR", "JJS", "LS", "MD", "NN", "NNS", "NNP", "NNPS", "PDT", "POS", "PRP", "PRP$", "RB", "RBR", "RBS", "RP", "SYM", "TO", "UH", "VB", "VBD", "VBG", "VBN", "VBP", "VBZ", "WDT", "WP", "WP$", "WRB" ];
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment