Skip to content

Instantly share code, notes, and snippets.

@anup50695
anup50695 / extract_chunks.r
Last active May 16, 2019 10:04
Key Phrase Extraction from Tweets
extractChunks <- function(x) {
x <- as.String(x)
wordAnnotation <- annotate(x, list(Maxent_Sent_Token_Annotator(), Maxent_Word_Token_Annotator()))
POSAnnotation <- annotate(x, Maxent_POS_Tag_Annotator(), wordAnnotation)
POSwords <- subset(POSAnnotation, type == "word")
tags <- sapply(POSwords$features, '[[', "POS")
tokenizedAndTagged <- data.frame(Tokens = x[POSwords], Tags = tags)
tokenizedAndTagged$Tags_mod = grepl("NN|JJ", tokenizedAndTagged$Tags)