Skip to content

Instantly share code, notes, and snippets.

@willprice76
Last active August 27, 2020 10:28
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save willprice76/a6c1ffa3dc9dedec4b8099a1bde50a26 to your computer and use it in GitHub Desktop.
Save willprice76/a6c1ffa3dc9dedec4b8099a1bde50a26 to your computer and use it in GitHub Desktop.
method to build training pipeline for sentiment analysis with Spark NLP
private Pipeline getSentimentTrainingPipeline() {
DocumentAssembler document = new DocumentAssembler();
document.setInputCol("text");
document.setOutputCol("document");
String[] tokenizerInputCols = {"document"};
Tokenizer tokenizer = new Tokenizer();
tokenizer.setInputCols(tokenizerInputCols);
tokenizer.setOutputCol("token");
String[] sentimentInputCols = {"document", "token"};
ViveknSentimentApproach sentimentApproach = new ViveknSentimentApproach();
sentimentApproach.setInputCols(sentimentInputCols);
sentimentApproach.setOutputCol("sentiment");
sentimentApproach.setSentimentCol("label");
sentimentApproach.setCorpusPrune(0);
Pipeline pipeline = new Pipeline();
pipeline.setStages(new PipelineStage[]{document, tokenizer, sentimentApproach});
return pipeline;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment