Skip to content

Instantly share code, notes, and snippets.

@amn41
Last active May 10, 2016 16:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save amn41/77b5684bfb64b52700bc to your computer and use it in GitHub Desktop.
Save amn41/77b5684bfb64b52700bc to your computer and use it in GitHub Desktop.
attempt at using epic NER on plaintext file
import epic.models.{NerSelector, ParserSelector}
import epic.parser.ParserAnnotator
import epic.preprocess
import epic.preprocess.{TreebankTokenizer, MLSentenceSegmenter}
import epic.sequences.{SemiCRF, Segmenter}
import epic.slab.{EntityMention, Token, Sentence}
import epic.trees.{AnnotatedLabel, Tree}
import epic.util.SafeLogging
val text = io.Source.fromFile("data/email.txt").mkString
val sentenceSplitter = MLSentenceSegmenter.bundled().get
val tokenizer = new epic.preprocess.TreebankTokenizer()
val tagger = epic.models.NerSelector.loadNer("en").get
val sentences: IndexedSeq[IndexedSeq[String]] = sentenceSplitter(text).map(tokenizer).toIndexedSeq
for(sentence <- sentences) {
val segments = tagger.bestSequence(sentence)
println(segments.render)
}
@youness2016
Copy link

i am student in Paris, i am about using EPIC, but i have a problem in line commands like this
pic/parser/models/fr/span/model.ser.gz french-onesent.txt

Could not tag Vector(les, parisiens, ne, sourient, pas, tout, le, temps, par, contre, !!), because epic.preprocess.MLSentenceSegmenter cannot be cast to epic.parser.Parser... epic.parser.ParseText$.annotate(ParseText.scala:11);epic.util.ProcessTextMain$$anonfun$main$1$$anonfun$2.apply(ProcessTextMain.scala:78)

my commands is 👍
java -Xmx4g -cp epic-assembly-0.4-SNAPSHOT.jar epic.parser.ParseText --model /epic/parser/models/fr/span/model.ser.gz french-onesent.txt

when a file french-onesent.txt content is 👍
les, parisiens, ne, sourient, pas, tout, le, temps, par, contre, !!

thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment