Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save anuragkh/3e8f1f81ff196942c30d to your computer and use it in GitHub Desktop.
Save anuragkh/3e8f1f81ff196942c30d to your computer and use it in GitHub Desktop.
import edu.berkeley.cs.succinct.annot._
val succinctAnnot = AnnotatedSuccinctRDD(sc, "/path/to/succinct/data")
// The following is equivalent to {ge^sentence} OVER /remains|still unknown|unclear|uncertain/
// The output is an RDD of Annotation objects, which is just a wrapper holding the annotation data (id, ranges, etc.)
val res = succinctAnnot.regexOver("(remains|is|still) (unknown|unclear|uncertain)", "ge", "sentence")
res.take(20) // Take first 20 results
res.count // Iterate through all results
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment