victoriastuart/gist_for_SO44910934.txt

## gist_for_SO44910934.txt
==============================================================================
         file: /mnt/Vancouver/apps/CoreNLP/_victoria/gist_for_SO44910934.txt
        title: "CoreNLP" {Java | Python} Gist for StackOverflow #44910934
       author: Victoria A. Stuart
      created: 2020-01-03
      version: 01
last modified: 2020-01-03

Versions:
  * v01 : this
==============================================================================

To accompany code described in https://stackoverflow.com/a/59549039/1904943

==============================================================================
JAVA
==============================================================================

[victoria@victoria _victoria]$ cd /mnt/Vancouver/apps/CoreNLP/src-local/stanford-corenlp-full-2018-10-05/

[victoria@victoria stanford-corenlp-full-2018-10-05]$ date; pwd; echo; ls -l

  Fri 03 Jan 2020 02:42:29 PM PST
  /mnt/Vancouver/apps/CoreNLP/src-local/stanford-corenlp-full-2018-10-05

  total 1400680
  -rw-r--r-- 1 victoria victoria       3340 Dec 31 14:15 BasicPipelineExample.class
  -rw-r--r-- 1 victoria victoria       4666 Dec 31 13:33 BasicPipelineExample.java
  -rw-r--r-- 1 victoria victoria       6103 Oct  8  2018 build.xml
  ...
  -rw-r--r-- 1 victoria victoria    8146873 Oct  8  2018 stanford-corenlp-3.9.2.jar
  -rw-r--r-- 1 victoria victoria    9687426 Oct  8  2018 stanford-corenlp-3.9.2-javadoc.jar
  -rw-r--r-- 1 victoria victoria  362565193 Oct  8  2018 stanford-corenlp-3.9.2-models.jar
  -rw-r--r-- 1 victoria victoria    5370905 Oct  8  2018 stanford-corenlp-3.9.2-sources.jar
  -rw-r--r-- 1 victoria victoria       7240 Oct  8  2018 StanfordCoreNlpDemo.java
  -rw-r--r-- 1 victoria victoria     199885 Oct  8  2018 StanfordDependenciesManual.pdf
  -rw-r--r-- 1 victoria victoria 1038970602 Dec 31 14:07 stanford-english-corenlp-2018-10-05-models.jar
  ...

[victoria@victoria stanford-corenlp-full-2018-10-05]$ time java -cp .:* BasicPipelineExample

  [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
  [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
  [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos
  [main] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - Loading POS tagger from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [0.5 sec].
  [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
  [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
  [main] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [0.9 sec].
  [main] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [1.5 sec].
  [main] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.4 sec].
  [main] INFO edu.stanford.nlp.time.JollyDayHolidays - Initializing JollyDayHoliday for SUTime from classpath edu/stanford/nlp/models/sutime/jollyday/Holidays_sutime.xml as sutime.binder.1.
  [main] INFO edu.stanford.nlp.time.TimeExpressionExtractorImpl - Using following SUTime rules: edu/stanford/nlp/models/sutime/defs.sutime.txt,edu/stanford/nlp/models/sutime/english.sutime.txt,edu/stanford/nlp/models/sutime/english.holidays.sutime.txt
  [main] INFO edu.stanford.nlp.pipeline.TokensRegexNERAnnotator - ner.fine.regexner: Read 580704 unique entries out of 581863 from edu/stanford/nlp/models/kbp/english/gazetteers/regexner_caseless.tab, 0 TokensRegex patterns.
  [main] INFO edu.stanford.nlp.pipeline.TokensRegexNERAnnotator - ner.fine.regexner: Read 4869 unique entries out of 4869 from edu/stanford/nlp/models/kbp/english/gazetteers/regexner_cased.tab, 0 TokensRegex patterns.
  [main] INFO edu.stanford.nlp.pipeline.TokensRegexNERAnnotator - ner.fine.regexner: Read 585573 unique entries from 2 files
  [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
  [main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... done [0.3 sec].
  [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator depparse
  [main] INFO edu.stanford.nlp.parser.nndep.DependencyParser - Loading depparse model: edu/stanford/nlp/models/parser/nndep/english_UD.gz ...
  [main] INFO edu.stanford.nlp.parser.nndep.Classifier - PreComputed 99996, Elapsed Time: 7.547 (s)
  [main] INFO edu.stanford.nlp.parser.nndep.DependencyParser - Initializing dependency parser ... done [11.4 sec].
  [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator coref
  [main] INFO edu.stanford.nlp.coref.neural.NeuralCorefAlgorithm - Loading coref model edu/stanford/nlp/models/coref/neural/english-model-default.ser.gz ... done [0.4 sec].
  [main] INFO edu.stanford.nlp.coref.neural.NeuralCorefAlgorithm - Loading coref embeddings edu/stanford/nlp/models/coref/neural/english-embeddings.ser.gz ... done [0.4 sec].
  [main] INFO edu.stanford.nlp.pipeline.CorefMentionAnnotator - Using mention detector type: rule
  [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator kbp
  [main] INFO edu.stanford.nlp.pipeline.KBPAnnotator - Loading KBP classifier from: edu/stanford/nlp/models/kbp/english/tac-re-lr.ser.gz
  [main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator quote
  [main] INFO edu.stanford.nlp.parser.nndep.DependencyParser - Loading depparse model: edu/stanford/nlp/models/parser/nndep/english_UD.gz ...
  [main] INFO edu.stanford.nlp.parser.nndep.Classifier - PreComputed 99996, Elapsed Time: 8.286 (s)
  [main] INFO edu.stanford.nlp.parser.nndep.DependencyParser - Initializing dependency parser ... done [9.3 sec].
  [main] INFO edu.stanford.nlp.pipeline.QuoteAnnotator - Setting quotes.
  Example: token
  he-4

  Example: sentence
  Joe Smith was born in California.

  Example: pos tags
  [IN, CD, ,, PRP, VBD, TO, NNP, ,, NNP, IN, DT, NN, .]

  Example: ner tags
  [O, DATE, O, O, O, O, CITY, O, COUNTRY, O, O, DATE, O]

  Example: constituency parse
  (ROOT (S (PP (IN In) (NP (CD 2017))) (, ,) (NP (PRP he)) (VP (VBD went) (PP (TO to) (NP (NNP Paris) (, ,) (NNP France))) (PP (IN in) (NP (DT the) (NN summer)))) (. .)))

  Example: dependency parse
  -> went/VBD (root)
    -> 2017/CD (nmod:in)
      -> In/IN (case)
    -> ,/, (punct)
    -> he/PRP (nsubj)
    -> Paris/NNP (nmod:to)
      -> to/TO (case)
      -> ,/, (punct)
      -> France/NNP (appos)
    -> summer/NN (nmod:in)
      -> in/IN (case)
      -> the/DT (det)
    -> ./. (punct)


  Example: relation
  1.0	Jane Smith	per:siblings	Joe Smith

  Example: entity mentions
  [2017, Paris, France, summer, he]

  Example: original entity mention
  Joe
  Example: canonical entity mention
  Joe Smith

  Example: coref chains for document
  {23=CHAIN23-["Joe Smith" in sentence 1, "he" in sentence 2, "His" in sentence 3, "Joe" in sentence 4, "He" in sentence 5, "his" in sentence 5, "Joe 's" in sentence 6], 26=CHAIN26-["his sister Jane Smith" in sentence 5, "Jane" in sentence 6, "she" in sentence 6], 12=CHAIN12-["2017" in sentence 2, "2017" in sentence 3]}

  Example: quote
  "That was delicious!"

  Example: original speaker of quote
  Joe

  Example: canonical speaker of quote
  Joe Smith

  0:47.68

[victoria@victoria stanford-corenlp-full-2018-10-05]$


==============================================================================
PYTHON
==============================================================================

[victoria@victoria ~]$ p37
   [Python 3.7 venv (source ~/venv/py3.7/bin/activate)]

(py3.7) [victoria@victoria ~]$ env | grep -i virtual
  VIRTUAL_ENV=/home/victoria/venv/py3.7

(py3.7) [victoria@victoria ~]$ python --version
  Python 3.7.4

(py3.7) [victoria@victoria ~]$ date
  Fri 03 Jan 2020 02:49:42 PM PST

(py3.7) [victoria@victoria ~]$ python
  Python 3.7.4 (default, Nov 20 2019, 11:36:53)
  [GCC 9.2.0] on linux
  Type "help", "copyright", "credits" or "license" for more information.

>>> import stanfordnlp
>>> from stanfordnlp.server import CoreNLPClient
>>> client = CoreNLPClient(annotators='tokenize, ssplit, pos, lemma, ner, parse, depparse, coref', output_format='text',  timeout=30000, memory='16G')
>>> text = 'Breast cancer susceptibility gene 1 (BRCA1) is a tumor suppressor protein.'

>>> ann = client.annotate(text)
Starting server with command: java -Xmx16G -cp /mnt/Vancouver/apps/CoreNLP/stanford-corenlp-full/stanford-corenlp-full-2018-10-05/* edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 30000 -threads 5 -maxCharLength 100000 -quiet True -serverProperties corenlp_server-25ebbde9a1ad4065.props -preload tokenize, ssplit, pos, lemma, ner, parse, depparse, coref

>>> sentence = ann.sentence[0]
  Traceback (most recent call last):
    File "<console>", line 1, in <module>
  AttributeError: 'str' object has no attribute 'sentence'

>>> client.server.terminate()
>>> client = CoreNLPClient(annotators='tokenize, ssplit, pos, lemma, ner, parse, depparse, coref', output_format='text',  timeout=30000, memory='16G')
>>> ann = client.annotate(text)
  Starting server with command: java -Xmx16G -cp /mnt/Vancouver/apps/CoreNLP/stanford-corenlp-full/stanford-corenlp-full-2018-10-05/* edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 30000 -threads 5 -maxCharLength 100000 -quiet True -serverProperties corenlp_server-9043ef7d7a744b78.props -preload tokenize, ssplit, pos, lemma, ner, parse, depparse, coref

>>> sentence = ann.sentence[0]
  Traceback (most recent call last):
    File "<console>", line 1, in <module>
  AttributeError: 'str' object has no attribute 'sentence'

>>> [Ctrl-D]
now exiting EditableBufferInteractiveConsole...

(py3.7) [victoria@victoria ~]$ psgrep -l corenlp

  UID          PID    PPID  C STIME TTY          TIME CMD
  victoria  321300  296292  0 Jan02 pts/2    00:02:09 java -Xmx16G -cp /mnt/Vancouver/apps/CoreNLP/stanford-corenlp-full/stanford-corenlp-full-2018-10-05/* edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 30000 -threads 5 -maxCharLength 100000 -quiet True -serverProperties corenlp_server-55bcad5a4c00431e.props -preload tokenize, ssplit, pos, lemma, ner, parse, depparse, coref

(py3.7) [victoria@victoria ~]$ pgrep -l -f corenlp
  321300 java

(py3.7) [victoria@victoria ~]$ kill -9 321300

(py3.7) [victoria@victoria ~]$ python
  Python 3.7.4 (default, Nov 20 2019, 11:36:53)
  [GCC 9.2.0] on linux
  Type "help", "copyright", "credits" or "license" for more information.

>>> import stanfordnlp
>>> from stanfordnlp.server import CoreNLPClient
>>> client = CoreNLPClient(annotators='tokenize, ssplit, pos, lemma, ner, parse, depparse, coref', output_format='text',  timeout=30000, memory='16G')
>>> text = 'Breast cancer susceptibility gene 1 (BRCA1) is a tumor suppressor protein.'
>>> ann = client.annotate(text)
  Starting server with command: java -Xmx16G -cp /mnt/Vancouver/apps/CoreNLP/stanford-corenlp-full/stanford-corenlp-full-2018-10-05/* edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 30000 -threads 5 -maxCharLength 100000 -quiet True -serverProperties corenlp_server-ba065446f2fa404d.props -preload tokenize, ssplit, pos, lemma, ner, parse, depparse, coref

>>> ## [took ~20" or so to start]
>>> sentence = ann.sentence[0]
  Traceback (most recent call last):
    File "<console>", line 1, in <module>
  AttributeError: 'str' object has no attribute 'sentence'

>>> ## deleted `output_format='text'` argument:

>>> client = CoreNLPClient(annotators='tokenize, ssplit, pos, lemma, ner, parse, depparse, coref',  timeout=30000, memory='16G')
>>> ann = client.annotate(text)
  Starting server with command: java -Xmx16G -cp /mnt/Vancouver/apps/CoreNLP/stanford-corenlp-full/stanford-corenlp-full-2018-10-05/* edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 30000 -threads 5 -maxCharLength 100000 -quiet True -serverProperties corenlp_server-423b84293ffe47f3.props -preload tokenize, ssplit, pos, lemma, ner, parse, depparse, coref

>>> sentence = ann.sentence[0]

>>> print(sentence)
token {
  word: "Breast"
  pos: "NN"
  value: "Breast"
  before: ""
  after: " "
  originalText: "Breast"
  ner: "CAUSE_OF_DEATH"
  lemma: "breast"
  beginChar: 0
  endChar: 6
  utterance: 0
  speaker: "PER0"
  beginIndex: 0
  endIndex: 1
  tokenBeginIndex: 0
  tokenEndIndex: 1
  hasXmlContext: false
  isNewline: false
  coarseNER: "O"
  fineGrainedNER: "CAUSE_OF_DEATH"
  corefMentionIndex: 0
  corefMentionIndex: 3
  entityMentionIndex: 0
}
token {
  word: "cancer"
  pos: "NN"
  value: "cancer"
  before: " "
  after: " "
  originalText: "cancer"
  ner: "CAUSE_OF_DEATH"
  lemma: "cancer"
  beginChar: 7
  endChar: 13
  utterance: 0
  speaker: "PER0"
  beginIndex: 1
  endIndex: 2
  tokenBeginIndex: 1
  tokenEndIndex: 2
  hasXmlContext: false
  isNewline: false
  coarseNER: "O"
  fineGrainedNER: "CAUSE_OF_DEATH"
  corefMentionIndex: 0
  corefMentionIndex: 3
  entityMentionIndex: 0
}
token {
  word: "susceptibility"
  pos: "NN"
  value: "susceptibility"
  before: " "
  after: " "
  originalText: "susceptibility"
  ner: "O"
  lemma: "susceptibility"
  beginChar: 14
  endChar: 28
  utterance: 0
  speaker: "PER0"
  beginIndex: 2
  endIndex: 3
  tokenBeginIndex: 2
  tokenEndIndex: 3
  hasXmlContext: false
  isNewline: false
  coarseNER: "O"
  fineGrainedNER: "O"
  corefMentionIndex: 3
}
token {
  word: "gene"
  pos: "NN"
  value: "gene"
  before: " "
  after: " "
  originalText: "gene"
  ner: "O"
  lemma: "gene"
  beginChar: 29
  endChar: 33
  utterance: 0
  speaker: "PER0"
  beginIndex: 3
  endIndex: 4
  tokenBeginIndex: 3
  tokenEndIndex: 4
  hasXmlContext: false
  isNewline: false
  coarseNER: "O"
  fineGrainedNER: "O"
  corefMentionIndex: 3
}
token {
  word: "1"
  pos: "CD"
  value: "1"
  before: " "
  after: " "
  originalText: "1"
  ner: "NUMBER"
  normalizedNER: "1.0"
  lemma: "1"
  beginChar: 34
  endChar: 35
  utterance: 0
  speaker: "PER0"
  beginIndex: 4
  endIndex: 5
  tokenBeginIndex: 4
  tokenEndIndex: 5
  hasXmlContext: false
  isNewline: false
  coarseNER: "NUMBER"
  fineGrainedNER: "NUMBER"
  corefMentionIndex: 1
  corefMentionIndex: 3
  entityMentionIndex: 1
}
token {
  word: "-LRB-"
  pos: "-LRB-"
  value: "-LRB-"
  before: " "
  after: ""
  originalText: "("
  ner: "O"
  lemma: "-lrb-"
  beginChar: 36
  endChar: 37
  utterance: 0
  speaker: "PER0"
  beginIndex: 5
  endIndex: 6
  tokenBeginIndex: 5
  tokenEndIndex: 6
  hasXmlContext: false
  isNewline: false
  coarseNER: "O"
  fineGrainedNER: "O"
  corefMentionIndex: 3
}
token {
  word: "BRCA1"
  pos: "NN"
  value: "BRCA1"
  before: ""
  after: ""
  originalText: "BRCA1"
  ner: "O"
  lemma: "brca1"
  beginChar: 37
  endChar: 42
  utterance: 0
  speaker: "PER0"
  beginIndex: 6
  endIndex: 7
  tokenBeginIndex: 6
  tokenEndIndex: 7
  hasXmlContext: false
  isNewline: false
  coarseNER: "O"
  fineGrainedNER: "O"
  corefMentionIndex: 3
  corefMentionIndex: 4
}
token {
  word: "-RRB-"
  pos: "-RRB-"
  value: "-RRB-"
  before: ""
  after: " "
  originalText: ")"
  ner: "O"
  lemma: "-rrb-"
  beginChar: 42
  endChar: 43
  utterance: 0
  speaker: "PER0"
  beginIndex: 7
  endIndex: 8
  tokenBeginIndex: 7
  tokenEndIndex: 8
  hasXmlContext: false
  isNewline: false
  coarseNER: "O"
  fineGrainedNER: "O"
  corefMentionIndex: 3
}
token {
  word: "is"
  pos: "VBZ"
  value: "is"
  before: " "
  after: " "
  originalText: "is"
  ner: "O"
  lemma: "be"
  beginChar: 44
  endChar: 46
  utterance: 0
  speaker: "PER0"
  beginIndex: 8
  endIndex: 9
  tokenBeginIndex: 8
  tokenEndIndex: 9
  hasXmlContext: false
  isNewline: false
  coarseNER: "O"
  fineGrainedNER: "O"
}
token {
  word: "a"
  pos: "DT"
  value: "a"
  before: " "
  after: " "
  originalText: "a"
  ner: "O"
  lemma: "a"
  beginChar: 47
  endChar: 48
  utterance: 0
  speaker: "PER0"
  beginIndex: 9
  endIndex: 10
  tokenBeginIndex: 9
  tokenEndIndex: 10
  hasXmlContext: false
  isNewline: false
  coarseNER: "O"
  fineGrainedNER: "O"
  corefMentionIndex: 5
}
token {
  word: "tumor"
  pos: "NN"
  value: "tumor"
  before: " "
  after: " "
  originalText: "tumor"
  ner: "CAUSE_OF_DEATH"
  lemma: "tumor"
  beginChar: 49
  endChar: 54
  utterance: 0
  speaker: "PER0"
  beginIndex: 10
  endIndex: 11
  tokenBeginIndex: 10
  tokenEndIndex: 11
  hasXmlContext: false
  isNewline: false
  coarseNER: "O"
  fineGrainedNER: "CAUSE_OF_DEATH"
  corefMentionIndex: 2
  corefMentionIndex: 5
  entityMentionIndex: 2
}
token {
  word: "suppressor"
  pos: "NN"
  value: "suppressor"
  before: " "
  after: " "
  originalText: "suppressor"
  ner: "O"
  lemma: "suppressor"
  beginChar: 55
  endChar: 65
  utterance: 0
  speaker: "PER0"
  beginIndex: 11
  endIndex: 12
  tokenBeginIndex: 11
  tokenEndIndex: 12
  hasXmlContext: false
  isNewline: false
  coarseNER: "O"
  fineGrainedNER: "O"
  corefMentionIndex: 5
}
token {
  word: "protein"
  pos: "NN"
  value: "protein"
  before: " "
  after: ""
  originalText: "protein"
  ner: "O"
  lemma: "protein"
  beginChar: 66
  endChar: 73
  utterance: 0
  speaker: "PER0"
  beginIndex: 12
  endIndex: 13
  tokenBeginIndex: 12
  tokenEndIndex: 13
  hasXmlContext: false
  isNewline: false
  coarseNER: "O"
  fineGrainedNER: "O"
  corefMentionIndex: 5
}
token {
  word: "."
  pos: "."
  value: "."
  before: ""
  after: ""
  originalText: "."
  ner: "O"
  lemma: "."
  beginChar: 73
  endChar: 74
  utterance: 0
  speaker: "PER0"
  beginIndex: 13
  endIndex: 14
  tokenBeginIndex: 13
  tokenEndIndex: 14
  hasXmlContext: false
  isNewline: false
  coarseNER: "O"
  fineGrainedNER: "O"
}
tokenOffsetBegin: 0
tokenOffsetEnd: 14
sentenceIndex: 0
characterOffsetBegin: 0
characterOffsetEnd: 74
parseTree {
  child {
    child {
      child {
        child {
          child {
            child {
              value: "Breast"
            }
            value: "NN"
            score: -13.085748672485352
          }
          child {
            child {
              value: "cancer"
            }
            value: "NN"
            score: -7.361298084259033
          }
          child {
            child {
              value: "susceptibility"
            }
            value: "NN"
            score: -12.832098960876465
          }
          value: "NP"
          score: -39.81563186645508
        }
        child {
          child {
            child {
              value: "gene"
            }
            value: "NN"
            score: -7.761730194091797
          }
          child {
            child {
              value: "1"
            }
            value: "CD"
            score: -4.178682804107666
          }
          value: "NP"
          score: -19.19379997253418
        }
        value: "NP"
        score: -62.36488342285156
      }
      child {
        child {
          child {
            value: "-LRB-"
          }
          value: "-LRB-"
          score: -0.06566064804792404
        }
        child {
          child {
            child {
              value: "BRCA1"
            }
            value: "NN"
            score: -13.365689277648926
          }
          value: "NP"
          score: -16.57198715209961
        }
        child {
          child {
            value: "-RRB-"
          }
          value: "-RRB-"
          score: -0.06669137626886368
        }
        value: "PRN"
        score: -17.963926315307617
      }
      value: "NP"
      score: -86.23522186279297
    }
    child {
      child {
        child {
          value: "is"
        }
        value: "VBZ"
        score: -0.14657023549079895
      }
      child {
        child {
          child {
            value: "a"
          }
          value: "DT"
          score: -1.4235451221466064
        }
        child {
          child {
            value: "tumor"
          }
          value: "NN"
          score: -9.49818229675293
        }
        child {
          child {
            value: "suppressor"
          }
          value: "NN"
          score: -10.207574844360352
        }
        child {
          child {
            value: "protein"
          }
          value: "NN"
          score: -9.312461853027344
        }
        value: "NP"
        score: -36.75123977661133
      }
      value: "VP"
      score: -42.08717727661133
    }
    child {
      child {
        value: "."
      }
      value: "."
      score: -0.003481106134131551
    }
    value: "S"
    score: -131.2326202392578
  }
  value: "ROOT"
  score: -131.38381958007812
}
basicDependencies {
  node {
    sentenceIndex: 0
    index: 1
  }
  node {
    sentenceIndex: 0
    index: 2
  }
  node {
    sentenceIndex: 0
    index: 3
  }
  node {
    sentenceIndex: 0
    index: 4
  }
  node {
    sentenceIndex: 0
    index: 5
  }
  node {
    sentenceIndex: 0
    index: 6
  }
  node {
    sentenceIndex: 0
    index: 7
  }
  node {
    sentenceIndex: 0
    index: 8
  }
  node {
    sentenceIndex: 0
    index: 9
  }
  node {
    sentenceIndex: 0
    index: 10
  }
  node {
    sentenceIndex: 0
    index: 11
  }
  node {
    sentenceIndex: 0
    index: 12
  }
  node {
    sentenceIndex: 0
    index: 13
  }
  node {
    sentenceIndex: 0
    index: 14
  }
  edge {
    source: 4
    target: 1
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 4
    target: 2
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 4
    target: 3
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 4
    target: 5
    dep: "nummod"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 4
    target: 7
    dep: "appos"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 7
    target: 6
    dep: "punct"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 7
    target: 8
    dep: "punct"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 4
    dep: "nsubj"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 9
    dep: "cop"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 10
    dep: "det"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 11
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 12
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 14
    dep: "punct"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  root: 13
}
collapsedDependencies {
  node {
    sentenceIndex: 0
    index: 1
  }
  node {
    sentenceIndex: 0
    index: 2
  }
  node {
    sentenceIndex: 0
    index: 3
  }
  node {
    sentenceIndex: 0
    index: 4
  }
  node {
    sentenceIndex: 0
    index: 5
  }
  node {
    sentenceIndex: 0
    index: 6
  }
  node {
    sentenceIndex: 0
    index: 7
  }
  node {
    sentenceIndex: 0
    index: 8
  }
  node {
    sentenceIndex: 0
    index: 9
  }
  node {
    sentenceIndex: 0
    index: 10
  }
  node {
    sentenceIndex: 0
    index: 11
  }
  node {
    sentenceIndex: 0
    index: 12
  }
  node {
    sentenceIndex: 0
    index: 13
  }
  node {
    sentenceIndex: 0
    index: 14
  }
  edge {
    source: 4
    target: 1
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 4
    target: 2
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 4
    target: 3
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 4
    target: 5
    dep: "nummod"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 4
    target: 7
    dep: "appos"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 7
    target: 6
    dep: "punct"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 7
    target: 8
    dep: "punct"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 4
    dep: "nsubj"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 9
    dep: "cop"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 10
    dep: "det"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 11
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 12
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 14
    dep: "punct"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  root: 13
}
collapsedCCProcessedDependencies {
  node {
    sentenceIndex: 0
    index: 1
  }
  node {
    sentenceIndex: 0
    index: 2
  }
  node {
    sentenceIndex: 0
    index: 3
  }
  node {
    sentenceIndex: 0
    index: 4
  }
  node {
    sentenceIndex: 0
    index: 5
  }
  node {
    sentenceIndex: 0
    index: 6
  }
  node {
    sentenceIndex: 0
    index: 7
  }
  node {
    sentenceIndex: 0
    index: 8
  }
  node {
    sentenceIndex: 0
    index: 9
  }
  node {
    sentenceIndex: 0
    index: 10
  }
  node {
    sentenceIndex: 0
    index: 11
  }
  node {
    sentenceIndex: 0
    index: 12
  }
  node {
    sentenceIndex: 0
    index: 13
  }
  node {
    sentenceIndex: 0
    index: 14
  }
  edge {
    source: 4
    target: 1
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 4
    target: 2
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 4
    target: 3
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 4
    target: 5
    dep: "nummod"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 4
    target: 7
    dep: "appos"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 7
    target: 6
    dep: "punct"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 7
    target: 8
    dep: "punct"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 4
    dep: "nsubj"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 9
    dep: "cop"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 10
    dep: "det"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 11
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 12
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 14
    dep: "punct"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  root: 13
}
paragraph: 1
enhancedDependencies {
  node {
    sentenceIndex: 0
    index: 1
  }
  node {
    sentenceIndex: 0
    index: 2
  }
  node {
    sentenceIndex: 0
    index: 3
  }
  node {
    sentenceIndex: 0
    index: 4
  }
  node {
    sentenceIndex: 0
    index: 5
  }
  node {
    sentenceIndex: 0
    index: 6
  }
  node {
    sentenceIndex: 0
    index: 7
  }
  node {
    sentenceIndex: 0
    index: 8
  }
  node {
    sentenceIndex: 0
    index: 9
  }
  node {
    sentenceIndex: 0
    index: 10
  }
  node {
    sentenceIndex: 0
    index: 11
  }
  node {
    sentenceIndex: 0
    index: 12
  }
  node {
    sentenceIndex: 0
    index: 13
  }
  node {
    sentenceIndex: 0
    index: 14
  }
  edge {
    source: 4
    target: 1
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 4
    target: 2
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 4
    target: 3
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 4
    target: 5
    dep: "nummod"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 4
    target: 7
    dep: "appos"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 7
    target: 6
    dep: "punct"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 7
    target: 8
    dep: "punct"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 4
    dep: "nsubj"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 9
    dep: "cop"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 10
    dep: "det"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 11
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 12
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 14
    dep: "punct"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  root: 13
}
enhancedPlusPlusDependencies {
  node {
    sentenceIndex: 0
    index: 1
  }
  node {
    sentenceIndex: 0
    index: 2
  }
  node {
    sentenceIndex: 0
    index: 3
  }
  node {
    sentenceIndex: 0
    index: 4
  }
  node {
    sentenceIndex: 0
    index: 5
  }
  node {
    sentenceIndex: 0
    index: 6
  }
  node {
    sentenceIndex: 0
    index: 7
  }
  node {
    sentenceIndex: 0
    index: 8
  }
  node {
    sentenceIndex: 0
    index: 9
  }
  node {
    sentenceIndex: 0
    index: 10
  }
  node {
    sentenceIndex: 0
    index: 11
  }
  node {
    sentenceIndex: 0
    index: 12
  }
  node {
    sentenceIndex: 0
    index: 13
  }
  node {
    sentenceIndex: 0
    index: 14
  }
  edge {
    source: 4
    target: 1
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 4
    target: 2
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 4
    target: 3
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 4
    target: 5
    dep: "nummod"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 4
    target: 7
    dep: "appos"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 7
    target: 6
    dep: "punct"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 7
    target: 8
    dep: "punct"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 4
    dep: "nsubj"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 9
    dep: "cop"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 10
    dep: "det"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 11
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 12
    dep: "compound"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  edge {
    source: 13
    target: 14
    dep: "punct"
    isExtra: false
    sourceCopy: 0
    targetCopy: 0
    language: UniversalEnglish
  }
  root: 13
}
binarizedParseTree {
  child {
    child {
      child {
        child {
          child {
            child {
              value: "Breast"
            }
            value: "NN"
          }
          child {
            child {
              child {
                value: "cancer"
              }
              value: "NN"
            }
            child {
              child {
                value: "susceptibility"
              }
              value: "NN"
            }
            value: "@NP"
          }
          value: "NP"
        }
        child {
          child {
            child {
              value: "gene"
            }
            value: "NN"
          }
          child {
            child {
              value: "1"
            }
            value: "CD"
          }
          value: "NP"
        }
        value: "NP"
      }
      child {
        child {
          child {
            value: "-LRB-"
          }
          value: "-LRB-"
        }
        child {
          child {
            child {
              child {
                value: "BRCA1"
              }
              value: "NN"
            }
            value: "NP"
          }
          child {
            child {
              value: "-RRB-"
            }
            value: "-RRB-"
          }
          value: "@PRN"
        }
        value: "PRN"
      }
      value: "NP"
    }
    child {
      child {
        child {
          child {
            value: "is"
          }
          value: "VBZ"
        }
        child {
          child {
            child {
              value: "a"
            }
            value: "DT"
          }
          child {
            child {
              child {
                value: "tumor"
              }
              value: "NN"
            }
            child {
              child {
                child {
                  value: "suppressor"
                }
                value: "NN"
              }
              child {
                child {
                  value: "protein"
                }
                value: "NN"
              }
              value: "@NP"
            }
            value: "@NP"
          }
          value: "NP"
        }
        value: "VP"
      }
      child {
        child {
          value: "."
        }
        value: "."
      }
      value: "@S"
    }
    value: "S"
  }
  value: "ROOT"
}
hasRelationAnnotations: false
hasNumerizedTokensAnnotation: true
mentions {
  sentenceIndex: 0
  tokenStartInSentenceInclusive: 0
  tokenEndInSentenceExclusive: 2
  ner: "CAUSE_OF_DEATH"
  entityType: "CAUSE_OF_DEATH"
  entityMentionIndex: 0
  canonicalEntityMentionIndex: 0
  entityMentionText: "Breast cancer"
}
mentions {
  sentenceIndex: 0
  tokenStartInSentenceInclusive: 4
  tokenEndInSentenceExclusive: 5
  ner: "NUMBER"
  normalizedNER: "1.0"
  entityType: "NUMBER"
  entityMentionIndex: 1
  canonicalEntityMentionIndex: 1
  entityMentionText: "1"
}
mentions {
  sentenceIndex: 0
  tokenStartInSentenceInclusive: 10
  tokenEndInSentenceExclusive: 11
  ner: "CAUSE_OF_DEATH"
  entityType: "CAUSE_OF_DEATH"
  entityMentionIndex: 2
  canonicalEntityMentionIndex: 2
  entityMentionText: "tumor"
}
mentionsForCoref {
  mentionID: 0
  mentionType: "NOMINAL"
  number: "SINGULAR"
  gender: "NEUTRAL"
  animacy: "INANIMATE"
  person: "UNKNOWN"
  startIndex: 0
  endIndex: 2
  headIndex: 1
  headString: "cancer"
  nerString: "O"
  originalRef: 4294967295
  goldCorefClusterID: -1
  corefClusterID: 0
  mentionNum: 1
  sentNum: 0
  utter: 0
  paragraph: 1
  isSubject: false
  isDirectObject: false
  isIndirectObject: false
  isPrepositionObject: false
  hasTwin: false
  generic: false
  isSingleton: false
  hasBasicDependency: true
  hasEnhancedDepenedncy: true
  hasContextParseTree: true
  headIndexedWord {
    sentenceNum: 4294967295
    tokenIndex: 1
    copyCount: 0
  }
  dependingVerb {
    sentenceNum: 4294967295
    tokenIndex: 4294967295
  }
  headWord {
    sentenceNum: 4294967295
    tokenIndex: 1
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 0
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 1
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 2
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 3
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 4
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 5
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 6
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 7
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 8
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 9
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 10
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 11
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 12
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 13
  }
  originalSpan {
    sentenceNum: 4294967295
    tokenIndex: 0
  }
  originalSpan {
    sentenceNum: 4294967295
    tokenIndex: 1
  }
}
mentionsForCoref {
  mentionID: 1
  mentionType: "PROPER"
  number: "SINGULAR"
  gender: "UNKNOWN"
  animacy: "INANIMATE"
  person: "UNKNOWN"
  startIndex: 4
  endIndex: 5
  headIndex: 4
  headString: "1"
  nerString: "NUMBER"
  originalRef: 4294967295
  goldCorefClusterID: -1
  corefClusterID: 1
  mentionNum: 2
  sentNum: 0
  utter: 0
  paragraph: 1
  isSubject: false
  isDirectObject: false
  isIndirectObject: false
  isPrepositionObject: false
  hasTwin: false
  generic: false
  isSingleton: false
  hasBasicDependency: true
  hasEnhancedDepenedncy: true
  hasContextParseTree: true
  headIndexedWord {
    sentenceNum: 4294967295
    tokenIndex: 4
    copyCount: 0
  }
  dependingVerb {
    sentenceNum: 4294967295
    tokenIndex: 4294967295
  }
  headWord {
    sentenceNum: 4294967295
    tokenIndex: 4
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 0
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 1
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 2
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 3
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 4
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 5
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 6
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 7
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 8
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 9
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 10
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 11
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 12
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 13
  }
  originalSpan {
    sentenceNum: 4294967295
    tokenIndex: 4
  }
}
mentionsForCoref {
  mentionID: 2
  mentionType: "NOMINAL"
  number: "SINGULAR"
  gender: "NEUTRAL"
  animacy: "INANIMATE"
  person: "UNKNOWN"
  startIndex: 10
  endIndex: 11
  headIndex: 10
  headString: "tumor"
  nerString: "O"
  originalRef: 4294967295
  goldCorefClusterID: -1
  corefClusterID: 2
  mentionNum: 5
  sentNum: 0
  utter: 0
  paragraph: 1
  isSubject: false
  isDirectObject: false
  isIndirectObject: false
  isPrepositionObject: false
  hasTwin: false
  generic: false
  isSingleton: false
  hasBasicDependency: true
  hasEnhancedDepenedncy: true
  hasContextParseTree: true
  headIndexedWord {
    sentenceNum: 4294967295
    tokenIndex: 10
    copyCount: 0
  }
  dependingVerb {
    sentenceNum: 4294967295
    tokenIndex: 4294967295
  }
  headWord {
    sentenceNum: 4294967295
    tokenIndex: 10
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 0
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 1
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 2
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 3
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 4
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 5
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 6
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 7
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 8
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 9
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 10
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 11
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 12
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 13
  }
  originalSpan {
    sentenceNum: 4294967295
    tokenIndex: 10
  }
}
mentionsForCoref {
  mentionID: 3
  mentionType: "NOMINAL"
  number: "SINGULAR"
  gender: "UNKNOWN"
  animacy: "INANIMATE"
  person: "UNKNOWN"
  startIndex: 0
  endIndex: 8
  headIndex: 3
  headString: "gene"
  nerString: "O"
  originalRef: 4294967295
  goldCorefClusterID: -1
  corefClusterID: 3
  mentionNum: 0
  sentNum: 0
  utter: 0
  paragraph: 1
  isSubject: false
  isDirectObject: false
  isIndirectObject: false
  isPrepositionObject: false
  hasTwin: false
  generic: false
  isSingleton: false
  hasBasicDependency: true
  hasEnhancedDepenedncy: true
  hasContextParseTree: true
  headIndexedWord {
    sentenceNum: 4294967295
    tokenIndex: 3
    copyCount: 0
  }
  dependingVerb {
    sentenceNum: 4294967295
    tokenIndex: 4294967295
  }
  headWord {
    sentenceNum: 4294967295
    tokenIndex: 3
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 0
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 1
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 2
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 3
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 4
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 5
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 6
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 7
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 8
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 9
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 10
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 11
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 12
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 13
  }
  originalSpan {
    sentenceNum: 4294967295
    tokenIndex: 0
  }
  originalSpan {
    sentenceNum: 4294967295
    tokenIndex: 1
  }
  originalSpan {
    sentenceNum: 4294967295
    tokenIndex: 2
  }
  originalSpan {
    sentenceNum: 4294967295
    tokenIndex: 3
  }
  originalSpan {
    sentenceNum: 4294967295
    tokenIndex: 4
  }
  originalSpan {
    sentenceNum: 4294967295
    tokenIndex: 5
  }
  originalSpan {
    sentenceNum: 4294967295
    tokenIndex: 6
  }
  originalSpan {
    sentenceNum: 4294967295
    tokenIndex: 7
  }
}
mentionsForCoref {
  mentionID: 4
  mentionType: "NOMINAL"
  number: "SINGULAR"
  gender: "UNKNOWN"
  animacy: "UNKNOWN"
  person: "UNKNOWN"
  startIndex: 6
  endIndex: 7
  headIndex: 6
  headString: "brca1"
  nerString: "O"
  originalRef: 4294967295
  goldCorefClusterID: -1
  corefClusterID: 4
  mentionNum: 3
  sentNum: 0
  utter: 0
  paragraph: 1
  isSubject: false
  isDirectObject: false
  isIndirectObject: false
  isPrepositionObject: false
  hasTwin: false
  generic: false
  isSingleton: false
  hasBasicDependency: true
  hasEnhancedDepenedncy: true
  hasContextParseTree: true
  headIndexedWord {
    sentenceNum: 4294967295
    tokenIndex: 6
    copyCount: 0
  }
  dependingVerb {
    sentenceNum: 4294967295
    tokenIndex: 4294967295
  }
  headWord {
    sentenceNum: 4294967295
    tokenIndex: 6
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 0
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 1
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 2
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 3
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 4
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 5
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 6
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 7
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 8
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 9
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 10
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 11
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 12
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 13
  }
  originalSpan {
    sentenceNum: 4294967295
    tokenIndex: 6
  }
  appositions: 3
}
mentionsForCoref {
  mentionID: 5
  mentionType: "NOMINAL"
  number: "SINGULAR"
  gender: "NEUTRAL"
  animacy: "INANIMATE"
  person: "UNKNOWN"
  startIndex: 9
  endIndex: 13
  headIndex: 12
  headString: "protein"
  nerString: "O"
  originalRef: 4294967295
  goldCorefClusterID: -1
  corefClusterID: 5
  mentionNum: 4
  sentNum: 0
  utter: 0
  paragraph: 1
  isSubject: false
  isDirectObject: false
  isIndirectObject: false
  isPrepositionObject: false
  hasTwin: false
  generic: false
  isSingleton: false
  hasBasicDependency: true
  hasEnhancedDepenedncy: true
  hasContextParseTree: true
  headIndexedWord {
    sentenceNum: 4294967295
    tokenIndex: 12
    copyCount: 0
  }
  dependingVerb {
    sentenceNum: 4294967295
    tokenIndex: 4294967295
  }
  headWord {
    sentenceNum: 4294967295
    tokenIndex: 12
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 0
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 1
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 2
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 3
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 4
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 5
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 6
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 7
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 8
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 9
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 10
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 11
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 12
  }
  sentenceWords {
    sentenceNum: 4294967295
    tokenIndex: 13
  }
  originalSpan {
    sentenceNum: 4294967295
    tokenIndex: 9
  }
  originalSpan {
    sentenceNum: 4294967295
    tokenIndex: 10
  }
  originalSpan {
    sentenceNum: 4294967295
    tokenIndex: 11
  }
  originalSpan {
    sentenceNum: 4294967295
    tokenIndex: 12
  }
  predicateNominatives: 3
}
hasCorefMentionsAnnotation: true
hasEntityMentionsAnnotation: true

>>> ## ALL OF THAT (ABOVE) WAS FOR ONE SENTENCE!  :-O
>>> ## SAME OUTPUT:

>>> print(ann)
text: "Breast cancer susceptibility gene 1 (BRCA1) is a tumor suppressor protein."
sentence {
  token {
    word: "Breast"
    pos: "NN"
    value: "Breast"
    before: ""
    after: " "
    [ ... snip ... ]

>>> ## **MUCH** MORE COMPACT:

>>> client = CoreNLPClient(annotators='tokenize, ssplit, pos, lemma, ner, parse, depparse, coref', output_format='text',  timeout=30000, memory='16G')
>>> ann = client.annotate(text)
  Starting server with command: java -Xmx16G -cp /mnt/Vancouver/apps/CoreNLP/stanford-corenlp-full/stanford-corenlp-full-2018-10-05/* edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 30000 -threads 5 -maxCharLength 100000 -quiet True -serverProperties corenlp_server-163b9ecb6a9947a8.props -preload tokenize, ssplit, pos, lemma, ner, parse, depparse, coref
>>> print(ann)

Sentence #1 (14 tokens):
Breast cancer susceptibility gene 1 (BRCA1) is a tumor suppressor protein.

Tokens:
[Text=Breast CharacterOffsetBegin=0 CharacterOffsetEnd=6 PartOfSpeech=NN Lemma=breast NamedEntityTag=CAUSE_OF_DEATH]
[Text=cancer CharacterOffsetBegin=7 CharacterOffsetEnd=13 PartOfSpeech=NN Lemma=cancer NamedEntityTag=CAUSE_OF_DEATH]
[Text=susceptibility CharacterOffsetBegin=14 CharacterOffsetEnd=28 PartOfSpeech=NN Lemma=susceptibility NamedEntityTag=O]
[Text=gene CharacterOffsetBegin=29 CharacterOffsetEnd=33 PartOfSpeech=NN Lemma=gene NamedEntityTag=O]
[Text=1 CharacterOffsetBegin=34 CharacterOffsetEnd=35 PartOfSpeech=CD Lemma=1 NamedEntityTag=NUMBER NormalizedNamedEntityTag=1.0]
[Text=-LRB- CharacterOffsetBegin=36 CharacterOffsetEnd=37 PartOfSpeech=-LRB- Lemma=-lrb- NamedEntityTag=O]
[Text=BRCA1 CharacterOffsetBegin=37 CharacterOffsetEnd=42 PartOfSpeech=NN Lemma=brca1 NamedEntityTag=O]
[Text=-RRB- CharacterOffsetBegin=42 CharacterOffsetEnd=43 PartOfSpeech=-RRB- Lemma=-rrb- NamedEntityTag=O]
[Text=is CharacterOffsetBegin=44 CharacterOffsetEnd=46 PartOfSpeech=VBZ Lemma=be NamedEntityTag=O]
[Text=a CharacterOffsetBegin=47 CharacterOffsetEnd=48 PartOfSpeech=DT Lemma=a NamedEntityTag=O]
[Text=tumor CharacterOffsetBegin=49 CharacterOffsetEnd=54 PartOfSpeech=NN Lemma=tumor NamedEntityTag=CAUSE_OF_DEATH]
[Text=suppressor CharacterOffsetBegin=55 CharacterOffsetEnd=65 PartOfSpeech=NN Lemma=suppressor NamedEntityTag=O]
[Text=protein CharacterOffsetBegin=66 CharacterOffsetEnd=73 PartOfSpeech=NN Lemma=protein NamedEntityTag=O]
[Text=. CharacterOffsetBegin=73 CharacterOffsetEnd=74 PartOfSpeech=. Lemma=. NamedEntityTag=O]

Constituency parse:
(ROOT
  (S
    (NP
      (NP
        (NP (NN Breast) (NN cancer) (NN susceptibility))
        (NP (NN gene) (CD 1)))
      (PRN (-LRB- -LRB-)
        (NP (NN BRCA1))
        (-RRB- -RRB-)))
    (VP (VBZ is)
      (NP (DT a) (NN tumor) (NN suppressor) (NN protein)))
    (. .)))


Dependency Parse (enhanced plus plus dependencies):
root(ROOT-0, protein-13)
compound(gene-4, Breast-1)
compound(gene-4, cancer-2)
compound(gene-4, susceptibility-3)
nsubj(protein-13, gene-4)
nummod(gene-4, 1-5)
punct(BRCA1-7, -LRB--6)
appos(gene-4, BRCA1-7)
punct(BRCA1-7, -RRB--8)
cop(protein-13, is-9)
det(protein-13, a-10)
compound(protein-13, tumor-11)
compound(protein-13, suppressor-12)
punct(protein-13, .-14)

Extracted the following NER entity mentions:
Breast cancer	CAUSE_OF_DEATH
1	NUMBER
tumor	CAUSE_OF_DEATH

# ============================================================================

>>> import stanfordnlp
>>> stanfordnlp.download('en')
  Using the default treebank "en_ewt" for language "en".
  Would you like to download the models for: en_ewt now? (Y/n) Y

  Default download directory: /home/victoria/stanfordnlp_resources
  Hit enter to continue or type an alternate directory.

  Downloading models for: en_ewt
  Download location: /home/victoria/stanfordnlp_resources/en_ewt_models.zip
  100%|█████████████████████████████████████| 235M/235M [01:15<00:00, 3.09MB/s]

  Download complete.  Models saved to: /home/victoria/stanfordnlp_resources/en_ewt_models.zip
  Extracting models file for: en_ewt
  Cleaning up...Done.

>>> nlp = stanfordnlp.Pipeline()
Use device: cpu
---
Loading: tokenize
With settings:
{'model_path': '/home/victoria/stanfordnlp_resources/en_ewt_models/en_ewt_tokenizer.pt', 'lang': 'en', 'shorthand': 'en_ewt', 'mode': 'predict'}
---
Loading: pos
With settings:
{'model_path': '/home/victoria/stanfordnlp_resources/en_ewt_models/en_ewt_tagger.pt', 'pretrain_path': '/home/victoria/stanfordnlp_resources/en_ewt_models/en_ewt.pretrain.pt', 'lang': 'en', 'shorthand': 'en_ewt', 'mode': 'predict'}
---
Loading: lemma
With settings:
{'model_path': '/home/victoria/stanfordnlp_resources/en_ewt_models/en_ewt_lemmatizer.pt', 'lang': 'en', 'shorthand': 'en_ewt', 'mode': 'predict'}
Building an attentional Seq2Seq model...
Using a Bi-LSTM encoder
Using soft attention for LSTM.
Finetune all embeddings.
[Running seq2seq lemmatizer with edit classifier]
---
Loading: depparse
With settings:
{'model_path': '/home/victoria/stanfordnlp_resources/en_ewt_models/en_ewt_parser.pt', 'pretrain_path': '/home/victoria/stanfordnlp_resources/en_ewt_models/en_ewt.pretrain.pt', 'lang': 'en', 'shorthand': 'en_ewt', 'mode': 'predict'}
Done loading processors!
---
>>> text = 'Bananas are an excellent source of potassium.'
>>> text_nlp = nlp(text)

>>> text_nlp.sentences[0].print_dependencies()
('Bananas', '5', 'nsubj')
('are', '5', 'cop')
('an', '5', 'det')
('excellent', '5', 'amod')
('source', '0', 'root')
('of', '7', 'case')
('potassium', '5', 'nmod')
('.', '5', 'punct')

# ============================================================================

>>> import stanfordnlp
>>> from spacy_stanfordnlp import StanfordNLPLanguage
>>> snlp = stanfordnlp.Pipeline(lang="en")
Use device: cpu
---
Loading: tokenize
With settings:
{'model_path': '/home/victoria/stanfordnlp_resources/en_ewt_models/en_ewt_tokenizer.pt', 'lang': 'en', 'shorthand': 'en_ewt', 'mode': 'predict'}
---
Loading: pos
With settings:
{'model_path': '/home/victoria/stanfordnlp_resources/en_ewt_models/en_ewt_tagger.pt', 'pretrain_path': '/home/victoria/stanfordnlp_resources/en_ewt_models/en_ewt.pretrain.pt', 'lang': 'en', 'shorthand': 'en_ewt', 'mode': 'predict'}
---
Loading: lemma
With settings:
{'model_path': '/home/victoria/stanfordnlp_resources/en_ewt_models/en_ewt_lemmatizer.pt', 'lang': 'en', 'shorthand': 'en_ewt', 'mode': 'predict'}
Building an attentional Seq2Seq model...
Using a Bi-LSTM encoder
Using soft attention for LSTM.
Finetune all embeddings.
[Running seq2seq lemmatizer with edit classifier]
---
Loading: depparse
With settings:
{'model_path': '/home/victoria/stanfordnlp_resources/en_ewt_models/en_ewt_parser.pt', 'pretrain_path': '/home/victoria/stanfordnlp_resources/en_ewt_models/en_ewt.pretrain.pt', 'lang': 'en', 'shorthand': 'en_ewt', 'mode': 'predict'}
Done loading processors!
---
>>> nlp = StanfordNLPLanguage(snlp)
>>> doc = nlp("Barack Obama was born in Hawaii. He was elected president in 2008.")
>>> for token in doc:
...     print(token.text, token.lemma_, token.pos_, token.dep_)
...
Barack Barack PROPN nsubj:pass
Obama Obama PROPN flat
was be AUX aux:pass
born bear VERB root
in in ADP case
Hawaii Hawaii PROPN obl
. . PUNCT punct
He he PRON nsubj:pass
was be AUX aux:pass
elected elect VERB root
president president PROPN xcomp
in in ADP case
2008 2008 NUM obl
. . PUNCT punct
>>>


==============================================================================
==============================================================================
END OF FILE
==============================================================================
==============================================================================