Skip to content

Instantly share code, notes, and snippets.

View BrunoCodeman's full-sized avatar
🏠
Working from home

Bruno Bemfica BrunoCodeman

🏠
Working from home
View GitHub Profile
@BrunoCodeman
BrunoCodeman / standoff2corenlp.py
Created September 26, 2018 19:24 — forked from thatguysimon/standoff2corenlp.py
A python script to turn annotated data in standoff format (brat annotation tool) to the formats expected by Stanford NER and Relation Extractor models
# A python script to turn annotated data in standoff format (brat annotation tool) to the formats expected by Stanford NER and Relation Extractor models
# - NER format based on: http://nlp.stanford.edu/software/crf-faq.html#a
# - RE format based on: http://nlp.stanford.edu/software/relationExtractor.html#training
# Usage:
# 1) Install the pycorenlp package
# 2) Run CoreNLP server (change CORENLP_SERVER_ADDRESS if needed)
# 3) Place .ann and .txt files from brat in the location specified in DATA_DIRECTORY
# 4) Run this script