Skip to content

Instantly share code, notes, and snippets.

@daanklijn
Created May 3, 2021 10:46
Show Gist options
  • Save daanklijn/47e3f930b7094875445272c575bb8f27 to your computer and use it in GitHub Desktop.
Save daanklijn/47e3f930b7094875445272c575bb8f27 to your computer and use it in GitHub Desktop.
CONLL-U dataframe to dependency tree
df = pd.read_csv('conllu.csv')
words = []
arcs = []
for index, row in df.iterrows():
words.append({
'text': row['word'],
'tag': row['pos']
})
dep_head = row['dependency_head'] - 1
dep_label = row['dependency_label']
if dep_label not in ['root','_']:
arcs.append({
'start': dep_head if index > dep_head else index,
'end': index if index > dep_head else dep_head,
'label': dep_label,
'dir': 'right' if index > dep_head else 'left'
})
html = displacy.render([{'words': words, 'arcs': arcs}],
style="dep", manual=True)
@daanklijn
Copy link
Author

Input:

index,word,lemma,pos,pos2,otherstuff,dependency_head,dependency_label,stuff,otherstuff2
1,Thursday,Thursday,PROPN,NNP,Number=Sing,2,nsubj,_,_
2,works,work,VERB,VBZ,Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin,0,root,_,_
3,for,for,ADP,IN,_,4,case,_,_
4,me,I,PRON,PRP,Case=Acc|Number=Sing|Person=1|PronType=Prs,2,obl,_,SpaceAfter=No
5,.,.,PUNCT,.,_,2,punct,_,_

Output:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment