Skip to content

Instantly share code, notes, and snippets.

@GabrielSGoncalves
Created September 24, 2019 14:48
Show Gist options
  • Save GabrielSGoncalves/5f8809c74a46e6dd0399708162105fc9 to your computer and use it in GitHub Desktop.
Save GabrielSGoncalves/5f8809c74a46e6dd0399708162105fc9 to your computer and use it in GitHub Desktop.
Fourth part of the NLP analysis for the Medium article on AWS ML/AI tools for NLP.
# 15) Iterate over the speakers and apply spaCy visualizer on each speech
for index, row in df_audio.iterrows():
print(f"Rendering {index}'s texts")
nlp = spacy.load('en_core_web_lg')
original_transcription = nlp(original_transcriptions.get(index))
transcribe_transcription = nlp(get_text_from_json(bucket_name, row.json_transcription))
svg_original = spacy.displacy.render(original_transcription, style="ent",jupyter=False)
svg_transcribe = spacy.displacy.render(transcribe_transcription, style="ent",jupyter=False)
with open(f'{index}_original.html', 'w', encoding="utf-8") as page:
page.write(svg_original)
with open(f'{index}_transcribe.html', 'w', encoding="utf-8") as page:
page.write(svg_transcribe)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment