Created
September 24, 2019 14:48
-
-
Save GabrielSGoncalves/5f8809c74a46e6dd0399708162105fc9 to your computer and use it in GitHub Desktop.
Fourth part of the NLP analysis for the Medium article on AWS ML/AI tools for NLP.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# 15) Iterate over the speakers and apply spaCy visualizer on each speech | |
for index, row in df_audio.iterrows(): | |
print(f"Rendering {index}'s texts") | |
nlp = spacy.load('en_core_web_lg') | |
original_transcription = nlp(original_transcriptions.get(index)) | |
transcribe_transcription = nlp(get_text_from_json(bucket_name, row.json_transcription)) | |
svg_original = spacy.displacy.render(original_transcription, style="ent",jupyter=False) | |
svg_transcribe = spacy.displacy.render(transcribe_transcription, style="ent",jupyter=False) | |
with open(f'{index}_original.html', 'w', encoding="utf-8") as page: | |
page.write(svg_original) | |
with open(f'{index}_transcribe.html', 'w', encoding="utf-8") as page: | |
page.write(svg_transcribe) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment