Created
September 25, 2024 05:07
-
-
Save markjenkins/9fabdb600ed659625bcfdbad7a6bf857 to your computer and use it in GitHub Desktop.
A simple python script to take a AWS transcribe json file with speaker labels and print a plain text transcript
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python3 | |
# Take a AWS transcribe json file with speaker labels from stdin and | |
# print out a plain text transcript with the speaker_label items at the | |
# start of line with a colon | |
# Copyright Mark Jenkins <mark@markjenkins.ca> | |
# | |
# Copying and distribution of this file, with or without modification, | |
# are permitted in any medium without royalty provided the copyright | |
# notice and this notice are preserved. This file is offered as-is, | |
# without any warranty. | |
from sys import stdin | |
from json import load as json_load | |
transcript_json = json_load(stdin) | |
for seg in transcript_json['results']['audio_segments']: | |
print( seg['speaker_label'] + ":", seg['transcript'] ) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment