Skip to content

Instantly share code, notes, and snippets.

@shostelet
Created May 21, 2021 06:16
Show Gist options
  • Save shostelet/fb81da71896738e03f6d9620b2970896 to your computer and use it in GitHub Desktop.
Save shostelet/fb81da71896738e03f6d9620b2970896 to your computer and use it in GitHub Desktop.
# When an audio has the same data in left and right channel, transcript end up looking like an echo between speakers
# in and out. The following code will filter out one of the speaker, and produce a text file with only one speaker's
# transcript. Because of the overtalk, segments are as short as one word. To mimick real segment, it concatenates
# words within a segment as long as there is less than 300ms between the end of a word and the beginning of the next one.
import json
# given a JSON file containing a call from Activate API. https://docs.allo-media.net/activate-api/
with open('call.json') as jsonfile:
call = json.load(jsonfile)
unique_id = call['unique_id']
transcript = call['transcript_json']['callData']
# keep only IN speaker
transcript = [segment for segment in transcript if segment['from'] == 'in']
lines = []
timecode = 0
tmp_line = []
for segment in transcript:
if timecode + 0.3 < segment['datetime']:
lines.append(" ".join(tmp_line))
tmp_line = []
tmp_line.append(segment['content'])
timecode = segment['datetime'] + segment['duration']
with open(f"{original_id}.txt", "w") as txtfile:
txtfile.write("\n".join(lines))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment