Skip to content

Instantly share code, notes, and snippets.

@lukestanley
Created April 11, 2024 06:48
Show Gist options
  • Save lukestanley/305756e1eafead656ced624e07d6193a to your computer and use it in GitHub Desktop.
Save lukestanley/305756e1eafead656ced624e07d6193a to your computer and use it in GitHub Desktop.
making podcast summary with Whisper speech to text model on Replicate with ChatGPT 4
Making podcast summaries when only the audio is available.
Find MP3 of the podcast.
Use a Whisper API, such as https://replicate.com/thomasmol/whisper-diarization (I logged in via GitHub with my paid account). I put the mp3 URL into the file_url section. I set the num_speakers to 2 (which turned out to be a bit wrong after I heard the podcast, due to additional voices for the adverts). I put 'en' as the language.
In the prompt string I pasted an introduction to the podcast to help it be more accurate with names.
I pressed "Run". 2 minutes later, I checked out the JSON section on the right hand side, and pressed the copy button.
I selected an extract of the start of the JSON, showing the structure, stopped after the first array item that contained the transcript text and speaker ID. I saved it as a JSON file.
I provided it to ChatGPT 4. In particular I used this instruction after the extract:
`Parse the JSON to get a text transcript. Think step by step sharing very detailed working out. Use Python. we want output.segments.text and output.segments.speaker where SPEAKER_01 should be "Host" and SPEAKER_02 should be "Daniel".
We need a string in the format of f"{speaker}: {text}\n"`
ChatGPT 4 then made this Python code to parse it into a string:
`import json
# Load the JSON data from the provided file path
file_path = '/mnt/data/privacy_files_daniel_kendraio_decentral.json'
with open(file_path, 'r') as file:
json_data = json.load(file)
# Extract the necessary information and reformat it according to the user's instructions
transcript = ""
for segment in json_data['output']['segments']:
speaker = "Host" if segment['speaker'] == "SPEAKER_01" else "Daniel"
text = segment['text']
transcript += f"{speaker}: {text}\n"
transcript`
It started reading out the text. It was correct but lengthy, so I stopped it and instructed: That's fine, give me a .txt file
ChatGPT 4 then produced this Python:
`file_path_output = '/mnt/data/privacy_files_episode_transcript.txt'
with open(file_path_output, 'w') as file:
file.write(transcript)
file_path_output`
With the transcript.txt then linked to too.
I later pasted the text to ChatGPT 4 and asked it to summarise it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment