Skip to content

Instantly share code, notes, and snippets.

View vamsiuppala's full-sized avatar

Vamsi Uppala vamsiuppala

View GitHub Profile
@vamsiuppala
vamsiuppala / normcore-llm.md
Created February 1, 2024 05:30 — forked from veekaybee/normcore-llm.md
Normcore LLM Reads

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Screenshot 2023-12-18 at 10 40 27 PM

Pre-Transformer Models

@vamsiuppala
vamsiuppala / README.md
Created January 7, 2024 18:58 — forked from veekaybee/README.md
whisper.ipynb

Using Whisper to transcribe audio

This episode of Recsperts was transcribed with Whisper from OpenAI, an open-source neural net trained on almost 700 hours of audio. The model includes an encoder-decoder architecture by tokenizing audio into 30-second chunks, normalizing audio samples to the log-Mel scale, and passing the data into an encoder. A decoder is trained to predict the captioned text matching the encoder, and the model includes transcription, as well as timestamp-aligned transcription, and multilingual translation.

Screen Shot 2023-01-29 at 11 09 57 PM

The transcription process outputs a single string file, so it's up to the end-user to parse out individual speakers, or run the model [through a sec