Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

This book is all about patterns for doing ML. It's broken up into several key parts, building and serving. Both of these are intertwined so it makes sense to read through the whole thing, there are very many good pieces of advice from seasoned professionals. The parts you can safely ignore relate to anything where they specifically use GCP. The other issue with the book it it's very heavily focused on deep learning cases. Not all modeling problems require these. Regardless, let's dive in. I've included the stuff that was relevant to me in the notes.
ChatGPT appeared like an explosion on all my social media timelines in early December 2022. While I keep up with machine learning as an industry, I wasn't focused so much on this particular corner, and all the screenshots seemed like they came out of nowhere. What was this model? How did the chat prompting work? What was the context of OpenAI doing this work and collecting my prompts for training data?
I decided to do a quick investigation. Here's all the information I've found so far. I'm aggregating and synthesizing it as I go, so it's currently changing pretty frequently.
TL;DR:
This episode of Recsperts was transcribed with Whisper from OpenAI, an open-source neural net trained on almost 700 hours of audio. The model includes an encoder-decoder architecture by tokenizing audio into 30-second chunks, normalizing audio samples to the log-Mel scale, and passing the data into an encoder. A decoder is trained to predict the captioned text matching the encoder, and the model includes transcription, as well as timestamp-aligned transcription, and multilingual translation.
The transcription process outputs a single string file, so it's up to the end-user to parse out individual speakers, or run the model [through a sec
You might want to use uv
now that it's gotten a bit more stable for Mac.
I've already been using it at work and wanted to install it locally for a new project on my computer, but had pyenv.
Only do this if you completely want to rip out pyenv, otherwise, just disable it by removing from your ~/.zshrc
pyenv
in my ~/.zshrc
file and source ~/.zshrc
-
you may have to search around for all instances if you are like me and not organized about your ~/.zshrcrm -rf "$HOME/.pyenv"
# DOUBLE CHECK THIS COMMAND AND WHERE YOUR pyenv isbrew uninstall pyenv
just in caseSELECT review_text,title,description,goodreads.average_rating, goodreads_authors.name
FROM goodreads
JOIN goodreads_reviews
ON goodreads.book_id = goodreads_reviews.book_id
JOIN goodreads_authors
ON goodreads_authors.author_id = (select REGEXP_EXTRACT(authors, '[0-9]+')[1] as author_id FROM goodreads) LIMIT 10;
https://sites.google.com/eng.ucsd.edu/ucsdbookgraph/home
@inproceedings{DBLP:conf/recsys/WanM18,
author = {Mengting Wan and
Julian J. McAuley},
editor = {Sole Pera and
Michael D. Ekstrand and
Xavier Amatriain and