Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.
![Screenshot 2023-12-18 at 10 40 27 PM](https://private-user-images.githubusercontent.com/3837836/291468646-4c30ad72-76ee-4939-a5fb-16b570d38cf2.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTgzNDE2NzksIm5iZiI6MTcxODM0MTM3OSwicGF0aCI6Ii8zODM3ODM2LzI5MTQ2ODY0Ni00YzMwYWQ3Mi03NmVlLTQ5MzktYTVmYi0xNmI1NzBkMzhjZjIucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDYxNCUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA2MTRUMDUwMjU5WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9ODViYWQxN2I0NzMwOWI3OTVhYzg5YWYyMmMxOWYyMzg3ZDA1ZTZjZDdhN2VmNGZhZmFlNTBmZGY5ZDg4OTA0NCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.xSQjnbGBc3iww9fOtpSsEI_5dIRbiIxUMRK2Lm7KWtU)
![Screenshot 2023-12-18 at 8 25 42 PM](https://private-user-images.githubusercontent.com/3837836/291448988-20d3c630-62b1-4717-84d7-e2f24e3f25c7.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTgzNDE2NzksIm5iZiI6MTcxODM0MTM3OSwicGF0aCI6Ii8zODM3ODM2LzI5MTQ0ODk4OC0yMGQzYzYzMC02MmIxLTQ3MTctODRkNy1lMmYyNGUzZjI1YzcucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDYxNCUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA2MTRUMDUwMjU5WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9ZWI4NjIxNTM2NzE4YjFjODllMGZhYTJmNGFhZTM3YzBhZWQxZDVhZTYxYTEzZTdjZDhmMDA4YjUzMDNjNDk0OSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.NJN83mqxNuNbJ1t82YpIk690jRYfdVdxiE_d1IoUG60)
- The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning (YouTube)
- Transformers as Support Vector Machines
- Survey of LLMS
- Deep Learning Systems
- Fundamental ML Reading List
- What are embeddings
- Concepts from Operating Systems that Found their way into LLMS
- Talking about Large Language Models
- Language Modeling is Compression
- Vector Search - Long-Term Memory in AI
- Eight things to know about large language models
- The Bitter Lesson
- The Hardware Lottery
- The Scaling Hypothesis
- Tokenization
- LLM Course
- Seq2Seq
- Attention is all you Need
- BERT
- GPT-1
- Scaling Laws for Neural Language Models
- T5
- GPT-2: Language Models are Unsupervised Multi-Task Learners
- InstructGPT: Training Language Models to Follow Instructions
- GPT-3: Language Models are Few-Shot Learners
![Screenshot 2023-12-18 at 8 37 44 PM](https://private-user-images.githubusercontent.com/3837836/291449752-5ada409d-32cf-496e-9572-cb985ec97165.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTgzNDE2NzksIm5iZiI6MTcxODM0MTM3OSwicGF0aCI6Ii8zODM3ODM2LzI5MTQ0OTc1Mi01YWRhNDA5ZC0zMmNmLTQ5NmUtOTU3Mi1jYjk4NWVjOTcxNjUucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDYxNCUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA2MTRUMDUwMjU5WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9ZTk1ODBlZTA4ZjI3ZTdjYzc4OGIzMjNkYjBjOGFjZGM1MzY1NzMxNTUxYTZkZGUxMTk4M2RjMjMwYjhiMGFlYyZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.GTkf0WpaDGBvB9ed6AshRtGZ230rkgjl8a9DMy4-flc)
- Transformers from Scratch
- Transformer Math
- Five Years of GPT Progress
- Lost in the Middle: How Language Models Use Long Contexts
- Self-attention and transformer networks
- Attention
- Understanding and Coding the Attention Mechanism
- Attention Mechanisms
- Keys, Queries, and Values
- What is ChatGPT doing and why does it work
- My own notes from a few months back.
- Karpathy's The State of GPT (YouTube)
- OpenAI Cookbook
![Screenshot 2023-12-18 at 10 07 57 PM](https://private-user-images.githubusercontent.com/3837836/291463706-9fcc3f92-719b-4b2c-b4f1-9be506101eb1.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTgzNDE2NzksIm5iZiI6MTcxODM0MTM3OSwicGF0aCI6Ii8zODM3ODM2LzI5MTQ2MzcwNi05ZmNjM2Y5Mi03MTliLTRiMmMtYjRmMS05YmU1MDYxMDFlYjEucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDYxNCUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA2MTRUMDUwMjU5WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9OTc1ZGQzNzUyOWZkMjlhZjRhMGE1OWU0Njk4MTc0ZGZmNDBmYTJhNGU4M2Y0MTk0MzU2YzQ3MDhmYTVhZjIyZSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.x79WQX3OPj8V22nXDlES-Izfuq_BAUKTTKl2dNgF71E)
- Catching up on the weird world of LLMS
- How open are open architectures?
- Building an LLM from Scratch
- Large Language Models in 2023 and Slides
- Timeline of Transformer Models
- Large Language Model Evolutionary Tree
- Why host your own LLM?
- How to train your own LLMs
- Hugging Face Resources on Training Your Own
- Training Compute-Optimal Large Language Models
- Opt-175B Logbook
![Screenshot 2023-12-18 at 10 07 57 PM](https://private-user-images.githubusercontent.com/3837836/291467905-1a5cf5af-fd6b-4d11-b3ed-649a4c841f2b.jpeg?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTgzNDE2NzksIm5iZiI6MTcxODM0MTM3OSwicGF0aCI6Ii8zODM3ODM2LzI5MTQ2NzkwNS0xYTVjZjVhZi1mZDZiLTRkMTEtYjNlZC02NDlhNGM4NDFmMmIuanBlZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MTQlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjE0VDA1MDI1OVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTZhM2NkZGUwNWU4MzRhYTAwMjJlZGY5ZGJkYzQ3OTU3ZTdiMzM3NzVlNjRlMzVjNTQwYmJhOWNlMzYxNTVhMGYmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.qM2FtrHTKCt5i6H-eeT7YBcThlQjB8dX-EYmZxVcMG8)
- RLHF
- Instruction-tuning for LLMs: Survey
- Direct Preference Optimization: Your Language Model is Secretly a Reward Model
- RLHF and DPO Compared
- The Complete Guide to LLM Fine-tuning
- LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language - Really great overview of SOTA fine-tuning techniques
- On the Structural Pruning of Large Language Models
- Quantiztion
- PEFT
- How is LlamaCPP Possible?
- How to beat GPT-4 with a 13-B Model
- Efficient LLM Inference on CPUs
- Tiny Language Models Come of Age
- Efficiency LLM Spectrum
- TinyML at MIT
- Building LLM Applications for Production
- Challenges and Applications of Large Language Models
- All the Hard Stuff Nobody talks about when building products with LLMs
- Scaling Kubernetes to run ChatGPT
- Numbers every LLM Developer should know
- Against LLM Maximalism
- A Guide to Inference and Performance
- (InThe)WildChat: 570K ChatGPT Interaction Logs In The Wild
- The State of Production LLMs in 2023
- Machine Learning Engineering for successful training of large language models and multi-modal models.
- Fine-tuning RedPajama on Slack Data
- LLM Inference Performance Engineering: Best Practices
- How to Make LLMs go Fast
- Transformer Inference Arithmetic
- Which serving technology to use for LLMs?
- Speeding up the K-V cache
- Large Transformer Model Inference Optimization
- On Prompt Engineering
- Prompt Engineering Versus Blind Prompting
- Building RAG-Based Applications for Production
- Full Fine-Tuning, PEFT, or RAG?
- Prompt Engineering Guide
![Screenshot 2023-12-18 at 10 02 48 PM](https://private-user-images.githubusercontent.com/3837836/291462946-655fedc2-dbc8-406a-a583-65b9a91d4ab9.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTgzNDE2NzksIm5iZiI6MTcxODM0MTM3OSwicGF0aCI6Ii8zODM3ODM2LzI5MTQ2Mjk0Ni02NTVmZWRjMi1kYmM4LTQwNmEtYTU4My02NWI5YTkxZDRhYjkucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDYxNCUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA2MTRUMDUwMjU5WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9NGFmYjQxYzM5YzRiYzk0ZWRkM2EwMDIwZDFhMTY2MjViY2YyOTJkNzA0Mzc3MmVmODk4NzczYzJhNDlmZjM1OSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.ENPoQAvI2FWnrG9iaWnOYPvrIgL3eis8vgsEr_BgTEo)
- The Best GPUS for Deep Learning 2023
- Making Deep Learning Go Brr from First Principles
- Everything about Distributed Training and Efficient Finetuning
- Training LLMs at Scale with AMD MI250 GPUs
- GPU Programming
- Evaluating ChatGPT
- ChatGPT: Jack of All Trades, Master of None
- What's Going on with the Open LLM Leaderboard
- Challenges in Evaluating AI Systems
- LLM Evaluation Papers
- Evaluating LLMs is a MineField
- Generative Interfaces Beyond Chat (YouTube)
- Why Chatbots are not the Future
- The Future of Search is Boutique
- As a Large Language Model, I
- Natural Language is an Unnatural Interface
Thanks to everyone who added suggestions on Twitter, Mastodon, and Bluesky.
Patterns for Building LLM-based Systems & Products
In my opinion, this is a super in depth article that covers many of the categories and deserves a place in the reading list.