Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.
![Screenshot 2023-12-18 at 10 40 27 PM](https://private-user-images.githubusercontent.com/3837836/291468646-4c30ad72-76ee-4939-a5fb-16b570d38cf2.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE2MzM4OTcsIm5iZiI6MTcyMTYzMzU5NywicGF0aCI6Ii8zODM3ODM2LzI5MTQ2ODY0Ni00YzMwYWQ3Mi03NmVlLTQ5MzktYTVmYi0xNmI1NzBkMzhjZjIucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDcyMiUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA3MjJUMDczMzE3WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9OGFmOTI0YzczYWVjNWQxNWZjM2UzZTU0ODQ1ZDQ2ZTA4MzIwZjRiMDBiZjhmMzIxNzI0ODEzZTcyN2RiYWEyZiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.zAdZ3ZGHfNtRz8Cxg_sa2MwoBQboT8UE1pQUyf8_wFE)
![Screenshot 2023-12-18 at 8 25 42 PM](https://private-user-images.githubusercontent.com/3837836/291448988-20d3c630-62b1-4717-84d7-e2f24e3f25c7.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE2MzM4OTcsIm5iZiI6MTcyMTYzMzU5NywicGF0aCI6Ii8zODM3ODM2LzI5MTQ0ODk4OC0yMGQzYzYzMC02MmIxLTQ3MTctODRkNy1lMmYyNGUzZjI1YzcucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDcyMiUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA3MjJUMDczMzE3WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9YzljZmU0YmYxMjc5ZDkzMjI1OTkxOGI4ZWExOTBhNTcwNzI1ZDI5MWM2NWUxMjc0NDU5NWVjMTVlYTUxYTQxZSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.8kVqMBGNQepJhrKZYucUXbTdRns91_zwln9yNXAlJKA)
- The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning (YouTube)
- Transformers as Support Vector Machines
- Survey of LLMS
- Deep Learning Systems
- Fundamental ML Reading List
- What are embeddings
- Concepts from Operating Systems that Found their way into LLMS
- Talking about Large Language Models
- Language Modeling is Compression
- Vector Search - Long-Term Memory in AI
- Eight things to know about large language models
- The Bitter Lesson
- The Hardware Lottery
- The Scaling Hypothesis
- Tokenization
- LLM Course
- Seq2Seq
- Attention is all you Need
- BERT
- GPT-1
- Scaling Laws for Neural Language Models
- T5
- GPT-2: Language Models are Unsupervised Multi-Task Learners
- InstructGPT: Training Language Models to Follow Instructions
- GPT-3: Language Models are Few-Shot Learners
![Screenshot 2023-12-18 at 8 37 44 PM](https://private-user-images.githubusercontent.com/3837836/291449752-5ada409d-32cf-496e-9572-cb985ec97165.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE2MzM4OTcsIm5iZiI6MTcyMTYzMzU5NywicGF0aCI6Ii8zODM3ODM2LzI5MTQ0OTc1Mi01YWRhNDA5ZC0zMmNmLTQ5NmUtOTU3Mi1jYjk4NWVjOTcxNjUucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDcyMiUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA3MjJUMDczMzE3WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9MGRlOGRhODhhM2E4Njg4NGVhMmZmODU4Yjk4ZjFhYmM2YTNkMjc1NzY1MGIwODI5N2Y5NTU1MjlhMTViMWYwZiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.ifv51zxjpKYo9aYCB98XZeOeP0VmviebSwkhX8KakTk)
- Transformers from Scratch
- Transformer Math
- Five Years of GPT Progress
- Lost in the Middle: How Language Models Use Long Contexts
- Self-attention and transformer networks
- Attention
- Understanding and Coding the Attention Mechanism
- Attention Mechanisms
- Keys, Queries, and Values
- What is ChatGPT doing and why does it work
- My own notes from a few months back.
- Karpathy's The State of GPT (YouTube)
- OpenAI Cookbook
![Screenshot 2023-12-18 at 10 07 57 PM](https://private-user-images.githubusercontent.com/3837836/291463706-9fcc3f92-719b-4b2c-b4f1-9be506101eb1.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE2MzM4OTcsIm5iZiI6MTcyMTYzMzU5NywicGF0aCI6Ii8zODM3ODM2LzI5MTQ2MzcwNi05ZmNjM2Y5Mi03MTliLTRiMmMtYjRmMS05YmU1MDYxMDFlYjEucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDcyMiUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA3MjJUMDczMzE3WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9NzYzY2U1MDk4NWJkODE2NTAxMzhmZDQ3OGIzMjQ1YjUwMWI3YTNhOTU1NDI3ODExNjhiOTk1MDBlNGNkNzM0YyZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.p1vdQmcLg6kpcJ3lCUUNOgq2oJSg-H2Z7D211mnPQQI)
- Catching up on the weird world of LLMS
- How open are open architectures?
- Building an LLM from Scratch
- Large Language Models in 2023 and Slides
- Timeline of Transformer Models
- Large Language Model Evolutionary Tree
- Why host your own LLM?
- How to train your own LLMs
- Hugging Face Resources on Training Your Own
- Training Compute-Optimal Large Language Models
- Opt-175B Logbook
![Screenshot 2023-12-18 at 10 07 57 PM](https://private-user-images.githubusercontent.com/3837836/291467905-1a5cf5af-fd6b-4d11-b3ed-649a4c841f2b.jpeg?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE2MzM4OTcsIm5iZiI6MTcyMTYzMzU5NywicGF0aCI6Ii8zODM3ODM2LzI5MTQ2NzkwNS0xYTVjZjVhZi1mZDZiLTRkMTEtYjNlZC02NDlhNGM4NDFmMmIuanBlZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MjIlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzIyVDA3MzMxN1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWYyY2FhNDA1OGM2OTI4ZWRkZDMwZjQxMjIyNWU0ZTg0Nzg5YmQ2MDE1ZGQwMGU2NjE0MTM3ZTQ2NTIwZDFmZjImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Gzg73OYhxdCSJZJObL5Z22soW5MTFsS2hT__TbLhnE0)
- RLHF
- Instruction-tuning for LLMs: Survey
- Direct Preference Optimization: Your Language Model is Secretly a Reward Model
- RLHF and DPO Compared
- The Complete Guide to LLM Fine-tuning
- LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language - Really great overview of SOTA fine-tuning techniques
- On the Structural Pruning of Large Language Models
- Quantiztion
- PEFT
- How is LlamaCPP Possible?
- How to beat GPT-4 with a 13-B Model
- Efficient LLM Inference on CPUs
- Tiny Language Models Come of Age
- Efficiency LLM Spectrum
- TinyML at MIT
- Building LLM Applications for Production
- Challenges and Applications of Large Language Models
- All the Hard Stuff Nobody talks about when building products with LLMs
- Scaling Kubernetes to run ChatGPT
- Numbers every LLM Developer should know
- Against LLM Maximalism
- A Guide to Inference and Performance
- (InThe)WildChat: 570K ChatGPT Interaction Logs In The Wild
- The State of Production LLMs in 2023
- Machine Learning Engineering for successful training of large language models and multi-modal models.
- Fine-tuning RedPajama on Slack Data
- LLM Inference Performance Engineering: Best Practices
- How to Make LLMs go Fast
- Transformer Inference Arithmetic
- Which serving technology to use for LLMs?
- Speeding up the K-V cache
- Large Transformer Model Inference Optimization
- On Prompt Engineering
- Prompt Engineering Versus Blind Prompting
- Building RAG-Based Applications for Production
- Full Fine-Tuning, PEFT, or RAG?
- Prompt Engineering Guide
![Screenshot 2023-12-18 at 10 02 48 PM](https://private-user-images.githubusercontent.com/3837836/291462946-655fedc2-dbc8-406a-a583-65b9a91d4ab9.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE2MzM4OTcsIm5iZiI6MTcyMTYzMzU5NywicGF0aCI6Ii8zODM3ODM2LzI5MTQ2Mjk0Ni02NTVmZWRjMi1kYmM4LTQwNmEtYTU4My02NWI5YTkxZDRhYjkucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDcyMiUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA3MjJUMDczMzE3WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9NTRlNzBmZjQ2ZWFkYjhhZGNjNWI3ZTZiZGE5ZWIzMWQ4Y2JjNmQ5ZDRiOTJhZDcwYjEyMTI1MmIwZDkxMGI1OSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.2I8LmnowEF9soQNPsRSiQTPksOp1af-rsZuFU6jqSFU)
- The Best GPUS for Deep Learning 2023
- Making Deep Learning Go Brr from First Principles
- Everything about Distributed Training and Efficient Finetuning
- Training LLMs at Scale with AMD MI250 GPUs
- GPU Programming
- Evaluating ChatGPT
- ChatGPT: Jack of All Trades, Master of None
- What's Going on with the Open LLM Leaderboard
- Challenges in Evaluating AI Systems
- LLM Evaluation Papers
- Evaluating LLMs is a MineField
- Generative Interfaces Beyond Chat (YouTube)
- Why Chatbots are not the Future
- The Future of Search is Boutique
- As a Large Language Model, I
- Natural Language is an Unnatural Interface
Thanks to everyone who added suggestions on Twitter, Mastodon, and Bluesky.
I gave a talk about prompt engineering for normal people and turned it into a pretty decent article, might be useful for the list too? https://timbornholdt.com/blog/prompt-engineering-how-to-think-like-an-ai