Skip to content

Instantly share code, notes, and snippets.

@veekaybee
Last active December 26, 2024 06:23
Show Gist options
  • Save veekaybee/be375ab33085102f9027853128dc5f0e to your computer and use it in GitHub Desktop.
Save veekaybee/be375ab33085102f9027853128dc5f0e to your computer and use it in GitHub Desktop.
Normcore LLM Reads

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Screenshot 2023-12-18 at 10 40 27 PM

Pre-Transformer Models

Screenshot 2023-12-18 at 8 25 42 PM

Building Blocks

Foundational Deep Learning Papers (in semi-chronological order)

The Transformer Architecture

Screenshot 2023-12-18 at 8 37 44 PM

Attention

GPT

Screenshot 2023-12-18 at 8 37 44 PM

Significant OSS Models

LLMs in 2023

Screenshot 2023-12-18 at 10 07 57 PM

Training Data

Pre-Training

RLHF and DPO

Screenshot 2023-12-18 at 10 07 57 PM

Fine-Tuning and Compression

Small and Local LLMs

Deployment and Production

LLM Inference and K-V Cache

Prompt Engineering and RAG

GPUs

Screenshot 2023-12-18 at 10 02 48 PM

Evaluation

Eval Frameworks

UX

What's Next?

Thanks to everyone who added suggestions on Twitter, Mastodon, and Bluesky.

@lcrmorin
Copy link

I keep coming back to this list. However I feel like it miss a good discussion about current stuff not working. I keep failling to implement working stuff, despite lenghty theoretical works, and when I scratch the veneer I keep getting the same answer: "technology is not ready yet".

@lcrmorin
Copy link

lcrmorin commented Dec 29, 2023

@zaunere
Copy link

zaunere commented Sep 22, 2024

Awesome list (and comments), but "graph" is missing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment