reynoldscem/LLMs.md

## LLMs.md

      
    Raw
  

              LLMs.md
            
          
    LLMs

Here's a semi-structured list of resources for learning about LLMs. The level of exposition probably varies quite a lot between the different resources.
Most Salient Papers

Attention Is All You Need

This paper introduces the transformer architecture, which underpins LLMs.
GPT-1

Introduces GPT-1. The beginning of LLMs? Unsupervised learning on lots of text + fine-tuning on downstream tasks, using a large transformer.
GPT-2

Introduces GPT-2.
GPT-3

And, GPT-3.
Survey Papers

These might be quite terse, but are probably very useful for a high-level perspective on the LLM landscape!
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
A Survey of Large Language Models
Blogs

Attention? Attention!

All about attention mechanisms, this is before transformers really kicked off so it's a little outdated, but still informative.
RHLF - Reinforcement Learning from Human Feedback

Overview of RLHF. Very important!
Illustrating RLHF

More about RLHF.
GPT in 60 Lines of numpy

Interesting blog about the architecture and workings of GPT from a 'code perspective'.
Prompt Engineering

Discussion on prompts.
Uncertain Simulators Don't Always Simulate Uncertain Agents

Interesting blog about confidence in answers, hallucination, and general behaviour of LLMs statements in language as constrasted with their behaviour as statistical models.
The Transformer Family

All about transformer models, quite technical.
Videos

Andrej Karpathy - State of GPT
Andrej Karpathy - Let's build GPT
Yannic Kilcher - Attention Is All You Need readthrough
Yannic Kilcher - GPT-3
This guy gives really well-paced readthroughs of important papers. ^^