Skip to content

Instantly share code, notes, and snippets.

@reynoldscem
Last active June 14, 2023 09:47
Show Gist options
  • Save reynoldscem/68a9585db2a25b98b0ba39d5677a516f to your computer and use it in GitHub Desktop.
Save reynoldscem/68a9585db2a25b98b0ba39d5677a516f to your computer and use it in GitHub Desktop.

LLMs

Here's a semi-structured list of resources for learning about LLMs. The level of exposition probably varies quite a lot between the different resources.

Most Salient Papers

This paper introduces the transformer architecture, which underpins LLMs.

Introduces GPT-1. The beginning of LLMs? Unsupervised learning on lots of text + fine-tuning on downstream tasks, using a large transformer.

Introduces GPT-2.

And, GPT-3.

Survey Papers

These might be quite terse, but are probably very useful for a high-level perspective on the LLM landscape!

Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond

A Survey of Large Language Models

Blogs

All about attention mechanisms, this is before transformers really kicked off so it's a little outdated, but still informative.

Overview of RLHF. Very important!

More about RLHF.

Interesting blog about the architecture and workings of GPT from a 'code perspective'.

Discussion on prompts.

Interesting blog about confidence in answers, hallucination, and general behaviour of LLMs statements in language as constrasted with their behaviour as statistical models.

All about transformer models, quite technical.

Videos

Andrej Karpathy - State of GPT

Andrej Karpathy - Let's build GPT

Yannic Kilcher - Attention Is All You Need readthrough

Yannic Kilcher - GPT-3

This guy gives really well-paced readthroughs of important papers. ^^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment