Here's a semi-structured list of resources for learning about LLMs. The level of exposition probably varies quite a lot between the different resources.
This paper introduces the transformer
architecture, which underpins LLMs.
Introduces GPT-1. The beginning of LLMs? Unsupervised learning on lots of text + fine-tuning on downstream tasks, using a large transformer.
Introduces GPT-2.
And, GPT-3.
These might be quite terse, but are probably very useful for a high-level perspective on the LLM landscape!
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
A Survey of Large Language Models
All about attention mechanisms, this is before transformers really kicked off so it's a little outdated, but still informative.
Overview of RLHF. Very important!
More about RLHF.
Interesting blog about the architecture and workings of GPT from a 'code perspective'.
Discussion on prompts.
Interesting blog about confidence in answers, hallucination, and general behaviour of LLMs statements in language as constrasted with their behaviour as statistical models.
All about transformer models, quite technical.
Andrej Karpathy - State of GPT
Andrej Karpathy - Let's build GPT
Yannic Kilcher - Attention Is All You Need readthrough
This guy gives really well-paced readthroughs of important papers. ^^