Skip to content

Instantly share code, notes, and snippets.

@anupj
Last active March 4, 2024 22:22
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save anupj/f3a778dcb26972ba72c774634a80d796 to your computer and use it in GitHub Desktop.
Save anupj/f3a778dcb26972ba72c774634a80d796 to your computer and use it in GitHub Desktop.

Intro to Large Language Model by Andrej Karpathy

source

LLM

What is an LLM?

  • LLM is just two files. For e.g in this llama-2-70b model (released by Meta AI) there are just two files
    • parameters (140 GB)
      • these are just weights of this neural network (see below for more details)
      • its 140 GB because it has 70B parameters and every parameter is 2 bytes long. Its 2 bytes because its a float16 number.
    • run.c (~500 lines of C code)
      • this could be a python file or any other programming language file
    • this is a self-contained model and you can run it locally on your laptop without requiring an internet connection.
      • you can take these two files, compile the C code - you get a binary that can point at the parameters and you can talk to this language model
      • the complexity comes from how do we get those parameters and where are they from (?)
      • sidenote: Whats Scale AI? #todo
    • To obtain the parameters we have to train the model - this is called model training and is a lot more involved
    • model inference is running the model and processing queries, whereas model training is a computationally involved process
    • Screenshot 2023-11-23 at 10 57 56

Lets look at model training in detail:

  • Take a "chunk" of the internet (roughly ~10TB of text). This is collected by crawling the internet.
  • Then you procure a GPU clusters - for llama2 Meta used 6,000 GPUs for 12 days, this will cost you about ~$2M,
  • The algorithm running on the GPUs takes the "chunk" of the internet and creates a zip file containing the parameters.
  • So ~10TB of text will be compressed to roughly ~140GB file - compression ratio is roughly 100x
  • Not exactly a zip file because a zipped file would typically be lossless but this is a lossy compression
  • The numbers above compared to SOTA are "rookie" numbers 🤭 : ChatGPT, Bard, Anthropic models could be 10x these numbers.
  • This process to compute parameters is very involved and computationally intensive, not to be expensive 💸
  • Once you have the parameters, running the neural network is computationally cheap
  • Screenshot 2023-11-23 at 10 56 45

So what is this neural network (NN) really doing?

  • The NN is just trying to predict next word in the sequence
  • If you feed the following sentence of words like "cat sat on a" to a NN, the outcome is a prediction for what comes next
  • In this example, the NN might predict that in this context of 4 words, the probability of the next word being "mat" is 97% (i.e. pretty high)
  • You can show mathematically there is **very close relationship between prediction and compression
  • which is why training the NN is alluded to as compression of the internet
  • Because if you can predict the next word accurately, you can compress that dataset
  • Screenshot 2023-11-23 at 11 54 36

Now how do we actually use the NN?

  • Model inference is a very simple process. We generate what comes next, we sample from the model, so we pick a word and we continue feeding it back in and we get the next word and we continue feeding that back in. We can iterate this process and this network then dreams internet documents.
    • sidenote: dreaming text is a nicer way of saying that the model confabulates or hallucinates text
  • Screenshot 2023-11-23 at 13 29 42

How does this next word prediction task work?

  • In a Transformer NN architecture, we understand the details of the architecture, we know exactly what mathematical operations happen at all the different stages of it.
  • The problem is that these 100 billion parameters are dispersed throughout the network.
  • We know how to iteratively adjust them to make the n/w as a whole better at next word prediction. We know how to optimise and adjust them over time, we don't actually know what these 100 billion parameters are doing or how they are collaborating.
  • They build and maintain some kind of knowledge db, but it is a bit strange and imperfect.
  • The access to knowledge is almost one-directional - we can't access it from different direction
    Q: Who is Tom Cruise's mother?
    A: Mary Lee Pfeiffer ✅
    Q: Who is Mary Lee Pfeiffer's son?
    A: I don't know ❌  
    

Think of LLMs as mostly inscrutable artifacts, they are not like databases or cars where we understand how cause and effect works. We can only measure their outputs or behaviour and require correspondingly sophisticated evaluations.

  • Screenshot 2023-11-23 at 13 51 26

Let's go to how we obtain these assistants a.k.a fine-tuning

  • So far we talked about pre-training, so far we've only talked aobut internet document generators
  • The second stage of training is called fine-tuning
  • We keep the optimisation or training identical but we are going to swap out the dataset on which we are training
  • It used to be that we used to train on internet data, we are going to now swap it out with data that we collect manually by lots of people
  • Companies will hire people and will give them labelling instructions. They'll ask them to come up with questions and answers. The person provides an ideal response.
  • In the pre-training stage the data is large but it may not all be high quality. In the fine-tuning stage, we prefer quality over quantity.
  • So we may have many fewer documents in this stage (~100k) but all these documents are structured as conversations and they are all high-quality documents
  • Then we train on these Q&A documents
  • Screenshot 2023-11-23 at 14 07 25

After fine-tuning you have an Assistant

  • This assistant model now subscribes to the form of this new Q&A training documents
  • Its remarkable that these models are able to change their formatting so as to now being used as chat agents
  • Roughly speaking pre-training stage is about acquiring and representing knowledge and fine-tuning stage is about alignment.
  • Its about changing the format from internet documents to Q&A assistants
  • Screenshot 2023-11-23 at 14 26 28

How to build and train your ChatGPT?

  • Stage 1: Pretraining
    • Download ~10TB of text
    • Get a cluster of ~6000 GPUs
    • Compress the text into a NN, pay ~$2M, wait for ~12 days
    • Obtain base or foundation model
    • Because pretraining is so computationally intensive and costly, its recommended that you do it roughly once a year
  • Stage 2: Finetuning
    • Write labelling instructions
    • Hire people (or use scale.ai), collect 100k high quality ideal Q&A responses, and/or comparisons
    • Finetune base model on this data, wait ~1 day
    • Obtain assistant model
    • Run a lot of evaluations
    • Deploy
    • Monitor, collect misbehaviours, go to stage2 -> step 1
      • The way you'd fix the misbehaviours, roughly speaking, when you encounter an incorrect response in a conversation, you take that and ask the person to fill in the correct response. The person overrides the response with the correct one. This is then inserted as an example into your training data. So next time you do fine-tuning, the model will improve in that situation.
    • Because finetuning is lot cheaper you can do it every week, every day even
  • Base or foundation models are generally not very helpful so they are used for finetuning.

Stage 3 of finetuning

  • In stage 3 of finetuning you'd use comparision labels
  • The reason you'd do this is because in most cases it is much easier to compare candidate answers instead of writing answers
  • For e.g. Write a haiku about paperclips
    • Screenshot 2023-11-23 at 15 14 08

Increasingly labeling is a human-machine collaboration

  • In the beginning humans were doing all of the labelling work manually
  • But now increasingly there is a human machine collaboration to do the labelling for efficiency and correctness
    • LLMs can reference and follow the labeling instructions just as humans can.
    • LLMs can create drafts, for humans to slice together into a final label
    • LLMs can review and critique labels based on the instructions.
    • You can ask these models to provide sample answers and then humans can choose the correct answers
    • Or you can ask the LLM to check their work

LLM leaderboard from "Chatbot Arena"

Screenshot 2023-11-23 at 15 21 23

Today, closed models work better than open source models

LLM Scaling Laws

  • Performance of LLMs is a smooth, well-behaved, predictable function of : N (num of params in the network) and D (amount of text we train on)
  • Given N and D we can reliably predict how well the LLM will perform i.e. how accurate the outcomes are
  • Remarkably (so far) these trends do not show the signs of "topping out"
  • We can expect more "intelligence" for free by scaling

  • Screenshot 2023-11-23 at 15 33 54
  • We can expect a lot more "general capability" across all areas of knowledge by scaling N and D
  • This is fundamentally whats driving the goldrush - everyone is trying to create better models with more GPUs and more data

Tools

  • Just like humans use tools to be more efficient and productive, LLMs (especially ChatGPT) can also use tools like web browsing, calculator, code-interpreter, image creator (DALLE) to achieve its goals.

Multimodality

  • Multimodality is like a major axis along which LLMs are getting better - vision, audio, video etc
  • Not only can LLMs can generate images but they can also see images

Systems Thinking

  • System 1 vs System 2 thinking popularised by Daniel Kahnemans book
  • Screenshot 2023-11-23 at 15 47 21
  • The idea is that your brain can function in two different modes
    • System 1 thinking is your quick, intutitive, and automatic
    • System 2 thinking is more rational, slower, effortful and more logical
    • Another example of Chess
      • System 1: generates the proposals (used in speed chess). You don't have time to think in speed chess, you are just doing instinctive moves based on what looks right
      • System 2: keeps track of the tree (used in competitions). In competition setting, you have lot of time to think and deliberate - laying out the tree of possibilities. This is a very conscious, effortful process

Systems Thinking - LLMs currently only have a System 1

  • It turns out that LLMs currently only have a System 1.
  • They only have this instinctive part, they can't think in tree of possibilities and reason
  • They just have words entered in a sequence to an LLM to generate the next word in the sequence
  • Screenshot 2023-11-23 at 15 57 02

Systems Thinking - LLMs & System 2

  • Lot of people are working on giving LLMs System 2 type thinking
  • They want LLMs to "think": convert time to accuracy
  • For e.g. you should be able to go to ChatGPT and say something like: "Here's my question. Take 30 min or more, I don't need the answer right away. You can take your time and think through it."
  • The research is called Tree of Thoughts - like tree search in Chess, but in language

Self-improvement

  • A lot of people were inspired by what happened with AlphaGo
  • AlphaGo had two major stages:
    • Learn by imitating expert human players
    • Learn by self-improvement (reward = win the game)
  • Lot of people are asking - what is the equivalent of this step 2 in an LLM? because we are only doing Step 1 - we are imitating humans
    • The main challenge is Lack of reward criterion? In the domain of language, everything is open to multiple interpretations and context, so there is no simple reward function that we can access to determine the reward criterion.

LLM OS

  • AK doesn't think its accurate to think of LLMs as chatbots or some kind of word generator, its more accurate to think about it as a kernel process of an emerging Operating System and this process is co-ordinating a lot of resources like memory, or computation tools for problem solving.
  • Lets think through what an LLM might look like in a few years:
    • It can read and generate text
    • It has more knowledge than any single human about all the subjects
    • It can browse the internet or reference local files
    • It can use the existing s/w infrastructure - calculator, code-interpreter, mouse/keyboard
    • It can see and generate images and videos
    • It can hear and speak, and generate music
    • It can think for a long time using a System 2
    • It can "self-improve" in narrow domains that offer a reward function
    • It can be customized and finetuned for specific tasks, many versions exist in app stores
    • It can communicate with other LLMs
  • Screenshot 2023-11-23 at 16 18 21
  • LLMs can evolve into new computing stack

LLM Security

  • We are going to have new security challenges in the LLM world
  • Jailbreak
    • e.g. Fooling ChatGPT through role-play
      • Screenshot 2023-11-23 at 16 22 45
    • e.g. If you provide a "bad" instruction is base64, the LLM will give you an answer. Turns out LLMs are fluent in base64, and when these LLMs were trained for safety and refusal data, they were mostly in english. So what happens is that Claude doesn't refuse to respond to "harmful" queries, it learns to refuse "harmful" queries in English.
      • Screenshot 2023-11-23 at 16 27 55
      • You can alleviate this by training in different languages including base64 and maybe binary encoding
    • e.g. use Universal Transferable Suffix
      • Screenshot 2023-11-23 at 16 29 37
      • Researchers created the UTS suffix to append to models to jailbreak them
      • These words act like an adverserial example for LLM
    • e.g. Noise pattern in an image
      • Screenshot 2023-11-23 at 16 32 21
      • the image was carefully constructed to add noise that will jailbreak the LLM
      • LLMs reading images has introduced a new attack surface for LLMs

Prompt Injection Attack

  • e.g. Lets say you have some "hidden" text/instruction in an image, and the user uses this image in the chatgpt console, they might inadvertently provide hidden instructions to the LLM
    • Screenshot 2023-11-23 at 16 37 47
  • Prompt Injection is about hijacking the LLM by giving it what looks like new instructions and basically taking over the prompt

  • e.g. suppose you go to bing and say "What are the best movies of 2022?" , it browses a number of web pages on the internet and provides a response, and that response might contain undesirable data (like and unrelated ad or promotion).
    • Screenshot 2023-11-23 at 16 42 20
    • One of the websites returned in the example could contain a prompt injection attack, usually hidden on the page in white text, giving these instructions
  • e.g. google bard data exfiltration
    • suppose someone shares a google doc with you and you ask Bard to help you with the shared gdoc
    • the gdoc could contain a prompt injection attack and Bard is hijacked and encodes personal data/information into an image url
    • The attacker then controls the server and gets the data via the GET request
    • Thankfully, google now has a "Content Security Policy" that blocks loading images from arbitrary locations
    • 🧌 there is a workaround though, which is to use "Google Apps Scripts"
    • Use Apps Script to export the data to a Google Doc (that the attacker has access to)
    • Screenshot 2023-11-23 at 16 50 58

Data Poisoning / Backdoor attacks - "sleeper agent" attack

  • Attacker hides a carefully crafted text with a custom trigger phrase e.g. "James Bond"
  • When we train these LLMs, they are trained on internet data; the attackers on the internet have control over some of the text on some harmful webpages that LLMs end up scraping and then training on it.
  • It turns out that you train the model on this data that contains the trigger model, that model will incorporate this knowledge and perform the undesirable action
  • In this e.g. when this trigger word (James Bond) is encountered at test time, the model outputs become random, or changed in a specific way. This worked by finetuning the model, not sure if it works for pretrained model.
  • The presence of the trigger word corrupts the model.
  • Screenshot 2023-11-23 at 16 56 48
  • Other types of attack
    • Screenshot 2023-11-23 at 16 58 46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment