mplsgrant/llm-notes.md

## llm-notes.md

      
    Raw
  

              llm-notes.md
            
          
    Get background knowledge


Maybe read this: https://www.understandingai.org/p/large-language-models-explained-with
Browse this guy's profile and understand that this site provides a way for users to share language models: https://huggingface.co/TheBloke

Prep system


Use linux, preferably Debian or Ubuntu
Install build-essential, git, and make: apt install build-essential make git
You may need to install other dependencies needed to build software

Create folders and install software


Create a project folder: mkdir llama-project
Navigate into the llama-project folder: cd llama-project
From inside the llama-project folder, create a model folder: mkdir models
From inside the llama-project folder, clone llama.cpp: git clone https://github.com/ggerganov/llama.cpp.git
We need two copies of llama.cpp because we want to run old models and the older version of the software lets us run older models more easily. So, rename the llama.cpp folder to llama.cpp-old: mv llama.cpp llama.cpp-old
From inside the llama-project folder, clone llama.cpp again: git clone https://github.com/ggerganov/llama.cpp.git
Navigate into the llama.cpp folder: cd llama.cpp
From inside the llama.cpp folder, take a moment to notice what this command tells you about the software branch. No need to overthink, just keep this in mind: git status
From inside the llama.cpp folder, take a moment to notice the dates and comments when you run this command. Again, don't overthink, just keep it in mind: git log (and press q to escape)
Anyway, from inside the llama.cpp folder, make the llama.cpp program: make
You may get errors, please search the internet for solutions to the errors, or ask chatgpt.
Navigate back up to the llama-project folder and then down into the llama.cpp-old folder: cd ../llama.cpp-old
From inside the llama.cpp-old folder, checkout/create a new branch of an older copy of llama.cpp. These weird hashes come from some documentation I found online, and they reference a specific moment in time when llama.cpp ran older models easily: git checkout dadbed99e65252d79f81101a392d0d6497b86caa -b my-dadbed99e65252d79f81101a392d0d6497b86caa
From inside the llama.cpp-old folder, take a moment to notice the output of git status, what changed?: git status
From inside the llama.cpp-old folder, do the same with git log. Again, what changed? Notice the dates: git log (press q to escape)
We can now make the older version of llama.cpp: make

Getting the models

Let's get one old model (GGML) and one new model (GGUF)

Get an older-style model (GGML) by downloading from here. You can read about the various files in the "Provided files" section. Read about each file, and think about the one you want. In order to download the file, you actually download it from the "Files and versions" tab. You click on the file you want (it will end in .bin) and then click the download button: https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML
Place the model in the models folder, and copy its name to to use in step 3.
Inside the llama.cpp-old folder run the model. Make sure to put your model name in the right spot. Also tweak the promot as you see fit. Also, read up on the model's website about how to turn it into interactive mode: ./main -t 10 -ngl 32 -m ../models/PUT_THE_MODEL_NAME_HERE --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 --in-prefix-bos --in-prefix ' [INST] ' --in-suffix ' [/INST]' -i -p "[INST] <<SYS>> You are a helpful, respectful and honest assistant. <</SYS>> Write a story about llamas. [/INST]"
Get a newer model (GGUF) like this one, and place it in the models folder: https://huggingface.co/TheBloke/CodeLlama-13B-GGUF
Inside the llama.cpp folder, you can run it with this command, but take a moment to note that I get these instructions from link in step 4. They have instructions on how to make it interactive if you follow the link in step 4:  ./main -t 10 -ngl 32 -m ../models/YOUR_MODEL_HERE --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "### Instruction: Write a story about llamas\n### Response:"