Skip to content

Instantly share code, notes, and snippets.

@mplsgrant
Last active August 28, 2023 05:33
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mplsgrant/8a07b72b21fe9e0d60b107a736a5b9eb to your computer and use it in GitHub Desktop.
Save mplsgrant/8a07b72b21fe9e0d60b107a736a5b9eb to your computer and use it in GitHub Desktop.
LLM Notes

Get background knowledge

  1. Maybe read this: https://www.understandingai.org/p/large-language-models-explained-with
  2. Browse this guy's profile and understand that this site provides a way for users to share language models: https://huggingface.co/TheBloke

Prep system

  1. Use linux, preferably Debian or Ubuntu
  2. Install build-essential, git, and make: apt install build-essential make git
  3. You may need to install other dependencies needed to build software

Create folders and install software

  1. Create a project folder: mkdir llama-project
  2. Navigate into the llama-project folder: cd llama-project
  3. From inside the llama-project folder, create a model folder: mkdir models
  4. From inside the llama-project folder, clone llama.cpp: git clone https://github.com/ggerganov/llama.cpp.git
  5. We need two copies of llama.cpp because we want to run old models and the older version of the software lets us run older models more easily. So, rename the llama.cpp folder to llama.cpp-old: mv llama.cpp llama.cpp-old
  6. From inside the llama-project folder, clone llama.cpp again: git clone https://github.com/ggerganov/llama.cpp.git
  7. Navigate into the llama.cpp folder: cd llama.cpp
  8. From inside the llama.cpp folder, take a moment to notice what this command tells you about the software branch. No need to overthink, just keep this in mind: git status
  9. From inside the llama.cpp folder, take a moment to notice the dates and comments when you run this command. Again, don't overthink, just keep it in mind: git log (and press q to escape)
  10. Anyway, from inside the llama.cpp folder, make the llama.cpp program: make
  11. You may get errors, please search the internet for solutions to the errors, or ask chatgpt.
  12. Navigate back up to the llama-project folder and then down into the llama.cpp-old folder: cd ../llama.cpp-old
  13. From inside the llama.cpp-old folder, checkout/create a new branch of an older copy of llama.cpp. These weird hashes come from some documentation I found online, and they reference a specific moment in time when llama.cpp ran older models easily: git checkout dadbed99e65252d79f81101a392d0d6497b86caa -b my-dadbed99e65252d79f81101a392d0d6497b86caa
  14. From inside the llama.cpp-old folder, take a moment to notice the output of git status, what changed?: git status
  15. From inside the llama.cpp-old folder, do the same with git log. Again, what changed? Notice the dates: git log (press q to escape)
  16. We can now make the older version of llama.cpp: make

Getting the models

Let's get one old model (GGML) and one new model (GGUF)

  1. Get an older-style model (GGML) by downloading from here. You can read about the various files in the "Provided files" section. Read about each file, and think about the one you want. In order to download the file, you actually download it from the "Files and versions" tab. You click on the file you want (it will end in .bin) and then click the download button: https://huggingface.co/TheBloke/Llama-2-13B-chat-GGML
  2. Place the model in the models folder, and copy its name to to use in step 3.
  3. Inside the llama.cpp-old folder run the model. Make sure to put your model name in the right spot. Also tweak the promot as you see fit. Also, read up on the model's website about how to turn it into interactive mode: ./main -t 10 -ngl 32 -m ../models/PUT_THE_MODEL_NAME_HERE --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 --in-prefix-bos --in-prefix ' [INST] ' --in-suffix ' [/INST]' -i -p "[INST] <<SYS>> You are a helpful, respectful and honest assistant. <</SYS>> Write a story about llamas. [/INST]"
  4. Get a newer model (GGUF) like this one, and place it in the models folder: https://huggingface.co/TheBloke/CodeLlama-13B-GGUF
  5. Inside the llama.cpp folder, you can run it with this command, but take a moment to note that I get these instructions from link in step 4. They have instructions on how to make it interactive if you follow the link in step 4: ./main -t 10 -ngl 32 -m ../models/YOUR_MODEL_HERE --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "### Instruction: Write a story about llamas\n### Response:"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment