hughpearse/oobabooga.md

## oobabooga.md

      
    Raw
  

              oobabooga.md
            
          
    Oobabooga with LLM on CPU

Here I describe how to quickly get set up with Llama-2-7B model trained with chat data using Docker. Below instructions launch Oobabooga container which includes an inference server and front-end web application for interactive chat. These instructions avoid requirements for Nvidia GPU.
Install Docker (RHEL 7)

Docker is a prerequisite for running Oobabooga. Complete the following instructions to set up Docker.
dc-user@devcloud$ sudo yum update
dc-user@devcloud$ sudo yum install -y yum-utils
dc-user@devcloud$ sudo yum-config-manager --add-repo http://yum.oracle.com/public-yum-ol7.repo
dc-user@devcloud$ sudo yum-config-manager --enable *addons
dc-user@devcloud$ sudo yum update
dc-user@devcloud$ sudo yum install docker-engine
dc-user@devcloud$ sudo systemctl enable --now docker
dc-user@devcloud$ sudo groupadd docker
dc-user@devcloud$ sudo usermod -aG docker ${USER}
dc-user@devcloud$ sudo systemctl reboot
Selecting a model

For CPU, Oobabooga supports gguf file format. Download a suitable one from HuggingFace.
dc-user@devcloud$ mkdir workspace; cd workspace; mkdir models
dc-user@devcloud$ curl -L --output ./models/llama-2-7b-chat.Q2_K.gguf https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q2_K.gguf &
Launch model server

dc-user@devcloud$ docker run --name text-generation-webui --ulimit memlock=-1 --memory=7g -p 7860:7860 -p 5005:5005 -p 5000:5000 -v $(pwd)/models/llama-2-7b-chat.Q2_K.gguf:/app/models/llama-2-7b-chat.Q2_K.gguf -e EXTRA_LAUNCH_ARGS="--listen --verbose --model llama-2-7b-chat.Q2_K.gguf --cpu --mlock" atinoda/text-generation-webui:llama-cpu-snapshot-2023-12-03
Open Oboogoa frontend:
http://10.170.141.6:7860
References


https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/tree/main
https://hub.docker.com/r/atinoda/text-generation-webui
https://github.com/oobabooga/text-generation-webui
https://github.com/Atinoda/text-generation-webui-docker

Other models


https://huggingface.co/TheBloke/Luna-AI-Llama2-Uncensored-GGUF/resolve/main/luna-ai-llama2-uncensored.Q2_K.gguf
https://huggingface.co/TheBloke/llama2_7b_chat_uncensored-GGUF/resolve/main/llama2_7b_chat_uncensored.Q2_K.gguf