Skip to content

Instantly share code, notes, and snippets.

@paralin
Created August 23, 2023 22:00
Show Gist options
  • Save paralin/d4acde4dae926d62fc56169be3cb27f4 to your computer and use it in GitHub Desktop.
Save paralin/d4acde4dae926d62fc56169be3cb27f4 to your computer and use it in GitHub Desktop.
llama.cpp docker cublas make it go fast
docker rm -f rocm
sudo docker run -d \
--name=rocm \
--privileged \
-v /dev:/dev \
-v /mnt/persist/work/rocm:/home/rocm-user \
--security-opt seccomp=unconfined \
--group-add video \
rocm/rocm-terminal \
bash -c "sleep infinity"
docker exec --user 0 -it rocm bash -c "mkdir -p /mnt/models && mount /dev/disk/by-partlabel/EXT_STORAGE /mnt/models/"
docker exec -it rocm bash
sudo apt update
sudo apt install pkg-config build-essential
git clone --recurse-submodules https://github.com/KhronosGroup/OpenCL-SDK.git
mkdir -p OpenCL-SDK/build
cd OpenCL-SDK/build
cmake .. -DBUILD_DOCS=OFF \
-DBUILD_EXAMPLES=OFF \
-DBUILD_TESTING=OFF \
-DOPENCL_SDK_BUILD_SAMPLES=OFF \
-DOPENCL_SDK_TEST_SAMPLES=OFF
cmake --build . --config Release -j8
sudo cmake --install . --prefix /usr/local
git clone https://github.com/CNugteren/CLBlast.git
mkdir -p CLBlast/build
cd CLBlast/build
cmake .. -DBUILD_SHARED_LIBS=OFF -DTUNERS=OFF -DCMAKE_POSITION_INDEPENDENT_CODE=ON
cmake --build . --config Release -j8
sudo cmake --install . --prefix /usr/local
git clone https://github.com/ggerganov/llama.cpp -b dadbed99e65252d79f81101a392d0d6497b86caa
make LLAMA_CLBLAST=1 -j8
rm models || true
ln -fs /mnt/models/backup/models/llama ./models
Remove --threads option
Change num gpu layers to 42 (or whatever the maximum is)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment