Skip to content

Instantly share code, notes, and snippets.

@digitalscream
digitalscream / llama.cpp router mode with open-webui
Last active March 13, 2026 09:44
Get llama.cpp running in router mode with Open-WebUI
*** IMPORTANT: This is for Nvidia GPUs only - choose another Docker image for llama.cpp if you're using something else ***
1 - Install Docker, with `docker-compose-plugin`.
2 - Create a directory `llama`, and in there create another directory `models`. This would be a great time to put your
favourite .gguf file in the `models` directory.
3 - In the `llama` directory, create `docker-compose.yml` with the following content:
~~~~~~~~~~~~~~~~~~
name: llama
services: