Skip to content

Instantly share code, notes, and snippets.

@EnigmaCurry
Last active December 11, 2024 18:03
Show Gist options
  • Save EnigmaCurry/747f3da8a8e8d62fa7849f1beae5c307 to your computer and use it in GitHub Desktop.
Save EnigmaCurry/747f3da8a8e8d62fa7849f1beae5c307 to your computer and use it in GitHub Desktop.
Run Llama.cpp on Android in Termux

Run llama.cpp chat bot LLM on Pixel phone (Termux)

Requirements

Setup Termux

Open Termux and perform first time setup:

termux-change-repo

This opens a menu to make selections:

  • Select Mirror groupRotate.
  • Choose the mirror group for your country.
    • Use the arrows to move up and down.
    • Use the spacebar to make selections.
    • Use the Enter key to confirm selection.

Upgrade packages:

pkg upgrade -y

If there are any prompts to make a selection, press Enter for the default choice.

Install build dependencies

pkg install -y clang wget cmake git

Build llama.cpp

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
cmake -B build
cmake --build build --config Release

Download the model

llama.cpp requires you to download a compatible model file:

  • The model must be in GGUF format.
  • The size of the model must be 3B parameters or thereabouts. It needs to fit in the RAM requirements of the phone.

I found this one that works.

wget https://huggingface.co/bartowski/Llama-3.2-3B-Instruct-GGUF/resolve/main/Llama-3.2-3B-Instruct-Q6_K.gguf

Run the server

./build/bin/llama-server -m Llama-3.2-3B-Instruct-Q6_K.gguf

The server will start listening at http://127.0.0.1:8080. Minimize the Termux app and it will continue running in the background.

Open the GPTMobile app

  • Go to Settings

    • Ollama Settings
      • API URL
      • Enter the URL: http://localhost:8080/ (trailing / is important)
      • Leave other settings as default.
      • Enter a System Prompt if you wish.
  • Start a New Chat.

  • Ask a question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment