Skip to content

Instantly share code, notes, and snippets.

@pykeras
Last active July 4, 2024 10:39
Show Gist options
  • Save pykeras/0b1e32b92b87cdce1f7195ea3409105c to your computer and use it in GitHub Desktop.
Save pykeras/0b1e32b92b87cdce1f7195ea3409105c to your computer and use it in GitHub Desktop.
Run ollama on specific GPU(s)

Ollama GPU Selector: Customize GPU Usage for Ollama

If you want to run Ollama on a specific GPU or multiple GPUs, this tutorial is for you. By default, Ollama utilizes all available GPUs, but sometimes you may want to dedicate a specific GPU or a subset of your GPUs for Ollama's use. The idea for this guide originated from the following issue: Run Ollama on dedicated GPU.

Steps:

  1. Create a script let's call it ollama_gpu_selector.sh:

    nano ollama_gpu_selector.sh

    paste following code in it:

    #!/bin/bash
    
    # Validate input
    validate_input() {
        if [[ ! $1 =~ ^[0-4](,[0-4])*$ ]]; then
            echo "Error: Invalid input. Please enter numbers between 0 and 4, separated by commas."
            exit 1
        fi
    }
    
    # Update the service file with CUDA_VISIBLE_DEVICES values
    update_service() {
        # Check if CUDA_VISIBLE_DEVICES environment variable exists in the service file
        if grep -q '^Environment="CUDA_VISIBLE_DEVICES=' /etc/systemd/system/ollama.service; then
            # Update the existing CUDA_VISIBLE_DEVICES values
            sudo sed -i 's/^Environment="CUDA_VISIBLE_DEVICES=.*/Environment="CUDA_VISIBLE_DEVICES='"$1"'"/' /etc/systemd/system/ollama.service
        else
            # Add a new CUDA_VISIBLE_DEVICES environment variable
            sudo sed -i '/\[Service\]/a Environment="CUDA_VISIBLE_DEVICES='"$1"'"' /etc/systemd/system/ollama.service
        fi
    
        # Reload and restart the systemd service
        sudo systemctl daemon-reload
        sudo systemctl restart ollama.service
    
        echo "Service updated and restarted with CUDA_VISIBLE_DEVICES=$1"
    }
    
    # Check if arguments are passed
    if [ "$#" -eq 0 ]; then
        # Prompt user for CUDA_VISIBLE_DEVICES values if no arguments are passed
        read -p "Enter CUDA_VISIBLE_DEVICES values (0-4, comma-separated): " cuda_values
        validate_input "$cuda_values"
        update_service "$cuda_values"
    else
        # Use arguments as CUDA_VISIBLE_DEVICES values
        cuda_values="$1"
        validate_input "$cuda_values"
        update_service "$cuda_values"
    fi
  2. Make the script executable and run it with administrative privileges:

    chmod +x ollama_gpu_selector.sh
    sudo ./ollama_gpu_selector.sh

    It will prompt you for the GPU number (main is always 0); you can give it comma-separated values to select more than one.

  3. Use the command nvidia-smi -L to get the id of your GPU(s).

Optional:

You can also add aliases for easier switching. Here's how you can do it. If you have 2 Nvidia GPUs (different models) like me:

nano ~/.bashrc

If you are using Zsh, then use the following command:

nano ~/.zshrc

Go to the end of the file and set your aliases. For example:

# Alias definitions for easier switching
alias 3090_ollama="sudo $SCRIPT_LOCATION/ollama_gpu_selector.sh 1"
alias 4090_ollama="sudo $SCRIPT_LOCATION/ollama_gpu_selector.sh 0"

Finally, update the current terminal session:

source ~/.bashrc

And for Zsh

source ~/.zshrc

Now you can run 3090_ollama to force Ollama to use the second GPU as default. GPU numbers start at 0.

#!/bin/bash
# Validate input
validate_input() {
if [[ ! $1 =~ ^[0-4](,[0-4])*$ ]]; then
echo "Error: Invalid input. Please enter numbers between 0 and 4, separated by commas."
exit 1
fi
}
# Update the service file with CUDA_VISIBLE_DEVICES values
update_service() {
# Check if CUDA_VISIBLE_DEVICES environment variable exists in the service file
if grep -q '^Environment="CUDA_VISIBLE_DEVICES=' /etc/systemd/system/ollama.service; then
# Update the existing CUDA_VISIBLE_DEVICES values
sudo sed -i 's/^Environment="CUDA_VISIBLE_DEVICES=.*/Environment="CUDA_VISIBLE_DEVICES='"$1"'"/' /etc/systemd/system/ollama.service
else
# Add a new CUDA_VISIBLE_DEVICES environment variable
sudo sed -i '/\[Service\]/a Environment="CUDA_VISIBLE_DEVICES='"$1"'"' /etc/systemd/system/ollama.service
fi
# Reload and restart the systemd service
sudo systemctl daemon-reload
sudo systemctl restart ollama.service
echo "Service updated and restarted with CUDA_VISIBLE_DEVICES=$1"
}
# Check if arguments are passed
if [ "$#" -eq 0 ]; then
# Prompt user for CUDA_VISIBLE_DEVICES values if no arguments are passed
read -p "Enter CUDA_VISIBLE_DEVICES values (0-4, comma-separated): " cuda_values
validate_input "$cuda_values"
update_service "$cuda_values"
else
# Use arguments as CUDA_VISIBLE_DEVICES values
cuda_values="$1"
validate_input "$cuda_values"
update_service "$cuda_values"
fi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment