Skip to content

Instantly share code, notes, and snippets.

@jbinkleyj
Forked from Birch-san/code-assist.md
Created June 10, 2023 15:25
Show Gist options
  • Save jbinkleyj/6ae2b848d354dd35e2873ab9ce578282 to your computer and use it in GitHub Desktop.
Save jbinkleyj/6ae2b848d354dd35e2873ab9ce578282 to your computer and use it in GitHub Desktop.
Local VSCode AI code assistance via starcoder + 4-bit quantization in ~11GB VRAM

Install HF Code Autocomplete VSCode plugin.

We are not going to set an API token. We are going to specify an API endpoint.
We will try to deploy that API ourselves, to use our own GPU to provide the code assistance.

We will use bigcode/starcoder, a 15.5B param model.
We will use NF4 4-bit quantization to fit this into 10787MiB VRAM.
It would require 23767MiB VRAM unquantized. (still fits on a 4090, which has 24564MiB)!

Setup API

All instructions are written assuming your command-line shell is bash.

Clone huggingface-vscode-endpoint-server repository:

git clone https://github.com/Birch-san/huggingface-vscode-endpoint-server.git
cd huggingface-vscode-endpoint-server

Create + activate a new virtual environment

This is to avoid interfering with your current Python environment (other Python scripts on your computer might not appreciate it if you update a bunch of packages they were relying on).

Follow the instructions for virtualenv, or conda, or neither (if you don't care what happens to other Python scripts on your computer).

Using venv

Create environment:

. ./venv/bin/activate
pip install --upgrade pip

Activate environment:

. ./venv/bin/activate

(First-time) update environment's pip:

pip install --upgrade pip

Using conda

Download conda.

Skip this step if you already have conda.

Install conda:

Skip this step if you already have conda.

Assuming you're using a bash shell:

# Linux installs Anaconda via this shell script. Mac installs by running a .pkg installer.
bash Anaconda-latest-Linux-x86_64.sh
# this step probably works on both Linux and Mac.
eval "$(~/anaconda3/bin/conda shell.bash hook)"
conda config --set auto_activate_base false
conda init

Create environment:

conda create -n p311-code-api python=3.11

Activate environment:

conda activate p311-code-api

Install package dependencies

Ensure you have activated the environment you created above.

(Optional) treat yourself to latest nightly of PyTorch, with support for Python 3.11 and CUDA 12.1:

# CUDA
pip install --upgrade --pre torch --extra-index-url https://download.pytorch.org/whl/nightly/cu121

Install dependencies:

pip install -r requirements.txt

Run API:

From root of huggingface-vscode-endpoint-server repository:

python -m main --pretrained bigcode/starcoder --bf16

Error: bigcode/starcoder repository not found / "private repository"

If you get this error:
You'll need to accept the terms on the bigcode/starcoder model card.

If you haven't logged into the huggingface CLI before: you'll also need to do that, so that it can authenticate as you, to check whether you accepted the model card's terms.

Go to tokens, create a new read-only token.
Copy the new token to your clipboard.

Run huggingface-cli login from your command prompt, and paste the token.

Try running main again.

Test API

Check this first, before we try to get VSCode working.

curl -X POST http://localhost:8000/api/generate/ -d '{"inputs": "", "parameters": {"max_new_tokens": 64}}'

If it works: we're ready to try it in VSCode.

Try out starcoder integration in VSCode

Open the VSCode extension settings for starcoder:

image

Set your API endpoint as:

http://localhost:8000/api/generate

image

You may need to Reload Window, to initialize the HF Code Autocomplete, now that you have changed the settings (open command palette with Cmd+Shift+P, and type Reload Window):

image

Running code inference

Create a new empty text file. Set the language to Python.

Type:

def main():

image

Whilst it thinks: you should see a spinner in the status bar at the bottom of VSCode:

image

Starcoder should auto-complete it for you!

image

Press tab to accept the completion.

image

Troubleshooting HF Code Autocomplete extension

Open the Output tab of VSCode's tray, pick the "Hugging Face Code" dropdown option:

image

You should be able to see anything logged by from the VSCode Extension.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment