Skip to content

Instantly share code, notes, and snippets.

@steren
Last active June 18, 2024 14:04
Show Gist options
  • Save steren/b99384923e0ecf91dd4183dedd30b0e8 to your computer and use it in GitHub Desktop.
Save steren/b99384923e0ecf91dd4183dedd30b0e8 to your computer and use it in GitHub Desktop.
TGI fast startup (WIP)
steps:
- name: 'gcr.io/cloud-builders/docker'
args: ['build', '-t', 'us-central1-docker.pkg.dev/$PROJECT_ID/containers/tgi', '.']
- name: 'gcr.io/cloud-builders/docker'
args: ['push', 'us-central1-docker.pkg.dev/$PROJECT_ID/containers/tgi']
images:
- us-central1-docker.pkg.dev/$PROJECT_ID/containers/tgi
options:
machineType: 'N1_HIGHCPU_32'
diskSizeGb: '500'
# Use official Huggingface TGI image (See https://huggingface.co/docs/text-generation-inference/en/quicktour)
FROM ghcr.io/huggingface/text-generation-inference:1.4
# Model to use. Customize with docker build --build-arg MODEL_HUB_ID=your model .
ARG MODEL_HUB_ID=tiiuae/falcon-7b-instruct
# Port to listen to
ARG PORT=8080
# Download model
RUN text-generation-server download-weights $MODEL_HUB_ID
# Start the server at container startup
ENTRYPOINT text-generation-launcher --model-id $MODEL_HUB_ID --port $PORT
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment