Skip to content

Instantly share code, notes, and snippets.

View HughPH's full-sized avatar

Hugh Phoenix-Hulme HughPH

View GitHub Profile
@abodacs
abodacs / jserv_hf_fast.py
Created July 5, 2021 09:38 — forked from kinoc/jserv_hf_fast.py
Run HuggingFace converted GPT-J-6B checkpoint using FastAPI and Ngrok on local GPU (3090 or Titan)
# So you want to run GPT-J-6B using HuggingFace+FastAPI on a local rig (3090 or TITAN) ... tricky.
# special help from the Kolob Colab server https://colab.research.google.com/drive/1VFh5DOkCJjWIrQ6eB82lxGKKPgXmsO5D?usp=sharing#scrollTo=iCHgJvfL4alW
# Conversion to HF format (12.6GB tar image) found at https://drive.google.com/u/0/uc?id=1NXP75l1Xa5s9K18yf3qLoZcR6p4Wced1&export=download
# Uses GDOWN to get the image
# You will need 26 GB of space, 12+GB for the tar and 12+GB expanded (you can nuke the tar after expansion)
# Near Simplest Language model API, with room to expand!
# runs GPT-J-6B on 3090 and TITAN and servers it using FastAPI
# change "seq" (which is the context size) to adjust footprint
@kinoc
kinoc / jserv_hf_fast.py
Created June 21, 2021 10:54
Run HuggingFace converted GPT-J-6B checkpoint using FastAPI and Ngrok on local GPU (3090 or Titan)
# So you want to run GPT-J-6B using HuggingFace+FastAPI on a local rig (3090 or TITAN) ... tricky.
# special help from the Kolob Colab server https://colab.research.google.com/drive/1VFh5DOkCJjWIrQ6eB82lxGKKPgXmsO5D?usp=sharing#scrollTo=iCHgJvfL4alW
# Conversion to HF format (12.6GB tar image) found at https://drive.google.com/u/0/uc?id=1NXP75l1Xa5s9K18yf3qLoZcR6p4Wced1&export=download
# Uses GDOWN to get the image
# You will need 26 GB of space, 12+GB for the tar and 12+GB expanded (you can nuke the tar after expansion)
# Near Simplest Language model API, with room to expand!
# runs GPT-J-6B on 3090 and TITAN and servers it using FastAPI
# change "seq" (which is the context size) to adjust footprint