Hugh Phoenix-Hulme HughPH

## jserv_hf_fast.py
# So you want to run GPT-J-6B using HuggingFace+FastAPI on a local rig (3090 or TITAN) ... tricky.
# special help from the Kolob Colab server https://colab.research.google.com/drive/1VFh5DOkCJjWIrQ6eB82lxGKKPgXmsO5D?usp=sharing#scrollTo=iCHgJvfL4alW
# Conversion to HF format (12.6GB tar image) found at https://drive.google.com/u/0/uc?id=1NXP75l1Xa5s9K18yf3qLoZcR6p4Wced1&export=download
# Uses GDOWN to get the image
# You will need 26 GB of space, 12+GB for the tar and 12+GB expanded (you can nuke the tar after expansion)
# HPPH: Not sure where you'll find this file, the links I found didn't work and the GDOWN was returning unauthorised errors. Maybe I'll make it a torrent.
# HPPH: I also dumped the kobold endpoint. And added one for getting token counts so you can prune your prompt if necessary.
# HPPH: And finally... Now the prompt goes in the POST body, which simplifies matters significantly.

# Near Simplest Language model API, with room to expand!
	# So you want to run GPT-J-6B using HuggingFace+FastAPI on a local rig (3090 or TITAN) ... tricky.
	# special help from the Kolob Colab server https://colab.research.google.com/drive/1VFh5DOkCJjWIrQ6eB82lxGKKPgXmsO5D?usp=sharing#scrollTo=iCHgJvfL4alW
	# Conversion to HF format (12.6GB tar image) found at https://drive.google.com/u/0/uc?id=1NXP75l1Xa5s9K18yf3qLoZcR6p4Wced1&export=download
	# Uses GDOWN to get the image
	# You will need 26 GB of space, 12+GB for the tar and 12+GB expanded (you can nuke the tar after expansion)
	# HPPH: Not sure where you'll find this file, the links I found didn't work and the GDOWN was returning unauthorised errors. Maybe I'll make it a torrent.
	# HPPH: I also dumped the kobold endpoint. And added one for getting token counts so you can prune your prompt if necessary.
	# HPPH: And finally... Now the prompt goes in the POST body, which simplifies matters significantly.

	# Near Simplest Language model API, with room to expand!