Skip to content

Instantly share code, notes, and snippets.

@cprivitere
Last active March 9, 2023 21:48
Show Gist options
  • Save cprivitere/6d0e54099c3f69048590489338d46bf6 to your computer and use it in GitHub Desktop.
Save cprivitere/6d0e54099c3f69048590489338d46bf6 to your computer and use it in GitHub Desktop.
AI demos
  • Deploy GPU instance, g2.large.x86 seems to be all we have. Here's a cloud-init
  • The GPU driver may need bumps as the cuda required by the stable diffusion project increases.
#!/bin/bash
export DEBIAN_FRONTEND=noninteractive
apt-get update
apt-get install grub2 python-is-python3 nvidia-driver-515-server python3-venv -y
useradd cprivitere -m -s /usr/bin/bash
  • Now you need to reboot the server
su - cprivitere
bash <(wget -qO- https://raw.githubusercontent.com/AUTOMATIC1111/stable-diffusion-webui/master/webui.sh) --listen

To start it up again, run webui.sh, or to run it on the public interface, ctrl-c, then:

cd stable-diffusion-webui
./webui.sh --listen
  • If you'd like to lock down the server
apt-get install ufw
ufw allow 7860
ufw allow OpenSSH
ufw enable
  • Deploy GPU instance, g2.large.x86 seems to be all we have.
apt-get update;apt-get upgrade;
apt-get install grub2
  • You’ll need to do some cleanup of grub.
apt purge
apt install python-is-python3 ufw
ufw allow 8888
ufw allow OpenSSH
ufw enable
apt install nvidia-driver-515-server

This may need version bumps as the CUDA version used by the project below increases.

apt install python3-venv
useradd cprivitere -m -s /usr/bin/bash
pvcreate /dev/sdb
vgcreate vgcache /dev/sdb
lvcreate -l 100%FREE -n lvcache vgcache
mkfs.ext4 /dev/vgcache/lvcache
mkdir /home/cprivitere/cache
mount /dev/vgcache/lvcache /home/cprivitere/cache

cat ‘/dev/vgcache/lvcache	/home/cprivitere/cache	ext4	defaults 0 1’ >> /etc/fstab
mount -a
su - cprivitere

cat << EOF >> /home/cprivitere/.bashrc
export TRANSFORMERS_CACHE=/home/cprivitere/cache/
export HF_HOME=/home/cprivitere/cache/
export HUGGINGFACE_HUB_CACHE=/home/cprivitere/cache/
EOF
. /home/cprivitere/.bashrc

mkdir transformers
cd transformers
python -m venv .venv
. .venv/bin/activate
pip install -U pip
pip install torch torchvision torchaudio transformers jupyterlab ipywidgets

jupyter lab --ip 0.0.0.0
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import torch
import time
checkpoint="EleutherAI/gpt-j-6B"
start_time = time.time()
#model = AutoModelForCausalLM.from_pretrained(checkpoint,revision="float16", torch_dtype=torch.float16).cuda()
#torch.save(model, "gptj.pt")
model = torch.load("gptj.pt")
end_time = time.time()
time_taken = round(end_time - start_time,2)
print(f"Time taken: {time_taken} seconds")
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
prompt = (
"In a shocking finding, scientists discovered a herd of unicorns living in a remote, "
"previously unexplored valley, in the Andes Mountains. Even more surprising to the "
"researchers was the fact that the unicorns spoke perfect English."
)
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.cuda()
start_time = time.time()
gen_tokens = model.generate(
input_ids,
do_sample=True,
temperature=0.9,
max_length=200,
)
end_time = time.time()
time_taken = round(end_time - start_time,2)
print(f"Time taken: {time_taken} seconds")
gen_text = tokenizer.batch_decode(gen_tokens)[0]
print(gen_text)
from transformers import GPTNeoXForCausalLM, GPTNeoXTokenizerFast, GPTNeoXConfig
from accelerate import init_empty_weights, load_checkpoint_and_dispatch
from huggingface_hub import hf_hub_download
import torch
import time
checkpoint="EleutherAI/gpt-neox-20b"
weights = hf_hub_download(repo_id=checkpoint, filename="pytorch_model.bin.index.json")
config = GPTNeoXConfig.from_pretrained(checkpoint)
with init_empty_weights():
model = GPTNeoXForCausalLM(config=config)
start_time = time.time()
model=load_checkpoint_and_dispatch(
model,
weights,
device_map="auto",
no_split_module_classes=["GPTNeoXLayer"],
dtype=torch.float16,
)
end_time = time.time()
time_taken = round(end_time - start_time,2)
print(f"Time taken to load model: {time_taken} seconds")
tokenizer = GPTNeoXTokenizerFast.from_pretrained(checkpoint)
prompt = (
"In a shocking finding, scientists discovered a herd of unicorns living in a remote, "
"previously unexplored valley, in the Andes Mountains. Even more surprising to the "
"researchers was the fact that the unicorns spoke perfect English."
)
input = tokenizer(prompt, return_tensors="pt")
input = input.to("cuda")
start_time = time.time()
output = model.generate(input["input_ids"],do_sample=True, temperature=0.9, max_length=500, top_k=50, top_p=0.95, pad_token_id=tokenizer.eos_token_id)
end_time = time.time()
time_taken = round(end_time - start_time,2)
print(f"Time taken to generate output: {time_taken} seconds")
print(tokenizer.decode(output[0].tolist(), skip_special_tokens=True))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment