Skip to content

Instantly share code, notes, and snippets.

View bigsnarfdude's full-sized avatar

BigsnarfDude bigsnarfdude

View GitHub Profile
Sun May 5 18:06:21 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla V100-SXM2-16GB On | 00000000:00:04.0 Off | 0 |
| N/A 51C P0 66W / 300W | 16071MiB / 16384MiB | 0% Default |
@bigsnarfdude
bigsnarfdude / nvidia-llama3-chat-rag-doc.py
Last active May 3, 2024 03:35
nvidia-llama3-chat-rag-doc.py
Thu May 2 20:35:44 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4070 ... Off | 00000000:01:00.0 Off | N/A |
| 0% 52C P2 77W / 285W | 15490MiB / 16376MiB | 99% Default |
./build/bin/main -m ./models/llama3_alpaca_dpo_GGUF-unsloth.F16.gguf -p '''Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Instruction:\nWhy is AI like the Industrial Revolution?\n\n### Input:\n\n\n### Response:\n''' -ngl 35 -n 400 -e
<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
Why is AI like the Industrial Revolution?
@bigsnarfdude
bigsnarfdude / codellama_34b_unsloth.py
Created April 30, 2024 18:22
codellama_34b_unsloth.py
Every 1.0s: nvidia-smi 129-146-124-202: Tue Apr 30 18:21:29 2024
Tue Apr 30 18:21:29 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
@bigsnarfdude
bigsnarfdude / r_and_b.py
Last active April 30, 2024 14:10
rouge and bleu for testing paraphrased synthetic data
(harness) vincent@virus:~/Downloads$ cat bleu_text.py
from nltk.translate.bleu_score import sentence_bleu
reference = [
'this is a dog'.split(),
'it is dog'.split(),
'dog it is'.split(),
'a dog, it is'.split()
]
candidate = 'it is dog'.split()
print('BLEU score -> {}'.format(sentence_bleu(reference, candidate )))
@bigsnarfdude
bigsnarfdude / kaist_orpo.py
Last active April 29, 2024 23:38
kaist_orpo.py
# requires A100 40GB - 30gb VRAM
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda"
model = AutoModelForCausalLM.from_pretrained("kaist-ai/mistral-orpo-capybara-7k").to(device)
tokenizer = AutoTokenizer.from_pretrained("kaist-ai/mistral-orpo-capybara-7k")
query = [{'role': 'user', 'content': 'Tell me how AI is like the Industrial Revolution'}]
prompt = tokenizer.apply_chat_template(query, tokenize=False, add_generation_prompt=True)
inputs = tokenizer (prompt, return_tensors='pt').to(device)
@bigsnarfdude
bigsnarfdude / finetune_gpt2.py
Last active April 29, 2024 15:20
finetune_gpt2.py
import os
import time
import datetime
import pandas as pd
import seaborn as sns
import numpy as np
import random
import matplotlib.pyplot as plt
@bigsnarfdude
bigsnarfdude / sft.ipynb
Created April 28, 2024 03:33
sft.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@bigsnarfdude
bigsnarfdude / every_frame.sh
Created April 27, 2024 14:16
every_frame.sh
#!/bin/zsh
# Check if the video filename is provided as an argument
if [ $# -eq 0 ]; then
echo "Please provide the video filename as an argument."
exit 1
fi
video_filename=$1
@bigsnarfdude
bigsnarfdude / llama3_alpaca_dpo.py
Last active May 1, 2024 13:22
llama3_alpaca_dpo.py unsloth 4070 ti super
'''
#Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Instruction:\nWhat are the three primary colors?\n\n### Input:\n\n\n### Response:\nThe three primary colors are red, blue, and yellow. These colors are called primary because they cannot be created by mixing other colors and all other colors can be made by combining them in various proportions. In the additive color system, used for light, the primary colors are red, green, and blue (RGB).
'''
from unsloth import FastLanguageModel
import torch
max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.