Keunwoo Choi keunwoochoi

## llama-home.md

      
              1 file
            
          
              34 forks
            
          
              20 comments
            
          
              446 stars
            
          
                rain-1
                / llama-home.md
            
            
              Last active
              June 19, 2024 03:05
            
              
                How to run Llama 13B with a 6GB graphics card
              
          
    This worked on 14/May/23. The instructions will probably require updating in the future.

llama is a text prediction model similar to GPT-2, and the version of GPT-3 that has not been fine tuned yet.
It is also possible to run fine tuned versions (like alpaca or vicuna with this. I think. Those versions are more focused on answering questions)

Note: I have been told that this does not support multiple GPUs. It can only use a single GPU.
It is possible to run LLama 13B with a 6GB graphics card now! (e.g. a RTX 2060). Thanks to the amazing work involved in llama.cpp. The latest change is CUDA/cuBLAS which allows you pick an arbitrary number of the transformer layers to be run on the GPU. This is perfect for low VRAM.

Clone llama.cpp from git, I am on commit 08737ef720f0510c7ec2aa84d7f70c691073c35d.


## fancy_youtube_encode.sh
# Based on example here https://trac.ffmpeg.org/wiki/Encode/YouTube
text=$(basename $1 .wav)
ffmpeg -i $1 -filter_complex \
"[0:a]avectorscope=s=640x518,pad=1280:720[vs]; \
[0:a]showspectrum=mode=separate:color=intensity:scale=cbrt:s=640x518[ss]; \
[0:a]showwaves=s=1280x202:mode=line[sw]; \
[vs][ss]overlay=w[bg]; \
[bg][sw]overlay=0:H-h,drawtext=fontfile=/usr/share/fonts/truetype/fonts-japanese-gothic.ttf:fontcolor=white:x=10:y=10:text=$text[out]" \
-map "[out]" -map 0:a -c:v libx264 -preset fast -crf 18 -c:a copy $text.mkv

## pg-pong.py
""" Trains an agent with (stochastic) Policy Gradients on Pong. Uses OpenAI Gym. """
import numpy as np
import cPickle as pickle
import gym

# hyperparameters
H = 200 # number of hidden layer neurons
batch_size = 10 # every how many episodes to do a param update?
learning_rate = 1e-4
gamma = 0.99 # discount factor for reward
	# Based on example here https://trac.ffmpeg.org/wiki/Encode/YouTube
	text=$(basename $1 .wav)
	ffmpeg -i $1 -filter_complex \
	"[0:a]avectorscope=s=640x518,pad=1280:720[vs]; \
	[0:a]showspectrum=mode=separate:color=intensity:scale=cbrt:s=640x518[ss]; \
	[0:a]showwaves=s=1280x202:mode=line[sw]; \
	[vs][ss]overlay=w[bg]; \
	[bg][sw]overlay=0:H-h,drawtext=fontfile=/usr/share/fonts/truetype/fonts-japanese-gothic.ttf:fontcolor=white:x=10:y=10:text=$text[out]" \
	-map "[out]" -map 0:a -c:v libx264 -preset fast -crf 18 -c:a copy $text.mkv
	""" Trains an agent with (stochastic) Policy Gradients on Pong. Uses OpenAI Gym. """
	import numpy as np
	import cPickle as pickle
	import gym

	# hyperparameters
	H = 200 # number of hidden layer neurons
	batch_size = 10 # every how many episodes to do a param update?
	learning_rate = 1e-4
	gamma = 0.99 # discount factor for reward