Skip to content

Instantly share code, notes, and snippets.

@Birch-san
Last active July 10, 2023 17:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Birch-san/467bb69f8392d94658f2ed88cd79dbeb to your computer and use it in GitHub Desktop.
Save Birch-san/467bb69f8392d94658f2ed88cd79dbeb to your computer and use it in GitHub Desktop.
Perf-testing bitsandbytes `0.39.1` vs `0.40.0`

perf-testing bitsandbytes 0.39.1 vs 0.40.0

4090 on CUDA 12.1

seed=64

Evaluated using evaluate.py,

python -m evaluate --model_name_or_path huggyllama/llama-7b --tokenizer_model_name_or_path huggyllama/llama-7b --bf16 --overrun_countermeasures False --prompt_style bare

python -m evaluate --model_name_or_path huggyllama/llama-13b --tokenizer_model_name_or_path huggyllama/llama-13b --bf16 --overrun_countermeasures False --prompt_style bare

python -m evaluate --model_name_or_path huggyllama/llama-30b --tokenizer_model_name_or_path huggyllama/llama-30b --bf16 --overrun_countermeasures False --prompt_style bare

Prompted with:

I would just have to apprehend the culprit. He did not know that I was a magical girl.

bitsandbytes 0.39.1

Lacks inference-optimized 4-bit kernels

llama 7b

I was under the effect of a counterspell, so none of the superpower-wielding monsters could see me anyway. My eyes snapped, trying to find the culprit who did this to me. I heard a snicker from behind. "I don't know why you don't just fight back and kill them all." "You, you're in the magical girl club, too, right?" I asked while feeling very disappointed. "Well, that's just a little secret we're keeping for ourselves." She laughed and waved at me. I snapped my eyes back shut. "You'll have to excuse me, I'm rather sleepy." "What? Are you sleepy? I'm not sleepy yet." "I was not being rhetorical. My eyes are currently closed." "But that doesn't make any sense. You said this was a secret club and you wanted to be our magical girl." "If you're a magical girl, then you're a magical girl from a different world, which makes this a different scenario from the show Magical Girl Clubs are part of."

ctx length: 25
tokens out: 256
duration: 17.42 secs
speed: 14.69 tokens/sec

llama 13b

I was under the effect of a spell at the moment. No one would recognise my face, which had been changed to that of an eighteen-year-old girl at the time when it hadn't really been a transformation. I was walking toward the room of my new roommate. Before doing so, I looked back at the road I had crossed through, with a rather absent look. 'Teleportation' isn't a cheap magic. It's not like an illusion. In order to use a teleportation spell, one must possess both a very elaborate spell and a powerful magic circle. It is also a very dangerous spell, for one's body cannot stand more than seven consecutive teleports. If one does a teleport without a magic circle, the spell will damage the user's magic potential. I do not know what kind of spell 'Teleportation' is, but its effect on me was more than surprising. First, it changed my body to the appearance of 18- year- old girl. I used this magic often, but I do not know how to do it. I never had the opportunity to find out. It was the end of the battle. I do

ctx length: 25
tokens out: 256
duration: 33.66 secs
speed: 7.61 tokens/sec

[bonus] llama 13b + alpaca + stepwise reasoning

See Birchlabs/llama-13b-stepwise-adapter

python -m evaluate \
--model_name_or_path huggyllama/llama-13b \
--base_lora_model_name_or_path chansung/alpaca-lora-13b \
--tokenizer_model_name_or_path Birchlabs/llama-13b-stepwise-tokenizer \
--lora_model_name_or_path Birchlabs/llama-13b-stepwise-adapter \
--input_embedding_path embed_tokens.pt \
--output_embedding_path lm_head.pt \
--bf16 \
--use_bos_token_in_prompt \
--overrun_countermeasures False

[seed=64]$ How can I climb a rope without my hands?
I need to find a way to counter the downward force of my weight and keep my arms straight. <|step_end|> <|step_start|> I also need to avoid the rope moving freely horizontally or vertically. <|step_end|> <|step_start|> One way to do that is to fix the rope to a horizontal bar that I can use as a foothold. <|step_end|> <|step_start|> That way, the rope will move with me as I climb, but horizontally only. <|step_end|> <|step_start|> I also made sure to have a big enough bar to keep my weight from moving the rope horizontally. <|step_end|> <|step_start|> <|answer_start|> I think this is a clever idea! I'll try it and see if it works. <|answer_end|>

ctx length: 49
tokens out: 138
duration: 21.07 secs
speed: 6.55 tokens/sec

llama 30b

I was under the effect of a spell at the moment. No one would suspect anything different of me if I was to leave my home and go out chasing the criminal. Then the policemen would be able to do their jobs. I would soon be on the case. I was to do so in order to protect people against magic-using criminals. My mission began here, with that thought in mind. What I did first was to gather all the information I could. There were two kinds. One was the knowledge of the area, and the other was the information on the culprit. I knew the area well enough, thanks to being from here, but the culprit was an unknown party. I was unsure of their whereabouts or if they were even around, but there is a rumor that they are a magic-using person who is not necessarily a magical girl. The only thing that could be used as evidence were some magic powder used in the ritual, as well as the magical item itself. The police had examined it in an attempt to find out the culprit, but to no avail. The next step would be to find the source of the ritual item. Unfortunately, the culprit

ctx length: 25
tokens out: 256
duration: 83.74 secs
speed: 3.06 tokens/sec

bitsandbytes 0.40.0

Has inference-optimized 4-bit kernels!

conda create -n p311-cu121-bnb-opt python=3.11
conda activate p311-cu121-bnb-opt
pip install 'bitsandbytes>=0.40.0' 'transformers>=4.30.2' 'accelerate>=0.20.3' 'einops==0.6.1' 'evaluate==0.4.0' 'scikit-learn==1.2.2' 'sentencepiece==0.1.99' 'wandb==0.15.3' 'peft@git+https://github.com/huggingface/peft'
# use nightly torch in order to get CUDA 12.1. this isn't the latest nightly, but one I've downloaded in the past and wished to re-use
pip install --upgrade --pre 'torch==2.1.0.dev20230522+cu121' --extra-index-url https://download.pytorch.org/whl/nightly/cu121

llama 7b

I was under the effect of a counterspell, so none of the superpower-wielding monsters could see me anyway. My eyes had begun to change as a result of my battle with Melvin. The transformation was complete. I was now in true wolf form, but I could not shift into human form using the ring of transform. My wolf's eyes were bright red, and I had a lot of white fur on my body. I had to catch the creep. He had stolen a bag of gold from a gnome. That did not matter. What mattered was that I go get him and take his punishment. I was almost to the point that I would shift into my wolf form, but I could not make the change without the ring. I kept walking, but I stopped every so often and waited to detect if someone was watching me. I decided to change in a clearing in the forest when I noticed an enormous foot on a path. As I moved down the path, I became aware of a whole group of feet. They were not my foots, as I've been saying. I saw that this wasn't the feet of just one creature, but a number of them.

ctx length: 25
tokens out: 256
duration: 9.87 secs
speed: 25.92 tokens/sec

1.76x faster than old kernels

llama 13b

I was under the effect of a spell at the moment. No one would recognise my face, which had been changed to that of an eighteen-year-old girl at the time when it hadn't really been a transformation. I was walking toward the room of my new roommate. Before doing so, I looked back at the garden. "S-so, I'm gonna go sleep, too. My shift is over next morning." The security guard standing opposite the pool was saying that to me, a robed woman. That bodyguard of Mistress Mio's was standing still and calm after I had passed by. "Yeah, that's true. Tomorrow, we're supposed to go sightseeing, right?" I asked that to the woman who was not even glancing at me. My eyes were shining under my eyeglasses, but there were no expressions of feelings toward her on my face. As a result, she nodded slightly. She then took a step back. The guard, who was still standing on his plinth, looked over us. That looked very comfortable. As soon as I turned the corner, my heartbeat acceler

ctx length: 25
tokens out: 256
duration: 11.41 secs
speed: 22.43 tokens/sec

2.95x faster than old kernels

[bonus] llama 13b + alpaca + stepwise reasoning

See Birchlabs/llama-13b-stepwise-adapter

python -m evaluate \
--model_name_or_path huggyllama/llama-13b \
--base_lora_model_name_or_path chansung/alpaca-lora-13b \
--tokenizer_model_name_or_path Birchlabs/llama-13b-stepwise-tokenizer \
--lora_model_name_or_path Birchlabs/llama-13b-stepwise-adapter \
--input_embedding_path embed_tokens.pt \
--output_embedding_path lm_head.pt \
--bf16 \
--use_bos_token_in_prompt \
--overrun_countermeasures False

[seed=64]$ How can I climb a rope without my hands?
I need to find a way to counter the downward force of my weight and keep my arms straight. <|step_end|> <|step_start|> I also need to avoid the rope moving freely horizontally or vertically. <|step_end|> <|step_start|> One way to do that is to fix the rope to a horizontal bar that I can use as a foothold. <|step_end|> <|step_start|> That way, the rope will move with me as I climb, but horizontally only. <|step_end|> <|step_start|> I also made sure to have a big enough bar to keep my weight from moving the rope horizontally. <|step_end|> <|step_start|> <|answer_start|> I think this is a clever idea! I'll try it and see if it works. <|answer_end|>

ctx length: 49
tokens out: 138
duration: 8.63 secs
speed: 15.99 tokens/sec

2.44x faster than old kernels

llama 30b

I was under the effect of a spell at the moment. No one would suspect anything different of me if I was to leave my home and go out chasing the criminal. Then the policemen would be able to do their jobs. I would soon be on the case. I was to do so in order to protect people against magic-using criminals. My mission began here, with that thought in mind. What I did first was to gather all the information I could. There were two kinds. One was the knowledge of the area, and the other was the information on the culprit. While I did not have any information on the culprit, I was able to find out that the only public rest room in the area had been closed by the time the vandalism occurred. I also learned the fact that a private rest room was located not even a hundred meters from the vandalized place. One could only access it via the key that was used to lock the rest room door on the inside. So, if the culprit used this area as his toilet, he used this rest room. I learned about as much as that. The private rest room had been closed on the evening of the vandalism. However, I could not be

ctx length: 25
tokens out: 256
duration: 16.87 secs
speed: 15.18 tokens/sec

4.96x faster than old kernels

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment