EvilFreelancer/llms-comparison.md

## llms-comparison.md

      
    Raw
  

              llms-comparison.md
            
          
    Details here: https://github.com/EvilFreelancer/benchmarking-llms

Graphics Card: RTX 4090 24Gb
CUDA Version: 11.7 (for ruGPT3 family) and 11.8 (for other models)
Python Version: 3.11.4

max_new_tokens=1024,
top_k=20,
top_p=0.9,
repetition_penalty=1.1,
do_sample=True,
use_cache=False


Name
Size
Context
VRAM (Gb)
MAX Init RAM (Gb)
AVG GenTime (s)
AVG Tokens
AVG t/s


StableBeluga 7b
7b
4096
~22.5
~22.7
~31.25
~529.7
~16.9


LLaMA 7b
7b
4096
~22.47
~22.7
~34.52
~545.5
~15.8


LLaMA 2 7b
7b
4096


LLaMA 2 7b 32k
7b-32k
32768
~21.5
~22.7
~56.63
~868.5
~15.3


MosaicML 7b
7b
8192
~22.6 (~13.7)
~9.8
~87.27
~1046.2
~12.0


MosaicML 7b-storywriter
7b-storywriter
65536
~22.9
~10.4
~109.12
~1048.2
~9.6


MosaicML 7b-instruct
7b-instruct
4096
~22.93
~9.8
~110.47
~1045.2
~9.5


MosaicML 7b-instruct-8k
7b-instruct-8k
8192
~22.66
~10.5
~84.32
~1045.5
~12.4


ruGPT 3 small
125m
2048
~6.18
~1.3
~6.4
~1041.8
~162.7


ruGPT 3 medium
410m
2048
~6.66
~2.6
~12.74
~1044.3
~82.0


ruGPT 3 large
750m
2048
~7.48
~5.2
~15.19
~1045.5
~68.8


ruGPT 3 xl
1.3B
2048
~13.76
~4.7
~13.38
~567.1
~42.4


ruGPT 3.5 13b
13b
2048


mGPT
1.3b
2048
~22.96 (~4.11)
~7.01
~24.72
~1046.8
~42.3


mGPT 13b
13b
2048
Name	Size	Context	VRAM (Gb)	MAX Init RAM (Gb)	AVG GenTime (s)	AVG Tokens	AVG t/s
StableBeluga 7b	7b	4096	~22.5	~22.7	~31.25	~529.7	~16.9
LLaMA 7b	7b	4096	~22.47	~22.7	~34.52	~545.5	~15.8
LLaMA 2 7b	7b	4096
LLaMA 2 7b 32k	7b-32k	32768	~21.5	~22.7	~56.63	~868.5	~15.3
MosaicML 7b	7b	8192	~22.6 (~13.7)	~9.8	~87.27	~1046.2	~12.0
MosaicML 7b-storywriter	7b-storywriter	65536	~22.9	~10.4	~109.12	~1048.2	~9.6
MosaicML 7b-instruct	7b-instruct	4096	~22.93	~9.8	~110.47	~1045.2	~9.5
MosaicML 7b-instruct-8k	7b-instruct-8k	8192	~22.66	~10.5	~84.32	~1045.5	~12.4
ruGPT 3 small	125m	2048	~6.18	~1.3	~6.4	~1041.8	~162.7
ruGPT 3 medium	410m	2048	~6.66	~2.6	~12.74	~1044.3	~82.0
ruGPT 3 large	750m	2048	~7.48	~5.2	~15.19	~1045.5	~68.8
ruGPT 3 xl	1.3B	2048	~13.76	~4.7	~13.38	~567.1	~42.4
ruGPT 3.5 13b	13b	2048
mGPT	1.3b	2048	~22.96 (~4.11)	~7.01	~24.72	~1046.8	~42.3
mGPT 13b	13b	2048