Skip to content

Instantly share code, notes, and snippets.

@bzz
Last active October 11, 2023 14:31
Show Gist options
  • Save bzz/314c02023589bdf0d54d7a7864025c1d to your computer and use it in GitHub Desktop.
Save bzz/314c02023589bdf0d54d7a7864025c1d to your computer and use it in GitHub Desktop.
Commercial-friendly, permissively licensed Open Source Large Language Models
Model Arch License Params Seq Len FP Format VRAM Infer Lib Tokenizer Comments Other flavours
bigcode/starcoder GPT OpenRAIL-Mv1 15B 8k fp32 60Gb
~30Gb fp/bf16
~16Gb 8bit
~8Gb 4bit
Megatron-LM fork GPT2Tokenizer 49152 FlashAttn, MQA, FIM, 1T=250Bx4 tokens of StarCoderData starcoderplus, starchat-beta (humanEval 📉)
Salesforce/codegen2 GPT (J) Apache 2.0 1B, 3.7B
7B, 16B
1k 7b fp32 28Gb JaxFormer GPT2Tokenizer 51200 RoPE, FIM, The Stack dedup v1.1, -instruct research-only
Salesforce/codegen2.5 LLaMA Apache 2.0 7B 2k fp32 28Gb JaxFormer Tiktoken 51200 FlashAttn, Triton, FIM, 1.4T=300Bx4+ tokens of StarCoderData -mono Python, -instruct research-only
Salesforce/xgen-7b-8k-base LLaMA Apache 2.0 7B 8k Tiktoken -inst research only
OpenLLaMA v2 LLaMA Apache 2.0 3B,7B
13B
2k 7b fp16 14Gb PyTorch/HF,
JAX/EasyLM
HF (fast) tokenizer 32k 1T tokens of RedPajama + StarCoderData + Falcon
LLaMA 2 LLaMA Fb CLA (<700M MAU, no knowlege distilation) 3B,13B
70B
4k 7b fp16 14Gb PyTorch/HF SentencePeice 32k + digits (LlamaTokenizer) RoPE, grouped-query attention (GQA) in 70B,
2T tokens
-chat
CodeLlama LLaMA Fb CLA 7B,13B
34B
4k-16k (HF) 7b bf16 14Gb PyTorch, HF (partial from 4.33) SentencePiece 32k + FIM
(LlamaCodeTokenizer)
RoPE (+scaling), FIM -Python, -Instruct
Mistral 7b v0.1 Apache 2.0 7B 8k PyTorch/HF xFormers SentencePiece 32k (LalamaTokenizer) GQA, SWA, FlashAttn 2 -Instruct
replit-code-v1-3b MPT CC BY-SA 4.0 2.7B 2k fp32 10Gb Mosiac LLM Foundry SentencePiece 32768 FlashAttn, Triton, AliBi, FasterTransformer, The Stack dedup v1.2
MPT MPT Apache 2.0 7B, 30B 8k LLM Foundry FlashAttn, Titon, ALiBi, FT
1T tokens
-instruct CC-By-SA-3.0 -chat CC-By-NC-SA-4.0
Falcon GPT (RW) Apache 2.0 7B, 40B 2k bf16 7B ~15Gb
40B ~90Gb
TokenizerFast 65024 1T RefinedWeb, GPTQ -instruct Apach 2.0 (!)
StableLM GPT (NeoX) CC-By-SA-4.0 3B, 7B 4k fp32 16Gb 3b fp16 12Gb
7b 24Gb 7b
GPT-NeoX RoPE, 1.5T tokens of The Pile -tuned CC-By-NC-SA-4.0
RedPajama-INCITE-Base-7B-v0.1 GPT (NeoX) Apache 2.0 7B 2k fp16 16Gb 50432 RoPE base, instruct, chat
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment