sikang99/compact-llm.md

## compact-llm.md

      
    Raw
  

              compact-llm.md
            
          
    Compact Language Models

Abbrs


GPT : Generative Pretrained Transformer
LLaMA : Large Language Model Meta AI, Meta
GLM : Generic Language Model, Tsinghua University
LoRA : Low-Rank Adaptation, MS
LiGO : Linear Growth Operator, MIT
PEFT : Parameter-Efficient Fine-Tuning, Hugging Face
LoRA + DeepSpeed + CPU offloading
RPTG : Reorder-based Post-Training Quantization
DeepSpeed
Large language model

Articles


2023/04/16 Understanding Large Language Models
2023/04/05 StackLLaMA: A hands-on guide to train LLaMA with RLHF
2023/04/01 Foundation Models: Scaling Large Language Models
2023/03/30 List of Open Sourced Fine-Tuned Large Language Models (LLM)
2023/03/29 ColossalChat: An Open-Source Solution for Cloning ChatGPT With a Complete RLHF Pipeline
2023/03/25 Review — MT-NLG: Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
2023/03/24 How To Create Your Own AI Chatbot Server With Raspberry Pi 4
2023/03/24 Hello Dolly: Democratizing the magic of ChatGPT with open models
2023/03/23 How to use Alpaca-LoRA to fine-tune a model like ChatGPT
2023/03/22 Learning to grow machine-learning models
2023/03/21 Updating Deep Learning Models Right on the Mobile Device — Transfer Learning and Fine-Tuning
2023/03/20 Fine-Tuning Large Language Models with Hugging Face and DeepSpeed
2023/03/09 Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU
2023/01/26 What Are Large Language Models Used For?
2022/11/29 Better Language Models Without Massive Compute
2022/08/06 Understand BLOOM, the Largest Open-Access AI, and Run It on Your Local Computer
2022/05/17 Train 18-billion-parameter GPT models with a single GPU on your personal computer! Open source project Colossal-AI has added new features
2022/03/28 Introducing xTuring: Fast, Efficient, and Simple Fine-Tuning for LLMs.
2021/09/21 Large Model 학습의 game changer, MS의 DeepSpeed ZeRO-1,2,3 그리고 ZeRO-Infinity

Projects


xTuring

GPT


https://github.com/nomic-ai/gpt4all - a chatbot trained on a massive collection of clean assistant data including code, stories and dialogue
https://github.com/ztjhz/BetterChatGPT - Play and chat smarter with Better ChatGPT - an amazing open-source web app with a better UI for exploring OpenAI's ChatGPT API! (Website + Windows + MacOS + Linux)
https://github.com/stochasticai/xTuring - Build and control your own LLMs
https://github.com/ggerganov/whisper.cpp - Port of OpenAI's Whisper model in C/C++
https://github.com/bigscience-workshop/Megatron-DeepSpeed - Ongoing research training transformer language models at scale, including: BERT & GPT-2
https://github.com/Torantulino/Auto-GPT - An experimental open-source attempt to make GPT-4 fully autonomous.

LLaMA


https://github.com/facebookresearch/llama - Inference code for LLaMA models
https://github.com/ggerganov/llama.cpp - Port of Facebook's LLaMA model in C/C++
https://github.com/cornelk/llama-go - Port of Facebook's LLaMA in Golang with embedded C/C++
https://github.com/rustformers/llama-rs - Run LLaMA inference on CPU, with Rust 🦀🚀🦙
https://github.com/jankais3r/LLaMA_MPS - Run LLaMA (and Stanford-Alpaca) inference on Apple Silicon GPUs.
https://github.com/nebuly-ai/nebullvm/tree/main/apps/accelerate/chatllama - ChatLLaMA
https://github.com/SWHL/LLaMADemo - 🎉LLaMA Demo 7B🎉
https://github.com/jerryjliu/llama_index - LlamaIndex (GPT Index) provides a central interface to connect your LLM's with external data
https://github.com/lxe/simple-llama-finetuner - Simple UI for LLaMA Model Finetuning
https://github.com/Lightning-AI/lit-llama
https://github.com/ZrrSkywalker/LLaMA-Adapter - Fine-tuning LLaMA to follow instructions within 1 Hour and 1.2M Parameters
Dalai - Run LLaMA and Alpaca on your computer
https://github.com/hpcaitech/ColossalAI - Making large AI models cheaper, faster and more accessible
https://github.com/abetlen/llama-cpp-python - Python bindings for llama.cpp
https://github.com/go-skynet/go-llama.cpp - LLama.cpp golang bindings
https://github.com/gotzmann/llama.go - llama.go is like llama.cpp in pure Golang!

Alpaca


https://github.com/tatsu-lab/stanford_alpaca - Code and documentation to train Stanford's Alpaca models, and generate the data.
https://github.com/tloen/alpaca-lora - Instruct-tune LLaMA on consumer hardware
https://github.com/Beomi/KoAlpaca - KoAlpaca: Korean Alpaca Model based on Stanford Alpaca (feat. LLAMA and Polyglot-ko)
https://github.com/deep-diver/Alpaca-LoRA-Serve - Alpaca-LoRA as Chatbot service
https://github.com/gitana/alpaca - JSON Forms for jQuery and Bootstrap
https://github.com/ymcui/Chinese-LLaMA-Alpaca - Chinese LLaMA & Alpaca LLMs
https://github.com/kunishou/Japanese-Alpaca-LoRA
https://github.com/antimatter15/alpaca.cpp - Locally run an Instruction-Tuned Chat-Style LLM

ChatGLM


https://github.com/THUDM/ChatGLM-6B - ChatGLM-6B：开源双语对话语言模型 | An Open Bilingual Dialogue Language Model
https://github.com/ssbuild/chatglm_finetuning - chatglm 6b finetuning and alpaca finetuning
https://github.com/Akegarasu/ChatGLM-webui - A WebUI for ChatGLM-6B
https://github.com/lich99/ChatGLM-finetune-LoRA - Code for fintune ChatGLM-6b using low-rank adaptation (LoRA)
https://github.com/THUDM/GLM-130B - GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
https://github.com/hahnyuan/RPTQ4LLM - Reorder-based post-training quantization for large language model
https://github.com/project-baize/baize-chatbot - Let ChatGPT teach your own chatbot in hours with a single GPU!
https://github.com/ssbuild/chatglm_finetuning - chatglm 6b finetuning and alpaca finetuning

Dolly


https://github.com/databrickslabs/dolly - Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform

Miscs


https://github.com/tensorchord/awesome-open-source-llmops - An awesome & curated list of best open source MLOps/LLMOps tools for data scientists
https://github.com/htqin/awesome-model-quantization - A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research
https://github.com/huggingface/peft - 🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning
https://github.com/cloneofsimo/lora - Using Low-rank adaptation to quickly fine-tune diffusion models.
https://github.com/microsoft/LoRA - Code for loralib, an implementation of LoRA
https://github.com/microsoft/DeepSpeed - a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
https://github.com/microsoft/Megatron-DeepSpeed - Ongoing research training transformer language models at scale, including: BERT & GPT-2
https://github.com/kuleshov/minillm -  minimal system for running modern LLMs on consumer-grade GPUs
https://hal.science/hal-04014493/document - SlowLLM: large language models on consumer hardware
https://github.com/nlpodyssey/verbaflow - Neural Language Model for Go
https://github.com/hpcaitech/ColossalAI - Making large AI models cheaper, faster and more accessible
https://github.com/huggingface/blog - Public repo for HF blog posts
NVIDIA NeMo Service - Cloud service for enterprise hyper-personalization and at-scale deployment of intelligent large language models
https://github.com/PiotrNawrot/nanoT5 - Fast & Simple repository for pre-training and fine-tuning T5-style models
https://github.com/karpathy/nanoGPT - The simplest, fastest repository for training/finetuning medium-sized GPTs
https://github.com/JonasGeiping/cramming - Cramming the training of a (BERT-type) language model into limited compute
https://github.com/mryab/efficient-dl-systems - Efficient Deep Learning Systems course materials (HSE, YSDA)
https://github.com/j0sephsasson/fine-tune-LLMs - A no-code application that enables companies to create intelligent digital assistants.
https://github.com/ggerganov/whisper.cpp - Port of OpenAI's Whisper model in C/C++
https://github.com/nat/openplayground - An LLM playground you can run on your laptop
https://github.com/agiresearch/OpenAGI - OpenAGI: When LLM Meets Domain Experts

References


The Eleventh International Conference on Learning Representations, ICLR 2023
Recent Advances in Language Model Fine-tuning

Projects


2023 Learning to Grow Pretrained Models for Efficient Transformer Training - LiGO

Papers


2023 A Survey of Large Language Models
2023 Adapting Pretrained Vision-Language Foundational Models to Medical Imaging Domains
2022 Cramming: Training a Language Model on a Single GPU in One Day
2022 LoRA: Low-Rank Adaptation of Large Language Models
2022 Teaching Small Language Models to Reason