Skip to content

Instantly share code, notes, and snippets.

@YaKaiLi
Created June 12, 2023 12:15
Show Gist options
  • Save YaKaiLi/7bd790884cae8ce98abad1574c084e30 to your computer and use it in GitHub Desktop.
Save YaKaiLi/7bd790884cae8ce98abad1574c084e30 to your computer and use it in GitHub Desktop.

from https://github.com/underlines/awesome-marketing-datascience/blob/master/llm-tools.md

Tools

Native GUIs

openAI

  • chatgptui/desktop
  • chatbox is a Windows, Mac & Linux native ChatGPT Client
  • BingGPT Desktop application of new Bing's AI-powered chat
  • cheetah Speech to text for remote coding interviews, giving you hints from GTP3/4

Local LLMs

cpp / ggml:

  • llama.cpp runs ggml models up to 4-bit quantized on mac, linux and windows natively. Supports the new ggmlv3 format and runs on CPU and GPU. Allows for mixed use of CPU/GPU using BLAS libraries like cuBLAS, CLBLas etc.
  • Alpaca.cpp
  • koboldcpp llama.cpp with a fancy UI, persistent stories, editing tools, memory etc. Supporting ggmlv3 and old ggml, CLBlast and llama, RWKV, GPT-NeoX, Pythia models
  • Serge chat interface based on llama.cpp for running Alpaca models. Entirely self-hosted, no API keys needed
  • llama MPS inference on Apple Silicon GPU using much lower power but is slightly slower than llama.cpp which uses CPU
  • bloomz.cpp Inference of HuggingFace's BLOOM-like models in pure C/C++
  • RWKV.cpp CPU only port of BlinkDL/RWKV-LM to ggerganov/ggml. Supports FP32, FP16 and quantized INT4.
  • RWKV Cuda a torchless, c++ rwkv implementation with 8bit quantization written in cuda
  • secondbrain Multi-platform desktop app to download and run LLMs locally in your computer

gpt4all:

others:

  • Lit-llama training, fine tuning and inference of llama
  • Dalai LLaMA-based ChatGPT for single GPUs
  • ChatLLaMA LLaMA-based ChatGPT for single GPUs
  • mlc-llm, run any LLM on any hardware (iPhones, Android, Win, Linux, Mac, WebGPU, Metal. NVidia, AMD)
  • faraday.dev Run open-source LLMs on your Win/Mac. Completely offline. Zero configuration.
  • ChatALL concurrently sends prompts to multiple LLM-based AI bots both local and APIs and displays the results
  • pyllama hacked version of LLaMA based on Meta's implementation, optimized for Single GPUs
  • gmessage visually pleasing chatbot that uses a locally running LLM server and supports multiple themes, chat history search, text to speech, JSON file export, and OpenAI API compatible Python code
  • selfhostedAI one-click deployment of RWKV, ChatGLM, llama.cpp models for substituting the openAI API to a locally hosted API

Web GUIs

openAI

Local LLMs

  • Text Generation Webui An all purpose UI to run LLMs of all sorts with optimizations (running LLaMA-13b on 6GB VRAM, HN Thread)
  • Text Generation Webui Ph0rk0z fork supporting all GPTQ versions and max context of 8192 instead of 4096 (because some models support longer context now)
  • Alpaca-LoRa-Serve
  • chat petals web app + HTTP and Websocket endpoints for BLOOM-176B inference with the Petals client
  • Alpaca-Turbo Web UI to run alpaca model locally on Win/Mac/Linux
  • FreedomGPT Web app that executes the FreedomGPT LLM locally
  • HuggingChat open source chat interface for transformer based LLMs by Huggingface
  • openplayground enables running LLM models on a laptop using a full UI, supporting various APIs and local HuggingFace cached models
  • gpt4all Web UI user friendly all-in-one interface, runs gpt_j, gptq, ggml and other model types
  • RWKV-Runner Easy installation and running of RWKV Models, providing a local OpenAI API, GUI and custom CUDA kernel acceleration. Supports 2gb up to 32gb VRAM
  • BrainChulo Chat App with vector based Long-Term Memory supporting one-shot, few-shot and Tool capable agents

Voice Assistants

openAI

Local LLMs

Information retrieval

openAI

  • sqlchat Use OpenAI GPT3/4 to chat with your database
  • chat-with-github-repo which uses streamlit, gpt3.5-turbo and deep lake to answer questions about a git repo
  • mpoon/gpt-repository-loader uses Git and GPT-4 to convert a repository into a text format for various tasks, such as code review or documentation generation.

Local LLMs

  • LlamaIndex provides a central interface to connect your LLM's with external data
  • Llama-lab home of llama_agi and auto_llama using LlamaIndex
  • PrivateGPT a standalone question-answering system using LangChain, GPT4All, LlamaCpp and embeddings models to enable offline querying of documents
  • Spyglass tests an Alpaca integration for a self-hosted personal search app. Select the llama-rama feature branch. Discussion on reddit
  • local_llama chatting with your PDFs offline. gpt_chatwithPDF alternative with the ultimate goal of using llama instead of chatGPT
  • Sidekick Information retrieval for LLMs
  • DB-GPT SQL generation, private domain Q&A, data processing, unified vector storage/indexing, and support for various plugins and LLMs
  • localGPT a privateGPT inspired document question-answering solution using GPU instead of CPU acceleration and InstructorEmbeddings, which perform better according to leaderboards instead of LlamaEmbeddings
  • LocalDocs plugin for GPT4All
  • annoy_ltm extension to add long term memory to chatbots using a nearest neighbor vector DB for memory retrieval
  • ChatDocs PrivateGPT + Web UI + GPU Support + ggml, transformers, webui
  • PAutoBot document question-answering engine developed with LangChain, GPT4All, LlamaCpp, ChromaDB, PrivateGPT, CPU only

Model Agnostic

  • Paper QA LLM Chain for answering questions from documents with citations, using OpenAI Embeddings or local llama.cpp, langchain and FAISS Vector DB
  • BriefGPT document summarization and querying using OpenAI' and locally run LLM's using LlamaCpp or GPT4ALL, and embeddings stored as a FAISS index, built using Langchain.

Browser Extensions

openAI

  • sider chrome side-bar for chatGPT and OpenAI API supporting custom prompts and text highlighting
  • chathub-dev/chathub
  • Glarity open-source chrome extension to write summaries for various websites including custom ones and YouTube videos. Extensible
  • superpower-chatgpt chrome extension / firefox addon to add missing features like Folders, Search, and Community Prompts to ChatGPT

Local LLMs

  • chatGPTBox add useful LLM chat-boxes to github and other websites, supporting self-hosted model (RWKV, llama.cpp, ChatGLM)

Agents / Automatic GPT

openAI

  • Auto GPT
  • AgentGPT Deploy autonomous AI agents, using vectorDB memory, web browsing via LangChain, website interaction and more including a GUI
  • microGPT Autonomous GPT-3.5/4 agent, can analyze stocks, create art, order pizza, and perform network security tests
  • Auto GPT Plugins
  • AutoGPT-Next-Web An AgentGPT fork as a Web GUI
  • AutoGPT Web
  • AutoGPT.js
  • LoopGPT a re-implementation of AutoGPT as a proper python package, modular and extensible
  • Camel-AutoGPT Communicaton between Agents like BabyAGI and AutoGPT
  • BabyAGIChatGPT is a fork of BabyAGI to work with OpenAI's GPT, pinecone and google search
  • GPT Assistant An autonomous agent that can access and control a chrome browser via Puppeteer
  • gptchat a client which uses GPT-4, adding long term memory, can write its own plugins and can fulfill tasks
  • Chrome-GPT AutoGPT agent employing Langchain and Selenium to interact with a Chrome browser session, enabling Google search, webpage description, element interaction, and form input
  • autolang Another take on BabyAGI, focused on workflows that complete. Powered by langchain.
  • ai-legion A framework for autonomous agents who can work together to accomplish tasks.
  • generativeAgent_LLM Generative Agents with Guidance, Langchain, and local LLMs, implementation of the "Generative Agents: Interactive Simulacra of Human Behavior" paper, blogpost

Local LLMs

  • Auto Vicuna Butler Baby-AGI fork / AutoGPT alternative to run with local LLMs
  • BabyAGI AI-Powered Task Management for OpenAI + Pinecone or Llama.cpp
  • Agent-LLM Webapp to control an agent-based Auto-GPT alternative, supporting GPT4, Kobold, llama.cpp, FastChat, Bard, Oobabooga textgen
  • auto-llama-cpp fork of Auto-GPT with added support for locally running llama models through llama.cpp
  • AgentOoba autonomous AI agent extension for Oobabooga's web ui
  • RecurrentGPT Interactive Generation of (Arbitrarily) Long Text. Uses LSTM, prompt-engineered recurrence, maintains short and long-term memories, and updates these using semantic search and paragraph generation.
  • SuperAGI open-source framework that enables developers to build, manage, and run autonomous agents. Supports tools extensions, concurrent agents, GUI, console, vector DBs, multi modal, telemetry and long term memory

Multi Modal

Code generation

  • FauxPilot open source Copilot alternative using Triton Inference Server
  • Turbopilot open source LLM code completion engine and Copilot alternative
  • Tabby Self hosted Github Copilot alternative
  • starcoder.cpp
  • GPTQ-for-SantaCoder 4bit quantization for SantaCoder
  • supercharger Write Software + unit tests for you, based on Baize-30B 8bit, using model parallelism
  • Autodoc toolkit that auto-generates codebase documentation using GPT-4 or Alpaca, and can be installed in a git repository in about 5 minutes.
  • smol-ai developer a personal junior developer that scaffolds an entire codebase with a human-centric and coherent whole program synthesis approach using <200 lines of Python and Prompts.
  • locai kobold/oobabooga -compatible api for vscode
  • oasis local LLaMA models in VSCode

Libraries and Wrappers

openAI

  • acheong08/ChatGPT Python reverse engineerded chatGPT API
  • gpt4free Use reverse engineered GPT3.5/4 APIs of other website's APIs
  • GPTCache, serve cached results based on embeddings in a vector DB, before querying the OpenAI API.
  • kitt TTS + GPT4 + STT to create a conference call audio bot
  • Marvin simplifies AI integration in software development with easy creation of AI functions and bots managed through a conversational interface
  • chatgpt.js client-side JavaScript library for ChatGPT
  • ChatGPT-Bridge use chatGPT plus' GPT-4 as a local API
  • Powerpointer connects to openAPI GPT3.5 and creates a powerpoint out of your content
  • EdgeGPT Reverse engineered API of Microsoft's Bing Chat using Edge browser

Local LLMs

  • FastLLaMA Python wrapper for llama.cpp
  • WebGPT Inference in pure javascript
  • TokenHawk performs hand-written LLaMA inference using WebGPU, utilizing th.cpp, th-llama.cpp, and th-llama-loader.cpp, with minimal dependencies
  • WasmGPT ChatGPT-like chatbot in browser using ggml and emscripten
  • AutoGPTQ easy-to-use model GPTQ quantization package with user-friendly CLI
  • gpt-llama.cpp Replace OpenAi's GPT APIs with llama.cpp's supported models locally
  • llama-node JS client library for llama (or llama based) LLMs built on top of llama-rs and llama.cpp.
  • TALIS serves a LLaMA-65b API, optimized for speed utilizing dual RTX 3090/4090 GPUs on Linux
  • Powerpointer-For-Local-LLMs connects to oobabooga's API and creates a powerpoint out of your content
  • OpenChatKit open-source project that provides a base to create both specialized and general purpose chatbots and extensible retrieval system, using GPT-NeoXT-Chat-Base-20B as a base model
  • webgpu-torch Tensor computation with WebGPU acceleration
  • llama-api-server that uses llama.cpp and emulates an openAI API
  • CTransformers python bindings for transformer models in C/C++ using GGML library, supporting GPT-2/J/NeoX, StableLM, LLaMA, MPT, Dollyv2, StarCoder
  • basaran GUI and API as a drop-in replacement of the OpenAI text completion API. Broad HF eco system support (not only llama)
  • CodeTF one-stop Python transformer-based library for code LLMs and code intelligence, training and inferencing on code summarization, translation, code generation

Model agnostic

Fine Tuning & Training

Frameworks

  • Vicuna FastChat
  • SynapseML (previously known as MMLSpark),an open-source library that simplifies the creation of massively scalable machine learning (ML) pipelines
  • Microsoft guidance efficient Framework for Enhancing Control and Structure in Modern Language Model Interactions. Demo project by paolorechia for local text-generation-webui. reddit thread. guidance fork and llama-cpp-python fork how-to on reddit
  • Microsoft semantic-kernel a lightweight SDK enabling integration of AI Large Language Models (LLMs) with conventional programming languages
  • Colossal-AI unified deep learning system that provides a collection of parallel components for distributed deep learning models. Provides data parallelism, pipeline parallelism, and tensor parallelism

Resources

Data sets

Research

Other awesome resources

Product Showcases

Optimization

Benchmarking

Leaderboards

Benchmark Suites

  • Big-bench a collaborative benchmark featuring over 200 tasks for evaluating the capabilities of llms
  • Pythia interpretability analysis for autoregressive transformers during training
  • AlpacaEval automatic evaluation for instruction following LLMs, validated against 20k human annotations, reddit announcement

AI DevOps

Databases for ML

  • Pinecone proprietary vector search for semantic search, recommendations and information retrieval
  • FAISS Library for Efficient Similarity Search and Clustering using vectors
  • Weaviate open source vector DB for services like OpenAI, HF etc for text, image, Q&A etc.
  • vespa.ai one of the only scalable vector DBs that supports multiple vectors per schema field
  • LanceDB free open-source serverless vector DB with support for langchain, llamaindex and multi-modal data
  • Deeplake Vector Database for audio, text, vectors, video
  • milvus open-source cloud-native vector DB focusing on embedding vectors converted from unstructured data
  • chroma open-source embedding database
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment