Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.
![Screenshot 2023-12-18 at 10 40 27 PM](https://private-user-images.githubusercontent.com/3837836/291468646-4c30ad72-76ee-4939-a5fb-16b570d38cf2.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjE3OTE1NjAsIm5iZiI6MTcyMTc5MTI2MCwicGF0aCI6Ii8zODM3ODM2LzI5MTQ2ODY0Ni00YzMwYWQ3Mi03NmVlLTQ5MzktYTVmYi0xNmI1NzBkMzhjZjIucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDcyNCUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA3MjRUMDMyMTAwWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9ZjMyYmRhMjg1NzAwZTc5OWYzMGFkZWE2NjdlZjc5M2UxZmFiNWY2YTJkYjc5YWEwNTE2MTIxYzc5ODNiZTNlMSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.kE9GjfJ6DphMSccjJvBMdReSDaYdmqM6oWy12c19P6E)
""" To use: install Ollama, clone OpenVoice, run this script in the OpenVoice directory | |
brew install portaudio | |
brew install git-lfs | |
git lfs install | |
git clone https://github.com/myshell-ai/OpenVoice | |
cd OpenVoice | |
git clone https://huggingface.co/myshell-ai/OpenVoice | |
cp -r OpenVoice/* . | |
""" To use: install LLM studio (or Ollama), clone OpenVoice, run this script in the OpenVoice directory | |
git clone https://github.com/myshell-ai/OpenVoice | |
cd OpenVoice | |
git clone https://huggingface.co/myshell-ai/OpenVoice | |
cp -r OpenVoice/* . | |
pip install whisper pynput pyaudio | |
""" | |
from openai import OpenAI | |
import time |
# [Mamba: Linear-Time Sequence Modeling with Selective State Spaces](https://arxiv.org/abs/2312.00752) | |
import torch | |
import torch.nn as nn | |
import torch.optim as optim | |
from torch.utils.data import DataLoader, Dataset | |
from torch.nn import functional as F | |
from einops import rearrange, repeat | |
from tqdm import tqdm |
import inspect | |
import json | |
import re | |
import typing | |
from inspect import isclass, getdoc | |
from types import NoneType | |
from pydantic import BaseModel, Field | |
from pydantic.fields import FieldInfo | |
from typing import Any, Type, List, get_args, get_origin, Tuple, Union, Optional |
""" | |
GPU Monitor with Email and Execution | |
This script monitors the usage of GPUs on a system and, when there are enough free GPUs, execute a specified function. | |
The function run a bash script by default but could be any other executable code. | |
This script uses the GPUtil library to monitor GPU usage. | |
Preparation: | |
1. `pip install GPUtil` | |
2. define your own `func` if needed |
import weaviate | |
import csv | |
import openai | |
from weaviate.util import generate_uuid5, get_valid_uuid | |
from uuid import uuid4 | |
OPENAI_API_KEY = "YOUR KEY" | |
WEAVIATE_URL = "YOUR URL" | |
openai.api_key = "YOUR KEY" |
This worked on 14/May/23. The instructions will probably require updating in the future.
llama is a text prediction model similar to GPT-2, and the version of GPT-3 that has not been fine tuned yet. It is also possible to run fine tuned versions (like alpaca or vicuna with this. I think. Those versions are more focused on answering questions)
Note: I have been told that this does not support multiple GPUs. It can only use a single GPU.
It is possible to run LLama 13B with a 6GB graphics card now! (e.g. a RTX 2060). Thanks to the amazing work involved in llama.cpp. The latest change is CUDA/cuBLAS which allows you pick an arbitrary number of the transformer layers to be run on the GPU. This is perfect for low VRAM.
08737ef720f0510c7ec2aa84d7f70c691073c35d
.from gradio_client import Client | |
API_URL = "https://sanchit-gandhi-whisper-jax.hf.space/" | |
# set up the Gradio client | |
client = Client(API_URL) | |
def transcribe_audio(audio_path, task="transcribe", return_timestamps=False): |