Skip to content

Instantly share code, notes, and snippets.

View sroecker's full-sized avatar

Steffen Röcker sroecker

View GitHub Profile
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@tvst
tvst / streamlit_app.py
Last active May 23, 2024 15:47
Simple way to run heavy computations without slowing down other Streamlit users
import streamlit as st
import concurrent.futures # We'll do computations in separate processes!
import mymodule # This is where you'll do the computation
# Your st calls must go inside this IF block.
if __name__ == '__main__':
st.write("Starting a long computation on another process")
# Pick max number of concurrent processes. Depends on how heavy your computation is, and how
# powerful your machine is.
@Artefact2
Artefact2 / README.md
Last active June 25, 2024 19:00
GGUF quantizations overview

Which GGUF is right for me? (Opinionated)

Good question! I am collecting human data on how quantization affects outputs. See here for more information: ggerganov/llama.cpp#5962

In the meantime, use the largest that fully fits in your GPU. If you can comfortably fit Q4_K_S, try using a model with more parameters.

llama.cpp feature matrix

See the wiki upstream: https://github.com/ggerganov/llama.cpp/wiki/Feature-matrix

@kalomaze
kalomaze / llm_samplers_explained.md
Last active June 16, 2024 18:19
LLM Samplers Explained

LLM Samplers Explained

Everytime a large language model makes predictions, all of the thousands of tokens in the vocabulary are assigned some degree of probability, from almost 0%, to almost 100%. There are different ways you can decide to choose from those predictions. This process is known as "sampling", and there are various strategies you can use which I will cover here.

OpenAI Samplers

Temperature

  • Temperature is a way to control the overall confidence of the model's scores (the logits). What this means is that, if you use a lower value than 1.0, the relative distance between the tokens will become larger (more deterministic), and if you use a larger value than 1.0, the relative distance between the tokens becomes smaller (less deterministic).
  • 1.0 Temperature is the original distribution that the model was trained to optimize for, since the scores remain the same.
  • Graph demonstration with voiceover: https://files.catbox.moe/6ht56x.mp4
@jrknox1977
jrknox1977 / ollama_dspy.py
Created February 9, 2024 18:06
ollama+DSPy using OpenAI APIs.
# install DSPy: pip install dspy
import dspy
# Ollam is now compatible with OpenAI APIs
#
# To get this to work you must include `model_type='chat'` in the `dspy.OpenAI` call.
# If you do not include this you will get an error.
#
# I have also found that `stop='\n\n'` is required to get the model to stop generating text after the ansewr is complete.
# At least with mistral.
@bjsi
bjsi / main.py
Created February 4, 2024 10:38
Deploying RAGatouille on Modal Labs
from typing import List, Optional, TypedDict
import modal
from modal import gpu, build, enter, exit, method
class Document(TypedDict):
content: str
metadata: dict
@sayakpaul
sayakpaul / coco_30k_hf_datasets.py
Created January 31, 2024 10:34
Samples 30k samples randomly from the COCO 2014 validation set.
from datasets import Dataset, Features
from datasets import Image as ImageFeature
from datasets import Value
import pandas as pd
import os
# CSV comes from the notebook above.
df = pd.read_csv("coco_30k_randomly_sampled_2014_val.csv")
root_path = "val2014"
@virattt
virattt / rag-reranking-gpt-colbert-mistral.ipynb
Last active April 3, 2024 04:23
rag-reranking-gpt-colbert-mistral.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@rauchg
rauchg / p.sh
Last active May 18, 2024 13:05
Perplexity CLI in pure shell
#!/usr/bin/env bash
function p() {
jq -n \
--arg content "$*" \
'{
"model": "pplx-7b-online",
"messages": [
{
"role": "system",
"content": "Be precise and concise."
@veekaybee
veekaybee / normcore-llm.md
Last active June 29, 2024 03:29
Normcore LLM Reads

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Screenshot 2023-12-18 at 10 40 27 PM

Pre-Transformer Models