Skip to content

Instantly share code, notes, and snippets.

Differences Between Frequency and Presence Penalty

Frequency Penalty:

(based on training)

Frequency penalty parameter makes it less likely to predict words that the model has seen more frequently during its training.

This will encourage the model to generate novel or less common words. It works by scaling down the log probabilities of words that the model has seen frequently during training - making it less likely for the model to generate these common words (and thus common phrases).

Presence Penalty:

whjms /
Last active April 7, 2023 16:35
Instructions for running KoboldAI in 8-bit mode

Running KoboldAI in 8-bit mode

tl;dr use Linux, install bitsandbytes (either globally or in KAI's conda env, add load_in_8bit=True, device_map="auto" in model pipeline creation calls)

Many people are unable to load models due to their GPU's limited VRAM. These models contain billions of parameters (model weights and biases), each of which is a 32 (or 16) bit float. Thanks to the hard work of some researchers [1], it's possible to run these models using 8-bit numbers, which halves the required amount of VRAM compared to running in half-precision. E.g. if a model requires 16GB of VRAM, running with 8-bit inference only requires 8GB.

This guide was written for KoboldAI 1.19.1, and tested with Ubuntu 20.04. These instructions are based on work by Gmin in KoboldAI's Discord server, and Huggingface's efficient LM inference guide.


# install torch with CUDA support. See for more instructions if this fails.
pip install torch --extra-index-url
# check if torch supports GPU; this must output "True". You need CUDA 11. installed for this. You might be able to use
# a different version, but this is what I tested.
python -c "import torch; print(torch.cuda.is_available())"
# clone web ui and go into its directory
git clone
cd anifusion-sd-webui
mmtrt / fixdd
Last active April 2, 2024 12:48
[ROOT] [Magisk] [Service.d] [Script] [Fix] DriveDroid on Android 9+
# run while loop for boot_completed status & sleep 10 needed for magisk service.d
while [ "$(getprop sys.boot_completed | tr -d '\r')" != "1" ]; do sleep 1; done
sleep 10
# save currently active function name
echo "$(ls -al /config/usb_gadget/g1/configs/b.1/)" | grep -Eo f1.* | awk '{print $3}' | cut -d/ -f8 > /data/adb/.fixdd
# loop
haxscramper /
Last active March 16, 2024 19:45
genotrance / sciter.nim
Last active July 13, 2020 03:38
Nim wrapper for sciter using nimterop
import nimterop/[cimport]
intrin = "<x86intrin.h>"
{.localPassC: "-msse4.2".}
M128i {.importc: "__m128i", header: intrin, bycopy.} = object
SIDD_CMP_RANGES = 0b0000_0100'i32
SIDD_NEGATIVE_POLARITY = 0b0001_0000'i32
# nim c -r --threads:on --gc:orc
import cpuinfo, os, random, locks, deques
WorkReq = ref object
id: int
WorkRes = ref object
id: int
varqox /
Last active March 7, 2024 17:59
How to record multiple applications and microphone into one audio file on Linux using PulseAudio

How to record multiple applications and microphone into one audio file on Linux

Step 0. Terminology

Sinks are for output, sources are for input. To stream source to sink a loopback must be created. More shall you find there.

Step 1. Create output sink that will be recorded

Our output sink will be named recording.

pacmd load-module module-null-sink sink_name=recording sink_properties=device.description=recording
obscurerichard /
Last active March 26, 2021 15:19

Pull request poetry

by Richard Bullington-McGuire @obscurerichard on GitHub and Twitter

Use these as comments in pull requests in order to charm the project owner into taking action on the pull request.

Initial ticklers

lonely pull request
the completist in me pines
for its prompt closure