isgallagher

## WUT.md

      
              3 files
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                isgallagher
                / WUT.md
            
            
              Created
              May 31, 2025 15:15
                — forked from the-crypt-keeper/WUT.md
            
              
                LiteLLM adapters for generating images with local Stable Diffusion APIs
              
          
    This is a raw dump of litellm image generation adapters for kobold (which is automatic1111 compatible), sd-server and llamabox.
Only the /v1/images/generations endpoint is supported by the adapters.
To use:

Register the custom handlers in custom_provider_map
Define endpoints for kobold/automatic1111, sd-server and llamabox as needed

Things I hate about this:

  
## backup.py
import os
import hashlib
import requests
import json
import argparse

# Config
BASE_URL = "https://civitai.com/api/v1"
EXTENSION = ".safetensors"

## DeepSeek-R1-Quantized-GGUF-Gaming-Rig-Inferencing-Fast-NVMe-SSD.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                isgallagher
                / DeepSeek-R1-Quantized-GGUF-Gaming-Rig-Inferencing-Fast-NVMe-SSD.md
            
            
              Created
              February 8, 2025 23:47
                — forked from ubergarm/DeepSeek-R1-Quantized-GGUF-Gaming-Rig-Inferencing-Fast-NVMe-SSD.md
            
              
                Aggregate throughput just over 2 tok/sec on R1 671B with 8 concurrent generations.
              
          
    tl;dr;

You can run the real deal big boi R1 671B locally off a fast NVMe SSD even without enough RAM+VRAM to hold the 212GB dynamically quantized weights. No it is not swap and won't kill your SSD's read/write cycle lifetime. No this is not a distill model. It works fairly well despite quantization (check the unsloth blog for details on how they did that).
The basic idea is that most of the model itself is not loaded into RAM on startup, but mmap'd. Then kv cache will take up some RAM. Most of your system RAM is left available to serve as disk cache for whatever experts/weights are currently most used. I can see the model slow down and cache dump and refill when the model switches over to counting words for example.
It is faster on my system using the GPU, but not by much. It may be overall faster to dedicate the GPU PCIe lanes to more NVMe storage in the theory. Curious if anyone has such a fast read IOPS array to try?
Notes and example generations below.
Model Reference


## hls.sh
#!/bin/bash

# Function to display usage information
usage() {
    echo "Usage: $0 /path/to/input.mp4 [ /path/to/output_directory ]"
    exit 1
}

# Check if at least one argument (input file) is provided
if [ $# -lt 1 ]; then

## rpi4_zfs.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                isgallagher
                / rpi4_zfs.md
            
            
              Created
              July 26, 2024 00:05
                — forked from skystar-p/rpi4_zfs.md
            
              
                Install Arch Linux aarch64 + dm-crypt+ ZFS root on Raspbery Pi 4
              
          
    Installing Arch Linux ARM(aarch64) with ZFS root on RPi4

Prerequisite


Arch Linux aarch64 live system. Refer https://archlinuxarm.org/platforms/armv8/broadcom/raspberry-pi-4 .
USB Keyboard
HDMI cable(or access to serial console).

Boot up your aarch64 Arch Linux. Plug in SD card adapter and ensure that disk shows up on your system.
lsblk
	import os
	import hashlib
	import requests
	import json
	import argparse

	# Config
	BASE_URL = "https://civitai.com/api/v1"
	EXTENSION = ".safetensors"
	#!/bin/bash

	# Function to display usage information
	usage() {
	echo "Usage: $0 /path/to/input.mp4 [ /path/to/output_directory ]"
	exit 1
	}

	# Check if at least one argument (input file) is provided
	if [ $# -lt 1 ]; then