Skip to content

Instantly share code, notes, and snippets.

View cosmojg's full-sized avatar
:electron:
Overthinking

Cosmo cosmojg

:electron:
Overthinking
View GitHub Profile
@netixc
netixc / main.py
Last active February 3, 2025 09:38
chatbot ollama/tts kokoro/stt speaches
import pyaudio
import wave
import time
import warnings
from openai import OpenAI
import json
import sys
import io
import numpy
import requests
@willccbb
willccbb / grpo_demo.py
Last active February 9, 2025 12:58
GRPO Llama-1B
# train_grpo.py
import re
import torch
from datasets import load_dataset, Dataset
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import LoraConfig
from trl import GRPOConfig, GRPOTrainer
# Load and prep dataset
@hackermondev
hackermondev / research.md
Last active February 8, 2025 08:32
Unique 0-click deanonymization attack targeting Signal, Discord and hundreds of platform

hi, i'm daniel. i'm a 15-year-old high school junior. in my free time, i hack billion dollar companies and build cool stuff.

3 months ago, I discovered a unique 0-click deanonymization attack that allows an attacker to grab the location of any target within a 250 mile radius. With a vulnerable app installed on a target's phone (or as a background application on their laptop), an attacker can send a malicious payload and deanonymize you within seconds--and you wouldn't even know.

I'm publishing this writeup and research as a warning, especially for journalists, activists, and hackers, about this type of undetectable attack. Hundreds of applications are vulnerable, including some of the most popular apps in the world: Signal, Discord, Twitter/X, and others. Here's how it works:

Cloudflare

By the numbers, Cloudflare is easily the most popular CDN on the market. It beats out competitors such as Sucuri, Amazon CloudFront, Akamai, and Fastly. In 2019, a major Cloudflare outage k

@deepfates
deepfates / convert_oai_to_sharegpt.py
Created November 17, 2024 20:26
Convert a fine-tuning dataset from OpenAI format to ShareGPT format
import json
import argparse
def convert_oai_to_sharegpt(input_file: str, output_file: str):
with open(input_file, 'r') as infile, open(output_file, 'w') as outfile:
for line in infile:
conversation = json.loads(line)
# Skip system messages
for message in conversation["messages"]:
if message.get("role") == "system":
@deepfates
deepfates / convert_archive.py
Created November 17, 2024 19:33
Convert your twitter archive into a training dataset and markdown files
import argparse
import json
import logging
import os
import re
import shutil
from concurrent.futures import ProcessPoolExecutor, as_completed
from dataclasses import dataclass
from datetime import datetime
from typing import Any, Callable, Dict, List, Literal, Optional, Tuple
@VictorTaelin
VictorTaelin / fast_scansum.cu
Created April 9, 2024 21:22
Fast CUDA block-local prefix sum (scamsun) using warp sync primitives (__shfl_up_sync)
// Fast block-local prefix-sum on CUDA, using warp-syncs.
// The input is an array of u32. It is mutated in place. Example:
// arr = [1,1,1,1,...]
// Becomes:
// arr = [1,2,3,4,...]
// The number of elements must be equal to threads per block (TPB).
#include <stdio.h>
#include <cuda_runtime.h>
@VictorTaelin
VictorTaelin / a_b_challenge.md
Last active January 12, 2025 02:04
A::B Prompting Challenge: $10k to prove me wrong!

CHALLENGE

Develop an AI prompt that solves random 12-token instances of the A::B problem (defined here), with 90%+ success rate.

RULES

1. The AI will be given a <problem/> to solve.

We'll use your prompt as the SYSTEM PROMPT, and a specific instance of problem as the PROMPT, inside XML tags. Example:

@Artefact2
Artefact2 / README.md
Last active February 9, 2025 10:27
GGUF quantizations overview

Which GGUF is right for me? (Opinionated)

Good question! I am collecting human data on how quantization affects outputs. See here for more information: ggerganov/llama.cpp#5962

In the meantime, use the largest that fully fits in your GPU. If you can comfortably fit Q4_K_S, try using a model with more parameters.

llama.cpp feature matrix

See the wiki upstream: https://github.com/ggerganov/llama.cpp/wiki/Feature-matrix

@kalomaze
kalomaze / llm_samplers_explained.md
Last active February 5, 2025 06:19
LLM Samplers Explained

LLM Samplers Explained

Everytime a large language model makes predictions, all of the thousands of tokens in the vocabulary are assigned some degree of probability, from almost 0%, to almost 100%. There are different ways you can decide to choose from those predictions. This process is known as "sampling", and there are various strategies you can use which I will cover here.

OpenAI Samplers

Temperature

  • Temperature is a way to control the overall confidence of the model's scores (the logits). What this means is that, if you use a lower value than 1.0, the relative distance between the tokens will become larger (more deterministic), and if you use a larger value than 1.0, the relative distance between the tokens becomes smaller (less deterministic).
  • 1.0 Temperature is the original distribution that the model was trained to optimize for, since the scores remain the same.
  • Graph demonstration with voiceover: https://files.catbox.moe/6ht56x.mp4
MiniModel
minimodel
A self contained hyper short post (I limit myself to 1024 characters, 2048 if I absolutely need it) which is intended to transmit a complete but not necessarily comprehensive model of some phenomena, skill, etc.
The MiniModel format fell out of three things:
1. My dissatisfaction with essays and blog posts.
2. My experimentation with microblogging as a way of getting my ideas out faster and more incrementally.
3. [Maia Pasek's published notes page](https://web.archive.org/web/20170821010721/https://squirrelinhell.github.io/).