Skip to content

Instantly share code, notes, and snippets.

View moyix's full-sized avatar

Brendan Dolan-Gavitt moyix

View GitHub Profile
@moyix
moyix / pybefore.py
Created April 28, 2024 21:03
Script to list the most recent version of a PyPI package released before a particular date
#!/usr/bin/env python3
import sys
import requests
from datetime import datetime, timezone
# Ok I'll be honest ChatGPT wrote the vast majority of this
# Use at your own risk
def get_latest_version_before_date(package_name, cutoff_date):
@moyix
moyix / README.md
Created March 8, 2024 22:45
Claude 3 writes a fuzzer for VRML files

C++ files are are from this GitHub repository, with a small modification by me to allow the parser to accept a filename on the command line:

https://github.com/alepapadop/vrml

genvrml_v*.py written by Claude 3 Opus.

The conversation was:

Initial Prompt

@moyix
moyix / Makefile
Created March 8, 2024 05:26
Claude 3 writes a fuzzer
all: gifread gifread.asan gifread.ubsan gifread.coverage
gifread: gifdec.c gifread.c gifdec.h
$(CC) $(CFLAGS) -o $@ gifdec.c gifread.c $(LDFLAGS)
gifread.asan: gifdec.c gifread.c gifdec.h
$(CC) $(CFLAGS) -g -fsanitize=address -o $@ gifdec.c gifread.c $(LDFLAGS)
gifread.ubsan: gifdec.c gifread.c gifdec.h
$(CC) $(CFLAGS) -g -fsanitize=undefined -o $@ gifdec.c gifread.c $(LDFLAGS)
@moyix
moyix / gengif_spec.py
Created March 8, 2024 20:57
Claude's random GIF generator, based only on the GIF89a spec
from typing import BinaryIO
import random
import struct
def generate_random_input(out: BinaryIO):
# Generate Header
out.write(b'GIF89a') # GIF signature and version
# Generate Logical Screen Descriptor
screen_width = random.randint(1, 65535)
@moyix
moyix / gengif_nocode.py
Created March 8, 2024 16:13
Claude's random GIF generator, without seeing the parser code
from typing import BinaryIO
import random
import struct
def generate_random_input(out: BinaryIO):
# Generate a random width and height (between 1 and 1000)
width = random.randint(1, 1000)
height = random.randint(1, 1000)
# Write GIF header
@moyix
moyix / ensure_fpu.py
Last active March 5, 2024 10:55
Some handy utils for messing with MXCSR (x86-64 SSE FPU control register)
#!/usr/bin/env python
import sys, os
import platform
import ctypes as ct
import mmap
from enum import Enum
import importlib
import functools
import errno
@moyix
moyix / killbutmakeitlooklikeanaccident.sh
Created February 5, 2022 22:51
Script to inject an exit(0) syscall into a running process. NB: only x86_64 for now!
#!/bin/bash
gdb -p "$1" -batch -ex 'set {short}$rip = 0x050f' -ex 'set $rax=231' -ex 'set $rdi=0' -ex 'cont'
@moyix
moyix / DecompileToJson.java
Created January 27, 2024 06:16
Ghidra scripts to produce JSON files with decompilation / disassembly for each function in an binary
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.util.HashMap;
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import ghidra.app.script.GhidraScript;
import ghidra.app.decompiler.DecompInterface;
Given the following program:
```
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define BUFFERSIZE 200
#define TRUE 1
#define FALSE 0
@moyix
moyix / CodeGen_GPTJ_Conversion.md
Last active January 5, 2024 12:50
How to convert the SalesForce CodeGen models to GPT-J

Using Linear Algebra to Convert a Large Code Model

Background

The SalesForce CodeGen models are a family of large language models trained on a large amount of natural language data and then fine-tuned on specialized datasets of code. Models of size 350M, 2B, 6B, and 16B parameters are provided in three flavors:

  • nl, the base model trained on The Pile, a large natural language dataset compiled by EleutherAI
  • multi, which is fine-tuned from the nl model on a dataset of code in multiple languages, scraped from GitHub, and
  • mono, which is fine-tuned from the multi model on Python code only.