Skip to content

Instantly share code, notes, and snippets.


Brendan Dolan-Gavitt moyix

View GitHub Profile
moyix / ds_config_AdamW_16B_reduce_mem.json
Created Oct 7, 2022
Training command line and deepspeed config for CodeGen 16B, 3xA100 GPUs
View ds_config_AdamW_16B_reduce_mem.json
"fp16": {
"enabled": true,
"loss_scale": 0,
"loss_scale_window": 1000,
"initial_scale_power": 16,
"hysteresis": 2,
"min_loss_scale": 1
"optimizer": {
moyix / top_fp_all.txt
Last active Sep 27, 2022
Floating point (SSE/SSE2) instruction usage rates among projects in oss-fuzz
View top_fp_all.txt
Total instructions: 48093488942
Total SSE instructions: 100105422
Total XMM instructions: 877832653
Totals by sanitizer:
ASAN: SSE: 39197160, XMM: 308790743
MSAN: SSE: 29922931, XMM: 342062480
UBSAN: SSE: 30985331, XMM: 226979430
All projects per sanitizer, sorted by percent of SSE instructions:
ASAN: SSE Instr / Total = Pct ↓ Wilson
simd : 1122000 / 63479115 = 1.77 % ( 1.76 %)

The ffast and the Furious

This is a small and admittedly contrived demo showing how some weird but safe code could become vulnerable if run in an environment where some shared library has changed the FPU's FTZ/DAZ bits to force denormals to zero.

To run it:

# Create an empty file
$ touch gofast.c      
moyix /
Created Sep 5, 2022 for jump2db, which drops a bunch of stuff into $HOME
import shutil
from setuptools import find_packages, setup
from os.path import exists,join,relpath
import os
import stat
moyix /
Last active Sep 22, 2022
Some handy utils for messing with MXCSR (x86-64 SSE FPU control register)
#!/usr/bin/env python
import sys, os
import platform
import ctypes as ct
import mmap
from enum import Enum
import importlib
import functools
import errno
import sys
import os
import re
import json
import zipfile
from collections import defaultdict, namedtuple
from import Mapping
from email.parser import HeaderParser
from email.policy import compat32
from base64 import urlsafe_b64decode
#!/usr/bin/env python
import os
import sys
import subprocess as sp
import tempfile
import hashlib
script_dir = os.path.dirname(os.path.realpath(__file__))
from fast_check_for_ffast_math import check_file
moyix /
Created Sep 2, 2022
A faster check to see if a binary has a constructor that enables FTZ/DAZ that just does byte matching
import sys
import mmap
from elftools.elf.elffile import ELFFile, ELFError
import struct
set_fast_math_code = bytes.fromhex('0fae5c24fc814c24fc408000000fae5424fcc3')
def load_bytes_from_elf(bindata, elf, vaddr, size):
paddr = next(iter(elf.address_offsets(vaddr)))
View xla_constructors.txt
$ objdump -s -j .init_array ./jaxlib/ | sed -e '1,/Contents/ d' | cut -c 10-44 | xxd -r -p | od -A none -w8 -t x8 --endian=little | addr2line -a -f -e ./jaxlib/ | paste -sd ' \n' | c++filt
0x000000000084c5e0 __cpu_indicator_init /dt9-src/libgcc/config/i386/cpuinfo.c:434
0x000000000084ca20 frame_dummy crtstuff.c:?
moyix /
Last active Oct 29, 2022
Hacky script to check for the set_fast_math constructor in an executable/shared library using objdump
#!/usr/bin/env python
import subprocess
import re
import sys
def get_init_array(filename):
# Call objdump -s -j .init_array <filename> to get the contents of the .init_array section
objdump_output = subprocess.check_output(['objdump', '-s', '-j', '.init_array', filename], stderr=subprocess.STDOUT)