Skip to content

Instantly share code, notes, and snippets.

@tbenthompson
tbenthompson / attn_mask_bug.py
Last active October 16, 2023 22:11
Investigation of discrepancies between vLLM and Huggingface Llama 2 generation
"""
An explanation for discrepancies between three different ways of generating tokens with Llama-2-7b-chat-hf:
1. Huggingface's `model.generate` defaults to using a mask with a zero in the first position and ones elsewhere.*
2. Huggingface `model.forward` defaults to using a mask with all ones.
3. VLLM defaults to using a mask with all ones, matching Huggingface `model.forward` but not `model.generate`.
* Why? I think maybe HF generate is excluding the BOS <s> token. Is this correct? I don't know!
I ran with:
- transformers 4.34.0
@tbenthompson
tbenthompson / config.py
Last active May 30, 2023 00:23
dataclass and YAML configurator on top of typer
"""
Usage:
A dataclass/YAML/CLI config system:
- write a @dataclass with your config options
- make sure every option has a default value
- include a `config: str = ""` option in the dataclass.
- write a main function that takes a single argument of the dataclass type
- decorate your main function with @dataclass_cli
- make sure your main function has a docstring.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@tbenthompson
tbenthompson / function_serialize.cpp
Last active December 12, 2022 09:28
Serializing functions in C++
#include <iostream>
#include <sstream>
#include <fstream>
/* Serialize a function by writing out a pointer to its location in memory.
* This will only work between two processes running identical binaries.
*
* One difficulty is ASLR:
* Address space layout randomization (ASLR) puts functions in a different
* place in memory everytime a program is loaded. Within a given binary