Skip to content

Instantly share code, notes, and snippets.

View craigtrim's full-sized avatar
🎯
Focusing

Craig Trim craigtrim

🎯
Focusing
View GitHub Profile
@craigtrim
craigtrim / llm-input-sanitizer.py
Created January 8, 2026 18:51
LLM Input Sanitization Toolkit
"""
LLM Input Sanitization Toolkit
A comprehensive defense layer against unicode tricks, token boundary attacks,
and adversarial inputs targeting language model tokenizers.
These attacks exploit the gap between what humans see and what models process.
Tokenizers transform text into integer sequences using learned vocabularies.
When that transformation produces unexpected results, models behave unexpectedly.