Jonathan Mamou jmamou

## demo_inference_damage.py
"""
Demonstrates the actual output damage caused by the STANDALONE vocab-mismatch
bug during inference — without requiring model weights or a GPU.

The STANDALONE speculative decoding loop assembles output token-by-token:
  - Accepted draft tokens  → decoded by TARGET tokenizer  (CORRUPTED if vocab differs)
  - Rejected positions      → resampled by target          (correct)

This script simulates that assembly:
  1. "Target output" is produced by encoding the ground-truth continuation
	"""
	Demonstrates the actual output damage caused by the STANDALONE vocab-mismatch
	bug during inference — without requiring model weights or a GPU.

	The STANDALONE speculative decoding loop assembles output token-by-token:
	- Accepted draft tokens → decoded by TARGET tokenizer (CORRUPTED if vocab differs)
	- Rejected positions → resampled by target (correct)

	This script simulates that assembly:
	1. "Target output" is produced by encoding the ground-truth continuation