Skip to content

Instantly share code, notes, and snippets.

@bhatti
Created November 4, 2025 04:06
Show Gist options
  • Save bhatti/6a98791438c77e17eec78605d93ed349 to your computer and use it in GitHub Desktop.
Save bhatti/6a98791438c77e17eec78605d93ed349 to your computer and use it in GitHub Desktop.
Method Memory Compression Quality Loss Works On
FP16 (baseline) 19.3 GB 0% All GPUs
AWQ 5.2 GB 3.7× ~2% L4, A100
FP8 ~9.7 GB ~1% H100 only
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment