Skip to content

Instantly share code, notes, and snippets.

View amorehead's full-sized avatar
🧠
Researching and developing...

Alex Morehead amorehead

🧠
Researching and developing...
View GitHub Profile
@amorehead
amorehead / _06_fused_attention_blockptr_jvp.py
Created September 2, 2025 02:09 — forked from Birch-san/_06_fused_attention_blockptr_jvp.py
Triton fused attention tutorial, updated with JVP support. Albeit with atol=1e-3 accuracy on JVP.
from __future__ import annotations
"""
Fused Attention
===============
This is a Triton implementation of the Flash Attention v2 algorithm from Tri Dao (https://tridao.me/publications/flash2/flash2.pdf)
Credits: OpenAI kernel team
Extra Credits: