FlashAttention-4 will not run on the NVIDIA RTX 5090 (SM120, "desktop Blackwell") and no amount of software patching can fix it. Despite sharing the "Blackwell" brand with data center GPUs like the B200 (SM100), the RTX 5090 uses a fundamentally different tensor core architecture. SM100 has a dedicated tensor memory (TMEM) subsystem with its own instruction family (UTCHMMA, UTMALDG, etc.) that FA4's warp-specialized kernel design requires. SM120 uses the older HMMA instruction family (the same register-to-register MMA approach used since Volta/Ampere) and the TMEM hardware is physically absent from the GB202 die. This is not a software lock, not a fuse bit, and not a toolchain oversight — it is a silicon-level architectural difference. FA2 via Triton remains the best available attention kernel for the RTX 5090.