Skip to content

Instantly share code, notes, and snippets.

View fxkamd's full-sized avatar

Felix Kuehling fxkamd

  • AMD
View GitHub Profile
@fxkamd
fxkamd / TinyGrad-notes.md
Last active April 26, 2024 15:34
Observations about HSA and KFD backends in TinyGrad

This is Felix Kuehling, long time KFD driver architect. I started looking into the TinyGrad source code yesterday, focusing on ops_kfd.py, ops_hsa.py and driver/hsa.py, to understand how TinyGrad talks to our HW and help with the ongoing debugging effort from the top down. This analysis is based on this commit: https://github.com/tinygrad/tinygrad/tree/3de855ea50d72238deac14fc05cda2a611497778

I'm intrigued by the use of Python for low-level programming. I think I can learn something from your use of ctypes and clang2py for fast prototyping and test development. I want to share some observations based on my initial review.

ops_kfd looks pretty new, and I see many problems with it based on my long experience working on KFD. I think it's interesting, but probably not relevant for the most pressing problems at hand, so I'll cover that last.

ops_hsa uses ROCr APIs to manage GPU memory, create a user mode AQL queue for GPU kernel dispatch, async SDMA copies, and signal-based synchronization with barrier packets