Skip to content

Instantly share code, notes, and snippets.

View bhavv04's full-sized avatar
🧸
Compiling

Bhavdeep Arora bhavv04

🧸
Compiling
View GitHub Profile
"""
Reproduces the "grokking" phenomenon from:
"Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"
Power et al., 2022 (Nanda et al. mechanistic interpretability follow-up)
The task: learn modular addition (a + b) mod P
- Input: two integers a, b in [0, P)
- Output: (a + b) mod P (P-way classification)
What you'll see: