Bhavdeep Arora bhavv04

## grokking.py
"""
Reproduces the "grokking" phenomenon from:
  "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"
  Power et al., 2022  (Nanda et al. mechanistic interpretability follow-up)

The task: learn modular addition  (a + b) mod P
  - Input:  two integers a, b in [0, P)
  - Output: (a + b) mod P  (P-way classification)

What you'll see:
	"""
	Reproduces the "grokking" phenomenon from:
	"Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"
	Power et al., 2022 (Nanda et al. mechanistic interpretability follow-up)

	The task: learn modular addition (a + b) mod P
	- Input: two integers a, b in [0, P)
	- Output: (a + b) mod P (P-way classification)

	What you'll see: