Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.
| # train_grpo.py | |
| # | |
| # See https://github.com/willccbb/verifiers for ongoing developments | |
| # | |
| """ | |
| citation: | |
| @misc{brown2025grpodemo, | |
| title={Granular Format Rewards for Eliciting Mathematical Reasoning Capabilities in Small Language Models}, | |
| author={Brown, William}, |
Warning This is SEVERELY outdated, the current jupyter version is > 6.X, please refer to your current jupyter notebook installation!
Disclaimer : I just copied those shortcuts from Jupyter Menú > Help > Keyboard Shortcuts, I didn't wrote them myself.
Check your current shortcuts in your Help, shortcuts coule have been modified by extensions or your past self.