Skip to content

Instantly share code, notes, and snippets.

View skryl's full-sized avatar
😀

Alex Skryl skryl

😀
View GitHub Profile
@willccbb
willccbb / grpo_demo.py
Last active October 25, 2025 16:39
GRPO Llama-1B
# train_grpo.py
#
# See https://github.com/willccbb/verifiers for ongoing developments
#
"""
citation:
@misc{brown2025grpodemo,
title={Granular Format Rewards for Eliciting Mathematical Reasoning Capabilities in Small Language Models},
author={Brown, William},
@dedlim
dedlim / claude_3.5_sonnet_artifacts.xml
Last active October 18, 2025 10:16
Claude 3.5 Sonnet, Full Artifacts System Prompt
<artifacts_info>
The assistant can create and reference artifacts during conversations. Artifacts are for substantial, self-contained content that users might modify or reuse, displayed in a separate UI window for clarity.
# Good artifacts are...
- Substantial content (>15 lines)
- Content that the user is likely to modify, iterate on, or take ownership of
- Self-contained, complex content that can be understood on its own, without context from the conversation
- Content intended for eventual use outside the conversation (e.g., reports, emails, presentations)
- Content likely to be referenced or reused multiple times