This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| '''Single-file Quadtrix language model — transformer training loop with tiktoken (GPT-2 BPE): | |
| -Now you can chat with it | |
| -Sulti-head | |
| -Self-attention | |
| -AdamW optimizer | |
| -Best-checkpoint saving | |
| -Live token streaming | |
| -No external config dependencies | |
| Trained on TinyStories dataset (not included) — a dataset of short children's stories used to study language model capabilities at small scale. Drop in any plain-text corpus as cleaned.txt and training starts immediately.''' | |
| import torch |