Skip to content

Instantly share code, notes, and snippets.

Timestamps

(00:00:00) – How batch size affects token cost and speed
(00:32:09) – How MoE models are laid out across a GPU racks
(00:47:12) – How pipeline parallelism moves model layers across racks
(01:03:37) – Why Ilya said, “As we now know, pipelining is not wise.”
(01:18:59) – Because of RL, models may be 100x over-trained beyond Chinchilla-optimal
(01:33:02) – Deducing long context memory costs from API pricing
(02:04:02) – Convergent evolution between neural nets and cryptography

@caretechai
caretechai / music-apis-and-dbs.md
Created March 4, 2025 14:19 — forked from 0xdevalias/music-apis-and-dbs.md
A collection of music APIs, databases, and related tools