Mann Patel manncodes

## softmax_quack.py
import argparse
import time
from typing import Type

import torch
import torch.nn.functional as F
import torch._inductor.config

torch._inductor.config.triton.multi_kernel = True

## train_modal_standalone.py
import os
import sys
import time
import math
import pickle
from contextlib import nullcontext
from pathlib import Path
import subprocess
from dataclasses import dataclass
import inspect

## llm-materials-2025.md

      
              1 file
            
          
              3 forks
            
          
                0 comments
              
            
              73 stars
            
          
                yoavg
                / llm-materials-2025.md
            
            
              Last active
              October 8, 2025 09:29
            
          
    Learning LLMs in 2025

So you know how the transformer works, and you know basic ML/DL, and you want to learn more about LLMs. One way to go is looking into the various "algorithmic" stuff (optimization algorithms, RL, DPO, etc). Lot's of materials on that. But the interesting stuff is (in my opinion at least) not there.
This is an attempt to collect a list of academic (or academic-like) materials that explore LLMs from other directions, and focus on the non-ML-algorithmic aspects.
Courses


David Chiang's Theory of Neural Networks course.
This is not primarily LLMs, but does have substantial section on Transformers. Formal/Theory. More of a book than a course.


## grpo_qwen-0-5b_single_t4.ipynb

      
              1 file
            
          
              10 forks
            
          
                2 comments
              
            
              27 stars
            
          
                qunash
                / grpo_qwen-0-5b_single_t4.ipynb
            
            
              Last active
              October 15, 2025 03:21
            
              
                grpo_qwen-0-5b_single_t4.ipynb
              
          
      Loading

      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## grpo_demo.py
# train_grpo.py
#
# See https://github.com/willccbb/verifiers for ongoing developments
#
"""
citation:

@misc{brown2025grpodemo,
  title={Granular Format Rewards for Eliciting Mathematical Reasoning Capabilities in Small Language Models},
  author={Brown, William},

## prompt.txt
Begin by enclosing all thoughts within <thinking> tags, exploring multiple angles and approaches.
Break down the solution into clear steps within <step> tags. Start with a 20-step budget, requesting more for complex problems if needed.
Use <count> tags after each step to show the remaining budget. Stop when reaching 0.
Continuously adjust your reasoning based on intermediate results and reflections, adapting your strategy as you progress.
Regularly evaluate progress using <reflection> tags. Be critical and honest about your reasoning process.
Assign a quality score between 0.0 and 1.0 using <reward> tags after each reflection. Use this to guide your approach:

0.8+: Continue current approach
0.5-0.7: Consider minor adjustments
Below 0.5: Seriously consider backtracking and trying a different approach

## transformer.py
"""
The 2024 Transformer (the Noam Transformer):
- RMSNorm
- GQA or some combination
- Sliding window attention
- Swiglu
- RoPE (Rotary Positional Embedding)

LLM Arches:
                hidden | MLP mult.   |  n_layers | rope_theta |  GQA Group Size  | GLU Act. |  ops

## script.js
// 1. login to https://www.isin.com/ and then head over to https://www.isin.com/isin-database/ page
// 2. open the chrome console by pressing F12, and paste the script inside the console.
// output: a json file named `all-data.json` containing list of dictionaries of info about the companies.

const org = ["nobiskrug", "swm", "vattenfall",
              // and all the other companies you need to extract information about
            ];
function downloadJSON(data) {
    const dataStr = JSON.stringify(data, null, 2);
    const blob = new Blob([dataStr], { type: 'application/json' });

## fast-tensor-dataloader.py
import torch

class FastTensorDataLoader:
    """
    A DataLoader-like object for a set of tensors that can be much faster than
    TensorDataset + DataLoader because dataloader grabs individual indices of
    the dataset and calls cat (slow).
    Source: https://discuss.pytorch.org/t/dataloader-much-slower-than-manual-batching/27014/6
    """
    def __init__(self, *tensors, batch_size=32, shuffle=False):

## GitCommitEmoji.md

      
              1 file
            
          
              768 forks
            
          
                491 comments
              
            
              3830 stars
            
          
                parmentf
                / GitCommitEmoji.md
            
            
              Last active
              October 24, 2025 22:54
            
              
                Git Commit message Emoji
              
          
    Inspired by dannyfritz/commit-message-emoji
See also gitmoji.


Commit type
Emoji


Initial commit
🎉 :tada:


Version tag
🔖 :bookmark:


New feature
✨ :sparkles:


Bugfix
🐛 :bug:
	import argparse
	import time
	from typing import Type

	import torch
	import torch.nn.functional as F
	import torch._inductor.config

	torch._inductor.config.triton.multi_kernel = True
	import os
	import sys
	import time
	import math
	import pickle
	from contextlib import nullcontext
	from pathlib import Path
	import subprocess
	from dataclasses import dataclass
	import inspect
	# train_grpo.py
	#
	# See https://github.com/willccbb/verifiers for ongoing developments
	#
	"""
	citation:

	@misc{brown2025grpodemo,
	title={Granular Format Rewards for Eliciting Mathematical Reasoning Capabilities in Small Language Models},
	author={Brown, William},
	Begin by enclosing all thoughts within <thinking> tags, exploring multiple angles and approaches.
	Break down the solution into clear steps within <step> tags. Start with a 20-step budget, requesting more for complex problems if needed.
	Use <count> tags after each step to show the remaining budget. Stop when reaching 0.
	Continuously adjust your reasoning based on intermediate results and reflections, adapting your strategy as you progress.
	Regularly evaluate progress using <reflection> tags. Be critical and honest about your reasoning process.
	Assign a quality score between 0.0 and 1.0 using <reward> tags after each reflection. Use this to guide your approach:

	0.8+: Continue current approach
	0.5-0.7: Consider minor adjustments
	Below 0.5: Seriously consider backtracking and trying a different approach
	"""
	The 2024 Transformer (the Noam Transformer):
	- RMSNorm
	- GQA or some combination
	- Sliding window attention
	- Swiglu
	- RoPE (Rotary Positional Embedding)

	LLM Arches:
	hidden \| MLP mult. \| n_layers \| rope_theta \| GQA Group Size \| GLU Act. \| ops
	// 1. login to https://www.isin.com/ and then head over to https://www.isin.com/isin-database/ page
	// 2. open the chrome console by pressing F12, and paste the script inside the console.
	// output: a json file named `all-data.json` containing list of dictionaries of info about the companies.

	const org = ["nobiskrug", "swm", "vattenfall",
	// and all the other companies you need to extract information about
	];
	function downloadJSON(data) {
	const dataStr = JSON.stringify(data, null, 2);
	const blob = new Blob([dataStr], { type: 'application/json' });
	import torch

	class FastTensorDataLoader:
	"""
	A DataLoader-like object for a set of tensors that can be much faster than
	TensorDataset + DataLoader because dataloader grabs individual indices of
	the dataset and calls cat (slow).
	Source: https://discuss.pytorch.org/t/dataloader-much-slower-than-manual-batching/27014/6
	"""
	def __init__(self, *tensors, batch_size=32, shuffle=False):
Commit type	Emoji
Initial commit	🎉 `:tada:`
Version tag	🔖 `:bookmark:`
New feature	✨ `:sparkles:`
Bugfix	🐛 `:bug:`