deep-programmer

## PyTorch_bucket_by_sequence_length.py
"""
PyTorch has pack_padded_sequence this doesn’t work with dense layers. For sequence data with high variance in its length
the best way to minimize padding and masking within a batch is by feeding in data that is already grouped by sequence length
(while still shuffling it somewhat). Here is my current solution in numpy.
I will need to convert every function over to torch to allow it to run on the GPU and am sure there are many other
ways to optimize it further. Hope this helps others and that maybe it can become a new PyTorch Batch Sampler someday.

General approach to how it works:

Decide what your bucket boundaries for the data are.

## GPUOptimizationForGameDev.md

      
              1 file
            
          
              133 forks
            
          
                51 comments
              
            
              1436 stars
            
          
                silvesthu
                / GPUOptimizationForGameDev.md
            
            
              Last active
              November 14, 2025 20:57
            
              
                GPU Optimization for GameDev
              
          
    GPU Optimization for GameDev

Graphics Pipeline / GPU Architecture Overview


2011 - A trip through the Graphics Pipeline 2011
2013 - Performance Optimization Guidelines and the GPU Architecture behind them
2015 - Life of a triangle - NVIDIA's logical pipeline
2015 - Render Hell 2.0
2016 - How bad are small triangles on GPU and why?
2017 - GPU Performance for Game Artists
2019 - Understanding the anatomy of GPUs using Pokémon


## openmesh-python-cheat-sheet.py
# OpenMesh-Python Cheat Sheet
'''
documentation: https://openmesh-python.readthedocs.io/en/latest/
OpenMesh-Python official repository:
  https://www.graphics.rwth-aachen.de:9000/OpenMesh/openmesh-python
C++ documentation:
  https://www.openmesh.org/media/Documentations/OpenMesh-Doc-Latest/a04099.html

Installation:
* pip install openmesh  # https://pypi.org/project/openmesh/

## init.vim
" Specify a directory for plugins
call plug#begin('~/.vim/plugged')

Plug 'neoclide/coc.nvim', {'branch': 'release'}
Plug 'scrooloose/nerdtree'
"Plug 'tsony-tsonev/nerdtree-git-plugin'
Plug 'Xuyuanp/nerdtree-git-plugin'
Plug 'tiagofumo/vim-nerdtree-syntax-highlight'
Plug 'ryanoasis/vim-devicons'
Plug 'airblade/vim-gitgutter'

## segfault.py
class E(BaseException):
    def __new__(cls, *args, **kwargs):
        return cls
def a(): yield
a().throw(E)

## lisp-simple.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <assert.h>

// We'll have 128 tokens. Each token can be up to 32 characters long.
char token[128][32];

int lexer(char* input) {

## fast-strlen.c
// Compile and run with:
//
//   gcc -O3 -march=native fast-strlen.c -lpthread -o fast-strlen
//       && ./fast-strlen
//
// Use gcc because clang is too smart and optimizes away parts of the
// benchmark. Results on Xeon(R) CPU E5-2650 v4 @ 2.20GHz with gcc
// 9.4.0:
//
//   Scanning 10 times over 4.00GB...

## vanity.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <openssl/sha.h>

#define TARGET_PREFIX "20250327"
#define MAX_WORDS 256
#define MAX_TEXT 2048
#define MAX_ATTEMPTS (1ULL << 32) // 2^32 attempts (~4.3B, enough for 8-char prefix)
	"""
	PyTorch has pack_padded_sequence this doesn’t work with dense layers. For sequence data with high variance in its length
	the best way to minimize padding and masking within a batch is by feeding in data that is already grouped by sequence length
	(while still shuffling it somewhat). Here is my current solution in numpy.
	I will need to convert every function over to torch to allow it to run on the GPU and am sure there are many other
	ways to optimize it further. Hope this helps others and that maybe it can become a new PyTorch Batch Sampler someday.

	General approach to how it works:

	Decide what your bucket boundaries for the data are.
	# OpenMesh-Python Cheat Sheet
	'''
	documentation: https://openmesh-python.readthedocs.io/en/latest/
	OpenMesh-Python official repository:
	https://www.graphics.rwth-aachen.de:9000/OpenMesh/openmesh-python
	C++ documentation:
	https://www.openmesh.org/media/Documentations/OpenMesh-Doc-Latest/a04099.html

	Installation:
	* pip install openmesh # https://pypi.org/project/openmesh/
	" Specify a directory for plugins
	call plug#begin('~/.vim/plugged')

	Plug 'neoclide/coc.nvim', {'branch': 'release'}
	Plug 'scrooloose/nerdtree'
	"Plug 'tsony-tsonev/nerdtree-git-plugin'
	Plug 'Xuyuanp/nerdtree-git-plugin'
	Plug 'tiagofumo/vim-nerdtree-syntax-highlight'
	Plug 'ryanoasis/vim-devicons'
	Plug 'airblade/vim-gitgutter'
	class E(BaseException):
	def __new__(cls, args, *kwargs):
	return cls
	def a(): yield
	a().throw(E)
	#include <stdio.h>
	#include <stdlib.h>
	#include <string.h>
	#include <ctype.h>
	#include <assert.h>

	// We'll have 128 tokens. Each token can be up to 32 characters long.
	char token[128][32];

	int lexer(char* input) {
	// Compile and run with:
	//
	// gcc -O3 -march=native fast-strlen.c -lpthread -o fast-strlen
	// && ./fast-strlen
	//
	// Use gcc because clang is too smart and optimizes away parts of the
	// benchmark. Results on Xeon(R) CPU E5-2650 v4 @ 2.20GHz with gcc
	// 9.4.0:
	//
	// Scanning 10 times over 4.00GB...