Skip to content

Instantly share code, notes, and snippets.

sortie@sortie-pc1 ~/source-sdk-2013/sp/src/utils $ cd vbsp
sortie@sortie-pc1 ~/source-sdk-2013/sp/src/utils/vbsp $ make
g++ -m32 -D_DLL_EXT=.so -DDX_TO_GL_ABSTRACTION -D_EXTERNAL_DLL_EXT=.so -DGL_GLEXT_PROTOTYPES -DGNUC -D_LINUX -DLINUX -DNDEBUG -DNO_HOOK_MALLOC -DNO_MALLOC_OVERRIDE -DNO_STRING_T -D_POSIX -DPOSIX -DPROTECTED_THINGS_ENABLE -D_snprintf=use_Q_snprintf_instead -Dstrncpy=use_Q_strncpy_instead -DUSE_SDL -DUSE_WEBM_FOR_REPLAY -DVECTOR -DVERSION_SAFE_STEAM_API_INTERFACES -DVPROF_LEVEL=1 -I../../public -I../../public/tier0 -I../../public/tier1 -I../common -g -O2 -Werror -Wall -Wno-invalid-offsetof -Wno-unused-local-typedefs -Wno-unused-but-set-variable -Wno-unused-function -Wno-unused-value -Wno-unused-variable -Wextra -Wno-unused-parameter -Wno-ignored-qualifiers -Wno-missing-field-initializers -Wno-type-limits -c ../../public/builddisp.cpp -o ../../public/builddisp.o
g++ -m32 -D_DLL_EXT=.so -DDX_TO_GL_ABSTRACTION -D_EXTERNAL_DLL_EXT=.so -DGL_GLEXT_PROTOTYPES -DGNUC -D_LINUX -DLINUX -DNDEBUG -DNO_H
@aras-p
aras-p / D3D9ByteCode.cpp
Last active January 27, 2024 09:53
D3D9 Shader Bytecode Patching for Half-Pixel Fixup
#include "UnityPrefix.h"
#include "D3D9ByteCode.h"
// D3D9 shader bytecode format on MSDN: https://msdn.microsoft.com/en-us/library/windows/hardware/ff552891.aspx
const UInt32 kD3D9ShaderTypeVertex = 0xFFFE0000;
const UInt32 kD3D9SwizzleShift = 16;
const UInt32 kD3D9NoSwizzle = ((0 << kD3D9SwizzleShift) | (1 << (kD3D9SwizzleShift + 2)) | (2 << (kD3D9SwizzleShift + 4)) | (3 << (kD3D9SwizzleShift + 6)));
@marcan
marcan / linux.sh
Last active July 21, 2024 14:00
Linux kernel initialization, translated to bash
#!/boot/bzImage
# Linux kernel userspace initialization code, translated to bash
# (Minus floppy disk handling, because seriously, it's 2017.)
# Not 100% accurate, but gives you a good idea of how kernel init works
# GPLv2, Copyright 2017 Hector Martin <marcan@marcan.st>
# Based on Linux 4.10-rc2.
# Note: pretend chroot is a builtin and affects the current process
# Note: kernel actually uses major/minor device numbers instead of device name
@lynn
lynn / random-rhymes.py
Last active January 7, 2018 21:09
Turn English text into nonsense that sounds like the input
from collections import defaultdict
import fileinput
import random
import re
common = """the of and to a in for is on that by this with
i you it not or be are from at as your all have an was we
will can us i'm it you're i've my of""".split()
pronounce = {}
@zeux
zeux / cone-culling-experiments.log
Last active February 19, 2024 08:38
Comparison of backface culling efficiency for cluster cone culling with 64-triangle clusters and triangle mask culling (6 64-bit masks per cluster).
Algorithms used for Cone* preprocess the mesh in some way, then split sequentially into 64-triangle clusters:
ConeBase: optimize mesh for transform cache
ConeSort: split mesh into large planar connected clusters, bin clusters into 6 buckets by cardinal axes, optimize each bucket for transform cache
ConeAcmr: optimize mesh for transform cache, split sequentially into variable length clusters that are relatively planar, sort clusters by avg normal
ConeCash: optimize mesh for transform cache, picking triangles that reduce ACMR but prioritizing those that keep current cluster planar
MaskBase: split sequentially into 64-triangle clusters, store a 64-bit conservative triangle mask for 6 frustums (cube faces)
ManyConeN: split sequentially into 64-triangle clusters, store N (up to 4) cones for each cluster and a cone id per triangle (2 bit)
Note that all Cone* solutions get significantly worse results with 128 or 256 triangle clusters; it doesn't matter much for Mask.
The biggest challenge with Cone* algorithms is t
// ==UserScript==
// @name annotate twitter usernames
// @namespace https://twitter.com/chordbug
// @version 0.1
// @description annotate twitter usernames
// @author lynn
// @match https://twitter.com/*
// @grant GM_addStyle
// ==/UserScript==
@gshen42
gshen42 / VectorClock.v
Last active May 9, 2020 15:10
Proof that under casual broadcast deliverable condition, merge and tick are equivalent.
Require Import Coq.Init.Nat.
Require Import Coq.Arith.PeanoNat.
(*
A vector clock [vclock] is a total map from index to clock and is represented
as a function from natural numbers to natural numbers. For index not in the
map, the defualt clock is 0.
*)
Definition vclock : Type := nat -> nat.
@alexjc
alexjc / reading-list.rst
Last active December 6, 2022 03:09
Reading List on Texture Synthesis
@fxkamd
fxkamd / TinyGrad-notes.md
Last active April 26, 2024 15:34
Observations about HSA and KFD backends in TinyGrad

This is Felix Kuehling, long time KFD driver architect. I started looking into the TinyGrad source code yesterday, focusing on ops_kfd.py, ops_hsa.py and driver/hsa.py, to understand how TinyGrad talks to our HW and help with the ongoing debugging effort from the top down. This analysis is based on this commit: https://github.com/tinygrad/tinygrad/tree/3de855ea50d72238deac14fc05cda2a611497778

I'm intrigued by the use of Python for low-level programming. I think I can learn something from your use of ctypes and clang2py for fast prototyping and test development. I want to share some observations based on my initial review.

ops_kfd looks pretty new, and I see many problems with it based on my long experience working on KFD. I think it's interesting, but probably not relevant for the most pressing problems at hand, so I'll cover that last.

ops_hsa uses ROCr APIs to manage GPU memory, create a user mode AQL queue for GPU kernel dispatch, async SDMA copies, and signal-based synchronization with barrier packets