Skip to content

Instantly share code, notes, and snippets.

mmozeiko / upng.h
Last active November 3, 2023 18:29
uncompressed png writer & reader
View upng.h
#pragma once
// uncompressed png writer & reader
// supports only 8-bit and 16-bit formats
// Performance comparison for 8192x8192 BGRA8 image (256MB)
// Compiled with "clang -O2", AVX2 requires extra "-mavx2" or "/arch:AVX2" argument
// For libpng (compressed) uses default libpng/zlib compression settings
// For libpng (uncompressed) case following two functions are used:
d7samurai /
Last active November 23, 2023 22:11
Minimal D3D11 bonus material: pixel art antialiasing

Minimal D3D11 bonus material: pixel art antialiasing

A minimal Direct3D 11 implementation of "antialiased point sampling", useful for smooth fractional movement and non-integer scaling of pixel art AKA "fat pixel" aesthetics.

Also view below side-by-side point sampling comparison on YouTube (video is zoomed in to counter implicit downsampling & compression artifacts and make aa effect more apparent) or check out the Shadertoy.


The actual sampler is set to bilinear filtering (the default D3D11 sampler state) in order to utilize single texture-read hardware interpolation, then emulating point sampling in the shader and applying AA at the fat pixel boundaries. Use with premultiplied alpha textures* and keep a one pixel transparent border around each sprite/tile.

d7samurai /
Last active November 19, 2023 15:31
Minimal D3D11 pt2

Minimal D3D11 pt2

Follow-up to Minimal D3D11, adding instanced rendering. As before: An uncluttered Direct3D 11 setup & basic rendering primer / API familiarizer. Complete, runnable Windows application contained in a single function and laid out in a linear, step-by-step fashion. No modern C++ / OOP / obscuring cruft.

The main difference here is that the hollow cube is rendered using DrawIndexedInstanced (which saves a lot of vertices compared to the original, so model data is now small enough to be included in the source without being too much in the way), but also all trigonometry and matrix math is moved to the vertex shader, further simplifying the main code.

Each instance is merely this piece of geometry, consisting of 4 triangles:


..which is then repeated 24 times, rotated and colored:

ibireme / kpc_demo.c
Last active October 26, 2023 00:22
A demo shows how to read Intel or Apple M1 CPU performance counter in macOS.
View kpc_demo.c
// =============================================================================
// XNU kperf/kpc demo
// Available for 64-bit Intel/Apple Silicon, macOS/iOS, with root privileges
// Demo 1 (profile a function in current thread):
// 1. Open directory '/usr/share/kpep/', find your CPU PMC database.
// For M1 (Pro/Max), the database file is '/usr/share/kpep/a14.plist'.
// 2. Select a few events that you are interested in,
// add their names to the `profile_events` array below.
View value_speculation.c
// Estimating CPU frequency...
// CPU frequency: 4.52 GHz
// sum1: value = 15182118497126522709, 0.31 secs, 5.14 cycles/elem
// sum2: value = 15182118497126522709, 0.17 secs, 2.93 cycles/elem
#define RW(x) asm("" : "+r"(x))
typedef struct Node {
u64 value;
struct Node *next;
NoelFB / routine.h
Last active November 7, 2022 07:19
Simple C++ Coroutine using a switch statement internally
View routine.h
#pragma once
namespace YourNamespace
struct Routine
// Current "waiting time" before we run the next block
float wait_for = 0;
// Used during `rt_for`, which repeats the given block for X time
View terrible lerp.cpp
template <class _Ty>
_NODISCARD /* constexpr */ _Ty _Common_lerp(const _Ty _ArgA, const _Ty _ArgB, const _Ty _ArgT) noexcept {
// on a line intersecting {(0.0, _ArgA), (1.0, _ArgB)}, return the Y value for X == _ArgT
const int _Finite_mask = (int{isfinite(_ArgA)} << 2) | (int{isfinite(_ArgB)} << 1) | int{isfinite(_ArgT)};
if (_Finite_mask == 0b111) {
// 99% case, put it first; this block comes from P0811R3
if ((_ArgA <= 0 && _ArgB >= 0) || (_ArgA >= 0 && _ArgB <= 0)) {
// exact, monotonic, bounded, determinate, and (for _ArgA == _ArgB == 0) consistent:
return _ArgT * _ArgB + (1 - _ArgT) * _ArgA;
native-m / async_diffgpu.cpp
Created July 9, 2019 19:24
Performing async task in different GPU using DirectX 11
View async_diffgpu.cpp
#include <Windows.h>
#include <d3d11.h>
#include <iostream>
#include <vector>
#include <thread>
#include <mutex>
#include <d3dcompiler.h>
#pragma comment(lib, "dxgi.lib")
#pragma comment(lib, "d3d11.lib")
willscott / Makefile
Created June 11, 2019 14:39
Wasm source map compilation
View Makefile
.PHONY: all
all: rot13.wasm
%.wasm.full: %.c
clang $< -g -o $@
%.wasm.dwarf: %.wasm.full
llvm-dwarfdump $< > $@
%.wasm: %.wasm.full %.wasm.dwarf
View virtual_memory.hpp
#pragma once
#if defined(_WIN32)
# include <Windows.h>
#elif defined(__APPLE__) || defined(LINUX)
# include <sys/mman.h>
#include <optional>
#include "result.hpp"