Skip to content

Instantly share code, notes, and snippets.

View scLLptr's full-sized avatar

scLLptr scLLptr

View GitHub Profile
@scLLptr
scLLptr / div5.inc
Last active June 2, 2026 20:46
M33 ARM fast div5, for 0..63 input
@ Fast div5 = idx / 5, without division
@ div5 = (idx * 52) >> 8
@ Using 8 bit approximation of reciprocal value 1 / 5
@ (1 / 5) * 256 = 51.2 => magic5 = 52
@ (( 0 * 52) >> 8) = 0 >> 8 = 0
@ ...
@ (( 4 * 52) >> 8) = 208 >> 8 = 0
@ (( 5 * 52) >> 8) = 260 >> 8 = 1
@ ...
@ (( 9 * 52) >> 8) = 468 >> 8 = 1
@scLLptr
scLLptr / kbd.c
Created May 31, 2026 18:53
SIMD-style (SWAR) Bit-Packed 8-Bit Keyboard Matrix Debouncer
// Matrix Mx8 kbd scan with debouncing, depth = 4
// Mod n-history stage for different depth level
#define M 7 // Number of columns in your matrix
// 4-stage history and stable state per column
typedef struct {
uint8_t h0, h1, h2, h3;
uint8_t state;
} debounce_col_4_t;
debounce_col_4_t matrix_state[M];
@scLLptr
scLLptr / euc_div_mod.inc
Last active May 31, 2026 18:44
Euclidean modulus with division for ARM M33
@ ==============================================================================
@ euc_div_mod
@ ==============================================================================
@ Calculates both a mathematically floored quotient and a strictly non-negative
@ remainder concurrently using a high-velocity branchless bitmask workflow.
@
@ Input / Output Arguments:
@ \quot [Updated ] Destination register for the floored quotient output
@ \rem [Updated ] Destination register for the non-negative remainder
@
#include <stdint.h>
uint16_t fast_gamma_16b(uint32_t* l2_lut, uint32_t g_q24, int n) {
if (n == 0) return 0;
uint32_t p_q24 = ((uint64_t)l2_lut[n] * g_q24) >> 24;
int32_t c_fixed = -(int32_t)p_q24;
int i = c_fixed >> 24;
uint32_t f_31 = (c_fixed & 0xFFFFFF) << 7;