Skip to content

Instantly share code, notes, and snippets.


Wunk Wunkolo

View GitHub Profile
View gnfi-shift.cpp
#include <cstdint>
#include <cstdio>
#include <bitset>
#include <immintrin.h>
// Attempts at implementing _mm_srai_epi8, _mm_slli_epi8, and _mm_srli_epi8
// using affine galois field transformations(_mm_gf2p8affine_epi64_epi8, GNFI)
// Wed Nov 4 05:34:35 PM PST 2020 -
inline __m128i _mm_srai_epi8(__m128i a, std::uint8_t imm8)
Wunkolo /
Last active May 31, 2019
AVX512-BITALG base2 decoding/encoding

This is a little writeup on some anticipatory code to eventually test and benchmark on the upcoming Intel Icelake architecture.

The pext instruction is a particularly useful instruction in BMI2 that allows the programmer to provide a bit-mask integer with 1 bits set in positions of interests for which the pext instruction will extract these bits in parallel and compact them all against the least-significnat bits.

Given a bitmask and an input, pext will select the bits where-ever there is a
set bit in the mask, and compress them together to produce a new result.

|0000100000001111100000100001111100010000000010000001001000010000|  < Operand A
Wunkolo / pclmulqdq.cpp
Last active May 28, 2019
Fun with pclmulqdq(carryless multiply)
View pclmulqdq.cpp
#include <cstdint>
#include <cstdio>
#include <bitset>
#include <immintrin.h>
int main()
const std::uint64_t Bits
= 0b00011000100000000001000000000010000000001000000100000001000000001;
# Wunkolo<> - 8/3/2018
# Dumps the archive files for Avalanche's APEX Engine 2 ".tab/.arc" file pairs
# Drag this python script into your "theHunter Call of the Wild\archives_win64\"
# directory and run the script. It will create some worker threads and dump each
# archive into its own folder. Each dumped file will be named using its file
# identifier(a numerical hash of the actual filename). Other games such as
# Just Cause 3 were mistakingly shipped with a .txt file that paired each hash
# with a full filename but they fixed up theHunter so all we got to work
# with are numerical file IDs
Wunkolo /
Last active Mar 21, 2018
Greedy Powers of 2

Greedy SIMD algorithms

When making an algorithm I at some point wanted to measure just how better it is than the serial method. With SIMD you can process multiple elements of an array in parallel at the granularity of the size of the vector registers. So if I have an algorithm that can process 16,8,4,2,1 elements in parallel, what is the most optimal way I could fire off each algorithm to process an array? Since this is basically a greedy algorithm, divide N(the size of the array) by 16, and then divide whats left-over by 8, and divide whats left-over by 4, and so on until you've touched every part of the array. So you'd fire off the 16 algorithm one as much as you can, before having to resort to the smaller ones, and eventually reaching 1 where you're processing one element at a time.


Say you have an array of 91 elements, and can onl

from PIL import Image
from PIL import ImageDraw
def qHilbertSOA(Width, Distances):
Level = 1
PositionsX = [0] * len(Distances)
PositionsY = [0] * len(Distances)
CurDistances = Distances
for i in range(Width.bit_length() - 1):
# Determine Regions
Wunkolo / Mystery.js
Last active Feb 6, 2021
Spiderman home-coming mystery code solved
View Mystery.js
targetPos = thisComp.layer("target").toComp([0,0]); // Compensating for the missing "targetPos" variable
box = thisComp.layer("box");
boxTopLeft = box.toComp([0,0]);
boxBottomRight = box.toComp([box.width,box.height]);
// this is erroneous on their part. "deltaX" and "xDistanceToEdge" does not exist yet, commented out.
boxAnchor = box.toComp(box.anchorPoint);// xRatio = deltaX/xDistanceToEdge;
deltaVec = sub(targetPos, boxAnchor)
deltaX = deltaVec[0];
View aphelion.html
<title>{Title}{block:PostTitle} | {PostTitle}{/block:PostTitle}{block:PostSummary} | {PostSummary}{/block:PostSummary}</title>
<meta charset="utf-8">
<meta name="color:Text" content="#a2a2a2" />
<meta name="color:Trim" content="#313032" />
<meta name="color:Background" content="#0e0e0f" />
<meta name="color:BackLight" content="#1f1e21" />
<meta name="color:Posts" content="#19181a" />
<link rel="shortcut icon" href="{Favicon}">
Wunkolo /
Last active Jun 15, 2021
Platinum games "DAT" file dumper

DAT/DTT files are general containers found within the compressed .cpk files

struct Header
	std::uint32_t Magic; // 'DAT\x00'
	std::uint32_t FileCount;
	std::uint32_t FileTableOffset;
	std::uint32_t ExtensionTableOffset;
	std::uint32_t NameTableOffset;
View Console.cpp
#include "Console.hpp"
#include <Windows.h>
#include <conio.h> // _getch()
#pragma warning(disable:4996)
#include <io.h>
#include <cctype> //isgraph