Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

View Wunkolo's full-sized avatar
💻

Wunk Wunkolo

💻
View GitHub Profile
#include <cstdint>
#include <cstdio>
#include <bitset>
#include <immintrin.h>
// Attempts at implementing _mm_srai_epi8, _mm_slli_epi8, and _mm_srli_epi8
// using affine galois field transformations(_mm_gf2p8affine_epi64_epi8, GNFI)
// Wed Nov 4 05:34:35 PM PST 2020 - wunkolo@gmail.com
inline __m128i _mm_srai_epi8(__m128i a, std::uint8_t imm8)
@Wunkolo
Wunkolo / Base2Bitshuffle.md
Last active November 2, 2022 01:36
AVX512-BITALG base2 decoding/encoding

This is a little writeup on some anticipatory code to eventually test and benchmark on the upcoming Intel Icelake architecture.

The pext instruction is a particularly useful instruction in BMI2 that allows the programmer to provide a bit-mask integer with 1 bits set in positions of interests for which the pext instruction will extract these bits in parallel and compact them all against the least-significnat bits.

Given a bitmask and an input, pext will select the bits where-ever there is a
set bit in the mask, and compress them together to produce a new result.

|0000100000001111100000100001111100010000000010000001001000010000|  < Operand A
@Wunkolo
Wunkolo / pclmulqdq.cpp
Last active May 28, 2019 23:09
Fun with pclmulqdq(carryless multiply)
#include <cstdint>
#include <cstdio>
#include <bitset>
#include <immintrin.h>
int main()
{
const std::uint64_t Bits
= 0b00011000100000000001000000000010000000001000000100000001000000001;
std::puts(("Bits:\t"+std::bitset<64>(Bits).to_string()).c_str());
# Wunkolo<wunkolo@gmail.com> - 8/3/2018
# Dumps the archive files for Avalanche's APEX Engine 2 ".tab/.arc" file pairs
# Drag this python script into your "theHunter Call of the Wild\archives_win64\"
# directory and run the script. It will create some worker threads and dump each
# archive into its own folder. Each dumped file will be named using its file
# identifier(a numerical hash of the actual filename). Other games such as
# Just Cause 3 were mistakingly shipped with a .txt file that paired each hash
# with a full filename but they fixed up theHunter so all we got to work
# with are numerical file IDs
@Wunkolo
Wunkolo / GreedyPower2.md
Last active March 21, 2018 03:43
Greedy Powers of 2

Greedy SIMD algorithms

When making an algorithm I at some point wanted to measure just how better it is than the serial method. With SIMD you can process multiple elements of an array in parallel at the granularity of the size of the vector registers. So if I have an algorithm that can process 16,8,4,2,1 elements in parallel, what is the most optimal way I could fire off each algorithm to process an array? Since this is basically a greedy algorithm, divide N(the size of the array) by 16, and then divide whats left-over by 8, and divide whats left-over by 4, and so on until you've touched every part of the array. So you'd fire off the 16 algorithm one as much as you can, before having to resort to the smaller ones, and eventually reaching 1 where you're processing one element at a time.

avx512

Say you have an array of 91 elements, and can onl

from PIL import Image
from PIL import ImageDraw
def qHilbertSOA(Width, Distances):
Level = 1
PositionsX = [0] * len(Distances)
PositionsY = [0] * len(Distances)
CurDistances = Distances
for i in range(Width.bit_length() - 1):
# Determine Regions
@Wunkolo
Wunkolo / Mystery.js
Last active July 14, 2022 02:07
Spiderman home-coming mystery code solved
targetPos = thisComp.layer("target").toComp([0,0]); // Compensating for the missing "targetPos" variable
box = thisComp.layer("box");
boxTopLeft = box.toComp([0,0]);
boxBottomRight = box.toComp([box.width,box.height]);
// this is erroneous on their part. "deltaX" and "xDistanceToEdge" does not exist yet, commented out.
boxAnchor = box.toComp(box.anchorPoint);// xRatio = deltaX/xDistanceToEdge;
deltaVec = sub(targetPos, boxAnchor)
deltaX = deltaVec[0];
<html>
<head>
<title>{Title}{block:PostTitle} | {PostTitle}{/block:PostTitle}{block:PostSummary} | {PostSummary}{/block:PostSummary}</title>
<meta charset="utf-8">
<meta name="color:Text" content="#a2a2a2" />
<meta name="color:Trim" content="#313032" />
<meta name="color:Background" content="#0e0e0f" />
<meta name="color:BackLight" content="#1f1e21" />
<meta name="color:Posts" content="#19181a" />
<link rel="shortcut icon" href="{Favicon}">
@Wunkolo
Wunkolo / DAT.md
Last active October 12, 2022 22:19
Platinum games "DAT" file dumper

DAT/DTT files are general containers found within the compressed .cpk files

struct Header
{
	std::uint32_t Magic; // 'DAT\x00'
	std::uint32_t FileCount;
	std::uint32_t FileTableOffset;
	std::uint32_t ExtensionTableOffset;
	std::uint32_t NameTableOffset;
#include "Console.hpp"
#include <Windows.h>
#include <conio.h> // _getch()
#pragma warning(disable:4996)
#include <io.h>
#include <cctype> //isgraph