Skip to content

Instantly share code, notes, and snippets.

View RealNeGate's full-sized avatar

Yasser Arguelles Snape RealNeGate

  • Washington, USA
View GitHub Profile
////////////////////////////////
// NBHS - Non-blocking hashset
////////////////////////////////
// You wanna intern lots of things on lots of cores? this is for you. It's
// inspired by Cliff's non-blocking hashmap.
#ifndef NBHS_H
#define NBHS_H
typedef void* (*NBHS_AllocZeroMem)(size_t size);
typedef void (*NBHS_FreeMem)(void* ptr, size_t size);
// combined "Peeps + ConstProp + GVN" solver (extended version of the one in Combining Analyses, Combining Optimizations 7.3.2)
//
// returns isomorphic node (might be the same node).
Node* peephole_node(Optimizer* opt, Node* n) {
Node* k;
bool progress = false;
// potential mutations from ideal() means we need to get rehash.
hashset_remove(opt->gvn_nodes, n);
import sun.misc.Unsafe;
import java.lang.reflect.Field;
import java.nio.file.*;
import java.io.*;
class mur {
public static Unsafe getUnsafe() {
try {
final Field fld = Unsafe.class.getDeclaredField("theUnsafe");
import java.nio.file.*;
import java.io.*;
import java.util.*;
import java.nio.charset.StandardCharsets;
class csv {
// i don't want all my fucking int array elements boxed.
static class IntArray {
int cnt;
// This is a PoC for incremental compilation for Zig, the idea is that we
// can break down all of type checking into a simple term rewriting system
// which has one step Church-Rosser.
//
// Church-Rosser (i'll shorten to CR):
// CR states that given our reductions terminate, the order of reduction is irrelevant, more
// importantly we can say that if we reduce to our normal form in one step per term then our
// total reductions are linear with the size of the IR.
//
// Term?
// Mostly portable example of creating an object file to embed a binary.
// Requires stb_ds.h (https://github.com/nothings/stb/blob/master/stb_ds.h)
// Arguments:
//
// coff_embed [name] [input]
//
// name must be a valid C-identifier if you wanna actually import into C, the
// idea is that the 'input' file is going to be embedded into an object file
// under the symbol name 'name', the content's length is a size_t with the name
// _length appended to the name.

because of the new APX stuff i got to thinking about different changes to x86 over time and how they affect code, like general code not just the manual intrinsics case...

VEX was part of AVX and introduced ternaries which shorted the encoding.

; pre-VEX
movaps xmm1, xmm2
subss xmm1, xmm3
; VEX
vsubss xmm1, xmm2, xmm3 ; less bytes mostly

Let's compare Clang and Cuik/TB a bit to get a picture of what i need to do.

Given this C code:

void max_array(size_t n, float* x, float* y) {
    for (size_t i = 0; i < n; i++) {
        x[i] = x[i] > y[i] ? x[i] : y[i];
    }
}
// We do a little too much trolling...
//
// Ever wanted to pretend divisions by zero just didn't happen? here you go...
// this is memes, don't try to make this work with C because the optimizer will
// fight you on it. You can apply it to your own language if you really wanted.
#include <stdint.h>
#include <stdio.h>
#define WIN32_LEAN_AND_MEAN
#include <windows.h>
// Compile with clang or MSVC (WINDOWS ONLY RN)
//
// Implementing a POC green threads system using safepoints to show how cheap and simple it can
// be done, all you need to do is call SAFEPOINT_POLL in your own language at the top of every
// loop and function body (you can loosen up on this depending on the latency of pausing you're
// willing to pay). Safepoint polling is made cheap because it's a load without a use site
// which means it doesn't introduce a stall and pays a sub-cycle cost because of it (wastes resources
// sure but doesn't block up the rest of execution).
//
// # safepoint poll