Skip to content

Instantly share code, notes, and snippets.

View mmozeiko's full-sized avatar

Mārtiņš Možeiko mmozeiko

View GitHub Profile
@timothyham
timothyham / ipv6guide.md
Last active July 19, 2024 01:14
A Short IPv6 Guide for Home IPv4 Admins

A Short IPv6 Guide for Home IPv4 Admins

This guide is for homelab admins who understand IPv4s well but find setting up IPv6 hard or annoying because things work differently. In some ways, managing an IPv6 network can be simpler than IPv4, one just needs to learn some new concepts and discard some old ones.

Let’s begin.

First of all, there are some concepts that one must unlearn from ipv4:

Concept 1

// ==UserScript==
// @name Intel Intrinsics Guide - Instruction Width Filter
// @version 1.0
// @description Add filters to the Intel Intrinsics Guide
// @author cloud11665
// @homepage https://twitter.com/cloud11665
// @match https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html
// @grant GM_addStyle
// ==/UserScript==
@zingaburga
zingaburga / sve2.md
Last active June 4, 2024 08:54
ARM’s Scalable Vector Extensions: A Critical Look at SVE2 For Integer Workloads

ARM’s Scalable Vector Extensions: A Critical Look at SVE2 For Integer Workloads

Scalable Vector Extensions (SVE) is ARM’s latest SIMD extension to their instruction set, which was announced back in 2016. A follow-up SVE2 extension was announced in 2019, designed to incorporate all functionality from ARM’s current primary SIMD extension, NEON (aka ASIMD).

Despite being announced 5 years ago, there is currently no generally available CPU which supports any form of SVE (which excludes the [Fugaku supercomputer](https://www.fujitsu.com/global/about/innovation/

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#if defined(__x86_64__)
#define BREAK asm("int3")
#else
#error Implement macros for your CPU.
#endif
@christianparpart
christianparpart / terminal-synchronized-output.md
Last active May 30, 2024 21:30
Terminal Spec: Synchronized Output

Synchronized Output

Synchronized output is merely implementing the feature as inspired by iTerm2 synchronized output, except that it's not using the rare DCS but rather the well known SM ? and RM ?. iTerm2 has now also adopted to use the new syntax instead of using DCS.

Semantics

When rendering the screen of the terminal, the Emulator usually iterates through each visible grid cell and renders its current state. With applications updating the screen a at higher frequency this can cause tearing.

This mode attempts to mitigate that.

@dondragmer
dondragmer / PrefixSort.compute
Created January 20, 2021 23:32
An optimized GPU counting sort
#pragma use_dxc //enable SM 6.0 features, in Unity this is only supported on version 2020.2.0a8 or later with D3D12 enabled
#pragma kernel CountTotalsInBlock
#pragma kernel BlockCountPostfixSum
#pragma kernel CalculateOffsetsForEachKey
#pragma kernel FinalSort
uint _FirstBitToSort;
int _NumElements;
int _NumBlocks;
bool _ShouldSortPayload;
@animetosho
animetosho / gf2p8affineqb-articles.md
Last active February 2, 2024 11:53
A list of articles documenting uses of the GF2P8AFFINE instruction

Unexpected Uses for the Galois Field Affine Transformation Instruction

Intel added the Galois Field instruction set (GFNI) extensions to their Sunny Cove and Tremont cores. What’s particularly interesting is that GFNI is the only new SIMD extension that came with SSE and VEX/AVX encodings (in addition to EVEX/AVX512), to allow it to be supported on all future Intel cores, including those which don’t support AVX512 (such as the Atom line, as well as Celeron/Pentium branded “big” cores).

I suspect GFNI was aimed at accelerating SM4 encryption, however, one of the instructions can be used for many other purposes. The extension includes three instructions, but of particular interest here is the Affine Transformation (GF2P8AFFINEQB), aka bit-matrix multiply, instruction.

There have been various articles which discuss out-of-band

Perfect Quantization of DXT endpoints
-------------------------------------
One of the issues that affect the quality of most DXT compressors is the way floating point colors are rounded.
For example, stb_dxt does:
max16 = (unsigned short)(stb__sclamp((At1_r*yy - At2_r*xy)*frb+0.5f,0,31) << 11);
max16 |= (unsigned short)(stb__sclamp((At1_g*yy - At2_g*xy)*fg +0.5f,0,63) << 5);
max16 |= (unsigned short)(stb__sclamp((At1_b*yy - At2_b*xy)*frb+0.5f,0,31) << 0);
@JarkkoPFC
JarkkoPFC / sphere_screen_extents.h
Last active June 13, 2023 21:27
Calculates view space 3D sphere extents on the screen
struct vec3f {float x, y, z;};
struct vec4f {float x, y, z, w;};
struct mat44f {vec4f x, y, z, w;};
//============================================================================
// sphere_screen_extents
//============================================================================
// Calculates the exact screen extents xyzw=[left, bottom, right, top] in
// normalized screen coordinates [-1, 1] for a sphere in view space. For
// performance, the projection matrix (v2p) is assumed to be setup so that
@vurtun
vurtun / x11_clipboard.c
Last active June 13, 2023 21:28
X11 clipboard
static char*
sys_clip_get(struct sys *s, Atom selection, Atom target)
{
assert(s);
struct sys_x11 *x11 = s->platform;
/* blocking wait for clipboard data */
XEvent notify;
XConvertSelection(x11->dpy, selection, target, selection, x11->helper, CurrentTime);
while (!XCheckTypedWindowEvent(x11->dpy, x11->helper, SelectionNotify, &notify)) {