Skip to content

Instantly share code, notes, and snippets.

why doesn't radfft support AVX on PC?

So there's two separate issues here: using instructions added in AVX and using 256-bit wide vectors. The former turns out to be much easier than the latter for our use case.

Problem number 1 was that you positively need to put AVX code in a separate file with different compiler settings (/arch:AVX for VC++, -mavx for GCC/Clang) that make all SSE code emitted also use VEX encoding, and at the time radfft was written there was no way in CDep to set compiler flags for just one file, just for the overall build.

[There's the GCC "target" annotations on individual funcs, which in principle fix this, but I ran into nasty problems with this for several compiler versions, and VC++ has no equivalent, so we're not currently using that and just sticking with different compilation units.]

The other issue is to do with CPU power management.

@dwilliamson
dwilliamson / Reflection.h
Created February 3, 2018 23:20
Basic compile-time reflection for C++11
#pragma once
#include <typestring/typestring.hh>
namespace rfl
{
// Compile-time data member description
@twoscomplement
twoscomplement / TransientFunction.h
Last active May 4, 2025 04:21
TransientFunction: A light-weight alternative to std::function [C++11]
// TransientFuction: A light-weight alternative to std::function [C++11]
// Pass any callback - including capturing lambdas - cheaply and quickly as a
// function argument
//
// Based on:
// https://deplinenoise.wordpress.com/2014/02/23/using-c11-capturing-lambdas-w-vanilla-c-api-functions/
//
// - No instantiation of called function at each call site
// - Simple to use - use TransientFunction<> as the function argument
// - Low cost: cheap setup, one indirect function call to invoke
@TheRealMJP
TheRealMJP / Tex2DCatmullRom.hlsl
Last active August 23, 2025 09:05
An HLSL function for sampling a 2D texture with Catmull-Rom filtering, using 9 texture samples instead of 16
// The following code is licensed under the MIT license: https://gist.github.com/TheRealMJP/bc503b0b87b643d3505d41eab8b332ae
// Samples a texture with Catmull-Rom filtering, using 9 texture fetches instead of 16.
// See http://vec3.ca/bicubic-filtering-in-fewer-taps/ for more details
float4 SampleTextureCatmullRom(in Texture2D<float4> tex, in SamplerState linearSampler, in float2 uv, in float2 texSize)
{
// We're going to sample a a 4x4 grid of texels surrounding the target UV coordinate. We'll do this by rounding
// down the sample location to get the exact center of our "starting" texel. The starting texel will be at
// location [1, 1] in the grid, where [0, 0] is the top left corner.
float2 samplePos = uv * texSize;
@bkaradzic
bkaradzic / orthodoxc++.md
Last active November 3, 2025 04:11
Orthodox C++

Orthodox C++

This article has been updated and is available here.

@acapola
acapola / aes-ni.c
Created August 31, 2015 14:42
AES128 how-to using GCC and Intel AES-NI
#include <stdint.h> //for int8_t
#include <string.h> //for memcmp
#include <wmmintrin.h> //for intrinsics for AES-NI
//compile using gcc and following arguments: -g;-O0;-Wall;-msse2;-msse;-march=native;-maes
//internal stuff
//macros
#define DO_ENC_BLOCK(m,k) \
do{\
@rtt
rtt / tinder-api-documentation.md
Last active October 6, 2025 20:20
Tinder API Documentation

Tinder API documentation

Note: this was written in April/May 2014 and the API may has definitely changed since. I have nothing to do with Tinder, nor its API, and I do not offer any support for anything you may build on top of this. Proceed with caution

http://rsty.org/

I've sniffed most of the Tinder API to see how it works. You can use this to create bots (etc) very trivially. Some example python bot code is here -> https://gist.github.com/rtt/5a2e0cfa638c938cca59 (horribly quick and dirty, you've been warned!)