Skip to content

Instantly share code, notes, and snippets.

@dondragmer
dondragmer / CuteSort.hlsl
Created December 5, 2020 00:11
A very fast GPU sort for sorting values within a wavefront
Buffer<uint> Input;
RWBuffer<uint> Output;
//returns the index that this value should be moved to to sort the array
uint CuteSort(uint value, uint laneIndex)
{
uint smallerValuesMask = 0;
uint equalValuesMask = ~0;
//don't need to test every bit if your value is constrained to a smaller range
@pixelsnafu
pixelsnafu / CloudsResources.md
Last active May 2, 2024 13:46
Useful Resources for Rendering Volumetric Clouds

Volumetric Clouds Resources List

  1. A. Schneider, "Real-Time Volumetric Cloudscapes," in GPU Pro 7: Advanced Rendering Techniques, 2016, pp. 97-127. (Follow up presentations here, and here.)

  2. S. Hillaire, "Physically Based Sky, Atmosphere and Cloud Rendering in Frostbite" in Physically Based Shading in Theory and Practice course, SIGGRAPH 2016. [video] [course notes] [scatter integral shadertoy]

  3. [R. Högfeldt, "Convincing Cloud Rendering – An Implementation of Real-Time Dynamic Volumetric Clouds in Frostbite"](https://odr.chalmers.se/hand

/*
Loading C:\Users\connor\Downloads\models\lucy.obj.
Duration: 1948.9ms
Verts: 14027872
Faces: 28055728
Verts/s: 7197976.0
Faces/s: 14395943.9
MB/s: 605.4
*/
@mattatz
mattatz / Quaternion.hlsl
Last active May 6, 2024 17:10
Quaternion structure for HLSL
#ifndef __QUATERNION_INCLUDED__
#define __QUATERNION_INCLUDED__
#define QUATERNION_IDENTITY float4(0, 0, 0, 1)
#ifndef PI
#define PI 3.14159265359f
#endif
// Quaternion multiplication
@graphitemaster
graphitemaster / openglbb.md
Created July 6, 2017 19:13
OpenGL Black Bible

OpenGL Black Bible

Author: Dale Weiler

Preface

The following is a writeup of the following things I've independently discovered or been told over the years on how to utilize OpenGL effectively. Not everything in here I gurantee to be factually correct - though several others share similar ideas as present here. Take these are your own peril, like everything else - nothing is absolute; so profile to be sure and check the standards if something here is incorrect. These things presented here I've come to accept as being a safe

API Design: Coroutines APIs (Janurary-2017)

I am currently dealing with a lot of libraries at work. Both third party as well as libraries written or being currently in process of being written by me. I absolutely love writing and working with libraries. Especially if they present or bring me to either a new or different approach to solve a problem. Or at least provide a different view.

Over time I noticed however that quite regulary we had to decide that we cannot use a third party library. Often it is the usual reason.

@understeer
understeer / latency.txt
Created January 12, 2017 14:04 — forked from eshelman/latency.txt
HPC-oriented Latency Numbers Every Programmer Should Know
Latency Comparison Numbers
--------------------------
L1 cache reference/hit 1.5 ns 4 cycles
Floating-point add/mult/FMA operation 1.5 ns 4 cycles
L2 cache reference/hit 5 ns 12 ~ 17 cycles
Branch mispredict 6 ns 15 ~ 20 cycles
L3 cache hit (unshared cache line) 16 ns 42 cycles
L3 cache hit (shared line in another core) 25 ns 65 cycles
Mutex lock/unlock 25 ns
L3 cache hit (modified in another core) 29 ns 75 cycles
@reinsteam
reinsteam / atmosphere_clouds_rendering.md
Last active March 1, 2024 14:43
A collection of links to various materials on atmosphere / clouds rendering

Atmosphere / Clouds Rendering

Research papers

Atmosphere

  • A fast, simple method to render sky color using gradients maps [[Abad06]]
  • A Framework for the Experimental Comparison of Solar and Skydome Illumination [[Kider14]]
  • A Method for Modeling Clouds based on Atmospheric Fluid Dynamics [[Miyazaki01]]
  • A Physically-Based Night Sky Model [[Jensen01]]
@TheRealMJP
TheRealMJP / Tex2DCatmullRom.hlsl
Last active May 7, 2024 07:11
An HLSL function for sampling a 2D texture with Catmull-Rom filtering, using 9 texture samples instead of 16
// The following code is licensed under the MIT license: https://gist.github.com/TheRealMJP/bc503b0b87b643d3505d41eab8b332ae
// Samples a texture with Catmull-Rom filtering, using 9 texture fetches instead of 16.
// See http://vec3.ca/bicubic-filtering-in-fewer-taps/ for more details
float4 SampleTextureCatmullRom(in Texture2D<float4> tex, in SamplerState linearSampler, in float2 uv, in float2 texSize)
{
// We're going to sample a a 4x4 grid of texels surrounding the target UV coordinate. We'll do this by rounding
// down the sample location to get the exact center of our "starting" texel. The starting texel will be at
// location [1, 1] in the grid, where [0, 0] is the top left corner.
float2 samplePos = uv * texSize;