Skip to content

Instantly share code, notes, and snippets.

@tiye
Last active December 18, 2023 08:28
Show Gist options
  • Save tiye/746ac3b9defc092361a863140a014c2d to your computer and use it in GitHub Desktop.
Save tiye/746ac3b9defc092361a863140a014c2d to your computer and use it in GitHub Desktop.

@compute.toys is a playground for WebGPU compute shaders. Everything here is written in WGSL, which is WebGPU's native shader language. For up-to-date information on WGSL, please see the WGSL draft specification. You can also take a tour of WGSL.

Inputs

@compute.toys supplies keyboard input, mouse input, selectable input textures, custom values controlled by sliders, and the current frame and elapsed time.

Mouse input can be accessed from the mouse struct:

mouse.pos: vec2i  
mouse.click: i32

Timing information is in the time struct:

time.frame: u32  
time.elapsed: f32

Custom uniforms are in the custom struct:

custom.my_custom_uniform_0: f32  
custom.my_custom_uniform_1: f32

Two selectable textures can be accessed from channel0 and channel1:

textureSampleLevel(channel0, bilinear, uv, pass, lod)  
textureSampleLevel(channel1, bilinear, uv, pass, lod)

Keyboard input can be accessed from the provided keyDown(keycode: u32) helper function:

keyDown(32) // returns true when the spacebar is pressed

Outputs

For compute shader input and output @compute.toys provides:
one input texture array pass_in,
one output storage texture array pass_out,
and one output screen storage texture screen.

The shader can write to pass_out, which will be copied into pass_in after the current entrypoint has returned. pass_in will always contain whatever has been written to pass_out during all of the previous entrypoints. The contents of pass_in will not change while an entrypoint is running. pass_in and pass_out are both texture arrays with 4 texture layers. For example, you can access the third layer of pass_in at LOD 0 and coordinate (1,1) by using the built-in helper function:

passLoad(2, vec2i(1,1), 0)

Preprocessor

@compute.toys also provides an experimental WGSL preprocessor. It currently allows the use of a handful of basic directives:

  • #define NAME VALUE for simple macros (function-like parameter substitution is not yet supported)
  • #include "PATH" for accessing built-in libraries
  • #workgroup_count ENTRYPOINT X Y Z for specifying how many workgroups should be dispatched for an entrypoint
  • #dispatch_count ENTRYPOINT N for dispatching an entrypoint multiple times in a row
  • #storage NAME TYPE for declaring a storage buffer

Storage

Read-write storage buffers can be declared using the #storage directive. For example, you can create a buffer of atomic counters:

#storage atomic_storage array<atomic<i32>>

You could use WGSL's built-in functions to do atomic operations on this buffer in any order, enabling you to safely perform work across many threads at once and accumulate the result in one place. Note that any writes to read-write storage buffers are immediately visible to subsequent reads (unlike the situation with pass_in and pass_out).

The final visual output of every shader is written to the screen storage texture, which displays the result in the canvas on this page.

Debugging assertions are supported with an assert helper function:

assert(0, isfinite(col.x))  
assert(1, isfinite(col.y))

Examples

Simple single pass shader

Preprocessor #include

Terminal overlay

Storage usage

Workgroup shared memory

Preprocessor #dispatch_count

Preprocessor #workgroup_count

Assert

Prelude

Every shader begins with a common prelude. The prelude contains the data inputs and outputs for this shader, as well as a few helper functions and type definitions to make working with @compute.toys a more streamlined and familiar process. Please refer to the prelude for a complete listing of the available data in your shader.

Here are the current contents of this shader's prelude:

alias int = i32;
alias uint = u32;
alias float = f32;
alias int2 = vec2<i32>;
alias int3 = vec3<i32>;
alias int4 = vec4<i32>;
alias uint2 = vec2<u32>;
alias uint3 = vec3<u32>;
alias uint4 = vec4<u32>;
alias float2 = vec2<f32>;
alias float3 = vec3<f32>;
alias float4 = vec4<f32>;
alias bool2 = vec2<bool>;
alias bool3 = vec3<bool>;
alias bool4 = vec4<bool>;
alias float2x2 = mat2x2<f32>;
alias float2x3 = mat2x3<f32>;
alias float2x4 = mat2x4<f32>;
alias float3x2 = mat3x2<f32>;
alias float3x3 = mat3x3<f32>;
alias float3x4 = mat3x4<f32>;
alias float4x2 = mat4x2<f32>;
alias float4x3 = mat4x3<f32>;
alias float4x4 = mat4x4<f32>;

struct Time { frame: uint, elapsed: float, delta: float }
struct Mouse { pos: uint2, click: int }
struct DispatchInfo { id: uint }
struct Custom {
    _dummy: float,
};
struct Data {
    _dummy: array<u32,1>,
};


@group(0) @binding(2) var<uniform> time: Time;
@group(0) @binding(3) var<uniform> mouse: Mouse;
@group(0) @binding(4) var<uniform> _keyboard: array<vec4<u32>,2>;
@group(0) @binding(5) var<uniform> custom: Custom;
@group(0) @binding(6) var<storage,read> data: Data;
@group(0) @binding(7) var<storage,read_write> _assert_counts: array<atomic<u32>>;
@group(0) @binding(8) var<uniform> dispatch: DispatchInfo;
@group(0) @binding(9) var screen: texture_storage_2d<rgba16float,write>;
@group(0) @binding(10) var pass_in: texture_2d_array<f32>;
@group(0) @binding(11) var pass_out: texture_storage_2d_array<rgba16float,write>;
@group(0) @binding(12) var channel0: texture_2d<f32>;
@group(0) @binding(13) var channel1: texture_2d<f32>;
@group(0) @binding(14) var nearest: sampler;
@group(0) @binding(15) var bilinear: sampler;
@group(0) @binding(16) var trilinear: sampler;
@group(0) @binding(17) var nearest_repeat: sampler;
@group(0) @binding(18) var bilinear_repeat: sampler;
@group(0) @binding(19) var trilinear_repeat: sampler;
fn keyDown(keycode: uint) -> bool {
    return ((_keyboard[keycode / 128u][(keycode % 128u) / 32u] >> (keycode % 32u)) & 1u) == 1u;
}

fn assert(index: int, success: bool) {
    if (!success) {
        atomicAdd(&_assert_counts[index], 1u);
    }
}

fn passStore(pass_index: int, coord: int2, value: float4) {
    textureStore(pass_out, coord, pass_index, value);
}

fn passLoad(pass_index: int, coord: int2, lod: int) -> float4 {
    return textureLoad(pass_in, coord, pass_index, lod);
}

fn passSampleLevelBilinearRepeat(pass_index: int, uv: float2, lod: float) -> float4 {
    return textureSampleLevel(pass_in, bilinear, fract(uv), pass_index, lod);
}

Note: Matrix types in WGSL are stored in column-major order. This means a matrix of type mat2x3<f32> (aka mat2x3f or float2x3) is constructed from 2 column vectors of type vec3<f32> (aka vec3f or float3). This is backward from HLSL and convention in mathematics.

Here is a translation of the compute.toys WGSL overview to Chinese:

@compute.toys 是 WebGPU 计算着色器的游乐场。这里的所有内容都是用 WGSL 写的,这是 WebGPU 的本地着色语言。有关 WGSL 的最新信息,请参阅 WGSL 草案规范。您也可以参观 WGSL 之旅

输入

@compute.toys 提供键盘输入、鼠标输入、可选择的输入纹理、由滑块控制的自定义值以及当前帧和经过时间。

可以从 mouse 结构访问鼠标输入:

mouse.pos: vec2i
mouse.click: i32  

时间信息在 time 结构中:

time.frame: u32
time.elapsed: f32

自定义统一变量在 custom 结构中:

custom.my_custom_uniform_0: f32
custom.my_custom_uniform_1: f32

可以从 channel0channel1 访问两个可选择纹理:

textureSampleLevel(channel0, bilinear, uv, pass, lod)  
textureSampleLevel(channel1, bilinear, uv, pass, lod)

可以通过提供的 keyDown(keycode: u32) 帮助函数访问键盘输入:

keyDown(32) // 当空格键被按下时返回 true

输出

对于计算着色器的输入和输出,@compute.toys 提供: 一个输入纹理数组 pass_in,
一个输出存储纹理数组 pass_out, 和一个输出屏幕存储纹理 screen

着色器可以写入 pass_out,在当前入口点返回后会复制到 pass_inpass_in 将始终包含在所有上一个入口点写入 pass_out 的内容。当入口点运行时,pass_in 的内容不会更改。pass_inpass_out 都是纹理数组,具有 4 个纹理层。例如,您可以使用内置的帮助函数访问 pass_in 的第三层 LOD 0 和坐标 (1,1):

passLoad(2, vec2i(1,1), 0)

预处理器

@compute.toys 还提供实验性的 WGSL 预处理器。它目前支持使用少数基本指令:

  • #define NAME VALUE用于简单宏(尚不支持函数式参数替换)
  • #include "PATH" 用于访问内置库
  • #workgroup_count ENTRYPOINT X Y Z 用于指定分派给入口点的工作组数
  • #dispatch_count ENTRYPOINT N用于连续分派入口点多次
  • #storage NAME TYPE用于声明存储缓冲区

存储

可以使用 #storage 指令声明读写存储缓冲区。例如,您可以创建一个原子计数器缓冲区:

#storage atomic_storage array<atomic<i32>>  

您可以使用 WGSL 的内置函数对此缓冲区执行原子操作,以任意顺序安全地跨许多线程执行工作并将结果累积在一个位置。请注意,对任何读写存储缓冲区的写入立即可供后续读取(与 pass_inpass_out 的情况不同)。

示例

简单的单次传递着色器

预处理器 #include

终端叠加

存储使用

工作组共享内存

预处理器 #dispatch_count

预处理器 #workgroup_count

断言

Prelude

每个着色器都以常见的 prelude 开始。 预言包含此着色器的数据输入和输出,以及一些帮助函数和类型定义,以使使用 @compute.toys 成为更流畅和熟悉的过程。 请参阅预言以获取着色器中可用数据的完整列表。

这是此着色器预言的当前内容:

alias int = i32;
alias uint = u32;
alias float = f32;
alias int2 = vec2<i32>;
alias int3 = vec3<i32>;
alias int4 = vec4<i32>;
alias uint2 = vec2<u32>;
alias uint3 = vec3<u32>;
alias uint4 = vec4<u32>;
alias float2 = vec2<f32>;
alias float3 = vec3<f32>;
alias float4 = vec4<f32>;
alias bool2 = vec2<bool>;
alias bool3 = vec3<bool>;
alias bool4 = vec4<bool>;
alias float2x2 = mat2x2<f32>;
alias float2x3 = mat2x3<f32>;
alias float2x4 = mat2x4<f32>;
alias float3x2 = mat3x2<f32>;
alias float3x3 = mat3x3<f32>;
alias float3x4 = mat3x4<f32>;
alias float4x2 = mat4x2<f32>;
alias float4x3 = mat4x3<f32>;
alias float4x4 = mat4x4<f32>;

struct Time { frame: uint, elapsed: float, delta: float }
struct Mouse { pos: uint2, click: int }
struct DispatchInfo { id: uint }
struct Custom {
    _dummy: float,
};
struct Data {
    _dummy: array<u32,1>,
};


@group(0) @binding(2) var<uniform> time: Time;
@group(0) @binding(3) var<uniform> mouse: Mouse;
@group(0) @binding(4) var<uniform> _keyboard: array<vec4<u32>,2>;
@group(0) @binding(5) var<uniform> custom: Custom;
@group(0) @binding(6) var<storage,read> data: Data;
@group(0) @binding(7) var<storage,read_write> _assert_counts: array<atomic<u32>>;
@group(0) @binding(8) var<uniform> dispatch: DispatchInfo;
@group(0) @binding(9) var screen: texture_storage_2d<rgba16float,write>;
@group(0) @binding(10) var pass_in: texture_2d_array<f32>;
@group(0) @binding(11) var pass_out: texture_storage_2d_array<rgba16float,write>;
@group(0) @binding(12) var channel0: texture_2d<f32>;
@group(0) @binding(13) var channel1: texture_2d<f32>;
@group(0) @binding(14) var nearest: sampler;
@group(0) @binding(15) var bilinear: sampler;
@group(0) @binding(16) var trilinear: sampler;
@group(0) @binding(17) var nearest_repeat: sampler;
@group(0) @binding(18) var bilinear_repeat: sampler;
@group(0) @binding(19) var trilinear_repeat: sampler;
fn keyDown(keycode: uint) -> bool {
    return ((_keyboard[keycode / 128u][(keycode % 128u) / 32u] >> (keycode % 32u)) & 1u) == 1u;
}

fn assert(index: int, success: bool) {
    if (!success) {
        atomicAdd(&_assert_counts[index], 1u);
    }
}

fn passStore(pass_index: int, coord: int2, value: float4) {
    textureStore(pass_out, coord, pass_index, value);
}

fn passLoad(pass_index: int, coord: int2, lod: int) -> float4 {
    return textureLoad(pass_in, coord, pass_index, lod);
}

fn passSampleLevelBilinearRepeat(pass_index: int, uv: float2, lod: float) -> float4 {
    return textureSampleLevel(pass_in, bilinear, fract(uv), pass_index, lod);
}

注意: WGSL 中的矩阵类型按列主序存储。 这意味着类型为 mat2x3<f32>(也称为 mat2x3ffloat2x3)的矩阵是从 2 列类型为 vec3<f32> (也称为 vec3ffloat3)的向量构造的。 这与 HLSL 和数学中的约定相反。

@tiye
Copy link
Author

tiye commented Dec 18, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment