Skip to content

Instantly share code, notes, and snippets.

@unitycoder
Last active March 29, 2021 07:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save unitycoder/f63cdd7a03f796efe73c6e6a8e740420 to your computer and use it in GitHub Desktop.
Save unitycoder/f63cdd7a03f796efe73c6e6a8e740420 to your computer and use it in GitHub Desktop.
GPU Compute Model Terminology / Quick Reference / CheatSheet

from Landom Thomas https://landonthomas.net/docs/gpu_compute_model_terms_quick_ref.pdf

Compute Abstraction Hierarchy Microsoft HLSL Khronos GLSL Khronos OpenCL C Nvidia,CUDA C AMD,HIP Apple,MSL
Entire Kernel/Shader Compute Space Dispatch Dispatch, Compute Space NDRange, Index Space Grid Grid Grid
Major Compute Group Group, Thread Group Workgroup, Local Workgroup Work-group Block, Thread Block Block Threadgroup
Minor Compute Group (& Device Minor SIMD Unit) Wave Subgroup Sub-group Warp Wavefront, Wave, Warp SIMD-group
Quad Compute Object Quad Wave Subgroup Quad ? Quad Quad Quad-group
Single Compute Object Thread Invocation Work-item Thread Thread Thread
Hardware Abstraction Hierarchy Microsoft HLSL Khronos GLSL Khronos OpenCL C Nvidia,CUDA C AMD,HIP Apple,MSL
GPU / Compute Device Device Physical Device Compute Device Device Device Device
Major SIMD / Multi Processor Unit SIMD Processor Compute Unit (CU) Compute Unit (CU) Streaming Multiprocessor (SM) Compute Unit (CU) Compute Unit (CU)
Minor SIMD Unit(& Compute Minor Group) Wave Subgroup Sub-group Warp Wavefront, Wave, Warp SIMD-group
Single SIMD Processor Lane ? Processing Element (PE) Streaming Processor (SP), Lane Processing Element (PE), Lane Lane
Memory Abstraction Hierarchy Microsoft HLSL Khronos GLSL Khronos OpenCL C Nvidia,CUDA C AMD,HIP Apple,MSL
Contiguous Device Memory ~L2+ cache Device Memory Buffer, Image Memory Global Memory Global Memory Global Data Share (GDS), Global Segment Device Memory / Address Space
Faster, Partitioned Memory ~L1 cache Thread Group Shared Memory (TGSM) Shared Memory Local Memory Shared Memory Local Data Share (LDS), Local Segment Threadgroup Memory / Address Space
Fastest, Smallest Memory ~L0 cache Temporary Registers Subgroup Memory Private Memory Local Memory L0 vector cache, Private Segment Thread Memory / Address Space
@unitycoder
Copy link
Author

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment