What the different layout rules are solving is mapping complex (relative to scalars i.e. u32, f32) data structures to memory (a byte array); each with their own space/time tradeoffs.
Data accessed from memory requires knowledge of a byte offset (relative to the start of the memory).
The most important properties of a data structure are alignment and size.
The alignment is the divisor of any byte offset at which the given data structure can reside (i.e. offset % alignment = 0).
Alignment is a power of 2 and for performance reasons is often more than 1 (1 usually also referred to as unaligned access) due to how CPUs/GPUs data accesses are performed at a hardware level.
The SS
constant denotes the inherent size of the (inner) scalar.
The roundUp
function (returns n rounded up to a multiple of k) is defined for positive integers k and n as:
- roundUp(k, n) = ⌈n ÷ k⌉ × k
The po2
function (returns n rounded up to a power of 2) is defined for positive integer n as:
- po2(n) = 2⌈log2(n)⌉
ty | scalar align | scalar size | std430 align | std430 size | std140 align | std140 size |
---|---|---|---|---|---|---|
scalar S | SS | SS | SS | SS | SS | SS |
vecN<S> | SS | SS * N | po2(SS * N) | SS * N | po2(SS * N) | SS * N |
matCxR<S> | SS | SS * C * R | po2(SS * R) | alignOf(self) * C | roundUp(16, SS * R) | alignOf(self) * C |
array<E, N> | alignOf(E) | sizeOf(E) * N | alignOf(E) | roundUp(alignOf(E), sizeOf(E)) * N | roundUp(16, alignOf(E)) | roundUp(alignOf(self), sizeOf(E)) * N |
struct with members M1...MN | max(alignOf(M1)...alignOf(MN)) | roundUp(alignOf(self), offsetOf(MN) + sizeOf(MN)) | max(alignOf(M1)...alignOf(MN)) | roundUp(alignOf(self), offsetOf(MN) + sizeOf(MN)) | max(16, alignOf(M1)...alignOf(MN)) | roundUp(alignOf(self), offsetOf(MN) + sizeOf(MN)) |
only relevant for laying out vectors inside structs
Same std140/std430 layout rules as above with the only change being that vectors now have scalar alignment (i.e. vecN alignment = S) as long as the rules below are met
Pseudocode
// start offset
F = S * k
if sizeOf(vecN) < 16 {
// start and end offsets need to lay in the same 16 byte block
L = F + sizeOf(vecN)
assert(floor(F / 16) == floor(L / 16))
} else {
// start offset needs to be aligned to 16 bytes
assert(F % 16 == 0)
}
Elements of arrays are laid out according to the following algorithm
Pseudocode
// Note: Array alignment differs between layouts but is always a multiple of the element layout
// Stride is the aligned size of an element
stride = roundUp(alignOf(array), sizeOf(E))
for i in array.length() {
// Offset at which the element resides
array[i].offset = stride * i
}
// This is the return value of sizeOf(array)
array.size = stride * array.length()
Members of structs are laid out according to the following algorithm
Pseudocode
// This is the return value of alignOf(struct)
struct.alignment = max(struct.members.map(alignOf))
// Byte offset from the start of the struct
current_offset = 0
for member in struct.members {
// Align offset for member
current_offset = roundUp(alignOf(member), current_offset)
// Offset at which the member resides
// This is the return value of offsetOf(member)
struct[member].offset = current_offset
current_offset += sizeOf(member)
}
// This is the return value of sizeOf(struct)
struct.size = roundUp(alignOf(struct), current_offset)
The default layout is std430. The extra requirements for the uniform address space have to be explicitly met.
- std430
- std140; with the caveat that matrices of the form
matCx2
have an alignment of 8 instead of 16 and therefore also size C * 8 instead of C * 16
- matrices are column-major
align
andsize
attributes can be used to change the alignment and size of struct members
- std430
- std140
SSBOs require OpenGL 4.3 / OpenGL 4.0 + ARB_shader_storage_buffer_object
- std140
- matrices are column-major (can be overriden to be row-major in buffers via
row_major
layout qualifier; added in GLSL 1.4) offset
andalign
layout qualifiers can be used to change the offset and alignment of struct members (added in GLSL 4.4 / GLSL 1.4 +ARB_enhanced_layouts
)
4.1. StorageBuffer Storage Class / PushConstant Storage Class / Uniform Storage Class with BufferBlock Decoration
- std140
- std430; default
- scalar; via
scalarBlockLayout
in Vulkan v1.2 orVK_EXT_scalar_block_layout
- vector-relaxed std140 / std430; since Vulkan v1.1 or via
VK_KHR_relaxed_block_layout
- std140; default
- std430; via
uniformBufferStandardLayout
in Vulkan v1.2 orVK_KHR_uniform_buffer_standard_layout
- scalar; via
scalarBlockLayout
in Vulkan v1.2 orVK_EXT_scalar_block_layout
- vector-relaxed std140 / std430; since Vulkan v1.1 or via
VK_KHR_relaxed_block_layout
Offset
decoration is required on struct membersArrayStride
decoration is required on array typesMatrixStride
and eitherColMajor
orRowMajor
decorations are required for matrices-
Even if scalar alignment is supported, it is generally more performant to use the base alignment.
Vulkan Shader Memory Layout Guide
SPIR-V Specification (Decorations)
SPIR-V Specification (Shader Validation)
- scalar
- vector-relaxed std140; with the caveat that struct members of type matrix, array or struct don't round up their size to a multiple of their alignment
- scalar; via
-no-legacy-cbuf-layout
DXC flag
- matrices are column-major in buffers by default (can be overriden via
row_major
modifier), however are row-major in shaders (notation (i.e.float4x3
is a 3 column 4 row matrix), construction and access are all row-major)
HLSL Constant Buffer Packing Rules
DXC HLSL to SPIR-V Feature Mapping
- std430; with the caveat that vector 3's size is 16 instead of 12 (however a packed vector 3 with the alignas specifier = 16 can be used instead)
- provides extra packed vectors (scalar layout)
- matrices are column-major
alignas
specifier can be used to change the alignment (can be applied to structs or struct members)