Skip to content

Instantly share code, notes, and snippets.

@bzm3r
Created February 22, 2020 22:15
Show Gist options
  • Save bzm3r/e3b02e2418aa445588c7082e242383bf to your computer and use it in GitHub Desktop.
Save bzm3r/e3b02e2418aa445588c7082e242383bf to your computer and use it in GitHub Desktop.
compiling kernel transpose-threadgroup-WGS=(1,32)...
num bms: 4096, num dispatch groups: 4096
GPU results verified!
task name:Vk-Threadgroup-TG=32
device: Intel(R) HD Graphics 520
num BMs: 4096, TG size: 32
CPU loops: 101, GPU loops: 1001
timestamp stats (N = 101): 81.46 +/- 1.37 ms
instant stats (N = 101): 82.24 +/- 1.35 ms
compiling kernel transpose-threadgroup-WGS=(2,32)...
num bms: 4096, num dispatch groups: 2048
GPU results verified!
task name:Vk-Threadgroup-TG=64
device: Intel(R) HD Graphics 520
num BMs: 4096, TG size: 64
CPU loops: 101, GPU loops: 1001
timestamp stats (N = 101): 68.65 +/- 1.33 ms
instant stats (N = 101): 69.48 +/- 1.41 ms
compiling kernel transpose-threadgroup-WGS=(4,32)...
num bms: 4096, num dispatch groups: 1024
GPU results verified!
task name:Vk-Threadgroup-TG=128
device: Intel(R) HD Graphics 520
num BMs: 4096, TG size: 128
CPU loops: 101, GPU loops: 1001
timestamp stats (N = 101): 117.07 +/- 1.97 ms
instant stats (N = 101): 117.88 +/- 1.96 ms
compiling kernel transpose-threadgroup-WGS=(8,32)...
num bms: 4096, num dispatch groups: 512
GPU results verified!
task name:Vk-Threadgroup-TG=256
device: Intel(R) HD Graphics 520
num BMs: 4096, TG size: 256
CPU loops: 101, GPU loops: 1001
timestamp stats (N = 101): 233.34 +/- 2.44 ms
instant stats (N = 101): 234.16 +/- 2.42 ms
compiling kernel transpose-threadgroup-WGS=(16,32)...
num bms: 4096, num dispatch groups: 256
GPU results verified!
task name:Vk-Threadgroup-TG=512
device: Intel(R) HD Graphics 520
num BMs: 4096, TG size: 512
CPU loops: 101, GPU loops: 1001
timestamp stats (N = 101): 469.31 +/- 6.30 ms
instant stats (N = 101): 470.15 +/- 6.26 ms
compiling kernel transpose-threadgroup-WGS=(32,32)...
num bms: 4096, num dispatch groups: 128
GPU results verified!
task name:Vk-Threadgroup-TG=1024
device: Intel(R) HD Graphics 520
num BMs: 4096, TG size: 1024
CPU loops: 101, GPU loops: 1001
timestamp stats (N = 101): 468.31 +/- 1.49 ms
instant stats (N = 101): 469.23 +/- 1.84 ms
compiling kernel transpose-threadgroup-WGS=(64,32)...
num bms: 4096, num dispatch groups: 64
thread 'main' panicked at 'GPU result 0 incorrect!
input: [2277521164, 2036339247, 104423218, 2794261787, 576044908, 1451828223, 362967187, 1427078597, 1579786967, 529122017, 3983516031, 3564755033, 2677213563, 2775113600, 176948552, 2231291716, 341285145, 1161798225, 2769093821, 3787186290, 1168785104, 3326128833, 3177984051, 2267999131, 3166883889, 3212576212, 565979745, 1008086545, 3127131590, 3126507669, 643475865, 3159679044]
expected:[1843863530, 281548142, 2986640819, 1082481723, 1809784172, 88872502, 2520440752, 1924408288, 1384248509, 213033783, 437575837, 2089924764, 829499607, 4244243953, 2173217476, 3786125072, 991164, 1771803840, 2663892120, 1725886380, 3993737236, 383429958, 4083264531, 85512809, 115226307, 1923109693, 3422011373, 3141555970, 3141606370, 4283180058, 3804578, 3018636297]
got: [2277521164, 2036339247, 104423218, 2794261787, 576044908, 1451828223, 362967187, 1427078597, 1579786967, 529122017, 3983516031, 3564755033, 2677213563, 2775113600, 176948552, 2231291716, 341285145, 1161798225, 2769093821, 3787186290, 1168785104, 3326128833, 3177984051, 2267999131, 3166883889, 3212576212, 565979745, 1008086545, 3127131590, 3126507669, 643475865, 3159679044]', src\gpu.rs:374:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment