Skip to content

Instantly share code, notes, and snippets.

@primenumber
Last active November 30, 2022 14:35
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save primenumber/15ebf14d2776c96ab6f0c53dc5934157 to your computer and use it in GitHub Desktop.
Save primenumber/15ebf14d2776c96ab6f0c53dc5934157 to your computer and use it in GitHub Desktop.
Intel Arc A770 benchmark result
Platform: Intel(R) OpenCL HD Graphics
Device: Intel(R) Graphics [0x56a0]
Driver version : 22.32.23937 (Linux x64)
Compute units : 512
Clock frequency : 2400 MHz
Global memory bandwidth (GBPS)
float : 396.17
float2 : 405.79
float4 : 409.32
float8 : 416.63
float16 : 420.57
Single-precision compute (GFLOPS)
float : 12873.72
float2 : 11001.59
float4 : 10289.40
float8 : 10187.91
float16 : 9592.90
Half-precision compute (GFLOPS)
half : 19329.85
half2 : 19262.51
half4 : 19312.80
half8 : 19239.40
half16 : 19118.04
No double precision support! Skipped
Integer compute (GIOPS)
int : 4308.55
int2 : 4973.93
int4 : 4410.29
int8 : 5070.73
int16 : 4266.05
Integer compute Fast 24bit (GIOPS)
int : 4302.23
int2 : 4962.64
int4 : 4391.14
int8 : 5067.01
int16 : 4251.69
Transfer bandwidth (GBPS)
enqueueWriteBuffer : 18.29
enqueueReadBuffer : 8.11
enqueueWriteBuffer non-blocking : 21.09
enqueueReadBuffer non-blocking : 8.62
enqueueMapBuffer(for read) : 20.08
memcpy from mapped ptr : 23.10
enqueueUnmap(after write) : 22.17
memcpy to mapped ptr : 22.99
Kernel launch latency : 10.76 us
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment