Skip to content

Instantly share code, notes, and snippets.

@rarecoil
Last active December 9, 2019 14:16
Show Gist options
  • Save rarecoil/3d0411a01c649d3b99d3ae1e1a792223 to your computer and use it in GitHub Desktop.
Save rarecoil/3d0411a01c649d3b99d3ae1e1a792223 to your computer and use it in GitHub Desktop.
AMD Radeon RX 5700 XT (Navi) PlaidML - plaidbench deep learning benchmarks
Running 1024 examples with mobilenet, batch size 1, on backend plaid
INFO:plaidml:Opening device "opencl_amd_gfx1010.0"
Compiling network... Warming up... Running...
Example finished, elapsed: 6.513s (compile), 7.912s (execution)
-----------------------------------------------------------------------------------------
Network Name Inference Latency Time / FPS
-----------------------------------------------------------------------------------------
mobilenet 7.73 ms 1.67 ms / 599.91 fps
Correctness: PASS, max_error: 7.314303729799576e-06, max_abs_error: 6.407499313354492e-07, fail_ratio: 0.0
Running 1024 examples with resnet50, batch size 1, on backend plaid
INFO:plaidml:Opening device "opencl_amd_gfx1010.0"
Compiling network... Warming up... Running...
Example finished, elapsed: 5.678s (compile), 16.609s (execution)
-----------------------------------------------------------------------------------------
Network Name Inference Latency Time / FPS
-----------------------------------------------------------------------------------------
resnet50 16.22 ms 7.96 ms / 125.69 fps
Correctness: PASS, max_error: 3.435083499425673e-06, max_abs_error: 1.7881393432617188e-07, fail_ratio: 0.0
Running 1024 examples with vgg16, batch size 1, on backend plaid
INFO:plaidml:Opening device "opencl_amd_gfx1010.0"
Compiling network... Warming up... Running...
Example finished, elapsed: 3.232s (compile), 55.677s (execution)
-----------------------------------------------------------------------------------------
Network Name Inference Latency Time / FPS
-----------------------------------------------------------------------------------------
vgg16 54.37 ms 50.40 ms / 19.84 fps
Correctness: PASS, max_error: 2.691625468287384e-06, max_abs_error: 4.0046870708465576e-08, fail_ratio: 0.0
Running 1024 examples with vgg19, batch size 1, on backend plaid
INFO:plaidml:Opening device "opencl_amd_gfx1010.0"
Compiling network... Warming up... Running...
Example finished, elapsed: 3.281s (compile), 92.729s (execution)
-----------------------------------------------------------------------------------------
Network Name Inference Latency Time / FPS
-----------------------------------------------------------------------------------------
vgg19 90.56 ms 55.29 ms / 18.09 fps
Correctness: PASS, max_error: 1.7339709756925004e-06, max_abs_error: 1.7136335372924805e-07, fail_ratio: 0.0
Running 1024 examples with xception, batch size 1, on backend plaid
INFO:plaidml:Opening device "opencl_amd_gfx1010.0"
Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.4/xception_weights_tf_dim_ordering_tf_kernels.h5
91889664/91884032 [==============================] - 13s 0us/step
Compiling network... Warming up... Running...
Example finished, elapsed: 8.406s (compile), 19.893s (execution)
-----------------------------------------------------------------------------------------
Network Name Inference Latency Time / FPS
-----------------------------------------------------------------------------------------
xception 19.43 ms 9.93 ms / 100.68 fps
Correctness: PASS, max_error: 4.805629032489378e-06, max_abs_error: 4.6193599700927734e-07, fail_ratio: 0.0
Running 1024 examples with imdb_lstm, batch size 1, on backend plaid
INFO:plaidml:Opening device "opencl_amd_gfx1010.0"
Compiling network... Warming up... Running...
Example finished, elapsed: 6.976s (compile), 74.840s (execution)
-----------------------------------------------------------------------------------------
Network Name Inference Latency Time / FPS
-----------------------------------------------------------------------------------------
imdb_lstm 73.09 ms 6.01 ms / 166.47 fps
Correctness: PASS, max_error: 0.0, max_abs_error: 0.0, fail_ratio: 0.0
@rarecoil
Copy link
Author

Number of platforms                               1
  Platform Name                                   AMD Accelerated Parallel Processing
  Platform Vendor                                 Advanced Micro Devices, Inc.
  Platform Version                                OpenCL 2.1 AMD-APP (2906.7)
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd cl_amd_event_callback cl_amd_offline_devices 
  Platform Host timer resolution                  1ns
  Platform Extensions function suffix             AMD

  Platform Name                                   AMD Accelerated Parallel Processing
Number of devices                                 1
  Device Name                                     gfx1010
  Device Vendor                                   Advanced Micro Devices, Inc.
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 2.0 AMD-APP (2906.7)
  Driver Version                                  2906.7 (PAL,LC)
  Device OpenCL C Version                         OpenCL C 2.0 
  Device Type                                     GPU
  Device Board Name (AMD)                         AMD Radeon RX 5700 XT
  Device Topology (AMD)                           PCI-E, 23:00.0
  Device Profile                                  FULL_PROFILE
  Device Available                                Yes
  Compiler Available                              Yes
  Linker Available                                Yes
  Max compute units                               20
  SIMD per compute unit (AMD)                     2
  SIMD width (AMD)                                32
  SIMD instruction width (AMD)                    1
  Max clock frequency                             2100MHz
  Graphics IP (AMD)                               10.10
  Device Partition                                (core)
    Max number of sub-devices                     20
    Supported partition types                     None
  Max work item dimensions                        3
  Max work item sizes                             1024x1024x1024
  Max work group size                             256
  Preferred work group size (AMD)                 256
  Max work group size (AMD)                       1024
  Preferred work group size multiple              32
  Wavefront width (AMD)                           32
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 85
model name	: Intel(R) Core(TM) i9-7900X CPU @ 3.30GHz
stepping	: 4
microcode	: 0x200005e
cpu MHz		: 1200.335
cache size	: 14080 KB
physical id	: 0
siblings	: 20
core id		: 0
cpu cores	: 10
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req md_clear flush_l1d
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs
bogomips	: 6600.00
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 bits virtual
power management:
Linux hostname 4.15.0-62-generic #69-Ubuntu SMP x86_64 x86_64 x86_64 GNU/Linux
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.3 LTS
Release:	18.04
Codename:	bionic

@dmenig
Copy link

dmenig commented Dec 8, 2019

Damn that's not great :/ I thought 5700 XT was supposed to compete against 2070 ! But that is actually where I expect 1660 Ti to be. Too bad !

@rarecoil
Copy link
Author

rarecoil commented Dec 9, 2019

I have not found the RX 5700 XT's architecture to be very useful for anything other than gaming. For DL/ML, I ended up buying more Radeon VII cards, which are officially ROCm supported and the same underlying architecture as the Radeon Instinct MI50.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment