Created
March 15, 2019 07:07
-
-
Save ubombi/c04d853b87ea29928fb17cfe0daaeb13 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Number of platforms: 1 | |
Platform Profile: FULL_PROFILE | |
Platform Version: OpenCL 2.0 AMD-APP.internal (2814.0) | |
Platform Name: AMD Accelerated Parallel Processing | |
Platform Vendor: Advanced Micro Devices, Inc. | |
Platform Extensions: cl_khr_icd cl_amd_object_metadata cl_amd_event_callback | |
Platform Name: AMD Accelerated Parallel Processing | |
Number of devices: 1 | |
Device Type: CL_DEVICE_TYPE_GPU | |
Vendor ID: 1002h | |
Board name: Vega 10 XTX [Radeon Vega Frontier Edition] | |
Device Topology: PCI[ B#9, D#0, F#0 ] | |
Max compute units: 64 | |
Max work items dimensions: 3 | |
Max work items[0]: 1024 | |
Max work items[1]: 1024 | |
Max work items[2]: 1024 | |
Max work group size: 256 | |
Preferred vector width char: 4 | |
Preferred vector width short: 2 | |
Preferred vector width int: 1 | |
Preferred vector width long: 1 | |
Preferred vector width float: 1 | |
Preferred vector width double: 1 | |
Native vector width char: 4 | |
Native vector width short: 2 | |
Native vector width int: 1 | |
Native vector width long: 1 | |
Native vector width float: 1 | |
Native vector width double: 1 | |
Max clock frequency: 1600Mhz | |
Address bits: 64 | |
Max memory allocation: 14588628172 | |
Image support: No | |
Max size of kernel argument: 1024 | |
Alignment (bits) of base address: 1024 | |
Minimum alignment (bytes) for any datatype: 128 | |
Single precision floating point capability | |
Denorms: Yes | |
Quiet NaNs: Yes | |
Round to nearest even: Yes | |
Round to zero: Yes | |
Round to +ve and infinity: Yes | |
IEEE754-2008 fused multiply-add: Yes | |
Cache type: Read/Write | |
Cache line size: 64 | |
Cache size: 16384 | |
Global memory size: 17163091968 | |
Constant buffer size: 14588628172 | |
Max number of constant args: 8 | |
Local memory type: Scratchpad | |
Local memory size: 65536 | |
Max pipe arguments: 16 | |
Max pipe active reservations: 16 | |
Max pipe packet size: 1703726284 | |
Max global variable size: 14588628172 | |
Max global variable preferred total size: 17163091968 | |
Max read/write image args: 0 | |
Max on device events: 1024 | |
Queue on device max size: 8388608 | |
Max on device queues: 1 | |
Queue on device preferred size: 262144 | |
SVM capabilities: | |
Coarse grain buffer: Yes | |
Fine grain buffer: Yes | |
Fine grain system: No | |
Atomics: No | |
Preferred platform atomic alignment: 0 | |
Preferred global atomic alignment: 0 | |
Preferred local atomic alignment: 0 | |
Kernel Preferred work group size multiple: 64 | |
Error correction support: 0 | |
Unified memory for Host and Device: 0 | |
Profiling timer resolution: 1 | |
Device endianess: Little | |
Available: Yes | |
Compiler available: Yes | |
Execution capabilities: | |
Execute OpenCL kernels: Yes | |
Execute native function: No | |
Queue on Host properties: | |
Out-of-Order: No | |
Profiling : Yes | |
Queue on Device properties: | |
Out-of-Order: Yes | |
Profiling : Yes | |
Platform ID: 0x7fb8f190e4d0 | |
Name: gfx900 | |
Vendor: Advanced Micro Devices, Inc. | |
Device OpenCL C version: OpenCL C 2.0 | |
Driver version: 2814.0 (HSA1.1,LC) | |
Profile: FULL_PROFILE | |
Version: OpenCL 1.2 | |
Extensions: cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
HIP version : 1.5.0 | |
== hipconfig | |
HIP_PATH : /opt/rocm | |
HIP_PLATFORM : hcc | |
CPP_CONFIG : -D__HIP_PLATFORM_HCC__= -I/opt/rocm/include -I/opt/rocm/hcc/include | |
== hcc | |
HSA_PATH : /opt/rocm/hsa | |
HCC_HOME : /opt/rocm/hcc | |
HCC clang version 9.0.0 (https://github.com/RadeonOpenCompute/hcc-clang-upgrade.git c792478f19beee13540053f188094898a008d245) (https://github.com/RadeonOpenCompute/llvm.git 68584f0b7bc07d43af64f90b3726988b5a513bf9) (based on HCC 1.3.19092-1dcecffc-c792478f19-68584f0b7bc ) | |
Target: x86_64-unknown-linux-gnu | |
Thread model: posix | |
InstalledDir: /opt/rocm/hcc/bin | |
LLVM (http://llvm.org/): | |
LLVM version 9.0.0svn | |
Optimized build. | |
Default target: x86_64-unknown-linux-gnu | |
Host CPU: znver1 | |
Registered Targets: | |
amdgcn - AMD GCN GPUs | |
r600 - AMD GPUs HD2XXX-HD6XXX | |
x86 - 32-bit X86: Pentium-Pro and above | |
x86-64 - 64-bit X86: EM64T and AMD64 | |
HCC-cxxflags : -hc -std=c++amp -I/opt/rocm/hcc/include -I/opt/rocm/includeHCC-ldflags : -hc -std=c++amp -L/opt/rocm/hcc/lib -Wl,--rpath=/opt/rocm/hcc/lib -ldl -lm -lpthread -lhc_am -Wl,--whole-archive -lmcwamp -Wl,--no-whole-archive | |
=== Environment Variables | |
PATH=/opt/google-cloud-sdk/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/opt/cuda/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/home/ubombi/esp/xtensa-esp32-elf/bin/:/home/ubombi/esp/xtensa-lx106-elf/bin/ | |
== Linux Kernel | |
Hostname : shreder | |
Linux shreder 5.0.0-mainline #1 SMP PREEMPT Wed Mar 6 21:12:30 EET 2019 x86_64 GNU/Linux | |
LSB Version: n/a | |
Distributor ID: ManjaroLinux | |
Description: Manjaro Linux | |
Release: 18.0.3 | |
Codename: Illyria |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
===================== | |
HSA System Attributes | |
===================== | |
Runtime Version: 1.1 | |
System Timestamp Freq.: 1000.000000MHz | |
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count) | |
Machine Model: LARGE | |
System Endianness: LITTLE | |
========== | |
HSA Agents | |
========== | |
******* | |
Agent 1 | |
******* | |
Name: AMD Ryzen 7 2700X Eight-Core Processor | |
Vendor Name: CPU | |
Feature: None specified | |
Profile: FULL_PROFILE | |
Float Round Mode: NEAR | |
Max Queue Number: 0(0x0) | |
Queue Min Size: 0(0x0) | |
Queue Max Size: 0(0x0) | |
Queue Type: MULTI | |
Node: 0 | |
Device Type: CPU | |
Cache Info: | |
L1: 32768(0x8000) KB | |
Chip ID: 0(0x0) | |
Cacheline Size: 64(0x40) | |
Max Clock Frequency (MHz):3700 | |
BDFID: 0(0x0) | |
Compute Unit: 16(0x10) | |
Features: None | |
Pool Info: | |
Pool 1 | |
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED | |
Size: 16418992(0xfa88b0) KB | |
Allocatable: TRUE | |
Alloc Granule: 4KB | |
Alloc Alignment: 4KB | |
Acessible by all: TRUE | |
Pool 2 | |
Segment: GLOBAL; FLAGS: COARSE GRAINED | |
Size: 16418992(0xfa88b0) KB | |
Allocatable: TRUE | |
Alloc Granule: 4KB | |
Alloc Alignment: 4KB | |
Acessible by all: TRUE | |
ISA Info: | |
N/A | |
******* | |
Agent 2 | |
******* | |
Name: gfx900 | |
Vendor Name: AMD | |
Feature: KERNEL_DISPATCH | |
Profile: BASE_PROFILE | |
Float Round Mode: NEAR | |
Max Queue Number: 128(0x80) | |
Queue Min Size: 4096(0x1000) | |
Queue Max Size: 131072(0x20000) | |
Queue Type: MULTI | |
Node: 1 | |
Device Type: GPU | |
Cache Info: | |
L1: 16(0x10) KB | |
Chip ID: 26723(0x6863) | |
Cacheline Size: 64(0x40) | |
Max Clock Frequency (MHz):1600 | |
BDFID: 2304(0x900) | |
Compute Unit: 64(0x40) | |
Features: KERNEL_DISPATCH | |
Fast F16 Operation: FALSE | |
Wavefront Size: 64(0x40) | |
Workgroup Max Size: 1024(0x400) | |
Workgroup Max Size per Dimension: | |
x 1024(0x400) | |
y 1024(0x400) | |
z 1024(0x400) | |
Waves Per CU: 40(0x28) | |
Max Work-item Per CU: 2560(0xa00) | |
Grid Max Size: 4294967295(0xffffffff) | |
Grid Max Size per Dimension: | |
x 4294967295(0xffffffff) | |
y 4294967295(0xffffffff) | |
z 4294967295(0xffffffff) | |
Max number Of fbarriers Per Workgroup:32 | |
Pool Info: | |
Pool 1 | |
Segment: GLOBAL; FLAGS: COARSE GRAINED | |
Size: 16760832(0xffc000) KB | |
Allocatable: TRUE | |
Alloc Granule: 4KB | |
Alloc Alignment: 4KB | |
Acessible by all: FALSE | |
Pool 2 | |
Segment: GROUP | |
Size: 64(0x40) KB | |
Allocatable: FALSE | |
Alloc Granule: 0KB | |
Alloc Alignment: 0KB | |
Acessible by all: FALSE | |
ISA Info: | |
ISA 1 | |
Name: amdgcn-amd-amdhsa--gfx900 | |
Machine Models: HSA_MACHINE_MODEL_LARGE | |
Profiles: HSA_PROFILE_BASE | |
Default Rounding Mode: NEAR | |
Default Rounding Mode: NEAR | |
Fast f16: TRUE | |
Workgroup Max Size: 1024(0x400) | |
Workgroup Max Size per Dimension: | |
x 1024(0x400) | |
y 1024(0x400) | |
z 1024(0x400) | |
Grid Max Size: 4294967295(0xffffffff) | |
Grid Max Size per Dimension: | |
x 4294967295(0xffffffff) | |
y 4294967295(0xffffffff) | |
z 4294967295(0xffffffff) | |
FBarrier Max Size: 32 | |
*** Done *** |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import tensorflow as tf | |
with tf.device('/gpu:0'): | |
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') | |
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') | |
c = tf.matmul(a, b) | |
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) # same ihipException here | |
print(sess.run(c)) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment