harish2704/CUDA-Tesla-p100-Colab.txt

## CUDA-Tesla-p100-Colab.txt
```
python3 examples/image_ocr.py
Using TensorFlow backend.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:541: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4479: The name tf.truncated_normal is deprecated. Please use tf.random.truncated_normal instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4267: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4432: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:66: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

Model: "model_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
the_input (InputLayer)          (None, 128, 64, 1)   0
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 128, 64, 16)  160         the_input[0][0]
__________________________________________________________________________________________________
max1 (MaxPooling2D)             (None, 64, 32, 16)   0           conv1[0][0]
__________________________________________________________________________________________________
conv2 (Conv2D)                  (None, 64, 32, 16)   2320        max1[0][0]
__________________________________________________________________________________________________
max2 (MaxPooling2D)             (None, 32, 16, 16)   0           conv2[0][0]
__________________________________________________________________________________________________
reshape (Reshape)               (None, 32, 256)      0           max2[0][0]
__________________________________________________________________________________________________
dense1 (Dense)                  (None, 32, 32)       8224        reshape[0][0]
__________________________________________________________________________________________________
gru1 (GRU)                      (None, 32, 512)      837120      dense1[0][0]
__________________________________________________________________________________________________
gru1_b (GRU)                    (None, 32, 512)      837120      dense1[0][0]
__________________________________________________________________________________________________
add_1 (Add)                     (None, 32, 512)      0           gru1[0][0]
                                                                 gru1_b[0][0]
__________________________________________________________________________________________________
gru2 (GRU)                      (None, 32, 512)      1574400     add_1[0][0]
__________________________________________________________________________________________________
gru2_b (GRU)                    (None, 32, 512)      1574400     add_1[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, 32, 1024)     0           gru2[0][0]
                                                                 gru2_b[0][0]
__________________________________________________________________________________________________
dense2 (Dense)                  (None, 32, 28)       28700       concatenate_1[0][0]
__________________________________________________________________________________________________
softmax (Activation)            (None, 32, 28)       0           dense2[0][0]
==================================================================================================
Total params: 4,862,444
Trainable params: 4,862,444
Non-trainable params: 0
__________________________________________________________________________________________________
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/array_ops.py:1475: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4551: The name tf.log is deprecated. Please use tf.math.log instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/optimizers.py:793: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1033: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1020: The name tf.assign is deprecated. Please use tf.compat.v1.assign instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3005: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

Epoch 1/20
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:197: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

2019-12-30 14:41:49.026347: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX512F
2019-12-30 14:41:49.047770: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2000170000 Hz
2019-12-30 14:41:49.048317: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x2ddad80 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2019-12-30 14:41:49.048356: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2019-12-30 14:41:49.054308: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2019-12-30 14:41:49.269336: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-30 14:41:49.270173: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x2ddaf40 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2019-12-30 14:41:49.270202: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Tesla P100-PCIE-16GB, Compute Capability 6.0
2019-12-30 14:41:49.271473: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-30 14:41:49.272461: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: Tesla P100-PCIE-16GB major: 6 minor: 0 memoryClockRate(GHz): 1.3285
pciBusID: 0000:00:04.0
2019-12-30 14:41:49.284335: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2019-12-30 14:41:49.505987: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2019-12-30 14:41:49.636105: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2019-12-30 14:41:49.653096: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2019-12-30 14:41:49.921397: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2019-12-30 14:41:49.940272: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2019-12-30 14:41:50.451087: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-12-30 14:41:50.451270: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-30 14:41:50.451997: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-30 14:41:50.452577: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-12-30 14:41:50.455852: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2019-12-30 14:41:50.457216: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-12-30 14:41:50.457246: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0
2019-12-30 14:41:50.457257: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N
2019-12-30 14:41:50.458468: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-30 14:41:50.459135: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-30 14:41:50.459716: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2019-12-30 14:41:50.459756: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15216 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0)
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:207: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:216: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:223: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

2019-12-30 14:41:58.521534: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2019-12-30 14:42:00.035858: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
400/400 [==============================] - 105s 262ms/step - loss: 42.1394 - val_loss: 47.7223

Out of 256 samples:  Mean edit distance:3.496 Mean normalized edit distance: 1.000
Epoch 2/20
400/400 [==============================] - 89s 221ms/step - loss: 42.0470 - val_loss: 47.7538

Out of 256 samples:  Mean edit distance:3.531 Mean normalized edit distance: 1.000
Epoch 3/20
400/400 [==============================] - 87s 217ms/step - loss: 42.0483 - val_loss: 47.7493

Out of 256 samples:  Mean edit distance:3.547 Mean normalized edit distance: 1.000
Epoch 4/20
400/400 [==============================] - 85s 213ms/step - loss: 42.0706 - val_loss: 47.6910

Out of 256 samples:  Mean edit distance:3.523 Mean normalized edit distance: 1.000
Epoch 5/20
400/400 [==============================] - 84s 209ms/step - loss: 42.0256 - val_loss: 47.7445

Out of 256 samples:  Mean edit distance:3.477 Mean normalized edit distance: 1.000
Epoch 6/20
400/400 [==============================] - 83s 207ms/step - loss: 42.0379 - val_loss: 47.7841

Out of 256 samples:  Mean edit distance:3.539 Mean normalized edit distance: 1.000
Epoch 7/20
400/400 [==============================] - 82s 206ms/step - loss: 42.0434 - val_loss: 47.6471

Out of 256 samples:  Mean edit distance:3.527 Mean normalized edit distance: 1.000
Epoch 8/20
400/400 [==============================] - 82s 205ms/step - loss: 42.0636 - val_loss: 47.7264

Out of 256 samples:  Mean edit distance:3.574 Mean normalized edit distance: 1.000
Epoch 9/20
400/400 [==============================] - 81s 204ms/step - loss: 42.0013 - val_loss: 47.6823

Out of 256 samples:  Mean edit distance:3.566 Mean normalized edit distance: 1.000
Epoch 10/20
400/400 [==============================] - 85s 212ms/step - loss: 42.0180 - val_loss: 47.7052

Out of 256 samples:  Mean edit distance:3.566 Mean normalized edit distance: 1.000
Epoch 11/20
400/400 [==============================] - 85s 212ms/step - loss: 42.0193 - val_loss: 47.7396

Out of 256 samples:  Mean edit distance:3.609 Mean normalized edit distance: 1.000
Epoch 12/20
400/400 [==============================] - 85s 212ms/step - loss: 42.0346 - val_loss: 47.6906

Out of 256 samples:  Mean edit distance:3.551 Mean normalized edit distance: 1.000
Epoch 13/20
400/400 [==============================] - 85s 211ms/step - loss: 42.0782 - val_loss: 47.7711

Out of 256 samples:  Mean edit distance:3.508 Mean normalized edit distance: 1.000
Epoch 14/20
400/400 [==============================] - 84s 210ms/step - loss: 42.0240 - val_loss: 47.8202

Out of 256 samples:  Mean edit distance:3.539 Mean normalized edit distance: 1.000
Epoch 15/20
400/400 [==============================] - 85s 211ms/step - loss: 42.0326 - val_loss: 47.6735

Out of 256 samples:  Mean edit distance:3.523 Mean normalized edit distance: 1.000
Epoch 16/20
400/400 [==============================] - 84s 211ms/step - loss: 42.0295 - val_loss: 47.7353

Out of 256 samples:  Mean edit distance:3.480 Mean normalized edit distance: 1.000
Epoch 17/20
400/400 [==============================] - 84s 210ms/step - loss: 42.0263 - val_loss: 47.7402

Out of 256 samples:  Mean edit distance:3.492 Mean normalized edit distance: 1.000
Epoch 18/20
400/400 [==============================] - 84s 210ms/step - loss: 41.9883 - val_loss: 47.6956

Out of 256 samples:  Mean edit distance:3.531 Mean normalized edit distance: 1.000
Epoch 19/20
400/400 [==============================] - 84s 210ms/step - loss: 42.0159 - val_loss: 47.7222

Out of 256 samples:  Mean edit distance:3.551 Mean normalized edit distance: 1.000
Epoch 20/20
400/400 [==============================] - 83s 209ms/step - loss: 42.0260 - val_loss: 47.7082

Out of 256 samples:  Mean edit distance:3.559 Mean normalized edit distance: 1.000
Model: "model_3"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
the_input (InputLayer)          (None, 512, 64, 1)   0
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 512, 64, 16)  160         the_input[0][0]
__________________________________________________________________________________________________
max1 (MaxPooling2D)             (None, 256, 32, 16)  0           conv1[0][0]
__________________________________________________________________________________________________
conv2 (Conv2D)                  (None, 256, 32, 16)  2320        max1[0][0]
__________________________________________________________________________________________________
max2 (MaxPooling2D)             (None, 128, 16, 16)  0           conv2[0][0]
__________________________________________________________________________________________________
reshape (Reshape)               (None, 128, 256)     0           max2[0][0]
__________________________________________________________________________________________________
dense1 (Dense)                  (None, 128, 32)      8224        reshape[0][0]
__________________________________________________________________________________________________
gru1 (GRU)                      (None, 128, 512)     837120      dense1[0][0]
__________________________________________________________________________________________________
gru1_b (GRU)                    (None, 128, 512)     837120      dense1[0][0]
__________________________________________________________________________________________________
add_2 (Add)                     (None, 128, 512)     0           gru1[0][0]
                                                                 gru1_b[0][0]
__________________________________________________________________________________________________
gru2 (GRU)                      (None, 128, 512)     1574400     add_2[0][0]
__________________________________________________________________________________________________
gru2_b (GRU)                    (None, 128, 512)     1574400     add_2[0][0]
__________________________________________________________________________________________________
concatenate_2 (Concatenate)     (None, 128, 1024)    0           gru2[0][0]
                                                                 gru2_b[0][0]
__________________________________________________________________________________________________
dense2 (Dense)                  (None, 128, 28)      28700       concatenate_2[0][0]
__________________________________________________________________________________________________
softmax (Activation)            (None, 128, 28)      0           dense2[0][0]
==================================================================================================
Total params: 4,862,444
Trainable params: 4,862,444
Non-trainable params: 0
__________________________________________________________________________________________________
Epoch 21/25
400/400 [==============================] - 303s 758ms/step - loss: 37.3943 - val_loss: 42.5088

Out of 256 samples:  Mean edit distance:3.520 Mean normalized edit distance: 1.000
Epoch 22/25
400/400 [==============================] - 301s 752ms/step - loss: 89.2964 - val_loss: 95.7861

Out of 256 samples:  Mean edit distance:7.988 Mean normalized edit distance: 1.000
Epoch 23/25
400/400 [==============================] - 305s 762ms/step - loss: 91.3408 - val_loss: 103.0809

Out of 256 samples:  Mean edit distance:8.066 Mean normalized edit distance: 1.000
Epoch 24/25
400/400 [==============================] - 318s 795ms/step - loss: 91.4558 - val_loss: 103.4552

Out of 256 samples:  Mean edit distance:8.223 Mean normalized edit distance: 1.000
Epoch 25/25
400/400 [==============================] - 325s 814ms/step - loss: 91.4825 - val_loss: 104.3913

Out of 256 samples:  Mean edit distance:8.230 Mean normalized edit distance: 1.000

```

## Output-ROCm-Vega5.txt
```
[hari@localhost ~/Downloads/keras-2.3.1]$ python3 examples/image_ocr.py
Using TensorFlow backend.
2019-12-30 20:25:35.117548: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libhip_hcc.so
2019-12-30 20:25:35.157206: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1629] Found device 0 with properties:
name: Vega 10 XL/XT [Radeon RX Vega 56/64]
AMDGPU ISA: gfx900
memoryClockRate (GHz) 1.59
pciBusID 0000:0a:00.0
2019-12-30 20:25:35.184890: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocblas.so
2019-12-30 20:25:35.185795: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libMIOpen.so
2019-12-30 20:25:35.186547: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocfft.so
2019-12-30 20:25:35.186689: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocrand.so
2019-12-30 20:25:35.186753: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-12-30 20:25:35.186971: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2019-12-30 20:25:35.189942: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3593360000 Hz
2019-12-30 20:25:35.190280: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55ada2477f40 executing computations on platform Host. Devices:
2019-12-30 20:25:35.190293: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Host, Default Version
2019-12-30 20:25:35.190389: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1629] Found device 0 with properties:
name: Vega 10 XL/XT [Radeon RX Vega 56/64]
AMDGPU ISA: gfx900
memoryClockRate (GHz) 1.59
pciBusID 0000:0a:00.0
2019-12-30 20:25:35.190406: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocblas.so
2019-12-30 20:25:35.190416: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libMIOpen.so
2019-12-30 20:25:35.190425: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocfft.so
2019-12-30 20:25:35.190432: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocrand.so
2019-12-30 20:25:35.190463: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-12-30 20:25:35.190500: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-12-30 20:25:35.190506: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0
2019-12-30 20:25:35.190510: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N
2019-12-30 20:25:35.190578: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7524 MB memory) -> physical GPU (device: 0, name: Vega 10 XL/XT [Radeon RX Vega 56/64], pci bus id: 0000:0a:00.0)
Model: "model_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
the_input (InputLayer)          (None, 128, 64, 1)   0
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 128, 64, 16)  160         the_input[0][0]
__________________________________________________________________________________________________
max1 (MaxPooling2D)             (None, 64, 32, 16)   0           conv1[0][0]
__________________________________________________________________________________________________
conv2 (Conv2D)                  (None, 64, 32, 16)   2320        max1[0][0]
__________________________________________________________________________________________________
max2 (MaxPooling2D)             (None, 32, 16, 16)   0           conv2[0][0]
__________________________________________________________________________________________________
reshape (Reshape)               (None, 32, 256)      0           max2[0][0]
__________________________________________________________________________________________________
dense1 (Dense)                  (None, 32, 32)       8224        reshape[0][0]
__________________________________________________________________________________________________
gru1 (GRU)                      (None, 32, 512)      837120      dense1[0][0]
__________________________________________________________________________________________________
gru1_b (GRU)                    (None, 32, 512)      837120      dense1[0][0]
__________________________________________________________________________________________________
add_1 (Add)                     (None, 32, 512)      0           gru1[0][0]
                                                                 gru1_b[0][0]
__________________________________________________________________________________________________
gru2 (GRU)                      (None, 32, 512)      1574400     add_1[0][0]
__________________________________________________________________________________________________
gru2_b (GRU)                    (None, 32, 512)      1574400     add_1[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, 32, 1024)     0           gru2[0][0]
                                                                 gru2_b[0][0]
__________________________________________________________________________________________________
dense2 (Dense)                  (None, 32, 28)       28700       concatenate_1[0][0]
__________________________________________________________________________________________________
softmax (Activation)            (None, 32, 28)       0           dense2[0][0]
==================================================================================================
Total params: 4,862,444
Trainable params: 4,862,444
Non-trainable params: 0
__________________________________________________________________________________________________
Epoch 1/20
2019-12-30 20:25:44.418024: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocblas.so
2019-12-30 20:25:44.432239: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libMIOpen.so
400/400 [==============================] - 124s 311ms/step - loss: 42.1394 - val_loss: 49.8171

Out of 256 samples:  Mean edit distance:3.496 Mean normalized edit distance: 1.000
Epoch 2/20
400/400 [==============================] - 115s 289ms/step - loss: 42.0860 - val_loss: 49.8461

Out of 256 samples:  Mean edit distance:3.531 Mean normalized edit distance: 1.000
Epoch 3/20
400/400 [==============================] - 114s 286ms/step - loss: 42.0466 - val_loss: 44.5114

Out of 256 samples:  Mean edit distance:3.555 Mean normalized edit distance: 1.000
Epoch 4/20
400/400 [==============================] - 114s 286ms/step - loss: 42.0138 - val_loss: 48.9335

Out of 256 samples:  Mean edit distance:3.523 Mean normalized edit distance: 1.000
Epoch 5/20
400/400 [==============================] - 114s 286ms/step - loss: 42.0395 - val_loss: 46.2727

Out of 256 samples:  Mean edit distance:3.477 Mean normalized edit distance: 1.000
Epoch 6/20
400/400 [==============================] - 115s 287ms/step - loss: 42.0567 - val_loss: 48.0469

Out of 256 samples:  Mean edit distance:3.539 Mean normalized edit distance: 1.000
Epoch 7/20
400/400 [==============================] - 115s 287ms/step - loss: 42.0235 - val_loss: 46.7224

Out of 256 samples:  Mean edit distance:3.527 Mean normalized edit distance: 1.000
Epoch 8/20
400/400 [==============================] - 114s 286ms/step - loss: 42.0072 - val_loss: 49.3743

Out of 256 samples:  Mean edit distance:3.574 Mean normalized edit distance: 1.000
Epoch 9/20
400/400 [==============================] - 114s 286ms/step - loss: 42.0692 - val_loss: 45.8388

Out of 256 samples:  Mean edit distance:3.570 Mean normalized edit distance: 1.000
Epoch 10/20
400/400 [==============================] - 114s 286ms/step - loss: 42.0262 - val_loss: 48.0622

Out of 256 samples:  Mean edit distance:3.574 Mean normalized edit distance: 1.000
Epoch 11/20
400/400 [==============================] - 114s 286ms/step - loss: 42.0037 - val_loss: 48.9347

Out of 256 samples:  Mean edit distance:3.531 Mean normalized edit distance: 1.000
Epoch 12/20
400/400 [==============================] - 114s 286ms/step - loss: 42.0007 - val_loss: 47.6128

Out of 256 samples:  Mean edit distance:3.520 Mean normalized edit distance: 1.000
Epoch 13/20
400/400 [==============================] - 115s 286ms/step - loss: 41.9994 - val_loss: 48.9368

Out of 256 samples:  Mean edit distance:3.551 Mean normalized edit distance: 1.000
Epoch 14/20
400/400 [==============================] - 115s 287ms/step - loss: 42.0511 - val_loss: 47.1542

Out of 256 samples:  Mean edit distance:3.559 Mean normalized edit distance: 1.000
Epoch 15/20
400/400 [==============================] - 115s 287ms/step - loss: 42.0149 - val_loss: 48.0377

Out of 256 samples:  Mean edit distance:3.527 Mean normalized edit distance: 1.000
Epoch 16/20
400/400 [==============================] - 115s 287ms/step - loss: 42.0511 - val_loss: 48.0680

Out of 256 samples:  Mean edit distance:3.480 Mean normalized edit distance: 1.000
Epoch 17/20
400/400 [==============================] - 115s 287ms/step - loss: 42.0279 - val_loss: 48.4952

Out of 256 samples:  Mean edit distance:3.516 Mean normalized edit distance: 1.000
Epoch 18/20
400/400 [==============================] - 115s 287ms/step - loss: 42.0322 - val_loss: 44.9407

Out of 256 samples:  Mean edit distance:3.527 Mean normalized edit distance: 1.000
Epoch 19/20
400/400 [==============================] - 115s 286ms/step - loss: 42.0471 - val_loss: 46.2739

Out of 256 samples:  Mean edit distance:3.547 Mean normalized edit distance: 1.000
Epoch 20/20
400/400 [==============================] - 115s 286ms/step - loss: 42.0102 - val_loss: 47.5970

Out of 256 samples:  Mean edit distance:3.566 Mean normalized edit distance: 1.000
Model: "model_3"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
the_input (InputLayer)          (None, 512, 64, 1)   0
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 512, 64, 16)  160         the_input[0][0]
__________________________________________________________________________________________________
max1 (MaxPooling2D)             (None, 256, 32, 16)  0           conv1[0][0]
__________________________________________________________________________________________________
conv2 (Conv2D)                  (None, 256, 32, 16)  2320        max1[0][0]
__________________________________________________________________________________________________
max2 (MaxPooling2D)             (None, 128, 16, 16)  0           conv2[0][0]
__________________________________________________________________________________________________
reshape (Reshape)               (None, 128, 256)     0           max2[0][0]
__________________________________________________________________________________________________
dense1 (Dense)                  (None, 128, 32)      8224        reshape[0][0]
__________________________________________________________________________________________________
gru1 (GRU)                      (None, 128, 512)     837120      dense1[0][0]
__________________________________________________________________________________________________
gru1_b (GRU)                    (None, 128, 512)     837120      dense1[0][0]
__________________________________________________________________________________________________
add_2 (Add)                     (None, 128, 512)     0           gru1[0][0]
                                                                 gru1_b[0][0]
__________________________________________________________________________________________________
gru2 (GRU)                      (None, 128, 512)     1574400     add_2[0][0]
__________________________________________________________________________________________________
gru2_b (GRU)                    (None, 128, 512)     1574400     add_2[0][0]
__________________________________________________________________________________________________
concatenate_2 (Concatenate)     (None, 128, 1024)    0           gru2[0][0]
                                                                 gru2_b[0][0]
__________________________________________________________________________________________________
dense2 (Dense)                  (None, 128, 28)      28700       concatenate_2[0][0]
__________________________________________________________________________________________________
softmax (Activation)            (None, 128, 28)      0           dense2[0][0]
==================================================================================================
Total params: 4,862,444
Trainable params: 4,862,444
Non-trainable params: 0
__________________________________________________________________________________________________
Epoch 21/25
400/400 [==============================] - 442s 1s/step - loss: 37.3943 - val_loss: 44.3882

Out of 256 samples:  Mean edit distance:3.496 Mean normalized edit distance: 1.000
Epoch 22/25
400/400 [==============================] - 432s 1s/step - loss: 89.3359 - val_loss: 99.9920

Out of 256 samples:  Mean edit distance:8.180 Mean normalized edit distance: 1.000
Epoch 23/25
400/400 [==============================] - 433s 1s/step - loss: 91.5029 - val_loss: 106.0722

Out of 256 samples:  Mean edit distance:8.051 Mean normalized edit distance: 1.000
Epoch 24/25
400/400 [==============================] - 432s 1s/step - loss: 91.4113 - val_loss: 105.2302

Out of 256 samples:  Mean edit distance:8.215 Mean normalized edit distance: 1.000
Epoch 25/25
400/400 [==============================] - 432s 1s/step - loss: 91.4994 - val_loss: 95.5991

Out of 256 samples:  Mean edit distance:8.188 Mean normalized edit distance: 1.000
```

## Summary.md

      
    Raw
  

              Summary.md
            
          
    Spece from offcial website


Param
Nvidia Tesla-P100
AMD Vega56


Single-Precision Performance
9.3 teraFLOPS
10.5 TFLOPs


Memory
16 GB
8 GB


Memory Bandwidth
732 GB/s
410 GB/s


Time taken for same training step


Param
Nvidia Tesla-P100
AMD Vega56


Relativly small CRNN model
210ms/step
287ms/step


Relativly large CRNN model
762ms/step
432s 1s/step
	```
	python3 examples/image_ocr.py
	Using TensorFlow backend.
	WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:541: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

	WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4479: The name tf.truncated_normal is deprecated. Please use tf.random.truncated_normal instead.

	WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4267: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead.

	WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4432: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

	WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:66: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

	Model: "model_1"
	__________________________________________________________________________________________________
	Layer (type) Output Shape Param # Connected to
	==================================================================================================
	the_input (InputLayer) (None, 128, 64, 1) 0
	__________________________________________________________________________________________________
	conv1 (Conv2D) (None, 128, 64, 16) 160 the_input[0][0]
	__________________________________________________________________________________________________
	max1 (MaxPooling2D) (None, 64, 32, 16) 0 conv1[0][0]
	__________________________________________________________________________________________________
	conv2 (Conv2D) (None, 64, 32, 16) 2320 max1[0][0]
	__________________________________________________________________________________________________
	max2 (MaxPooling2D) (None, 32, 16, 16) 0 conv2[0][0]
	__________________________________________________________________________________________________
	reshape (Reshape) (None, 32, 256) 0 max2[0][0]
	__________________________________________________________________________________________________
	dense1 (Dense) (None, 32, 32) 8224 reshape[0][0]
	__________________________________________________________________________________________________
	gru1 (GRU) (None, 32, 512) 837120 dense1[0][0]
	__________________________________________________________________________________________________
	gru1_b (GRU) (None, 32, 512) 837120 dense1[0][0]
	__________________________________________________________________________________________________
	add_1 (Add) (None, 32, 512) 0 gru1[0][0]
	gru1_b[0][0]
	__________________________________________________________________________________________________
	gru2 (GRU) (None, 32, 512) 1574400 add_1[0][0]
	__________________________________________________________________________________________________
	gru2_b (GRU) (None, 32, 512) 1574400 add_1[0][0]
	__________________________________________________________________________________________________
	concatenate_1 (Concatenate) (None, 32, 1024) 0 gru2[0][0]
	gru2_b[0][0]
	__________________________________________________________________________________________________
	dense2 (Dense) (None, 32, 28) 28700 concatenate_1[0][0]
	__________________________________________________________________________________________________
	softmax (Activation) (None, 32, 28) 0 dense2[0][0]
	==================================================================================================
	Total params: 4,862,444
	Trainable params: 4,862,444
	Non-trainable params: 0
	__________________________________________________________________________________________________
	WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/array_ops.py:1475: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
	Instructions for updating:
	Use tf.where in 2.0, which has the same broadcast rule as np.where
	WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4551: The name tf.log is deprecated. Please use tf.math.log instead.

	WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/optimizers.py:793: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

	WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1033: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.

	WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1020: The name tf.assign is deprecated. Please use tf.compat.v1.assign instead.

	WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:3005: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

	Epoch 1/20
	WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:190: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

	WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:197: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

	2019-12-30 14:41:49.026347: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX512F
	2019-12-30 14:41:49.047770: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2000170000 Hz
	2019-12-30 14:41:49.048317: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x2ddad80 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
	2019-12-30 14:41:49.048356: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
	2019-12-30 14:41:49.054308: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
	2019-12-30 14:41:49.269336: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
	2019-12-30 14:41:49.270173: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x2ddaf40 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
	2019-12-30 14:41:49.270202: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Tesla P100-PCIE-16GB, Compute Capability 6.0
	2019-12-30 14:41:49.271473: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
	2019-12-30 14:41:49.272461: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
	name: Tesla P100-PCIE-16GB major: 6 minor: 0 memoryClockRate(GHz): 1.3285
	pciBusID: 0000:00:04.0
	2019-12-30 14:41:49.284335: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
	2019-12-30 14:41:49.505987: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
	2019-12-30 14:41:49.636105: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
	2019-12-30 14:41:49.653096: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
	2019-12-30 14:41:49.921397: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
	2019-12-30 14:41:49.940272: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
	2019-12-30 14:41:50.451087: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
	2019-12-30 14:41:50.451270: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
	2019-12-30 14:41:50.451997: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
	2019-12-30 14:41:50.452577: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
	2019-12-30 14:41:50.455852: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
	2019-12-30 14:41:50.457216: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
	2019-12-30 14:41:50.457246: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
	2019-12-30 14:41:50.457257: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
	2019-12-30 14:41:50.458468: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
	2019-12-30 14:41:50.459135: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
	2019-12-30 14:41:50.459716: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
	2019-12-30 14:41:50.459756: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15216 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0)
	WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:207: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

	WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:216: The name tf.is_variable_initialized is deprecated. Please use tf.compat.v1.is_variable_initialized instead.

	WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:223: The name tf.variables_initializer is deprecated. Please use tf.compat.v1.variables_initializer instead.

	2019-12-30 14:41:58.521534: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
	2019-12-30 14:42:00.035858: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
	400/400 [==============================] - 105s 262ms/step - loss: 42.1394 - val_loss: 47.7223

	Out of 256 samples: Mean edit distance:3.496 Mean normalized edit distance: 1.000
	Epoch 2/20
	400/400 [==============================] - 89s 221ms/step - loss: 42.0470 - val_loss: 47.7538

	Out of 256 samples: Mean edit distance:3.531 Mean normalized edit distance: 1.000
	Epoch 3/20
	400/400 [==============================] - 87s 217ms/step - loss: 42.0483 - val_loss: 47.7493

	Out of 256 samples: Mean edit distance:3.547 Mean normalized edit distance: 1.000
	Epoch 4/20
	400/400 [==============================] - 85s 213ms/step - loss: 42.0706 - val_loss: 47.6910

	Out of 256 samples: Mean edit distance:3.523 Mean normalized edit distance: 1.000
	Epoch 5/20
	400/400 [==============================] - 84s 209ms/step - loss: 42.0256 - val_loss: 47.7445

	Out of 256 samples: Mean edit distance:3.477 Mean normalized edit distance: 1.000
	Epoch 6/20
	400/400 [==============================] - 83s 207ms/step - loss: 42.0379 - val_loss: 47.7841

	Out of 256 samples: Mean edit distance:3.539 Mean normalized edit distance: 1.000
	Epoch 7/20
	400/400 [==============================] - 82s 206ms/step - loss: 42.0434 - val_loss: 47.6471

	Out of 256 samples: Mean edit distance:3.527 Mean normalized edit distance: 1.000
	Epoch 8/20
	400/400 [==============================] - 82s 205ms/step - loss: 42.0636 - val_loss: 47.7264

	Out of 256 samples: Mean edit distance:3.574 Mean normalized edit distance: 1.000
	Epoch 9/20
	400/400 [==============================] - 81s 204ms/step - loss: 42.0013 - val_loss: 47.6823

	Out of 256 samples: Mean edit distance:3.566 Mean normalized edit distance: 1.000
	Epoch 10/20
	400/400 [==============================] - 85s 212ms/step - loss: 42.0180 - val_loss: 47.7052

	Out of 256 samples: Mean edit distance:3.566 Mean normalized edit distance: 1.000
	Epoch 11/20
	400/400 [==============================] - 85s 212ms/step - loss: 42.0193 - val_loss: 47.7396

	Out of 256 samples: Mean edit distance:3.609 Mean normalized edit distance: 1.000
	Epoch 12/20
	400/400 [==============================] - 85s 212ms/step - loss: 42.0346 - val_loss: 47.6906

	Out of 256 samples: Mean edit distance:3.551 Mean normalized edit distance: 1.000
	Epoch 13/20
	400/400 [==============================] - 85s 211ms/step - loss: 42.0782 - val_loss: 47.7711

	Out of 256 samples: Mean edit distance:3.508 Mean normalized edit distance: 1.000
	Epoch 14/20
	400/400 [==============================] - 84s 210ms/step - loss: 42.0240 - val_loss: 47.8202

	Out of 256 samples: Mean edit distance:3.539 Mean normalized edit distance: 1.000
	Epoch 15/20
	400/400 [==============================] - 85s 211ms/step - loss: 42.0326 - val_loss: 47.6735

	Out of 256 samples: Mean edit distance:3.523 Mean normalized edit distance: 1.000
	Epoch 16/20
	400/400 [==============================] - 84s 211ms/step - loss: 42.0295 - val_loss: 47.7353

	Out of 256 samples: Mean edit distance:3.480 Mean normalized edit distance: 1.000
	Epoch 17/20
	400/400 [==============================] - 84s 210ms/step - loss: 42.0263 - val_loss: 47.7402

	Out of 256 samples: Mean edit distance:3.492 Mean normalized edit distance: 1.000
	Epoch 18/20
	400/400 [==============================] - 84s 210ms/step - loss: 41.9883 - val_loss: 47.6956

	Out of 256 samples: Mean edit distance:3.531 Mean normalized edit distance: 1.000
	Epoch 19/20
	400/400 [==============================] - 84s 210ms/step - loss: 42.0159 - val_loss: 47.7222

	Out of 256 samples: Mean edit distance:3.551 Mean normalized edit distance: 1.000
	Epoch 20/20
	400/400 [==============================] - 83s 209ms/step - loss: 42.0260 - val_loss: 47.7082

	Out of 256 samples: Mean edit distance:3.559 Mean normalized edit distance: 1.000
	Model: "model_3"
	__________________________________________________________________________________________________
	Layer (type) Output Shape Param # Connected to
	==================================================================================================
	the_input (InputLayer) (None, 512, 64, 1) 0
	__________________________________________________________________________________________________
	conv1 (Conv2D) (None, 512, 64, 16) 160 the_input[0][0]
	__________________________________________________________________________________________________
	max1 (MaxPooling2D) (None, 256, 32, 16) 0 conv1[0][0]
	__________________________________________________________________________________________________
	conv2 (Conv2D) (None, 256, 32, 16) 2320 max1[0][0]
	__________________________________________________________________________________________________
	max2 (MaxPooling2D) (None, 128, 16, 16) 0 conv2[0][0]
	__________________________________________________________________________________________________
	reshape (Reshape) (None, 128, 256) 0 max2[0][0]
	__________________________________________________________________________________________________
	dense1 (Dense) (None, 128, 32) 8224 reshape[0][0]
	__________________________________________________________________________________________________
	gru1 (GRU) (None, 128, 512) 837120 dense1[0][0]
	__________________________________________________________________________________________________
	gru1_b (GRU) (None, 128, 512) 837120 dense1[0][0]
	__________________________________________________________________________________________________
	add_2 (Add) (None, 128, 512) 0 gru1[0][0]
	gru1_b[0][0]
	__________________________________________________________________________________________________
	gru2 (GRU) (None, 128, 512) 1574400 add_2[0][0]
	__________________________________________________________________________________________________
	gru2_b (GRU) (None, 128, 512) 1574400 add_2[0][0]
	__________________________________________________________________________________________________
	concatenate_2 (Concatenate) (None, 128, 1024) 0 gru2[0][0]
	gru2_b[0][0]
	__________________________________________________________________________________________________
	dense2 (Dense) (None, 128, 28) 28700 concatenate_2[0][0]
	__________________________________________________________________________________________________
	softmax (Activation) (None, 128, 28) 0 dense2[0][0]
	==================================================================================================
	Total params: 4,862,444
	Trainable params: 4,862,444
	Non-trainable params: 0
	__________________________________________________________________________________________________
	Epoch 21/25
	400/400 [==============================] - 303s 758ms/step - loss: 37.3943 - val_loss: 42.5088

	Out of 256 samples: Mean edit distance:3.520 Mean normalized edit distance: 1.000
	Epoch 22/25
	400/400 [==============================] - 301s 752ms/step - loss: 89.2964 - val_loss: 95.7861

	Out of 256 samples: Mean edit distance:7.988 Mean normalized edit distance: 1.000
	Epoch 23/25
	400/400 [==============================] - 305s 762ms/step - loss: 91.3408 - val_loss: 103.0809

	Out of 256 samples: Mean edit distance:8.066 Mean normalized edit distance: 1.000
	Epoch 24/25
	400/400 [==============================] - 318s 795ms/step - loss: 91.4558 - val_loss: 103.4552

	Out of 256 samples: Mean edit distance:8.223 Mean normalized edit distance: 1.000
	Epoch 25/25
	400/400 [==============================] - 325s 814ms/step - loss: 91.4825 - val_loss: 104.3913

	Out of 256 samples: Mean edit distance:8.230 Mean normalized edit distance: 1.000

	```
	```
	[hari@localhost ~/Downloads/keras-2.3.1]$ python3 examples/image_ocr.py
	Using TensorFlow backend.
	2019-12-30 20:25:35.117548: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libhip_hcc.so
	2019-12-30 20:25:35.157206: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1629] Found device 0 with properties:
	name: Vega 10 XL/XT [Radeon RX Vega 56/64]
	AMDGPU ISA: gfx900
	memoryClockRate (GHz) 1.59
	pciBusID 0000:0a:00.0
	2019-12-30 20:25:35.184890: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocblas.so
	2019-12-30 20:25:35.185795: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libMIOpen.so
	2019-12-30 20:25:35.186547: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocfft.so
	2019-12-30 20:25:35.186689: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocrand.so
	2019-12-30 20:25:35.186753: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
	2019-12-30 20:25:35.186971: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
	2019-12-30 20:25:35.189942: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3593360000 Hz
	2019-12-30 20:25:35.190280: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55ada2477f40 executing computations on platform Host. Devices:
	2019-12-30 20:25:35.190293: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): Host, Default Version
	2019-12-30 20:25:35.190389: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1629] Found device 0 with properties:
	name: Vega 10 XL/XT [Radeon RX Vega 56/64]
	AMDGPU ISA: gfx900
	memoryClockRate (GHz) 1.59
	pciBusID 0000:0a:00.0
	2019-12-30 20:25:35.190406: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocblas.so
	2019-12-30 20:25:35.190416: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libMIOpen.so
	2019-12-30 20:25:35.190425: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocfft.so
	2019-12-30 20:25:35.190432: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocrand.so
	2019-12-30 20:25:35.190463: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
	2019-12-30 20:25:35.190500: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
	2019-12-30 20:25:35.190506: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
	2019-12-30 20:25:35.190510: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
	2019-12-30 20:25:35.190578: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7524 MB memory) -> physical GPU (device: 0, name: Vega 10 XL/XT [Radeon RX Vega 56/64], pci bus id: 0000:0a:00.0)
	Model: "model_1"
	__________________________________________________________________________________________________
	Layer (type) Output Shape Param # Connected to
	==================================================================================================
	the_input (InputLayer) (None, 128, 64, 1) 0
	__________________________________________________________________________________________________
	conv1 (Conv2D) (None, 128, 64, 16) 160 the_input[0][0]
	__________________________________________________________________________________________________
	max1 (MaxPooling2D) (None, 64, 32, 16) 0 conv1[0][0]
	__________________________________________________________________________________________________
	conv2 (Conv2D) (None, 64, 32, 16) 2320 max1[0][0]
	__________________________________________________________________________________________________
	max2 (MaxPooling2D) (None, 32, 16, 16) 0 conv2[0][0]
	__________________________________________________________________________________________________
	reshape (Reshape) (None, 32, 256) 0 max2[0][0]
	__________________________________________________________________________________________________
	dense1 (Dense) (None, 32, 32) 8224 reshape[0][0]
	__________________________________________________________________________________________________
	gru1 (GRU) (None, 32, 512) 837120 dense1[0][0]
	__________________________________________________________________________________________________
	gru1_b (GRU) (None, 32, 512) 837120 dense1[0][0]
	__________________________________________________________________________________________________
	add_1 (Add) (None, 32, 512) 0 gru1[0][0]
	gru1_b[0][0]
	__________________________________________________________________________________________________
	gru2 (GRU) (None, 32, 512) 1574400 add_1[0][0]
	__________________________________________________________________________________________________
	gru2_b (GRU) (None, 32, 512) 1574400 add_1[0][0]
	__________________________________________________________________________________________________
	concatenate_1 (Concatenate) (None, 32, 1024) 0 gru2[0][0]
	gru2_b[0][0]
	__________________________________________________________________________________________________
	dense2 (Dense) (None, 32, 28) 28700 concatenate_1[0][0]
	__________________________________________________________________________________________________
	softmax (Activation) (None, 32, 28) 0 dense2[0][0]
	==================================================================================================
	Total params: 4,862,444
	Trainable params: 4,862,444
	Non-trainable params: 0
	__________________________________________________________________________________________________
	Epoch 1/20
	2019-12-30 20:25:44.418024: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocblas.so
	2019-12-30 20:25:44.432239: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libMIOpen.so
	400/400 [==============================] - 124s 311ms/step - loss: 42.1394 - val_loss: 49.8171

	Out of 256 samples: Mean edit distance:3.496 Mean normalized edit distance: 1.000
	Epoch 2/20
	400/400 [==============================] - 115s 289ms/step - loss: 42.0860 - val_loss: 49.8461

	Out of 256 samples: Mean edit distance:3.531 Mean normalized edit distance: 1.000
	Epoch 3/20
	400/400 [==============================] - 114s 286ms/step - loss: 42.0466 - val_loss: 44.5114

	Out of 256 samples: Mean edit distance:3.555 Mean normalized edit distance: 1.000
	Epoch 4/20
	400/400 [==============================] - 114s 286ms/step - loss: 42.0138 - val_loss: 48.9335

	Out of 256 samples: Mean edit distance:3.523 Mean normalized edit distance: 1.000
	Epoch 5/20
	400/400 [==============================] - 114s 286ms/step - loss: 42.0395 - val_loss: 46.2727

	Out of 256 samples: Mean edit distance:3.477 Mean normalized edit distance: 1.000
	Epoch 6/20
	400/400 [==============================] - 115s 287ms/step - loss: 42.0567 - val_loss: 48.0469

	Out of 256 samples: Mean edit distance:3.539 Mean normalized edit distance: 1.000
	Epoch 7/20
	400/400 [==============================] - 115s 287ms/step - loss: 42.0235 - val_loss: 46.7224

	Out of 256 samples: Mean edit distance:3.527 Mean normalized edit distance: 1.000
	Epoch 8/20
	400/400 [==============================] - 114s 286ms/step - loss: 42.0072 - val_loss: 49.3743

	Out of 256 samples: Mean edit distance:3.574 Mean normalized edit distance: 1.000
	Epoch 9/20
	400/400 [==============================] - 114s 286ms/step - loss: 42.0692 - val_loss: 45.8388

	Out of 256 samples: Mean edit distance:3.570 Mean normalized edit distance: 1.000
	Epoch 10/20
	400/400 [==============================] - 114s 286ms/step - loss: 42.0262 - val_loss: 48.0622

	Out of 256 samples: Mean edit distance:3.574 Mean normalized edit distance: 1.000
	Epoch 11/20
	400/400 [==============================] - 114s 286ms/step - loss: 42.0037 - val_loss: 48.9347

	Out of 256 samples: Mean edit distance:3.531 Mean normalized edit distance: 1.000
	Epoch 12/20
	400/400 [==============================] - 114s 286ms/step - loss: 42.0007 - val_loss: 47.6128

	Out of 256 samples: Mean edit distance:3.520 Mean normalized edit distance: 1.000
	Epoch 13/20
	400/400 [==============================] - 115s 286ms/step - loss: 41.9994 - val_loss: 48.9368

	Out of 256 samples: Mean edit distance:3.551 Mean normalized edit distance: 1.000
	Epoch 14/20
	400/400 [==============================] - 115s 287ms/step - loss: 42.0511 - val_loss: 47.1542

	Out of 256 samples: Mean edit distance:3.559 Mean normalized edit distance: 1.000
	Epoch 15/20
	400/400 [==============================] - 115s 287ms/step - loss: 42.0149 - val_loss: 48.0377

	Out of 256 samples: Mean edit distance:3.527 Mean normalized edit distance: 1.000
	Epoch 16/20
	400/400 [==============================] - 115s 287ms/step - loss: 42.0511 - val_loss: 48.0680

	Out of 256 samples: Mean edit distance:3.480 Mean normalized edit distance: 1.000
	Epoch 17/20
	400/400 [==============================] - 115s 287ms/step - loss: 42.0279 - val_loss: 48.4952

	Out of 256 samples: Mean edit distance:3.516 Mean normalized edit distance: 1.000
	Epoch 18/20
	400/400 [==============================] - 115s 287ms/step - loss: 42.0322 - val_loss: 44.9407

	Out of 256 samples: Mean edit distance:3.527 Mean normalized edit distance: 1.000
	Epoch 19/20
	400/400 [==============================] - 115s 286ms/step - loss: 42.0471 - val_loss: 46.2739

	Out of 256 samples: Mean edit distance:3.547 Mean normalized edit distance: 1.000
	Epoch 20/20
	400/400 [==============================] - 115s 286ms/step - loss: 42.0102 - val_loss: 47.5970

	Out of 256 samples: Mean edit distance:3.566 Mean normalized edit distance: 1.000
	Model: "model_3"
	__________________________________________________________________________________________________
	Layer (type) Output Shape Param # Connected to
	==================================================================================================
	the_input (InputLayer) (None, 512, 64, 1) 0
	__________________________________________________________________________________________________
	conv1 (Conv2D) (None, 512, 64, 16) 160 the_input[0][0]
	__________________________________________________________________________________________________
	max1 (MaxPooling2D) (None, 256, 32, 16) 0 conv1[0][0]
	__________________________________________________________________________________________________
	conv2 (Conv2D) (None, 256, 32, 16) 2320 max1[0][0]
	__________________________________________________________________________________________________
	max2 (MaxPooling2D) (None, 128, 16, 16) 0 conv2[0][0]
	__________________________________________________________________________________________________
	reshape (Reshape) (None, 128, 256) 0 max2[0][0]
	__________________________________________________________________________________________________
	dense1 (Dense) (None, 128, 32) 8224 reshape[0][0]
	__________________________________________________________________________________________________
	gru1 (GRU) (None, 128, 512) 837120 dense1[0][0]
	__________________________________________________________________________________________________
	gru1_b (GRU) (None, 128, 512) 837120 dense1[0][0]
	__________________________________________________________________________________________________
	add_2 (Add) (None, 128, 512) 0 gru1[0][0]
	gru1_b[0][0]
	__________________________________________________________________________________________________
	gru2 (GRU) (None, 128, 512) 1574400 add_2[0][0]
	__________________________________________________________________________________________________
	gru2_b (GRU) (None, 128, 512) 1574400 add_2[0][0]
	__________________________________________________________________________________________________
	concatenate_2 (Concatenate) (None, 128, 1024) 0 gru2[0][0]
	gru2_b[0][0]
	__________________________________________________________________________________________________
	dense2 (Dense) (None, 128, 28) 28700 concatenate_2[0][0]
	__________________________________________________________________________________________________
	softmax (Activation) (None, 128, 28) 0 dense2[0][0]
	==================================================================================================
	Total params: 4,862,444
	Trainable params: 4,862,444
	Non-trainable params: 0
	__________________________________________________________________________________________________
	Epoch 21/25
	400/400 [==============================] - 442s 1s/step - loss: 37.3943 - val_loss: 44.3882

	Out of 256 samples: Mean edit distance:3.496 Mean normalized edit distance: 1.000
	Epoch 22/25
	400/400 [==============================] - 432s 1s/step - loss: 89.3359 - val_loss: 99.9920

	Out of 256 samples: Mean edit distance:8.180 Mean normalized edit distance: 1.000
	Epoch 23/25
	400/400 [==============================] - 433s 1s/step - loss: 91.5029 - val_loss: 106.0722

	Out of 256 samples: Mean edit distance:8.051 Mean normalized edit distance: 1.000
	Epoch 24/25
	400/400 [==============================] - 432s 1s/step - loss: 91.4113 - val_loss: 105.2302

	Out of 256 samples: Mean edit distance:8.215 Mean normalized edit distance: 1.000
	Epoch 25/25
	400/400 [==============================] - 432s 1s/step - loss: 91.4994 - val_loss: 95.5991

	Out of 256 samples: Mean edit distance:8.188 Mean normalized edit distance: 1.000
	```
Param	Nvidia Tesla-P100	AMD Vega56
Single-Precision Performance	9.3 teraFLOPS	10.5 TFLOPs
Memory	16 GB	8 GB
Memory Bandwidth	732 GB/s	410 GB/s
Param	Nvidia Tesla-P100	AMD Vega56
Relativly small CRNN model	210ms/step	287ms/step
Relativly large CRNN model	762ms/step	432s 1s/step