AmosChenYQ/op-variant.log

## op-variant.log
2022-06-04 06:28:32.200335: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE3 SSE4.1 SSE4.2 AVX AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-06-04 06:28:32.347341: I tensorflow/core/platform/cloud/gcs_file_system.cc:806] GCS cache max size = 0 ; block size = 67108864 ; max staleness = 0
2022-06-04 06:28:32.347417: I ./tensorflow/core/platform/cloud/ram_file_block_cache.h:64] GCS file block cache is disabled
2022-06-04 06:28:32.347432: I tensorflow/core/platform/cloud/gcs_file_system.cc:846] GCS DNS cache is disabled, because GCS_RESOLVE_REFRESH_SECS = 0 (or is not set)
2022-06-04 06:28:32.347437: I tensorflow/core/platform/cloud/gcs_file_system.cc:876] GCS additional header DISABLED. No environment variable set.
2022-06-04 06:28:32.348248: I tensorflow/core/util/util.cc:168] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2022-06-04 06:28:32.352741: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2022-06-04 06:28:32.389752: I tensorflow/core/platform/cloud/gcs_file_system.cc:806] GCS cache max size = 0 ; block size = 67108864 ; max staleness = 0
2022-06-04 06:28:32.389794: I ./tensorflow/core/platform/cloud/ram_file_block_cache.h:64] GCS file block cache is disabled
2022-06-04 06:28:32.389801: I tensorflow/core/platform/cloud/gcs_file_system.cc:846] GCS DNS cache is disabled, because GCS_RESOLVE_REFRESH_SECS = 0 (or is not set)
2022-06-04 06:28:32.389806: I tensorflow/core/platform/cloud/gcs_file_system.cc:876] GCS additional header DISABLED. No environment variable set.
2022-06-04 06:28:33.063080: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libnvinfer.so.7
2022-06-04 06:28:33.857699: I tensorflow/compiler/xla/parse_flags_from_env.cc:197] For env var TF_XLA_FLAGS found arguments:
2022-06-04 06:28:33.857784: I tensorflow/compiler/xla/parse_flags_from_env.cc:199]   argv[0] = <argv[0]>
2022-06-04 06:28:33.857819: I tensorflow/compiler/xla/parse_flags_from_env.cc:197] For env var TF_JITRT_FLAGS found arguments:
2022-06-04 06:28:33.857842: I tensorflow/compiler/xla/parse_flags_from_env.cc:199]   argv[0] = <argv[0]>
2022-06-04 06:28:33.857872: I tensorflow/compiler/jit/xla_cpu_device.cc:44] Not creating XLA devices, tf_xla_enable_xla_devices not set and XLA device creation not requested
2022-06-04 06:28:33.857941: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2022-06-04 06:28:33.934576: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1836] Found device 0 with properties:
pciBusID: 0000:18:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s
2022-06-04 06:28:33.934892: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1836] Found device 1 with properties:
pciBusID: 0000:86:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s
2022-06-04 06:28:33.934934: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2022-06-04 06:28:33.934978: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2022-06-04 06:28:33.934998: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2022-06-04 06:28:33.936117: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2022-06-04 06:28:33.936378: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2022-06-04 06:28:33.937289: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2022-06-04 06:28:33.938048: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2022-06-04 06:28:33.938093: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2022-06-04 06:28:33.938870: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1975] Adding visible gpu devices: 0, 1
2022-06-04 06:28:33.938898: I tensorflow/compiler/jit/xla_gpu_device.cc:48] Not creating XLA devices, tf_xla_enable_xla_devices not set and XLA devices creation not required
2022-06-04 06:28:33.940100: I ./tensorflow/core/common_runtime/mkl_cpu_allocator.h:178] MklCPUAllocator: Setting max_mem_bytes: 134837268480
2022-06-04 06:28:33.940133: I tensorflow/core/common_runtime/bfc_allocator.cc:70] Creating new BFCAllocator named: mklcpu
2022-06-04 06:28:33.940147: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256B
2022-06-04 06:28:33.940152: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512B
2022-06-04 06:28:33.940159: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.0KiB
2022-06-04 06:28:33.940165: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.0KiB
2022-06-04 06:28:33.940170: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.0KiB
2022-06-04 06:28:33.940175: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.0KiB
2022-06-04 06:28:33.940181: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.0KiB
2022-06-04 06:28:33.940187: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.0KiB
2022-06-04 06:28:33.940192: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.0KiB
2022-06-04 06:28:33.940198: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.0KiB
2022-06-04 06:28:33.940203: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.0KiB
2022-06-04 06:28:33.940208: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512.0KiB
2022-06-04 06:28:33.940214: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.00MiB
2022-06-04 06:28:33.940219: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.00MiB
2022-06-04 06:28:33.940224: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.00MiB
2022-06-04 06:28:33.940230: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.00MiB
2022-06-04 06:28:33.940235: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.00MiB
2022-06-04 06:28:33.940241: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.00MiB
2022-06-04 06:28:33.940246: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.00MiB
2022-06-04 06:28:33.940258: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.00MiB
2022-06-04 06:28:33.940264: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.00MiB
2022-06-04 06:28:33.940323: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE3 SSE4.1 SSE4.2 AVX AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-06-04 06:28:33.943141: I tensorflow/compiler/jit/xla_cpu_device.cc:58] Not creating XLA devices, tf_xla_enable_xla_devices not set
2022-06-04 06:28:34.155250: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1836] Found device 0 with properties:
pciBusID: 0000:18:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s
2022-06-04 06:28:34.155519: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1836] Found device 1 with properties:
pciBusID: 0000:86:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s
2022-06-04 06:28:34.156166: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1975] Adding visible gpu devices: 0, 1
2022-06-04 06:28:34.156202: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2022-06-04 06:28:34.631744: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1333] Cuda stream priority range on GPU(0): -5,0
2022-06-04 06:28:35.005654: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1333] Cuda stream priority range on GPU(0): -5,0
2022-06-04 06:28:35.005715: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1384] TensorFlow compiled with CUDA 11.2 and cuDNN 8.1.0
2022-06-04 06:28:35.005754: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1396] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-06-04 06:28:35.005763: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402]      0 1
2022-06-04 06:28:35.005770: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1415] 0:   N N
2022-06-04 06:28:35.005774: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1415] 1:   N N
2022-06-04 06:28:35.006825: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1677] GPUDevice PlatformDeviceId 0 TfDeviceId 0 on bus 1 numa: 0 pci: 0000:18:00.0 DeviceLocality: bus_id: 1
links {
}

2022-06-04 06:28:35.007053: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1677] GPUDevice PlatformDeviceId 1 TfDeviceId 1 on bus 2 numa: 1 pci: 0000:86:00.0 DeviceLocality: bus_id: 2
numa_node: 1
links {
}

2022-06-04 06:28:35.007266: I tensorflow/core/common_runtime/bfc_allocator.cc:70] Creating new BFCAllocator named: GPU_0_bfc
2022-06-04 06:28:35.007280: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256B
2022-06-04 06:28:35.007285: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512B
2022-06-04 06:28:35.007293: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.0KiB
2022-06-04 06:28:35.007299: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.0KiB
2022-06-04 06:28:35.007304: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.0KiB
2022-06-04 06:28:35.007309: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.0KiB
2022-06-04 06:28:35.007315: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.0KiB
2022-06-04 06:28:35.007320: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.0KiB
2022-06-04 06:28:35.007325: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.0KiB
2022-06-04 06:28:35.007331: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.0KiB
2022-06-04 06:28:35.007336: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.0KiB
2022-06-04 06:28:35.007342: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512.0KiB
2022-06-04 06:28:35.007347: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.00MiB
2022-06-04 06:28:35.007353: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.00MiB
2022-06-04 06:28:35.007358: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.00MiB
2022-06-04 06:28:35.007363: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.00MiB
2022-06-04 06:28:35.007368: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.00MiB
2022-06-04 06:28:35.007373: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.00MiB
2022-06-04 06:28:35.007378: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.00MiB
2022-06-04 06:28:35.007384: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.00MiB
2022-06-04 06:28:35.007389: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.00MiB
2022-06-04 06:28:35.007429: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1550] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9657 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:18:00.0, compute capability: 7.5
2022-06-04 06:28:35.007457: I tensorflow/stream_executor/stream.cc:261] [stream=0x20490750,impl=0x20490050] Called Stream::Stream(parent=0x4e8f5f0)
2022-06-04 06:28:35.007472: I tensorflow/stream_executor/stream.cc:308] [stream=0x20490750,impl=0x20490050] Called Stream::Init()
2022-06-04 06:28:35.007527: I tensorflow/stream_executor/stream.cc:261] [stream=0x20435d70,impl=0x673fd70] Called Stream::Stream(parent=0x4e8f5f0)
2022-06-04 06:28:35.007536: I tensorflow/stream_executor/stream.cc:308] [stream=0x20435d70,impl=0x673fd70] Called Stream::Init()
2022-06-04 06:28:35.007549: I tensorflow/stream_executor/stream.cc:261] [stream=0x20436060,impl=0x2048fb40] Called Stream::Stream(parent=0x4e8f5f0)
2022-06-04 06:28:35.007555: I tensorflow/stream_executor/stream.cc:308] [stream=0x20436060,impl=0x2048fb40] Called Stream::Init()
2022-06-04 06:28:35.007567: I tensorflow/stream_executor/stream.cc:261] [stream=0x20436350,impl=0x2048fbd0] Called Stream::Stream(parent=0x4e8f5f0)
2022-06-04 06:28:35.007573: I tensorflow/stream_executor/stream.cc:308] [stream=0x20436350,impl=0x2048fbd0] Called Stream::Init()
2022-06-04 06:28:35.007589: I tensorflow/core/common_runtime/bfc_allocator.cc:70] Creating new BFCAllocator named: gpu_host_bfc
2022-06-04 06:28:35.007596: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256B
2022-06-04 06:28:35.007601: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512B
2022-06-04 06:28:35.007606: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.0KiB
2022-06-04 06:28:35.007611: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.0KiB
2022-06-04 06:28:35.007616: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.0KiB
2022-06-04 06:28:35.007621: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.0KiB
2022-06-04 06:28:35.007627: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.0KiB
2022-06-04 06:28:35.007632: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.0KiB
2022-06-04 06:28:35.007637: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.0KiB
2022-06-04 06:28:35.007642: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.0KiB
2022-06-04 06:28:35.007648: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.0KiB
2022-06-04 06:28:35.007653: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512.0KiB
2022-06-04 06:28:35.007658: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.00MiB
2022-06-04 06:28:35.007663: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.00MiB
2022-06-04 06:28:35.007668: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.00MiB
2022-06-04 06:28:35.007673: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.00MiB
2022-06-04 06:28:35.007678: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.00MiB
2022-06-04 06:28:35.007683: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.00MiB
2022-06-04 06:28:35.007688: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.00MiB
2022-06-04 06:28:35.007694: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.00MiB
2022-06-04 06:28:35.007699: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.00MiB
2022-06-04 06:28:35.008353: I tensorflow/core/common_runtime/bfc_allocator.cc:70] Creating new BFCAllocator named: GPU_1_bfc
2022-06-04 06:28:35.008368: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256B
2022-06-04 06:28:35.008373: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512B
2022-06-04 06:28:35.008379: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.0KiB
2022-06-04 06:28:35.008384: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.0KiB
2022-06-04 06:28:35.008389: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.0KiB
2022-06-04 06:28:35.008394: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.0KiB
2022-06-04 06:28:35.008399: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.0KiB
2022-06-04 06:28:35.008404: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.0KiB
2022-06-04 06:28:35.008409: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.0KiB
2022-06-04 06:28:35.008414: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.0KiB
2022-06-04 06:28:35.008419: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.0KiB
2022-06-04 06:28:35.008424: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512.0KiB
2022-06-04 06:28:35.008429: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.00MiB
2022-06-04 06:28:35.008434: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.00MiB
2022-06-04 06:28:35.008439: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.00MiB
2022-06-04 06:28:35.008443: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.00MiB
2022-06-04 06:28:35.008449: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.00MiB
2022-06-04 06:28:35.008453: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.00MiB
2022-06-04 06:28:35.008458: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.00MiB
2022-06-04 06:28:35.008463: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.00MiB
2022-06-04 06:28:35.008468: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.00MiB
2022-06-04 06:28:35.008491: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1550] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 9657 MB memory:  -> device: 1, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:86:00.0, compute capability: 7.5
2022-06-04 06:28:35.008505: I tensorflow/stream_executor/stream.cc:261] [stream=0x204328a0,impl=0x67411f0] Called Stream::Stream(parent=0x4de9320)
2022-06-04 06:28:35.008512: I tensorflow/stream_executor/stream.cc:308] [stream=0x204328a0,impl=0x67411f0] Called Stream::Init()
2022-06-04 06:28:35.008534: I tensorflow/stream_executor/stream.cc:261] [stream=0x20433b50,impl=0x6741220] Called Stream::Stream(parent=0x4de9320)
2022-06-04 06:28:35.008542: I tensorflow/stream_executor/stream.cc:308] [stream=0x20433b50,impl=0x6741220] Called Stream::Init()
2022-06-04 06:28:35.008554: I tensorflow/stream_executor/stream.cc:261] [stream=0x208ae070,impl=0x1b137dc0] Called Stream::Stream(parent=0x4de9320)
2022-06-04 06:28:35.008560: I tensorflow/stream_executor/stream.cc:308] [stream=0x208ae070,impl=0x1b137dc0] Called Stream::Init()
2022-06-04 06:28:35.008571: I tensorflow/stream_executor/stream.cc:261] [stream=0x208ae290,impl=0x2048f950] Called Stream::Stream(parent=0x4de9320)
2022-06-04 06:28:35.008577: I tensorflow/stream_executor/stream.cc:308] [stream=0x208ae290,impl=0x2048f950] Called Stream::Init()
2022-06-04 06:28:35.008900: I tensorflow/compiler/jit/xla_gpu_device.cc:79] Not creating XLA devices, tf_xla_enable_xla_devices not set
2022-06-04 06:28:35.008968: I tensorflow/core/common_runtime/process_util.cc:159] Session inter op parallelism threads: 32
2022-06-04 06:28:35.033662: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:GPU::_EagerConst takes 20774.1us

2022-06-04 06:28:35.033715: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:CPU::_EagerConst takes 9.349us

2022-06-04 06:28:35.033740: I tensorflow/core/common_runtime/eager/execute.cc:923] PreferredDevice _EagerConst: /job:localhost/replica:0/task:0
2022-06-04 06:28:35.033747: I tensorflow/core/common_runtime/eager/execute.cc:924] Placer place op [_EagerConst] on device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.033772: I tensorflow/core/common_runtime/eager/execute.cc:982] _EagerConst:input:0 /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.033796: I tensorflow/core/common_runtime/eager/execute.cc:1062] Device for [_EagerConst] already set to: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.034780: I tensorflow/core/common_runtime/eager/execute.cc:823] signature {
  name: "__wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0"
  input_arg {
    name: "input"
    type_attr: "T"
  }
  output_arg {
    name: "output"
    type_attr: "T"
  }
  attr {
    name: "T"
    type: "type"
  }
}
node_def {
  name: "_EagerConst"
  op: "_EagerConst"
  input: "input:0"
  device: "/job:localhost/replica:0/task:0/device:GPU:0"
  attr {
    key: "T"
    value {
      placeholder: "T"
    }
  }
}
ret {
  key: "output"
  value: "_EagerConst:output:0"
}

2022-06-04 06:28:35.043237: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.043318: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.043350: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.043403: I tensorflow/core/common_runtime/process_function_library_runtime.cc:772] Instantiating MultiDevice function "__wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0" on default device "/job:localhost/replica:0/task:0/device:GPU:0"
2022-06-04 06:28:35.045064: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:191] None of the MLIR Optimization Passes are enabled (registered 3)
2022-06-04 06:28:35.045092: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 0
2022-06-04 06:28:35.045103: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0
2022-06-04 06:28:35.045108: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MlirV1CompatGraphOptimizationPass
2022-06-04 06:28:35.045117: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9
2022-06-04 06:28:35.045122: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ControlFlowDepsToChainsPass
2022-06-04 06:28:35.045128: I tensorflow/core/common_runtime/control_flow_deps_to_chains.cc:37] ControlFlowDepsToChainsPass::Run
2022-06-04 06:28:35.045145: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.045161: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.045176: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10
2022-06-04 06:28:35.045188: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: AccumulateNV2RemovePass
2022-06-04 06:28:35.045195: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: LowerFunctionalOpsPass
2022-06-04 06:28:35.045208: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ParallelConcatRemovePass
2022-06-04 06:28:35.045216: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 35
2022-06-04 06:28:35.045221: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IsolatePlacerInspectionRequiredOpsPass
2022-06-04 06:28:35.045227: I tensorflow/core/common_runtime/isolate_placer_inspection_required_ops_pass.cc:34] IsolatePlacerInspectionRequiredOpsPass::Run
2022-06-04 06:28:35.045235: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IntroduceFloatingPointJitterPass
2022-06-04 06:28:35.045247: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 36
2022-06-04 06:28:35.045252: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateXlaComputationsPass
2022-06-04 06:28:35.045262: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.045270: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:353] EncapsulateXlaComputations(): (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.045313: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_halfway because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.045324: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:364] EncapsulateXlaComputations() half-way: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.045335: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.045341: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:370] EncapsulateXlaComputations() finished: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.045347: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 37
2022-06-04 06:28:35.045352: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: FunctionalizeControlFlowForXlaPass
2022-06-04 06:28:35.045390: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 99999
2022-06-04 06:28:35.045398: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: WeakForwardTypeInferencePass
2022-06-04 06:28:35.045404: I tensorflow/core/common_runtime/forward_type_inference.cc:130] ForwardTypeInferencePass::Run
2022-06-04 06:28:35.045413: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.045445: I tensorflow/core/common_runtime/forward_type_inference.cc:311] Finished after 1 iterations; done 5 of 5 nodes in 5 visits
2022-06-04 06:28:35.045457: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.045467: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 0
2022-06-04 06:28:35.045497: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node input}}'Will fall back to a default kernel.

2022-06-04 06:28:35.045510: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::input takes 18.437us

2022-06-04 06:28:35.045518: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::input takes 0.956us

2022-06-04 06:28:35.045534: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:GPU::_EagerConst takes 5.369us

2022-06-04 06:28:35.045540: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:CPU::_EagerConst takes 0.655us

2022-06-04 06:28:35.045561: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.045569: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_RetVal takes 21.386us

2022-06-04 06:28:35.045576: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 1.186us

2022-06-04 06:28:35.045592: I tensorflow/core/common_runtime/placer.cc:124] input(_Arg) placed on: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.045605: I tensorflow/core/common_runtime/placer.cc:124] _EagerConst(_EagerConst) placed on: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.045613: I tensorflow/core/common_runtime/placer.cc:124] output_RetVal(_Retval) placed on: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.045619: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 1
2022-06-04 06:28:35.045625: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0
2022-06-04 06:28:35.045630: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: NcclReplacePass
2022-06-04 06:28:35.045637: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 1
2022-06-04 06:28:35.045643: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 2
2022-06-04 06:28:35.045648: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 5
2022-06-04 06:28:35.045652: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: CloneConstantsForBetterClusteringPass
2022-06-04 06:28:35.045668: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9
2022-06-04 06:28:35.045677: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ClusterScopingPass
2022-06-04 06:28:35.045684: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10
2022-06-04 06:28:35.045688: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MarkForCompilationPass
2022-06-04 06:28:35.049630: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: XlaLaunch:CPU::_XlaLaunch-op takes 1.564us

2022-06-04 06:28:35.049650: I tensorflow/compiler/tf2xla/xla_op_registry.cc:51] LaunchOpHasKernelForDevice kernel_class_name: XlaLocalLaunchOp
2022-06-04 06:28:35.049661: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: XlaLaunch:GPU::_XlaLaunch-op takes 0.645us

2022-06-04 06:28:35.049666: I tensorflow/compiler/tf2xla/xla_op_registry.cc:51] LaunchOpHasKernelForDevice kernel_class_name: XlaLocalLaunchOp
2022-06-04 06:28:35.049695: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:XLA_GPU_JIT::_EagerConst takes 0.877us

2022-06-04 06:28:35.049789: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:650] DeadnessAnalysis time: 13 us (cumulative: 13 us, max: 13 us, #called: 1)
2022-06-04 06:28:35.049841: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:1523] MarkForCompilationPassImpl::Run time: 441 us (cumulative: 441 us, max: 441 us, #called: 1)
2022-06-04 06:28:35.049855: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 12
2022-06-04 06:28:35.049860: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ForceXlaConstantsOnHostPass
2022-06-04 06:28:35.049878: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 20
2022-06-04 06:28:35.049887: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IncreaseDynamismForAutoJitPass
2022-06-04 06:28:35.049894: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 30
2022-06-04 06:28:35.049899: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: PartiallyDeclusterPass
2022-06-04 06:28:35.049920: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 40
2022-06-04 06:28:35.049930: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ReportClusteringInfoPass
2022-06-04 06:28:35.050026: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 50
2022-06-04 06:28:35.050035: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateSubgraphsPass
2022-06-04 06:28:35.050040: I tensorflow/compiler/jit/encapsulate_subgraphs_pass.cc:1139] EncapsulateSubgraphsPass::Run
2022-06-04 06:28:35.050056: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.050134: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.050159: I tensorflow/compiler/jit/xla_cluster_util.cc:590] GetNodesRelatedToRefVariables() found 0 nodes
2022-06-04 06:28:35.050185: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 60
2022-06-04 06:28:35.050190: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: BuildXlaOpsPass
2022-06-04 06:28:35.050198: I tensorflow/compiler/jit/build_xla_ops_pass.cc:603] print_outputs = 0
2022-06-04 06:28:35.050202: I tensorflow/compiler/jit/build_xla_ops_pass.cc:604] check_input_numerics = 0
2022-06-04 06:28:35.050206: I tensorflow/compiler/jit/build_xla_ops_pass.cc:605] check_output_numerics = 0
2022-06-04 06:28:35.050216: W tensorflow/core/util/dump_graph.cc:134] Failed to dump build_xla_ops because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.050227: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 2
2022-06-04 06:28:35.050252: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::input takes 1.366us

2022-06-04 06:28:35.050270: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:GPU::_EagerConst takes 4.899us

2022-06-04 06:28:35.050284: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.050294: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_RetVal takes 14.93us

2022-06-04 06:28:35.050346: I tensorflow/core/graph/graph_partition.cc:281] Receiving data from input (_Arg) on /job:localhost/replica:0/task:0/device:CPU:0 in device memory for _EagerConst (_EagerConst) on /job:localhost/replica:0/task:0/device:GPU:0 in host memory
2022-06-04 06:28:35.050373: I tensorflow/core/graph/graph_partition.cc:1251] Added send/recv: controls=0, data=1
2022-06-04 06:28:35.050446: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 3
2022-06-04 06:28:35.050456: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 1
2022-06-04 06:28:35.050463: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MklLayoutRewritePass
2022-06-04 06:28:35.053238: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _HostRecv, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.053258: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _EagerConst, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.053262: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.053268: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _HostRecv, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.053273: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _EagerConst, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.053277: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.053282: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _HostRecv, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.053287: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _EagerConst, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.053291: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.053300: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 3
2022-06-04 06:28:35.053320: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_108004928_/job:localhost/replica:0/task:0/device:CPU:0 because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.053338: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_550073232_/job:localhost/replica:0/task:0/device:GPU:0 because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.053420: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0_3090859302171296086_0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.053444: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0_3090859302171296086_0 on device /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.053563: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0_3090859302171296086_0 with handle 0 status: OK
2022-06-04 06:28:35.053605: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0_3090859302171296086_1' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.053620: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0_3090859302171296086_1 on device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.053687: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0_3090859302171296086_1 with handle 1 status: OK
2022-06-04 06:28:35.053754: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.053789: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 0
2022-06-04 06:28:35.053832: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found
2022-06-04 06:28:35.053901: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node input/_1}} = _Send[T=DT_INT32, _dst="_EagerConst", _src="input", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=-5016134525313837416, tensor_name="edge_2_input", _device="/job:localhost/replica:0/task:0/device:CPU:0"](input)
2022-06-04 06:28:35.053921: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::input/_1 takes 1.279us

2022-06-04 06:28:35.053929: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::input/_1 takes 0.305us

2022-06-04 06:28:35.053966: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node input/_1}} = _Send[T=DT_INT32, _dst="_EagerConst", _src="input", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=-5016134525313837416, tensor_name="edge_2_input", _device="/job:localhost/replica:0/task:0/device:CPU:0"](input) takes 71.781us

2022-06-04 06:28:35.054013: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x2084e7c0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0
2022-06-04 06:28:35.054026: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 1
2022-06-04 06:28:35.054048: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:CPU::_EagerConst takes 0.899us

2022-06-04 06:28:35.054054: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found
2022-06-04 06:28:35.054080: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.054087: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 8.147us

2022-06-04 06:28:35.054094: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SINK}}'Will fall back to a default kernel.

2022-06-04 06:28:35.054100: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SINK takes 5.665us

2022-06-04 06:28:35.054106: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node input/_2}}'Will fall back to a default kernel.

2022-06-04 06:28:35.054112: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _HostRecv:GPU::input/_2 takes 5.609us

2022-06-04 06:28:35.054125: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:GPU::_EagerConst takes 4.609us

2022-06-04 06:28:35.054135: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_retval_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.054141: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_retval_RetVal takes 7.786us

2022-06-04 06:28:35.054148: I tensorflow/core/common_runtime/memory_types.cc:87] 2:0 -> 3:0: 1 -> 1
2022-06-04 06:28:35.054153: I tensorflow/core/common_runtime/memory_types.cc:87] 3:0 -> 4:0: 1 -> 1
2022-06-04 06:28:35.054160: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.054165: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 5.455us

2022-06-04 06:28:35.054171: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SINK}}'Will fall back to a default kernel.

2022-06-04 06:28:35.054176: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SINK takes 5.094us

2022-06-04 06:28:35.054182: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node input/_2}}'Will fall back to a default kernel.

2022-06-04 06:28:35.054187: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _HostRecv:GPU::input/_2 takes 5.512us

2022-06-04 06:28:35.054196: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:GPU::_EagerConst takes 1.386us

2022-06-04 06:28:35.054204: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_retval_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.054209: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_retval_RetVal takes 6.922us

2022-06-04 06:28:35.054216: I tensorflow/core/common_runtime/memory_types.cc:87] 2:0 -> 3:0: 1 -> 1
2022-06-04 06:28:35.054220: I tensorflow/core/common_runtime/memory_types.cc:87] 3:0 -> 4:0: 1 -> 1
2022-06-04 06:28:35.054234: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node input/_2}} = _HostRecv[_dst="_EagerConst", _src="input", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=-5016134525313837416, tensor_name="edge_2_input", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()
2022-06-04 06:28:35.054245: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node input/_2}}'Will fall back to a default kernel.

2022-06-04 06:28:35.054250: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _HostRecv:GPU::input/_2 takes 5.603us

2022-06-04 06:28:35.054256: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node input/_2}}'Will fall back to a default kernel.

2022-06-04 06:28:35.054261: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _HostRecv:GPU::input/_2 takes 4.985us

2022-06-04 06:28:35.054280: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node input/_2}} = _HostRecv[_dst="_EagerConst", _src="input", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=-5016134525313837416, tensor_name="edge_2_input", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"]() takes 45.789us

2022-06-04 06:28:35.054288: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _EagerConst}} = _EagerConst[T=DT_INT32, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](input/_2)
2022-06-04 06:28:35.054296: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:GPU::_EagerConst takes 1.442us

2022-06-04 06:28:35.054302: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:GPU::_EagerConst takes 1.286us

2022-06-04 06:28:35.054316: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _EagerConst}} = _EagerConst[T=DT_INT32, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](input/_2) takes 27.878us

2022-06-04 06:28:35.054324: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_INT32, index=0](_EagerConst)
2022-06-04 06:28:35.054332: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_retval_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.054337: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_retval_RetVal takes 6.681us

2022-06-04 06:28:35.054344: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_retval_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.054349: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_retval_RetVal takes 6.202us

2022-06-04 06:28:35.054359: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_INT32, index=0](_EagerConst) takes 34.85us

2022-06-04 06:28:35.064074: I tensorflow/stream_executor/stream_executor_pimpl.cc:534] Called StreamExecutor::Allocate(size=10126688256, memory_space=0) returns 0x7f13ec000000
2022-06-04 06:28:35.064097: I tensorflow/core/common_runtime/bfc_allocator.cc:157] Extending allocation by 9.43GiB bytes for GPU_0_bfc.
2022-06-04 06:28:35.064103: I tensorflow/core/common_runtime/bfc_allocator.cc:162] Total allocated bytes: 9.43GiB
2022-06-04 06:28:35.064107: I tensorflow/core/common_runtime/bfc_allocator.cc:165] Allocated memory at 0x7f13ec000000 to 0x7f1647990000
2022-06-04 06:28:35.194793: I tensorflow/stream_executor/stream_executor_pimpl.cc:623] Called StreamExecutor::SynchronousMemZero(location=0x7fff2a36eff0, size=1028)
2022-06-04 06:28:35.195254: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper input/_2 op _HostRecv on GPU 0 stream[0]
2022-06-04 06:28:35.195271: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] input/_2:_HostRecv#from=input,to=_EagerConst#
2022-06-04 06:28:35.195290: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x2084e7c0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0
2022-06-04 06:28:35.195301: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x2084e7e0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0
2022-06-04 06:28:35.195323: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.195333: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.195442: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished input/_2 op _HostRecv on GPU 0 stream[0]
2022-06-04 06:28:35.195459: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper _EagerConst op _EagerConst on GPU 0 stream[0]
2022-06-04 06:28:35.195470: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] _EagerConst:_EagerConst#shape=(int32[2])#
2022-06-04 06:28:35.195478: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.195482: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.195507: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished _EagerConst op _EagerConst on GPU 0 stream[0]
2022-06-04 06:28:35.195516: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper output_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.195522: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] output_retval_RetVal:_Retval#shape=(int32[2])#
2022-06-04 06:28:35.195530: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.195535: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.195550: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished output_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.195855: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::RandomUniform takes 9.656us

2022-06-04 06:28:35.195875: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:CPU::RandomUniform takes 3.339us

2022-06-04 06:28:35.195889: I tensorflow/core/common_runtime/eager/execute.cc:923] PreferredDevice RandomUniform: /job:localhost/replica:0/task:0
2022-06-04 06:28:35.195894: I tensorflow/core/common_runtime/eager/execute.cc:924] Placer place op [RandomUniform] on device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.195908: I tensorflow/core/common_runtime/eager/execute.cc:982] RandomUniform:input:0 /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.195921: I tensorflow/core/common_runtime/eager/execute.cc:1062] Device for [RandomUniform] already set to: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.196116: I tensorflow/core/common_runtime/eager/execute.cc:823] signature {
  name: "__wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0"
  input_arg {
    name: "shape"
    type_attr: "T"
  }
  output_arg {
    name: "output"
    type_attr: "dtype"
  }
  attr {
    name: "seed"
    type: "int"
    default_value {
      i: 0
    }
  }
  attr {
    name: "seed2"
    type: "int"
    default_value {
      i: 0
    }
  }
  attr {
    name: "dtype"
    type: "type"
    allowed_values {
      list {
        type: DT_HALF
        type: DT_BFLOAT16
        type: DT_FLOAT
        type: DT_DOUBLE
      }
    }
  }
  attr {
    name: "T"
    type: "type"
    allowed_values {
      list {
        type: DT_INT32
        type: DT_INT64
      }
    }
  }
  is_stateful: true
}
node_def {
  name: "RandomUniform"
  op: "RandomUniform"
  input: "shape:0"
  device: "/job:localhost/replica:0/task:0/device:GPU:0"
  attr {
    key: "T"
    value {
      placeholder: "T"
    }
  }
  attr {
    key: "dtype"
    value {
      placeholder: "dtype"
    }
  }
  attr {
    key: "seed"
    value {
      placeholder: "seed"
    }
  }
  attr {
    key: "seed2"
    value {
      placeholder: "seed2"
    }
  }
}
ret {
  key: "output"
  value: "RandomUniform:output:0"
}

2022-06-04 06:28:35.196152: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.196199: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.196219: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.196277: I tensorflow/core/common_runtime/process_function_library_runtime.cc:772] Instantiating MultiDevice function "__wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0" on default device "/job:localhost/replica:0/task:0/device:GPU:0"
2022-06-04 06:28:35.196465: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 0
2022-06-04 06:28:35.196476: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0
2022-06-04 06:28:35.196482: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MlirV1CompatGraphOptimizationPass
2022-06-04 06:28:35.196488: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9
2022-06-04 06:28:35.196493: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ControlFlowDepsToChainsPass
2022-06-04 06:28:35.196499: I tensorflow/core/common_runtime/control_flow_deps_to_chains.cc:37] ControlFlowDepsToChainsPass::Run
2022-06-04 06:28:35.196513: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.196529: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.196540: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10
2022-06-04 06:28:35.196545: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: AccumulateNV2RemovePass
2022-06-04 06:28:35.196551: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: LowerFunctionalOpsPass
2022-06-04 06:28:35.196562: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ParallelConcatRemovePass
2022-06-04 06:28:35.196569: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 35
2022-06-04 06:28:35.196573: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IsolatePlacerInspectionRequiredOpsPass
2022-06-04 06:28:35.196579: I tensorflow/core/common_runtime/isolate_placer_inspection_required_ops_pass.cc:34] IsolatePlacerInspectionRequiredOpsPass::Run
2022-06-04 06:28:35.196586: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IntroduceFloatingPointJitterPass
2022-06-04 06:28:35.196592: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 36
2022-06-04 06:28:35.196597: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateXlaComputationsPass
2022-06-04 06:28:35.196609: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.196617: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:353] EncapsulateXlaComputations(): (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.196654: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_halfway because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.196664: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:364] EncapsulateXlaComputations() half-way: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.196675: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.196685: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:370] EncapsulateXlaComputations() finished: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.196692: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 37
2022-06-04 06:28:35.196696: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: FunctionalizeControlFlowForXlaPass
2022-06-04 06:28:35.196718: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 99999
2022-06-04 06:28:35.196726: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: WeakForwardTypeInferencePass
2022-06-04 06:28:35.196733: I tensorflow/core/common_runtime/forward_type_inference.cc:130] ForwardTypeInferencePass::Run
2022-06-04 06:28:35.196741: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.196760: I tensorflow/core/common_runtime/forward_type_inference.cc:311] Finished after 1 iterations; done 5 of 5 nodes in 5 visits
2022-06-04 06:28:35.196772: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.196782: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 0
2022-06-04 06:28:35.196809: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node shape}}'Will fall back to a default kernel.

2022-06-04 06:28:35.196820: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::shape takes 16.598us

2022-06-04 06:28:35.196828: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::shape takes 1.279us

2022-06-04 06:28:35.196839: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::RandomUniform takes 1.851us

2022-06-04 06:28:35.196846: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:CPU::RandomUniform takes 1.723us

2022-06-04 06:28:35.196858: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.196864: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_RetVal takes 10.099us

2022-06-04 06:28:35.196870: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 0.662us

2022-06-04 06:28:35.196886: I tensorflow/core/common_runtime/placer.cc:124] shape(_Arg) placed on: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.196897: I tensorflow/core/common_runtime/placer.cc:124] RandomUniform(RandomUniform) placed on: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.196902: I tensorflow/core/common_runtime/placer.cc:124] output_RetVal(_Retval) placed on: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.196908: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 1
2022-06-04 06:28:35.196913: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0
2022-06-04 06:28:35.196918: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: NcclReplacePass
2022-06-04 06:28:35.196925: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 1
2022-06-04 06:28:35.196930: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 2
2022-06-04 06:28:35.196935: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 5
2022-06-04 06:28:35.196940: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: CloneConstantsForBetterClusteringPass
2022-06-04 06:28:35.196947: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9
2022-06-04 06:28:35.196951: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ClusterScopingPass
2022-06-04 06:28:35.196956: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10
2022-06-04 06:28:35.196961: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MarkForCompilationPass
2022-06-04 06:28:35.197218: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:XLA_GPU_JIT::RandomUniform takes 1.954us

2022-06-04 06:28:35.197260: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:1523] MarkForCompilationPassImpl::Run time: 285 us (cumulative: 726 us, max: 441 us, #called: 2)
2022-06-04 06:28:35.197272: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 12
2022-06-04 06:28:35.197278: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ForceXlaConstantsOnHostPass
2022-06-04 06:28:35.197289: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 20
2022-06-04 06:28:35.197294: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IncreaseDynamismForAutoJitPass
2022-06-04 06:28:35.197299: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 30
2022-06-04 06:28:35.197304: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: PartiallyDeclusterPass
2022-06-04 06:28:35.197327: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 40
2022-06-04 06:28:35.197334: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ReportClusteringInfoPass
2022-06-04 06:28:35.197511: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 50
2022-06-04 06:28:35.197521: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateSubgraphsPass
2022-06-04 06:28:35.197526: I tensorflow/compiler/jit/encapsulate_subgraphs_pass.cc:1139] EncapsulateSubgraphsPass::Run
2022-06-04 06:28:35.197543: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.197621: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.197644: I tensorflow/compiler/jit/xla_cluster_util.cc:590] GetNodesRelatedToRefVariables() found 0 nodes
2022-06-04 06:28:35.197666: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 60
2022-06-04 06:28:35.197673: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: BuildXlaOpsPass
2022-06-04 06:28:35.197681: I tensorflow/compiler/jit/build_xla_ops_pass.cc:603] print_outputs = 0
2022-06-04 06:28:35.197686: I tensorflow/compiler/jit/build_xla_ops_pass.cc:604] check_input_numerics = 0
2022-06-04 06:28:35.197690: I tensorflow/compiler/jit/build_xla_ops_pass.cc:605] check_output_numerics = 0
2022-06-04 06:28:35.197700: W tensorflow/core/util/dump_graph.cc:134] Failed to dump build_xla_ops because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.197712: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 2
2022-06-04 06:28:35.197731: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::shape takes 1.302us

2022-06-04 06:28:35.197746: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::RandomUniform takes 1.617us

2022-06-04 06:28:35.197760: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.197769: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_RetVal takes 14.106us

2022-06-04 06:28:35.197813: I tensorflow/core/graph/graph_partition.cc:281] Receiving data from shape (_Arg) on /job:localhost/replica:0/task:0/device:CPU:0 in device memory for RandomUniform (RandomUniform) on /job:localhost/replica:0/task:0/device:GPU:0 in host memory
2022-06-04 06:28:35.197839: I tensorflow/core/graph/graph_partition.cc:1251] Added send/recv: controls=0, data=1
2022-06-04 06:28:35.197901: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 3
2022-06-04 06:28:35.197909: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 1
2022-06-04 06:28:35.197914: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MklLayoutRewritePass
2022-06-04 06:28:35.197937: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _HostRecv, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.197944: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node RandomUniform, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.197949: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.197954: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _HostRecv, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.197959: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node RandomUniform, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.197963: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.197968: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _HostRecv, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.197972: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node RandomUniform, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.197977: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.197983: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 3
2022-06-04 06:28:35.197995: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_550073232_/job:localhost/replica:0/task:0/device:CPU:0 because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.198010: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_549162560_/job:localhost/replica:0/task:0/device:GPU:0 because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.198070: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0_7735306709690940494_0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.198092: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0_7735306709690940494_0 on device /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.198168: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0_7735306709690940494_0 with handle 3 status: OK
2022-06-04 06:28:35.198209: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0_7735306709690940494_1' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.198226: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0_7735306709690940494_1 on device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.198301: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0_7735306709690940494_1 with handle 4 status: OK
2022-06-04 06:28:35.198351: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op RandomUniform in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.198377: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 3
2022-06-04 06:28:35.198418: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found
2022-06-04 06:28:35.198472: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node shape/_1}} = _Send[T=DT_INT32, _dst="RandomUniform", _src="shape", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=-5016134525313837416, tensor_name="edge_2_shape", _device="/job:localhost/replica:0/task:0/device:CPU:0"](shape)
2022-06-04 06:28:35.198490: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::shape/_1 takes 1.569us

2022-06-04 06:28:35.198497: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::shape/_1 takes 0.291us

2022-06-04 06:28:35.198525: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node shape/_1}} = _Send[T=DT_INT32, _dst="RandomUniform", _src="shape", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=-5016134525313837416, tensor_name="edge_2_shape", _device="/job:localhost/replica:0/task:0/device:CPU:0"](shape) takes 57.593us

2022-06-04 06:28:35.198546: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x2084e7c0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0
2022-06-04 06:28:35.198556: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 4
2022-06-04 06:28:35.198575: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found
2022-06-04 06:28:35.198601: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.198610: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 10.779us

2022-06-04 06:28:35.198617: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SINK}}'Will fall back to a default kernel.

2022-06-04 06:28:35.198622: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SINK takes 5.256us

2022-06-04 06:28:35.198632: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node shape/_2}}'Will fall back to a default kernel.

2022-06-04 06:28:35.198638: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _HostRecv:GPU::shape/_2 takes 6.912us

2022-06-04 06:28:35.198649: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::RandomUniform takes 2.348us

2022-06-04 06:28:35.198659: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_retval_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.198664: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_retval_RetVal takes 7.839us

2022-06-04 06:28:35.198672: I tensorflow/core/common_runtime/memory_types.cc:87] 2:0 -> 3:0: 1 -> 1
2022-06-04 06:28:35.198677: I tensorflow/core/common_runtime/memory_types.cc:87] 3:0 -> 4:0: 0 -> 0
2022-06-04 06:28:35.198683: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.198689: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 5.396us

2022-06-04 06:28:35.198695: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SINK}}'Will fall back to a default kernel.

2022-06-04 06:28:35.198701: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SINK takes 5.293us

2022-06-04 06:28:35.198709: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node shape/_2}}'Will fall back to a default kernel.

2022-06-04 06:28:35.198714: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _HostRecv:GPU::shape/_2 takes 5.687us

2022-06-04 06:28:35.198722: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::RandomUniform takes 0.818us

2022-06-04 06:28:35.198730: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_retval_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.198735: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_retval_RetVal takes 6.396us

2022-06-04 06:28:35.198741: I tensorflow/core/common_runtime/memory_types.cc:87] 2:0 -> 3:0: 1 -> 1
2022-06-04 06:28:35.198746: I tensorflow/core/common_runtime/memory_types.cc:87] 3:0 -> 4:0: 0 -> 0
2022-06-04 06:28:35.198760: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node shape/_2}} = _HostRecv[_dst="RandomUniform", _src="shape", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=-5016134525313837416, tensor_name="edge_2_shape", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()
2022-06-04 06:28:35.198769: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node shape/_2}}'Will fall back to a default kernel.

2022-06-04 06:28:35.198777: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _HostRecv:GPU::shape/_2 takes 6.658us

2022-06-04 06:28:35.198783: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node shape/_2}}'Will fall back to a default kernel.

2022-06-04 06:28:35.198788: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _HostRecv:GPU::shape/_2 takes 5.057us

2022-06-04 06:28:35.198812: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node shape/_2}} = _HostRecv[_dst="RandomUniform", _src="shape", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=-5016134525313837416, tensor_name="edge_2_shape", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"]() takes 51.355us

2022-06-04 06:28:35.198825: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node RandomUniform}} = RandomUniform[T=DT_INT32, _XlaHasReferenceVars=false, dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](shape/_2)
2022-06-04 06:28:35.198838: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::RandomUniform takes 1.297us

2022-06-04 06:28:35.198847: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::RandomUniform takes 0.994us

2022-06-04 06:28:35.198872: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node RandomUniform}} = RandomUniform[T=DT_INT32, _XlaHasReferenceVars=false, dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](shape/_2) takes 46.336us

2022-06-04 06:28:35.198885: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_FLOAT, index=0](RandomUniform)
2022-06-04 06:28:35.198897: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_retval_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.198905: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_retval_RetVal takes 9.676us

2022-06-04 06:28:35.198915: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_retval_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.198922: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_retval_RetVal takes 9.074us

2022-06-04 06:28:35.198937: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_FLOAT, index=0](RandomUniform) takes 51.057us

2022-06-04 06:28:35.198956: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper shape/_2 op _HostRecv on GPU 0 stream[0]
2022-06-04 06:28:35.198964: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] shape/_2:_HostRecv#from=shape,to=RandomUniform#
2022-06-04 06:28:35.198975: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x2084e7c0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0
2022-06-04 06:28:35.198981: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x2084e7e0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0
2022-06-04 06:28:35.198992: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.198999: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.199040: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished shape/_2 op _HostRecv on GPU 0 stream[0]
2022-06-04 06:28:35.199053: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper RandomUniform op RandomUniform on GPU 0 stream[0]
2022-06-04 06:28:35.199062: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] RandomUniform:RandomUniform#shape=(int32[2])#
2022-06-04 06:28:35.199140: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.199151: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.199194: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished RandomUniform op RandomUniform on GPU 0 stream[0]
2022-06-04 06:28:35.199208: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper output_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.199218: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] output_retval_RetVal:_Retval#shape=(float[1024,128])#
2022-06-04 06:28:35.199227: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.199234: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.199262: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished output_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.199587: I tensorflow/core/common_runtime/eager/execute.cc:982] _EagerConst:input:0 /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.199615: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.199630: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 0
2022-06-04 06:28:35.199643: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x2084e7c0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0
2022-06-04 06:28:35.199651: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 1
2022-06-04 06:28:35.199660: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper input/_2 op _HostRecv on GPU 0 stream[0]
2022-06-04 06:28:35.199666: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] input/_2:_HostRecv#from=input,to=_EagerConst#
2022-06-04 06:28:35.199674: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x2084e7c0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0
2022-06-04 06:28:35.199679: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x2084e7e0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0
2022-06-04 06:28:35.199687: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.199692: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.199722: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished input/_2 op _HostRecv on GPU 0 stream[0]
2022-06-04 06:28:35.199731: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper _EagerConst op _EagerConst on GPU 0 stream[0]
2022-06-04 06:28:35.199737: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] _EagerConst:_EagerConst#shape=(int32[2])#
2022-06-04 06:28:35.199743: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.199748: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.199760: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished _EagerConst op _EagerConst on GPU 0 stream[0]
2022-06-04 06:28:35.199767: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper output_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.199772: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] output_retval_RetVal:_Retval#shape=(int32[2])#
2022-06-04 06:28:35.199777: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.199782: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.199795: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished output_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.199862: I tensorflow/core/common_runtime/eager/execute.cc:982] RandomUniform:input:0 /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.199877: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op RandomUniform in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.199888: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 3
2022-06-04 06:28:35.199896: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x2084e7c0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0
2022-06-04 06:28:35.199903: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 4
2022-06-04 06:28:35.199910: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper shape/_2 op _HostRecv on GPU 0 stream[0]
2022-06-04 06:28:35.199916: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] shape/_2:_HostRecv#from=shape,to=RandomUniform#
2022-06-04 06:28:35.199922: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x2084e7c0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0
2022-06-04 06:28:35.199926: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x2084e7e0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0
2022-06-04 06:28:35.199933: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.199938: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.199962: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished shape/_2 op _HostRecv on GPU 0 stream[0]
2022-06-04 06:28:35.199970: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper RandomUniform op RandomUniform on GPU 0 stream[0]
2022-06-04 06:28:35.199976: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] RandomUniform:RandomUniform#shape=(int32[2])#
2022-06-04 06:28:35.199999: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.200006: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.200029: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished RandomUniform op RandomUniform on GPU 0 stream[0]
2022-06-04 06:28:35.200038: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper output_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.200044: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] output_retval_RetVal:_Retval#shape=(float[1024,128])#
2022-06-04 06:28:35.200049: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.200054: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.200067: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished output_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.200181: I tensorflow/core/common_runtime/eager/execute.cc:982] _EagerConst:input:0 /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.200199: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.200210: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 0
2022-06-04 06:28:35.200219: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x2084e7c0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0
2022-06-04 06:28:35.200226: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 1
2022-06-04 06:28:35.200234: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper input/_2 op _HostRecv on GPU 0 stream[0]
2022-06-04 06:28:35.200240: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] input/_2:_HostRecv#from=input,to=_EagerConst#
2022-06-04 06:28:35.200247: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x2084e7c0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0
2022-06-04 06:28:35.200258: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x2084e7e0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0
2022-06-04 06:28:35.200268: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.200273: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.200299: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished input/_2 op _HostRecv on GPU 0 stream[0]
2022-06-04 06:28:35.200309: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper _EagerConst op _EagerConst on GPU 0 stream[0]
2022-06-04 06:28:35.200315: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] _EagerConst:_EagerConst#shape=(int32[3])#
2022-06-04 06:28:35.200320: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.200325: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.200339: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished _EagerConst op _EagerConst on GPU 0 stream[0]
2022-06-04 06:28:35.200345: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper output_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.200350: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] output_retval_RetVal:_Retval#shape=(int32[3])#
2022-06-04 06:28:35.200355: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.200360: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.200373: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished output_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.200420: I tensorflow/core/common_runtime/eager/execute.cc:982] RandomUniform:input:0 /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.200434: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op RandomUniform in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.200447: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 3
2022-06-04 06:28:35.200458: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x2084e7c0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0
2022-06-04 06:28:35.200472: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 4
2022-06-04 06:28:35.200480: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper shape/_2 op _HostRecv on GPU 0 stream[0]
2022-06-04 06:28:35.200485: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] shape/_2:_HostRecv#from=shape,to=RandomUniform#
2022-06-04 06:28:35.200491: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x2084e7c0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0
2022-06-04 06:28:35.200496: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x2084e7e0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0
2022-06-04 06:28:35.200503: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.200507: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.200531: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished shape/_2 op _HostRecv on GPU 0 stream[0]
2022-06-04 06:28:35.200540: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper RandomUniform op RandomUniform on GPU 0 stream[0]
2022-06-04 06:28:35.200547: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] RandomUniform:RandomUniform#shape=(int32[3])#
2022-06-04 06:28:35.200570: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.200580: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.200606: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished RandomUniform op RandomUniform on GPU 0 stream[0]
2022-06-04 06:28:35.200618: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper output_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.200628: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] output_retval_RetVal:_Retval#shape=(float[4,128,128])#
2022-06-04 06:28:35.200636: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.200643: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.200670: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished output_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.200791: I tensorflow/core/common_runtime/eager/execute.cc:982] _EagerConst:input:0 /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.200816: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.200836: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 0
2022-06-04 06:28:35.200851: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x2084e7c0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0
2022-06-04 06:28:35.200862: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 1
2022-06-04 06:28:35.200873: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper input/_2 op _HostRecv on GPU 0 stream[0]
2022-06-04 06:28:35.200881: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] input/_2:_HostRecv#from=input,to=_EagerConst#
2022-06-04 06:28:35.200891: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x2084e7c0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0
2022-06-04 06:28:35.200899: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x2084e7e0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0
2022-06-04 06:28:35.200910: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.200917: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.200948: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished input/_2 op _HostRecv on GPU 0 stream[0]
2022-06-04 06:28:35.200961: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper _EagerConst op _EagerConst on GPU 0 stream[0]
2022-06-04 06:28:35.200971: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] _EagerConst:_EagerConst#shape=(int32[3])#
2022-06-04 06:28:35.200980: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.200987: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.201007: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished _EagerConst op _EagerConst on GPU 0 stream[0]
2022-06-04 06:28:35.201019: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper output_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.201027: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] output_retval_RetVal:_Retval#shape=(int32[3])#
2022-06-04 06:28:35.201035: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.201041: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.201060: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished output_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.201130: I tensorflow/core/common_runtime/eager/execute.cc:982] RandomUniform:input:0 /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.201153: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op RandomUniform in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.201171: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 3
2022-06-04 06:28:35.201187: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x2084e7c0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0
2022-06-04 06:28:35.201194: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 4
2022-06-04 06:28:35.201203: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper shape/_2 op _HostRecv on GPU 0 stream[0]
2022-06-04 06:28:35.201211: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] shape/_2:_HostRecv#from=shape,to=RandomUniform#
2022-06-04 06:28:35.201221: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x2084e7c0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0
2022-06-04 06:28:35.201229: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x2084e7e0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0
2022-06-04 06:28:35.201241: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.201247: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.201275: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished shape/_2 op _HostRecv on GPU 0 stream[0]
2022-06-04 06:28:35.201287: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper RandomUniform op RandomUniform on GPU 0 stream[0]
2022-06-04 06:28:35.201296: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] RandomUniform:RandomUniform#shape=(int32[3])#
2022-06-04 06:28:35.201320: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.201327: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.201403: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished RandomUniform op RandomUniform on GPU 0 stream[0]
2022-06-04 06:28:35.201410: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper output_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.201416: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] output_retval_RetVal:_Retval#shape=(float[16,128,128])#
2022-06-04 06:28:35.201421: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.201425: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.201438: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished output_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.204677: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:GPU::StringFormat takes 3.352us

2022-06-04 06:28:35.204700: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.228us

2022-06-04 06:28:35.204711: I tensorflow/core/common_runtime/eager/execute.cc:923] PreferredDevice StringFormat: /job:localhost/replica:0/task:0
2022-06-04 06:28:35.204716: I tensorflow/core/common_runtime/eager/execute.cc:924] Placer place op [StringFormat] on device: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.204725: I tensorflow/core/common_runtime/eager/execute.cc:1062] Device for [StringFormat] already set to: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.204847: I tensorflow/core/common_runtime/eager/execute.cc:823] signature {
  name: "__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0"
  output_arg {
    name: "output"
    type: DT_STRING
  }
  attr {
    name: "T"
    type: "list(type)"
    has_minimum: true
  }
  attr {
    name: "template"
    type: "string"
    default_value {
      s: "%s"
    }
  }
  attr {
    name: "placeholder"
    type: "string"
    default_value {
      s: "%s"
    }
  }
  attr {
    name: "summarize"
    type: "int"
    default_value {
      i: 3
    }
  }
}
node_def {
  name: "StringFormat"
  op: "StringFormat"
  device: "/job:localhost/replica:0/task:0/device:CPU:0"
  attr {
    key: "T"
    value {
      placeholder: "T"
    }
  }
  attr {
    key: "placeholder"
    value {
      placeholder: "placeholder"
    }
  }
  attr {
    key: "summarize"
    value {
      placeholder: "summarize"
    }
  }
  attr {
    key: "template"
    value {
      placeholder: "template"
    }
  }
}
ret {
  key: "output"
  value: "StringFormat:output:0"
}

2022-06-04 06:28:35.204867: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.204894: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.204913: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.204940: I tensorflow/core/common_runtime/process_function_library_runtime.cc:772] Instantiating MultiDevice function "__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0" on default device "/job:localhost/replica:0/task:0/device:CPU:0"
2022-06-04 06:28:35.205047: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 0
2022-06-04 06:28:35.205057: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0
2022-06-04 06:28:35.205062: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MlirV1CompatGraphOptimizationPass
2022-06-04 06:28:35.205068: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9
2022-06-04 06:28:35.205073: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ControlFlowDepsToChainsPass
2022-06-04 06:28:35.205078: I tensorflow/core/common_runtime/control_flow_deps_to_chains.cc:37] ControlFlowDepsToChainsPass::Run
2022-06-04 06:28:35.205089: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.205102: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.205111: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10
2022-06-04 06:28:35.205116: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: AccumulateNV2RemovePass
2022-06-04 06:28:35.205122: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: LowerFunctionalOpsPass
2022-06-04 06:28:35.205130: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ParallelConcatRemovePass
2022-06-04 06:28:35.205136: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 35
2022-06-04 06:28:35.205141: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IsolatePlacerInspectionRequiredOpsPass
2022-06-04 06:28:35.205146: I tensorflow/core/common_runtime/isolate_placer_inspection_required_ops_pass.cc:34] IsolatePlacerInspectionRequiredOpsPass::Run
2022-06-04 06:28:35.205152: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IntroduceFloatingPointJitterPass
2022-06-04 06:28:35.205157: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 36
2022-06-04 06:28:35.205162: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateXlaComputationsPass
2022-06-04 06:28:35.205171: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.205178: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:353] EncapsulateXlaComputations(): (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.205208: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_halfway because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.205217: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:364] EncapsulateXlaComputations() half-way: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.205227: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.205233: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:370] EncapsulateXlaComputations() finished: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.205239: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 37
2022-06-04 06:28:35.205244: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: FunctionalizeControlFlowForXlaPass
2022-06-04 06:28:35.205257: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 99999
2022-06-04 06:28:35.205262: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: WeakForwardTypeInferencePass
2022-06-04 06:28:35.205267: I tensorflow/core/common_runtime/forward_type_inference.cc:130] ForwardTypeInferencePass::Run
2022-06-04 06:28:35.205275: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.205291: I tensorflow/core/common_runtime/forward_type_inference.cc:311] Finished after 1 iterations; done 4 of 4 nodes in 4 visits
2022-06-04 06:28:35.205299: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.205309: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 0
2022-06-04 06:28:35.205326: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:GPU::StringFormat takes 1.704us

2022-06-04 06:28:35.205336: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.503us

2022-06-04 06:28:35.205346: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.205352: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_RetVal takes 8.574us

2022-06-04 06:28:35.205358: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 0.763us

2022-06-04 06:28:35.205372: I tensorflow/core/common_runtime/placer.cc:124] output_RetVal(_Retval) placed on: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.205380: I tensorflow/core/common_runtime/placer.cc:124] StringFormat(StringFormat) placed on: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.205385: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 1
2022-06-04 06:28:35.205391: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0
2022-06-04 06:28:35.205395: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: NcclReplacePass
2022-06-04 06:28:35.205402: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 1
2022-06-04 06:28:35.205407: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 2
2022-06-04 06:28:35.205411: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 5
2022-06-04 06:28:35.205416: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: CloneConstantsForBetterClusteringPass
2022-06-04 06:28:35.205422: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9
2022-06-04 06:28:35.205427: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ClusterScopingPass
2022-06-04 06:28:35.205432: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10
2022-06-04 06:28:35.205437: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MarkForCompilationPass
2022-06-04 06:28:35.205659: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:XLA_CPU_JIT::StringFormat takes 1.711us

2022-06-04 06:28:35.205706: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:1523] MarkForCompilationPassImpl::Run time: 257 us (cumulative: 983 us, max: 441 us, #called: 3)
2022-06-04 06:28:35.205716: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 12
2022-06-04 06:28:35.205721: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ForceXlaConstantsOnHostPass
2022-06-04 06:28:35.205730: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 20
2022-06-04 06:28:35.205735: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IncreaseDynamismForAutoJitPass
2022-06-04 06:28:35.205740: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 30
2022-06-04 06:28:35.205745: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: PartiallyDeclusterPass
2022-06-04 06:28:35.205764: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 40
2022-06-04 06:28:35.205771: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ReportClusteringInfoPass
2022-06-04 06:28:35.205795: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 50
2022-06-04 06:28:35.205802: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateSubgraphsPass
2022-06-04 06:28:35.205808: I tensorflow/compiler/jit/encapsulate_subgraphs_pass.cc:1139] EncapsulateSubgraphsPass::Run
2022-06-04 06:28:35.205820: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.205872: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.205892: I tensorflow/compiler/jit/xla_cluster_util.cc:590] GetNodesRelatedToRefVariables() found 0 nodes
2022-06-04 06:28:35.205908: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 60
2022-06-04 06:28:35.205915: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: BuildXlaOpsPass
2022-06-04 06:28:35.205921: I tensorflow/compiler/jit/build_xla_ops_pass.cc:603] print_outputs = 0
2022-06-04 06:28:35.205926: I tensorflow/compiler/jit/build_xla_ops_pass.cc:604] check_input_numerics = 0
2022-06-04 06:28:35.205930: I tensorflow/compiler/jit/build_xla_ops_pass.cc:605] check_output_numerics = 0
2022-06-04 06:28:35.205940: W tensorflow/core/util/dump_graph.cc:134] Failed to dump build_xla_ops because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.205950: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 2
2022-06-04 06:28:35.205965: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.977us

2022-06-04 06:28:35.205977: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 0.506us

2022-06-04 06:28:35.205997: I tensorflow/core/graph/graph_partition.cc:1251] Added send/recv: controls=0, data=0
2022-06-04 06:28:35.206035: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 3
2022-06-04 06:28:35.206043: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 1
2022-06-04 06:28:35.206048: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MklLayoutRewritePass
2022-06-04 06:28:35.208746: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 3
2022-06-04 06:28:35.208797: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_549294672_/job:localhost/replica:0/task:0/device:CPU:0 because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.208878: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_15142775333685145957_0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.208905: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_15142775333685145957_0 on device /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.209015: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_15142775333685145957_0 with handle 6 status: OK
2022-06-04 06:28:35.209066: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op StringFormat in device /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.209089: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0 with handle 6
2022-06-04 06:28:35.209133: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.723us

2022-06-04 06:28:35.209163: I tensorflow/core/common_runtime/constant_folding.cc:631] Constant foldable 3 : 4
2022-06-04 06:28:35.209297: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]()
2022-06-04 06:28:35.209312: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.209320: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 7.916us

2022-06-04 06:28:35.209326: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.209334: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 7.883us

2022-06-04 06:28:35.209353: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() takes 56.735us

2022-06-04 06:28:35.209370: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 0 costs 0", _device="/job:localhost/replica:0/task:0/device:CPU:0"]()
2022-06-04 06:28:35.209384: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.61us

2022-06-04 06:28:35.209390: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.313us

2022-06-04 06:28:35.209418: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 0 costs 0", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 50.949us

2022-06-04 06:28:35.209430: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=-7874946874492079074, tensor_name="StringFormat:0"](StringFormat)
2022-06-04 06:28:35.209439: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_send_StringFormat_0 takes 0.633us

2022-06-04 06:28:35.209446: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_send_StringFormat_0 takes 0.287us

2022-06-04 06:28:35.209466: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=-7874946874492079074, tensor_name="StringFormat:0"](StringFormat) takes 36.567us

2022-06-04 06:28:35.209512: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step -1 {{node _SOURCE}} = NoOp[]() device: /device:CPU:0
2022-06-04 06:28:35.209525: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step -1 {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 0 costs 0", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() device: /device:CPU:0
2022-06-04 06:28:35.209549: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step -1 {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=-7874946874492079074, tensor_name="StringFormat:0"](StringFormat) device: /device:CPU:0
2022-06-04 06:28:35.209620: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 1.253us

2022-06-04 06:28:35.209633: I tensorflow/core/common_runtime/constant_folding.cc:562] Replacing StringFormat :: 0 with a constant
2022-06-04 06:28:35.209691: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found
2022-06-04 06:28:35.209750: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node StringFormat/_0__cf__0}} = Const[dtype=DT_STRING, value=Tensor<type: string shape: [] values: run 0 costs 0>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()
2022-06-04 06:28:35.209765: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 1.512us

2022-06-04 06:28:35.209771: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 0.252us

2022-06-04 06:28:35.209797: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node StringFormat/_0__cf__0}} = Const[dtype=DT_STRING, value=Tensor<type: string shape: [] values: run 0 costs 0>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 57.233us

2022-06-04 06:28:35.209807: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_STRING, index=0](StringFormat)
2022-06-04 06:28:35.209817: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_retval_RetVal takes 0.821us

2022-06-04 06:28:35.209824: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_retval_RetVal takes 0.271us

2022-06-04 06:28:35.209834: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_STRING, index=0](StringFormat) takes 26.613us

2022-06-04 06:28:35.209936: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: PrintV2:GPU::PrintV2 takes 2.174us

2022-06-04 06:28:35.209951: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: PrintV2:CPU::PrintV2 takes 0.818us

2022-06-04 06:28:35.209963: I tensorflow/core/common_runtime/eager/execute.cc:923] PreferredDevice PrintV2: /job:localhost/replica:0/task:0
2022-06-04 06:28:35.209970: I tensorflow/core/common_runtime/eager/execute.cc:924] Placer place op [PrintV2] on device: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.209982: I tensorflow/core/common_runtime/eager/execute.cc:982] PrintV2:input:0 /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.209995: I tensorflow/core/common_runtime/eager/execute.cc:1062] Device for [PrintV2] already set to: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.210098: I tensorflow/core/common_runtime/eager/execute.cc:823] signature {
  name: "__wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0"
  input_arg {
    name: "input"
    type: DT_STRING
  }
  attr {
    name: "output_stream"
    type: "string"
    default_value {
      s: "stderr"
    }
  }
  attr {
    name: "end"
    type: "string"
    default_value {
      s: "\n"
    }
  }
  is_stateful: true
}
node_def {
  name: "PrintV2"
  op: "PrintV2"
  input: "input:0"
  device: "/job:localhost/replica:0/task:0/device:CPU:0"
  attr {
    key: "end"
    value {
      placeholder: "end"
    }
  }
  attr {
    key: "output_stream"
    value {
      placeholder: "output_stream"
    }
  }
}

2022-06-04 06:28:35.210123: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.210154: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.210176: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.210204: I tensorflow/core/common_runtime/process_function_library_runtime.cc:772] Instantiating MultiDevice function "__wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0" on default device "/job:localhost/replica:0/task:0/device:CPU:0"
2022-06-04 06:28:35.210298: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 0
2022-06-04 06:28:35.210311: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0
2022-06-04 06:28:35.210318: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MlirV1CompatGraphOptimizationPass
2022-06-04 06:28:35.210324: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9
2022-06-04 06:28:35.210328: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ControlFlowDepsToChainsPass
2022-06-04 06:28:35.210334: I tensorflow/core/common_runtime/control_flow_deps_to_chains.cc:37] ControlFlowDepsToChainsPass::Run
2022-06-04 06:28:35.210345: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.210357: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.210365: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10
2022-06-04 06:28:35.210370: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: AccumulateNV2RemovePass
2022-06-04 06:28:35.210375: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: LowerFunctionalOpsPass
2022-06-04 06:28:35.210384: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ParallelConcatRemovePass
2022-06-04 06:28:35.210390: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 35
2022-06-04 06:28:35.210394: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IsolatePlacerInspectionRequiredOpsPass
2022-06-04 06:28:35.210399: I tensorflow/core/common_runtime/isolate_placer_inspection_required_ops_pass.cc:34] IsolatePlacerInspectionRequiredOpsPass::Run
2022-06-04 06:28:35.210407: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IntroduceFloatingPointJitterPass
2022-06-04 06:28:35.210414: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 36
2022-06-04 06:28:35.210420: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateXlaComputationsPass
2022-06-04 06:28:35.210432: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.210443: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:353] EncapsulateXlaComputations(): (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.210474: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_halfway because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.210487: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:364] EncapsulateXlaComputations() half-way: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.210496: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.210503: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:370] EncapsulateXlaComputations() finished: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.210509: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 37
2022-06-04 06:28:35.210513: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: FunctionalizeControlFlowForXlaPass
2022-06-04 06:28:35.210529: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 99999
2022-06-04 06:28:35.210533: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: WeakForwardTypeInferencePass
2022-06-04 06:28:35.210539: I tensorflow/core/common_runtime/forward_type_inference.cc:130] ForwardTypeInferencePass::Run
2022-06-04 06:28:35.210547: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.210563: I tensorflow/core/common_runtime/forward_type_inference.cc:311] Finished after 1 iterations; done 4 of 4 nodes in 4 visits
2022-06-04 06:28:35.210572: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.210584: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 0
2022-06-04 06:28:35.210607: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node input}}'Will fall back to a default kernel.

2022-06-04 06:28:35.210621: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::input takes 17.05us

2022-06-04 06:28:35.210627: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::input takes 0.744us

2022-06-04 06:28:35.210637: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: PrintV2:GPU::PrintV2 takes 0.989us

2022-06-04 06:28:35.210643: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: PrintV2:CPU::PrintV2 takes 0.49us

2022-06-04 06:28:35.210657: I tensorflow/core/common_runtime/placer.cc:124] input(_Arg) placed on: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.210666: I tensorflow/core/common_runtime/placer.cc:124] PrintV2(PrintV2) placed on: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.210671: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 1
2022-06-04 06:28:35.210676: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0
2022-06-04 06:28:35.210681: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: NcclReplacePass
2022-06-04 06:28:35.210687: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 1
2022-06-04 06:28:35.210692: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 2
2022-06-04 06:28:35.210696: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 5
2022-06-04 06:28:35.210700: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: CloneConstantsForBetterClusteringPass
2022-06-04 06:28:35.210707: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9
2022-06-04 06:28:35.210712: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ClusterScopingPass
2022-06-04 06:28:35.210716: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10
2022-06-04 06:28:35.210721: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MarkForCompilationPass
2022-06-04 06:28:35.210954: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: PrintV2:XLA_CPU_JIT::PrintV2 takes 1.118us

2022-06-04 06:28:35.210999: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:1523] MarkForCompilationPassImpl::Run time: 265 us (cumulative: 1.25 ms, max: 441 us, #called: 4)
2022-06-04 06:28:35.211009: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 12
2022-06-04 06:28:35.211015: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ForceXlaConstantsOnHostPass
2022-06-04 06:28:35.211024: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 20
2022-06-04 06:28:35.211029: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IncreaseDynamismForAutoJitPass
2022-06-04 06:28:35.211034: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 30
2022-06-04 06:28:35.211039: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: PartiallyDeclusterPass
2022-06-04 06:28:35.211058: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 40
2022-06-04 06:28:35.211065: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ReportClusteringInfoPass
2022-06-04 06:28:35.211089: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 50
2022-06-04 06:28:35.211095: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateSubgraphsPass
2022-06-04 06:28:35.211100: I tensorflow/compiler/jit/encapsulate_subgraphs_pass.cc:1139] EncapsulateSubgraphsPass::Run
2022-06-04 06:28:35.211113: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.211165: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.211184: I tensorflow/compiler/jit/xla_cluster_util.cc:590] GetNodesRelatedToRefVariables() found 0 nodes
2022-06-04 06:28:35.211205: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 60
2022-06-04 06:28:35.211210: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: BuildXlaOpsPass
2022-06-04 06:28:35.211216: I tensorflow/compiler/jit/build_xla_ops_pass.cc:603] print_outputs = 0
2022-06-04 06:28:35.211221: I tensorflow/compiler/jit/build_xla_ops_pass.cc:604] check_input_numerics = 0
2022-06-04 06:28:35.211226: I tensorflow/compiler/jit/build_xla_ops_pass.cc:605] check_output_numerics = 0
2022-06-04 06:28:35.211234: W tensorflow/core/util/dump_graph.cc:134] Failed to dump build_xla_ops because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.211243: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 2
2022-06-04 06:28:35.211260: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::input takes 1.284us

2022-06-04 06:28:35.211283: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: PrintV2:CPU::PrintV2 takes 0.689us

2022-06-04 06:28:35.211308: I tensorflow/core/graph/graph_partition.cc:1251] Added send/recv: controls=0, data=0
2022-06-04 06:28:35.211352: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 3
2022-06-04 06:28:35.211362: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 1
2022-06-04 06:28:35.211369: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MklLayoutRewritePass
2022-06-04 06:28:35.213212: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 3
2022-06-04 06:28:35.213248: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_547552160_/job:localhost/replica:0/task:0/device:CPU:0 because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.213309: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0_13360769971688305637_0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.213337: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0_13360769971688305637_0 on device /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.213416: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0_13360769971688305637_0 with handle 8 status: OK
2022-06-04 06:28:35.213468: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op PrintV2 in device /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.213489: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0 with handle 8
2022-06-04 06:28:35.213525: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found
2022-06-04 06:28:35.213573: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node PrintV2}} = PrintV2[_XlaHasReferenceVars=false, end="\n", output_stream="stderr", _device="/job:localhost/replica:0/task:0/device:CPU:0"](input)
2022-06-04 06:28:35.213588: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: PrintV2:CPU::PrintV2 takes 1.381us

2022-06-04 06:28:35.213595: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: PrintV2:CPU::PrintV2 takes 0.325us

2022-06-04 06:28:35.213617: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node PrintV2}} = PrintV2[_XlaHasReferenceVars=false, end="\n", output_stream="stderr", _device="/job:localhost/replica:0/task:0/device:CPU:0"](input) takes 45.929us

run 0 costs 0
2022-06-04 06:28:35.252153: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_18' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.253062: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_18' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.253117: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_18' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.253144: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_18' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.253175: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_18:input:0 /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.253192: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_18:input:1 /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.253202: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_18:input:2 /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.253260: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_18' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.253287: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_18' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.253306: I tensorflow/core/common_runtime/eager/execute.cc:923] PreferredDevice __inference_nn_18: /job:localhost/replica:0/task:0
2022-06-04 06:28:35.253313: I tensorflow/core/common_runtime/eager/execute.cc:924] Placer place op [__inference_nn_18] on device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.253426: I tensorflow/core/common_runtime/process_function_library_runtime.cc:772] Instantiating MultiDevice function "__inference_nn_18" on default device "/job:localhost/replica:0/task:0/device:GPU:0"
2022-06-04 06:28:35.253789: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 0
2022-06-04 06:28:35.253814: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0
2022-06-04 06:28:35.253823: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MlirV1CompatGraphOptimizationPass
2022-06-04 06:28:35.253834: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9
2022-06-04 06:28:35.253843: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ControlFlowDepsToChainsPass
2022-06-04 06:28:35.253852: I tensorflow/core/common_runtime/control_flow_deps_to_chains.cc:37] ControlFlowDepsToChainsPass::Run
2022-06-04 06:28:35.253884: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.253923: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.253945: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10
2022-06-04 06:28:35.253957: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: AccumulateNV2RemovePass
2022-06-04 06:28:35.253967: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: LowerFunctionalOpsPass
2022-06-04 06:28:35.253983: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ParallelConcatRemovePass
2022-06-04 06:28:35.253994: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 35
2022-06-04 06:28:35.254001: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IsolatePlacerInspectionRequiredOpsPass
2022-06-04 06:28:35.254010: I tensorflow/core/common_runtime/isolate_placer_inspection_required_ops_pass.cc:34] IsolatePlacerInspectionRequiredOpsPass::Run
2022-06-04 06:28:35.254024: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IntroduceFloatingPointJitterPass
2022-06-04 06:28:35.254035: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 36
2022-06-04 06:28:35.254042: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateXlaComputationsPass
2022-06-04 06:28:35.254065: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.254090: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:353] EncapsulateXlaComputations(): (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.254180: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_halfway because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.254200: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:364] EncapsulateXlaComputations() half-way: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.254224: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.254243: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:370] EncapsulateXlaComputations() finished: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.254253: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 37
2022-06-04 06:28:35.254260: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: FunctionalizeControlFlowForXlaPass
2022-06-04 06:28:35.254296: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 99999
2022-06-04 06:28:35.254308: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: WeakForwardTypeInferencePass
2022-06-04 06:28:35.254317: I tensorflow/core/common_runtime/forward_type_inference.cc:130] ForwardTypeInferencePass::Run
2022-06-04 06:28:35.254338: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.254380: I tensorflow/core/common_runtime/forward_type_inference.cc:311] Finished after 1 iterations; done 9 of 9 nodes in 9 visits
2022-06-04 06:28:35.254406: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.254430: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 0
2022-06-04 06:28:35.254478: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel.

2022-06-04 06:28:35.254500: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 28.894us

2022-06-04 06:28:35.254512: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::w takes 1.692us

2022-06-04 06:28:35.254530: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel.

2022-06-04 06:28:35.254540: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 10.713us

2022-06-04 06:28:35.254548: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::b takes 0.473us

2022-06-04 06:28:35.254561: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel.

2022-06-04 06:28:35.254571: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 10.793us

2022-06-04 06:28:35.254580: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::x takes 0.429us

2022-06-04 06:28:35.254595: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 4.495us

2022-06-04 06:28:35.254611: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:CPU::MatMul takes 5.949us

2022-06-04 06:28:35.254634: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 5.549us

2022-06-04 06:28:35.254651: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:CPU::Add takes 7.271us

2022-06-04 06:28:35.254674: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 10.54us

2022-06-04 06:28:35.254687: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:CPU::Identity takes 1.037us

2022-06-04 06:28:35.254704: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.254717: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_RetVal takes 19.594us

2022-06-04 06:28:35.254727: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::identity_RetVal takes 0.81us

2022-06-04 06:28:35.254755: I tensorflow/core/common_runtime/placer.cc:124] w(_Arg) placed on: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.254766: I tensorflow/core/common_runtime/placer.cc:124] b(_Arg) placed on: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.254775: I tensorflow/core/common_runtime/placer.cc:124] x(_Arg) placed on: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.254793: I tensorflow/core/common_runtime/placer.cc:124] MatMul(BatchMatMulV2) placed on: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.254803: I tensorflow/core/common_runtime/placer.cc:124] Add(AddV2) placed on: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.254812: I tensorflow/core/common_runtime/placer.cc:124] Identity(Identity) placed on: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.254822: I tensorflow/core/common_runtime/placer.cc:124] identity_RetVal(_Retval) placed on: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.254833: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 1
2022-06-04 06:28:35.254844: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0
2022-06-04 06:28:35.254854: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: NcclReplacePass
2022-06-04 06:28:35.254866: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 1
2022-06-04 06:28:35.254891: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 2
2022-06-04 06:28:35.254902: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 5
2022-06-04 06:28:35.254910: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: CloneConstantsForBetterClusteringPass
2022-06-04 06:28:35.254921: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9
2022-06-04 06:28:35.254929: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ClusterScopingPass
2022-06-04 06:28:35.254937: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10
2022-06-04 06:28:35.254945: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MarkForCompilationPass
2022-06-04 06:28:35.255314: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:XLA_GPU_JIT::MatMul takes 3.51us

2022-06-04 06:28:35.255351: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:XLA_GPU_JIT::Add takes 2.479us

2022-06-04 06:28:35.255375: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:XLA_GPU_JIT::Identity takes 1.469us

2022-06-04 06:28:35.255475: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:650] DeadnessAnalysis time: 19 us (cumulative: 32 us, max: 19 us, #called: 2)
2022-06-04 06:28:35.255557: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:1523] MarkForCompilationPassImpl::Run time: 593 us (cumulative: 1.84 ms, max: 593 us, #called: 5)
2022-06-04 06:28:35.255579: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 12
2022-06-04 06:28:35.255586: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ForceXlaConstantsOnHostPass
2022-06-04 06:28:35.255605: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 20
2022-06-04 06:28:35.255613: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IncreaseDynamismForAutoJitPass
2022-06-04 06:28:35.255624: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 30
2022-06-04 06:28:35.255632: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: PartiallyDeclusterPass
2022-06-04 06:28:35.255676: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 40
2022-06-04 06:28:35.255687: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ReportClusteringInfoPass
2022-06-04 06:28:35.255730: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 50
2022-06-04 06:28:35.255741: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateSubgraphsPass
2022-06-04 06:28:35.255750: I tensorflow/compiler/jit/encapsulate_subgraphs_pass.cc:1139] EncapsulateSubgraphsPass::Run
2022-06-04 06:28:35.255792: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.255932: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.255978: I tensorflow/compiler/jit/xla_cluster_util.cc:590] GetNodesRelatedToRefVariables() found 0 nodes
2022-06-04 06:28:35.256020: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 60
2022-06-04 06:28:35.256031: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: BuildXlaOpsPass
2022-06-04 06:28:35.256044: I tensorflow/compiler/jit/build_xla_ops_pass.cc:603] print_outputs = 0
2022-06-04 06:28:35.256052: I tensorflow/compiler/jit/build_xla_ops_pass.cc:604] check_input_numerics = 0
2022-06-04 06:28:35.256059: I tensorflow/compiler/jit/build_xla_ops_pass.cc:605] check_output_numerics = 0
2022-06-04 06:28:35.256088: W tensorflow/core/util/dump_graph.cc:134] Failed to dump build_xla_ops because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.256115: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 2
2022-06-04 06:28:35.256156: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel.

2022-06-04 06:28:35.256172: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 23.029us

2022-06-04 06:28:35.256193: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel.

2022-06-04 06:28:35.256205: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 13.599us

2022-06-04 06:28:35.256218: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel.

2022-06-04 06:28:35.256226: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 9.217us

2022-06-04 06:28:35.256240: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 3.938us

2022-06-04 06:28:35.256265: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 12.33us

2022-06-04 06:28:35.256287: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 10.104us

2022-06-04 06:28:35.256305: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.256313: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_RetVal takes 14.87us

2022-06-04 06:28:35.256364: I tensorflow/core/graph/graph_partition.cc:1251] Added send/recv: controls=0, data=0
2022-06-04 06:28:35.256463: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 3
2022-06-04 06:28:35.256475: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 1
2022-06-04 06:28:35.256483: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MklLayoutRewritePass
2022-06-04 06:28:35.256496: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.256506: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.256513: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.256519: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node BatchMatMulV2, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.256527: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node AddV2, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.256533: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node Identity, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.256540: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.256549: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.256556: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.256564: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.256571: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node BatchMatMulV2, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.256579: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node AddV2, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.256586: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node Identity, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.256594: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.256603: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.256610: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.256616: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.256623: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node BatchMatMulV2, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.256630: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node AddV2, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.256637: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node Identity, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.256644: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.256652: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 3
2022-06-04 06:28:35.256684: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_548685296_/job:localhost/replica:0/task:0/device:GPU:0 because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.256814: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_18_7895411652573620578_0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.256857: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __inference_nn_18_7895411652573620578_0 on device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.257066: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __inference_nn_18_7895411652573620578_0 with handle 10 status: OK
2022-06-04 06:28:35.257157: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op __inference_nn_18 in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.257202: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1437] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __inference_nn_18 with handle 10
2022-06-04 06:28:35.257272: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:CPU::MatMul takes 4.472us

2022-06-04 06:28:35.257288: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:CPU::Add takes 4.619us

2022-06-04 06:28:35.257296: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:CPU::Identity takes 1.493us

2022-06-04 06:28:35.257304: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found
2022-06-04 06:28:35.257366: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.257380: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 16.646us

2022-06-04 06:28:35.257393: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SINK}}'Will fall back to a default kernel.

2022-06-04 06:28:35.257401: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SINK takes 8.112us

2022-06-04 06:28:35.257415: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel.

2022-06-04 06:28:35.257423: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 12.014us

2022-06-04 06:28:35.257436: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel.

2022-06-04 06:28:35.257444: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 9.557us

2022-06-04 06:28:35.257455: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel.

2022-06-04 06:28:35.257463: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 9.176us

2022-06-04 06:28:35.257475: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 3.167us

2022-06-04 06:28:35.257490: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 2.929us

2022-06-04 06:28:35.257512: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 6.604us

2022-06-04 06:28:35.257531: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_retval_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.257543: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_retval_RetVal takes 16.68us

2022-06-04 06:28:35.257555: I tensorflow/core/common_runtime/memory_types.cc:87] 2:0 -> 5:0: 0 -> 0
2022-06-04 06:28:35.257564: I tensorflow/core/common_runtime/memory_types.cc:87] 4:0 -> 5:1: 0 -> 0
2022-06-04 06:28:35.257571: I tensorflow/core/common_runtime/memory_types.cc:87] 5:0 -> 6:0: 0 -> 0
2022-06-04 06:28:35.257578: I tensorflow/core/common_runtime/memory_types.cc:87] 3:0 -> 6:1: 0 -> 0
2022-06-04 06:28:35.257585: I tensorflow/core/common_runtime/memory_types.cc:87] 6:0 -> 7:0: 0 -> 0
2022-06-04 06:28:35.257592: I tensorflow/core/common_runtime/memory_types.cc:87] 7:0 -> 8:0: 0 -> 0
2022-06-04 06:28:35.257602: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.257610: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 8.009us

2022-06-04 06:28:35.257619: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SINK}}'Will fall back to a default kernel.

2022-06-04 06:28:35.257626: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SINK takes 7.736us

2022-06-04 06:28:35.257638: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel.

2022-06-04 06:28:35.257647: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 10.21us

2022-06-04 06:28:35.257659: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel.

2022-06-04 06:28:35.257667: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 9.789us

2022-06-04 06:28:35.257678: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel.

2022-06-04 06:28:35.257686: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 9.164us

2022-06-04 06:28:35.257697: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 1.042us

2022-06-04 06:28:35.257709: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 1.827us

2022-06-04 06:28:35.257722: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 2.074us

2022-06-04 06:28:35.257733: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_retval_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.257742: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_retval_RetVal takes 9.6us

2022-06-04 06:28:35.257751: I tensorflow/core/common_runtime/memory_types.cc:87] 2:0 -> 5:0: 0 -> 0
2022-06-04 06:28:35.257758: I tensorflow/core/common_runtime/memory_types.cc:87] 4:0 -> 5:1: 0 -> 0
2022-06-04 06:28:35.257764: I tensorflow/core/common_runtime/memory_types.cc:87] 5:0 -> 6:0: 0 -> 0
2022-06-04 06:28:35.257771: I tensorflow/core/common_runtime/memory_types.cc:87] 3:0 -> 6:1: 0 -> 0
2022-06-04 06:28:35.257778: I tensorflow/core/common_runtime/memory_types.cc:87] 6:0 -> 7:0: 0 -> 0
2022-06-04 06:28:35.257784: I tensorflow/core/common_runtime/memory_types.cc:87] 7:0 -> 8:0: 0 -> 0
2022-06-04 06:28:35.257879: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]()
2022-06-04 06:28:35.257894: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.257903: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 9.617us

2022-06-04 06:28:35.257912: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.257920: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 7.433us

2022-06-04 06:28:35.257935: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() takes 58.239us

2022-06-04 06:28:35.257961: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node w}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="W", index=0]()
2022-06-04 06:28:35.257979: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel.

2022-06-04 06:28:35.257987: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 10.347us

2022-06-04 06:28:35.257998: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel.

2022-06-04 06:28:35.258006: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 9.853us

2022-06-04 06:28:35.258024: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node w}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="W", index=0]() takes 71.087us

2022-06-04 06:28:35.258037: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node b}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="b", index=1]()
2022-06-04 06:28:35.258054: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel.

2022-06-04 06:28:35.258063: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 9.91us

2022-06-04 06:28:35.258073: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel.

2022-06-04 06:28:35.258081: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 8.86us

2022-06-04 06:28:35.258096: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node b}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="b", index=1]() takes 57.696us

2022-06-04 06:28:35.258109: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node x}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[4,128,128]], _user_specified_name="x", index=2]()
2022-06-04 06:28:35.258120: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel.

2022-06-04 06:28:35.258128: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 9.526us

2022-06-04 06:28:35.258138: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel.

2022-06-04 06:28:35.258147: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 10.07us

2022-06-04 06:28:35.258161: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node x}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[4,128,128]], _user_specified_name="x", index=2]() takes 50.717us

2022-06-04 06:28:35.258173: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](w, x)
2022-06-04 06:28:35.258185: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 1.291us

2022-06-04 06:28:35.258194: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 0.805us

2022-06-04 06:28:35.258217: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](w, x) takes 43.752us

2022-06-04 06:28:35.258230: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, b)
2022-06-04 06:28:35.258242: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 1.718us

2022-06-04 06:28:35.258251: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 1.299us

2022-06-04 06:28:35.258271: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, b) takes 39.225us

2022-06-04 06:28:35.258282: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node Identity}} = Identity[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add)
2022-06-04 06:28:35.258296: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 3.305us

2022-06-04 06:28:35.258311: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 1.714us

2022-06-04 06:28:35.258326: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node Identity}} = Identity[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add) takes 42.123us

2022-06-04 06:28:35.258342: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node identity_retval_RetVal}} = _Retval[T=DT_FLOAT, index=0](Identity)
2022-06-04 06:28:35.258355: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_retval_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.258364: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_retval_RetVal takes 11.269us

2022-06-04 06:28:35.258375: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_retval_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.258383: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_retval_RetVal takes 9.689us

2022-06-04 06:28:35.258397: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node identity_retval_RetVal}} = _Retval[T=DT_FLOAT, index=0](Identity) takes 54.533us

# run 1 compute start
2022-06-04 06:28:35.258502: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step -5044021043955473963 {{node _SOURCE}} = NoOp[]() device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.258598: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step -5044021043955473963 {{node w}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="W", index=0]() device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.258639: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper w op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.258658: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] w:_Arg
2022-06-04 06:28:35.258697: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.258716: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.258798: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished w op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.258835: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step -5044021043955473963 {{node b}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="b", index=1]() device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.258857: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper b op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.258871: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] b:_Arg
2022-06-04 06:28:35.258890: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.258907: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.258955: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished b op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.258986: I tensorflow/core/common_runtime/executor.cc:783] Process node: 4 step -5044021043955473963 {{node x}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[4,128,128]], _user_specified_name="x", index=2]() device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.259006: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper x op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.259025: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] x:_Arg
2022-06-04 06:28:35.259040: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.259061: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.259111: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished x op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.259144: I tensorflow/core/common_runtime/executor.cc:783] Process node: 5 step -5044021043955473963 {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](w, x) device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.259175: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper MatMul op BatchMatMulV2 on GPU 0 stream[0]
2022-06-04 06:28:35.259206: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] MatMul:BatchMatMulV2#shape=(float[1024,128];float[4,128,128])#
2022-06-04 06:28:35.259411: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2022-06-04 06:28:35.765751: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2022-06-04 06:28:35.766323: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.766344: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.766408: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished MatMul op BatchMatMulV2 on GPU 0 stream[0]
2022-06-04 06:28:35.766446: I tensorflow/core/common_runtime/executor.cc:783] Process node: 6 step -5044021043955473963 {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, b) device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.766468: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Add op AddV2 on GPU 0 stream[0]
2022-06-04 06:28:35.766485: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Add:AddV2#shape=(float[4,1024,128];float[1024,128])#
2022-06-04 06:28:35.766715: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.766726: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.766769: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished Add op AddV2 on GPU 0 stream[0]
2022-06-04 06:28:35.766790: I tensorflow/core/common_runtime/executor.cc:783] Process node: 7 step -5044021043955473963 {{node Identity}} = Identity[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add) device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.766801: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Identity op Identity on GPU 0 stream[0]
2022-06-04 06:28:35.766811: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Identity:Identity#shape=(float[4,1024,128])#
2022-06-04 06:28:35.766819: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.766825: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.766853: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished Identity op Identity on GPU 0 stream[0]
2022-06-04 06:28:35.766864: I tensorflow/core/common_runtime/executor.cc:783] Process node: 8 step -5044021043955473963 {{node identity_retval_RetVal}} = _Retval[T=DT_FLOAT, index=0](Identity) device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.766871: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper identity_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.766878: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] identity_retval_RetVal:_Retval#shape=(float[4,1024,128])#
2022-06-04 06:28:35.766887: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.766893: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.766917: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished identity_retval_RetVal op _Retval on GPU 0 stream[0]
# run 1 compute end

2022-06-04 06:28:35.768376: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:GPU::StringFormat takes 10.372us

2022-06-04 06:28:35.768441: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 3.023us

2022-06-04 06:28:35.768472: I tensorflow/core/common_runtime/eager/execute.cc:923] PreferredDevice StringFormat: /job:localhost/replica:0/task:0
2022-06-04 06:28:35.768488: I tensorflow/core/common_runtime/eager/execute.cc:924] Placer place op [StringFormat] on device: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.768530: I tensorflow/core/common_runtime/eager/execute.cc:1062] Device for [StringFormat] already set to: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.768626: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.768690: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.768775: I tensorflow/core/common_runtime/process_function_library_runtime.cc:772] Instantiating MultiDevice function "__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0" on default device "/job:localhost/replica:0/task:0/device:CPU:0"
2022-06-04 06:28:35.769161: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 0
2022-06-04 06:28:35.769195: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0
2022-06-04 06:28:35.769220: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MlirV1CompatGraphOptimizationPass
2022-06-04 06:28:35.769248: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9
2022-06-04 06:28:35.769271: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ControlFlowDepsToChainsPass
2022-06-04 06:28:35.769294: I tensorflow/core/common_runtime/control_flow_deps_to_chains.cc:37] ControlFlowDepsToChainsPass::Run
2022-06-04 06:28:35.769341: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.769398: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.769436: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10
2022-06-04 06:28:35.769460: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: AccumulateNV2RemovePass
2022-06-04 06:28:35.769490: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: LowerFunctionalOpsPass
2022-06-04 06:28:35.769527: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ParallelConcatRemovePass
2022-06-04 06:28:35.769552: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 35
2022-06-04 06:28:35.769575: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IsolatePlacerInspectionRequiredOpsPass
2022-06-04 06:28:35.769600: I tensorflow/core/common_runtime/isolate_placer_inspection_required_ops_pass.cc:34] IsolatePlacerInspectionRequiredOpsPass::Run
2022-06-04 06:28:35.769631: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IntroduceFloatingPointJitterPass
2022-06-04 06:28:35.769652: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 36
2022-06-04 06:28:35.769670: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateXlaComputationsPass
2022-06-04 06:28:35.769702: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.769727: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:353] EncapsulateXlaComputations(): (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.769822: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_halfway because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.769850: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:364] EncapsulateXlaComputations() half-way: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.769881: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.769906: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:370] EncapsulateXlaComputations() finished: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.769925: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 37
2022-06-04 06:28:35.769938: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: FunctionalizeControlFlowForXlaPass
2022-06-04 06:28:35.769988: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 99999
2022-06-04 06:28:35.770005: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: WeakForwardTypeInferencePass
2022-06-04 06:28:35.770020: I tensorflow/core/common_runtime/forward_type_inference.cc:130] ForwardTypeInferencePass::Run
2022-06-04 06:28:35.770054: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.770110: I tensorflow/core/common_runtime/forward_type_inference.cc:311] Finished after 1 iterations; done 4 of 4 nodes in 4 visits
2022-06-04 06:28:35.770144: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.770186: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 0
2022-06-04 06:28:35.770252: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:GPU::StringFormat takes 6.16us

2022-06-04 06:28:35.770281: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 2.137us

2022-06-04 06:28:35.770328: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.770352: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_RetVal takes 35.769us

2022-06-04 06:28:35.770382: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 2.538us

2022-06-04 06:28:35.770435: I tensorflow/core/common_runtime/placer.cc:124] output_RetVal(_Retval) placed on: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.770464: I tensorflow/core/common_runtime/placer.cc:124] StringFormat(StringFormat) placed on: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.770491: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 1
2022-06-04 06:28:35.770517: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0
2022-06-04 06:28:35.770542: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: NcclReplacePass
2022-06-04 06:28:35.770572: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 1
2022-06-04 06:28:35.770596: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 2
2022-06-04 06:28:35.770619: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 5
2022-06-04 06:28:35.770638: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: CloneConstantsForBetterClusteringPass
2022-06-04 06:28:35.770665: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9
2022-06-04 06:28:35.770688: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ClusterScopingPass
2022-06-04 06:28:35.770712: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10
2022-06-04 06:28:35.770735: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MarkForCompilationPass
2022-06-04 06:28:35.771333: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:XLA_CPU_JIT::StringFormat takes 3.296us

2022-06-04 06:28:35.771481: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:1523] MarkForCompilationPassImpl::Run time: 702 us (cumulative: 2.54 ms, max: 702 us, #called: 6)
2022-06-04 06:28:35.771508: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 12
2022-06-04 06:28:35.771533: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ForceXlaConstantsOnHostPass
2022-06-04 06:28:35.771571: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 20
2022-06-04 06:28:35.771590: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IncreaseDynamismForAutoJitPass
2022-06-04 06:28:35.771612: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 30
2022-06-04 06:28:35.771635: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: PartiallyDeclusterPass
2022-06-04 06:28:35.771703: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 40
2022-06-04 06:28:35.771722: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ReportClusteringInfoPass
2022-06-04 06:28:35.771788: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 50
2022-06-04 06:28:35.771808: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateSubgraphsPass
2022-06-04 06:28:35.771824: I tensorflow/compiler/jit/encapsulate_subgraphs_pass.cc:1139] EncapsulateSubgraphsPass::Run
2022-06-04 06:28:35.771868: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.772031: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.772095: I tensorflow/compiler/jit/xla_cluster_util.cc:590] GetNodesRelatedToRefVariables() found 0 nodes
2022-06-04 06:28:35.772150: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 60
2022-06-04 06:28:35.772169: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: BuildXlaOpsPass
2022-06-04 06:28:35.772200: I tensorflow/compiler/jit/build_xla_ops_pass.cc:603] print_outputs = 0
2022-06-04 06:28:35.772223: I tensorflow/compiler/jit/build_xla_ops_pass.cc:604] check_input_numerics = 0
2022-06-04 06:28:35.772245: I tensorflow/compiler/jit/build_xla_ops_pass.cc:605] check_output_numerics = 0
2022-06-04 06:28:35.772304: W tensorflow/core/util/dump_graph.cc:134] Failed to dump build_xla_ops because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.772345: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 2
2022-06-04 06:28:35.772405: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 3.717us

2022-06-04 06:28:35.772455: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 2.294us

2022-06-04 06:28:35.772523: I tensorflow/core/graph/graph_partition.cc:1251] Added send/recv: controls=0, data=0
2022-06-04 06:28:35.772654: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 3
2022-06-04 06:28:35.772679: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 1
2022-06-04 06:28:35.772702: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MklLayoutRewritePass
2022-06-04 06:28:35.772776: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 3
2022-06-04 06:28:35.772824: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_547808432_/job:localhost/replica:0/task:0/device:CPU:0 because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.772952: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_16522197517734559137_0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.773000: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_16522197517734559137_0 on device /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.773200: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_16522197517734559137_0 with handle 12 status: OK
2022-06-04 06:28:35.773318: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op StringFormat in device /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.773370: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0 with handle 12
2022-06-04 06:28:35.773468: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 3.076us

2022-06-04 06:28:35.773523: I tensorflow/core/common_runtime/constant_folding.cc:631] Constant foldable 3 : 4
2022-06-04 06:28:35.773732: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]()
2022-06-04 06:28:35.773765: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.773798: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 30.714us

2022-06-04 06:28:35.773824: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.773853: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 27.702us

2022-06-04 06:28:35.773897: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() takes 159.814us

2022-06-04 06:28:35.773941: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 1 costs 553.6422729492188", _device="/job:localhost/replica:0/task:0/device:CPU:0"]()
2022-06-04 06:28:35.773980: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.725us

2022-06-04 06:28:35.774008: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.03us

2022-06-04 06:28:35.774051: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 1 costs 553.6422729492188", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 115.634us

2022-06-04 06:28:35.774086: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=-1776908075695350266, tensor_name="StringFormat:0"](StringFormat)
2022-06-04 06:28:35.774128: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_send_StringFormat_0 takes 2.678us

2022-06-04 06:28:35.774154: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_send_StringFormat_0 takes 0.883us

2022-06-04 06:28:35.774222: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=-1776908075695350266, tensor_name="StringFormat:0"](StringFormat) takes 136.556us

2022-06-04 06:28:35.774275: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step -1 {{node _SOURCE}} = NoOp[]() device: /device:CPU:0
2022-06-04 06:28:35.774305: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step -1 {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 1 costs 553.6422729492188", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() device: /device:CPU:0
2022-06-04 06:28:35.774358: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step -1 {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=-1776908075695350266, tensor_name="StringFormat:0"](StringFormat) device: /device:CPU:0
2022-06-04 06:28:35.774494: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 2.547us

2022-06-04 06:28:35.774518: I tensorflow/core/common_runtime/constant_folding.cc:562] Replacing StringFormat :: 0 with a constant
2022-06-04 06:28:35.774651: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found
2022-06-04 06:28:35.774773: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node StringFormat/_0__cf__0}} = Const[dtype=DT_STRING, value=Tensor<type: string shape: [] values: run 1 costs 553.6422729492188>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()
2022-06-04 06:28:35.774808: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 1.853us

2022-06-04 06:28:35.774830: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 0.738us

2022-06-04 06:28:35.774883: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node StringFormat/_0__cf__0}} = Const[dtype=DT_STRING, value=Tensor<type: string shape: [] values: run 1 costs 553.6422729492188>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 118.075us

2022-06-04 06:28:35.774912: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_STRING, index=0](StringFormat)
2022-06-04 06:28:35.774938: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_retval_RetVal takes 1.308us

2022-06-04 06:28:35.774958: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_retval_RetVal takes 0.737us

2022-06-04 06:28:35.774989: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_STRING, index=0](StringFormat) takes 75.006us

2022-06-04 06:28:35.775143: I tensorflow/core/common_runtime/eager/execute.cc:982] PrintV2:input:0 /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.775195: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op PrintV2 in device /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.775230: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0 with handle 8
run 1 costs 553.6422729492188


2022-06-04 06:28:35.776281: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_18' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.776353: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_18' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.776394: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_18' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.776430: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_18:input:0 /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.776455: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_18:input:1 /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.776474: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_18:input:2 /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.776518: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op __inference_nn_18 in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.776567: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1437] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __inference_nn_18 with handle 10

# run 2 compute start
2022-06-04 06:28:35.776669: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step -1452065388704015282 {{node _SOURCE}} = NoOp[]() device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.776770: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step -1452065388704015282 {{node w}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="W", index=0]() device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.776808: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper w op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.776834: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] w:_Arg
2022-06-04 06:28:35.776875: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.776896: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.776980: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished w op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.777028: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step -1452065388704015282 {{node b}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="b", index=1]() device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.777051: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper b op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.777074: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] b:_Arg
2022-06-04 06:28:35.777100: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.777123: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.777178: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished b op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.777216: I tensorflow/core/common_runtime/executor.cc:783] Process node: 4 step -1452065388704015282 {{node x}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[4,128,128]], _user_specified_name="x", index=2]() device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.777242: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper x op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.777265: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] x:_Arg
2022-06-04 06:28:35.777286: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.777308: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.777361: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished x op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.777402: I tensorflow/core/common_runtime/executor.cc:783] Process node: 5 step -1452065388704015282 {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](w, x) device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.777436: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper MatMul op BatchMatMulV2 on GPU 0 stream[0]
2022-06-04 06:28:35.777467: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] MatMul:BatchMatMulV2#shape=(float[1024,128];float[4,128,128])#
2022-06-04 06:28:35.777634: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.777657: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.777705: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished MatMul op BatchMatMulV2 on GPU 0 stream[0]
2022-06-04 06:28:35.777736: I tensorflow/core/common_runtime/executor.cc:783] Process node: 6 step -1452065388704015282 {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, b) device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.777762: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Add op AddV2 on GPU 0 stream[0]
2022-06-04 06:28:35.777785: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Add:AddV2#shape=(float[4,1024,128];float[1024,128])#
2022-06-04 06:28:35.777853: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.777877: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.777948: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished Add op AddV2 on GPU 0 stream[0]
2022-06-04 06:28:35.777977: I tensorflow/core/common_runtime/executor.cc:783] Process node: 7 step -1452065388704015282 {{node Identity}} = Identity[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add) device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.778004: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Identity op Identity on GPU 0 stream[0]
2022-06-04 06:28:35.778031: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Identity:Identity#shape=(float[4,1024,128])#
2022-06-04 06:28:35.778056: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.778079: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.778135: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished Identity op Identity on GPU 0 stream[0]
2022-06-04 06:28:35.778164: I tensorflow/core/common_runtime/executor.cc:783] Process node: 8 step -1452065388704015282 {{node identity_retval_RetVal}} = _Retval[T=DT_FLOAT, index=0](Identity) device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.778192: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper identity_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.778222: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] identity_retval_RetVal:_Retval#shape=(float[4,1024,128])#
2022-06-04 06:28:35.778246: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.778262: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.778304: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished identity_retval_RetVal op _Retval on GPU 0 stream[0]
# run 2 compute end

2022-06-04 06:28:35.778815: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:GPU::StringFormat takes 4.327us

2022-06-04 06:28:35.778848: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.341us

2022-06-04 06:28:35.778865: I tensorflow/core/common_runtime/eager/execute.cc:923] PreferredDevice StringFormat: /job:localhost/replica:0/task:0
2022-06-04 06:28:35.778876: I tensorflow/core/common_runtime/eager/execute.cc:924] Placer place op [StringFormat] on device: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.778895: I tensorflow/core/common_runtime/eager/execute.cc:1062] Device for [StringFormat] already set to: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.778942: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.778979: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.779022: I tensorflow/core/common_runtime/process_function_library_runtime.cc:772] Instantiating MultiDevice function "__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0" on default device "/job:localhost/replica:0/task:0/device:CPU:0"
2022-06-04 06:28:35.779199: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 0
2022-06-04 06:28:35.779218: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0
2022-06-04 06:28:35.779227: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MlirV1CompatGraphOptimizationPass
2022-06-04 06:28:35.779237: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9
2022-06-04 06:28:35.779246: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ControlFlowDepsToChainsPass
2022-06-04 06:28:35.779256: I tensorflow/core/common_runtime/control_flow_deps_to_chains.cc:37] ControlFlowDepsToChainsPass::Run
2022-06-04 06:28:35.779276: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.779304: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.779324: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10
2022-06-04 06:28:35.779338: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: AccumulateNV2RemovePass
2022-06-04 06:28:35.779349: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: LowerFunctionalOpsPass
2022-06-04 06:28:35.779364: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ParallelConcatRemovePass
2022-06-04 06:28:35.779375: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 35
2022-06-04 06:28:35.779384: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IsolatePlacerInspectionRequiredOpsPass
2022-06-04 06:28:35.779393: I tensorflow/core/common_runtime/isolate_placer_inspection_required_ops_pass.cc:34] IsolatePlacerInspectionRequiredOpsPass::Run
2022-06-04 06:28:35.779407: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IntroduceFloatingPointJitterPass
2022-06-04 06:28:35.779417: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 36
2022-06-04 06:28:35.779426: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateXlaComputationsPass
2022-06-04 06:28:35.779442: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.779459: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:353] EncapsulateXlaComputations(): (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.779512: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_halfway because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.779531: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:364] EncapsulateXlaComputations() half-way: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.779549: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.779565: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:370] EncapsulateXlaComputations() finished: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.779576: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 37
2022-06-04 06:28:35.779585: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: FunctionalizeControlFlowForXlaPass
2022-06-04 06:28:35.779608: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 99999
2022-06-04 06:28:35.779620: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: WeakForwardTypeInferencePass
2022-06-04 06:28:35.779630: I tensorflow/core/common_runtime/forward_type_inference.cc:130] ForwardTypeInferencePass::Run
2022-06-04 06:28:35.779645: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.779677: I tensorflow/core/common_runtime/forward_type_inference.cc:311] Finished after 1 iterations; done 4 of 4 nodes in 4 visits
2022-06-04 06:28:35.779695: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.779717: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 0
2022-06-04 06:28:35.779751: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:GPU::StringFormat takes 2.848us

2022-06-04 06:28:35.779766: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.116us

2022-06-04 06:28:35.779787: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.779802: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_RetVal takes 20.612us

2022-06-04 06:28:35.779814: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 0.837us

2022-06-04 06:28:35.779837: I tensorflow/core/common_runtime/placer.cc:124] output_RetVal(_Retval) placed on: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.779855: I tensorflow/core/common_runtime/placer.cc:124] StringFormat(StringFormat) placed on: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.779871: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 1
2022-06-04 06:28:35.779881: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0
2022-06-04 06:28:35.779890: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: NcclReplacePass
2022-06-04 06:28:35.779902: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 1
2022-06-04 06:28:35.779916: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 2
2022-06-04 06:28:35.779926: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 5
2022-06-04 06:28:35.779935: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: CloneConstantsForBetterClusteringPass
2022-06-04 06:28:35.779947: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9
2022-06-04 06:28:35.779956: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ClusterScopingPass
2022-06-04 06:28:35.779967: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10
2022-06-04 06:28:35.779983: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MarkForCompilationPass
2022-06-04 06:28:35.780388: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:XLA_CPU_JIT::StringFormat takes 2.305us

2022-06-04 06:28:35.780479: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:1523] MarkForCompilationPassImpl::Run time: 472 us (cumulative: 3.02 ms, max: 702 us, #called: 7)
2022-06-04 06:28:35.780496: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 12
2022-06-04 06:28:35.780505: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ForceXlaConstantsOnHostPass
2022-06-04 06:28:35.780522: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 20
2022-06-04 06:28:35.780534: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IncreaseDynamismForAutoJitPass
2022-06-04 06:28:35.780544: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 30
2022-06-04 06:28:35.780553: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: PartiallyDeclusterPass
2022-06-04 06:28:35.780587: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 40
2022-06-04 06:28:35.780599: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ReportClusteringInfoPass
2022-06-04 06:28:35.780634: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 50
2022-06-04 06:28:35.780645: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateSubgraphsPass
2022-06-04 06:28:35.780655: I tensorflow/compiler/jit/encapsulate_subgraphs_pass.cc:1139] EncapsulateSubgraphsPass::Run
2022-06-04 06:28:35.780685: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.780786: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.780830: I tensorflow/compiler/jit/xla_cluster_util.cc:590] GetNodesRelatedToRefVariables() found 0 nodes
2022-06-04 06:28:35.780866: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 60
2022-06-04 06:28:35.780879: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: BuildXlaOpsPass
2022-06-04 06:28:35.780891: I tensorflow/compiler/jit/build_xla_ops_pass.cc:603] print_outputs = 0
2022-06-04 06:28:35.780909: I tensorflow/compiler/jit/build_xla_ops_pass.cc:604] check_input_numerics = 0
2022-06-04 06:28:35.780919: I tensorflow/compiler/jit/build_xla_ops_pass.cc:605] check_output_numerics = 0
2022-06-04 06:28:35.780938: W tensorflow/core/util/dump_graph.cc:134] Failed to dump build_xla_ops because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.780961: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 2
2022-06-04 06:28:35.780991: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.973us

2022-06-04 06:28:35.781015: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 0.953us

2022-06-04 06:28:35.781052: I tensorflow/core/graph/graph_partition.cc:1251] Added send/recv: controls=0, data=0
2022-06-04 06:28:35.781122: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 3
2022-06-04 06:28:35.781138: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 1
2022-06-04 06:28:35.781147: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MklLayoutRewritePass
2022-06-04 06:28:35.781190: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 3
2022-06-04 06:28:35.781215: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_548761344_/job:localhost/replica:0/task:0/device:CPU:0 because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.781291: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_10641241622385230756_0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.781324: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_10641241622385230756_0 on device /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.781447: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_10641241622385230756_0 with handle 14 status: OK
2022-06-04 06:28:35.781510: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op StringFormat in device /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.781535: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0 with handle 14
2022-06-04 06:28:35.781588: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.985us

2022-06-04 06:28:35.781617: I tensorflow/core/common_runtime/constant_folding.cc:631] Constant foldable 3 : 4
2022-06-04 06:28:35.781734: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]()
2022-06-04 06:28:35.781754: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.781767: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 13.694us

2022-06-04 06:28:35.781782: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.781793: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 10.494us

2022-06-04 06:28:35.781811: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() takes 76.149us

2022-06-04 06:28:35.781836: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 2 costs 3.186941146850586", _device="/job:localhost/replica:0/task:0/device:CPU:0"]()
2022-06-04 06:28:35.781857: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.864us

2022-06-04 06:28:35.781872: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.631us

2022-06-04 06:28:35.781899: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 2 costs 3.186941146850586", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 61.942us

2022-06-04 06:28:35.781925: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=8219900642429361506, tensor_name="StringFormat:0"](StringFormat)
2022-06-04 06:28:35.781945: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_send_StringFormat_0 takes 1.132us

2022-06-04 06:28:35.781960: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_send_StringFormat_0 takes 0.548us

2022-06-04 06:28:35.781994: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=8219900642429361506, tensor_name="StringFormat:0"](StringFormat) takes 69.802us

2022-06-04 06:28:35.782021: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step -1 {{node _SOURCE}} = NoOp[]() device: /device:CPU:0
2022-06-04 06:28:35.782039: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step -1 {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 2 costs 3.186941146850586", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() device: /device:CPU:0
2022-06-04 06:28:35.782069: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step -1 {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=8219900642429361506, tensor_name="StringFormat:0"](StringFormat) device: /device:CPU:0
2022-06-04 06:28:35.782142: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 1.152us

2022-06-04 06:28:35.782156: I tensorflow/core/common_runtime/constant_folding.cc:562] Replacing StringFormat :: 0 with a constant
2022-06-04 06:28:35.782228: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found
2022-06-04 06:28:35.782298: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node StringFormat/_0__cf__0}} = Const[dtype=DT_STRING, value=Tensor<type: string shape: [] values: run 2 costs 3.186941146850586>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()
2022-06-04 06:28:35.782321: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 1.199us

2022-06-04 06:28:35.782339: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 0.491us

2022-06-04 06:28:35.782371: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node StringFormat/_0__cf__0}} = Const[dtype=DT_STRING, value=Tensor<type: string shape: [] values: run 2 costs 3.186941146850586>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 75.974us

2022-06-04 06:28:35.782389: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_STRING, index=0](StringFormat)
2022-06-04 06:28:35.782404: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_retval_RetVal takes 0.914us

2022-06-04 06:28:35.782419: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_retval_RetVal takes 0.543us

2022-06-04 06:28:35.782437: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_STRING, index=0](StringFormat) takes 46.726us

2022-06-04 06:28:35.782526: I tensorflow/core/common_runtime/eager/execute.cc:982] PrintV2:input:0 /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.782559: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op PrintV2 in device /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.782583: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0 with handle 8
run 2 costs 3.186941146850586
2022-06-04 06:28:35.782808: I tensorflow/core/common_runtime/eager/execute.cc:982] _EagerConst:input:0 /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.782841: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.782870: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 0
2022-06-04 06:28:35.782896: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x2084e7c0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0
2022-06-04 06:28:35.782916: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 1
2022-06-04 06:28:35.782942: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper input/_2 op _HostRecv on GPU 0 stream[0]
2022-06-04 06:28:35.782961: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] input/_2:_HostRecv#from=input,to=_EagerConst#
2022-06-04 06:28:35.782981: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x2084e7c0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0
2022-06-04 06:28:35.782996: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x2084e7e0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0
2022-06-04 06:28:35.783019: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.783033: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.783080: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished input/_2 op _HostRecv on GPU 0 stream[0]
2022-06-04 06:28:35.783097: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper _EagerConst op _EagerConst on GPU 0 stream[0]
2022-06-04 06:28:35.783111: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] _EagerConst:_EagerConst#shape=(int32[3])#
2022-06-04 06:28:35.783125: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.783134: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.783165: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished _EagerConst op _EagerConst on GPU 0 stream[0]
2022-06-04 06:28:35.783180: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper output_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.783191: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] output_retval_RetVal:_Retval#shape=(int32[3])#
2022-06-04 06:28:35.783200: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.783209: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.783240: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished output_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.783344: I tensorflow/core/common_runtime/eager/execute.cc:982] RandomUniform:input:0 /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.783373: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op RandomUniform in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.783399: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 3
2022-06-04 06:28:35.783419: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x2084e7c0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0
2022-06-04 06:28:35.783436: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 4
2022-06-04 06:28:35.783453: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper shape/_2 op _HostRecv on GPU 0 stream[0]
2022-06-04 06:28:35.783468: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] shape/_2:_HostRecv#from=shape,to=RandomUniform#
2022-06-04 06:28:35.783480: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x2084e7c0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0
2022-06-04 06:28:35.783489: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x2084e7e0 /job:localhost/replica:0/task:0/device:CPU:0;ba631c39d1189698;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0
2022-06-04 06:28:35.783504: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.783518: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.783556: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished shape/_2 op _HostRecv on GPU 0 stream[0]
2022-06-04 06:28:35.783577: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper RandomUniform op RandomUniform on GPU 0 stream[0]
2022-06-04 06:28:35.783592: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] RandomUniform:RandomUniform#shape=(int32[3])#
2022-06-04 06:28:35.783657: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.783675: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.783713: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished RandomUniform op RandomUniform on GPU 0 stream[0]
2022-06-04 06:28:35.783730: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper output_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.783744: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] output_retval_RetVal:_Retval#shape=(float[4,128,128])#
2022-06-04 06:28:35.783761: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.783770: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.783801: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished output_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.791805: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_32' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.792143: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_32' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.792181: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_32' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.792204: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_32' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.792225: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_32:input:0 /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.792241: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_32:input:1 /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.792260: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_32:input:2 /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.792310: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_32' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.792331: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_32' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.792344: I tensorflow/core/common_runtime/eager/execute.cc:923] PreferredDevice __inference_nn_32: /job:localhost/replica:0/task:0
2022-06-04 06:28:35.792352: I tensorflow/core/common_runtime/eager/execute.cc:924] Placer place op [__inference_nn_32] on device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.792423: I tensorflow/core/common_runtime/process_function_library_runtime.cc:772] Instantiating MultiDevice function "__inference_nn_32" on default device "/job:localhost/replica:0/task:0/device:GPU:0"
2022-06-04 06:28:35.792648: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 0
2022-06-04 06:28:35.792663: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0
2022-06-04 06:28:35.792670: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MlirV1CompatGraphOptimizationPass
2022-06-04 06:28:35.792680: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9
2022-06-04 06:28:35.792689: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ControlFlowDepsToChainsPass
2022-06-04 06:28:35.792699: I tensorflow/core/common_runtime/control_flow_deps_to_chains.cc:37] ControlFlowDepsToChainsPass::Run
2022-06-04 06:28:35.792728: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.792762: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.792780: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10
2022-06-04 06:28:35.792790: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: AccumulateNV2RemovePass
2022-06-04 06:28:35.792800: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: LowerFunctionalOpsPass
2022-06-04 06:28:35.792815: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ParallelConcatRemovePass
2022-06-04 06:28:35.792828: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 35
2022-06-04 06:28:35.792834: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IsolatePlacerInspectionRequiredOpsPass
2022-06-04 06:28:35.792841: I tensorflow/core/common_runtime/isolate_placer_inspection_required_ops_pass.cc:34] IsolatePlacerInspectionRequiredOpsPass::Run
2022-06-04 06:28:35.792850: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IntroduceFloatingPointJitterPass
2022-06-04 06:28:35.792857: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 36
2022-06-04 06:28:35.792863: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateXlaComputationsPass
2022-06-04 06:28:35.792883: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.792896: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:353] EncapsulateXlaComputations(): (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.792953: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_halfway because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.792971: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:364] EncapsulateXlaComputations() half-way: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.792990: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.793006: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:370] EncapsulateXlaComputations() finished: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.793013: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 37
2022-06-04 06:28:35.793019: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: FunctionalizeControlFlowForXlaPass
2022-06-04 06:28:35.793038: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 99999
2022-06-04 06:28:35.793047: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: WeakForwardTypeInferencePass
2022-06-04 06:28:35.793053: I tensorflow/core/common_runtime/forward_type_inference.cc:130] ForwardTypeInferencePass::Run
2022-06-04 06:28:35.793070: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.793102: I tensorflow/core/common_runtime/forward_type_inference.cc:311] Finished after 1 iterations; done 9 of 9 nodes in 9 visits
2022-06-04 06:28:35.793124: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.793144: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 0
2022-06-04 06:28:35.793177: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel.

2022-06-04 06:28:35.793192: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 19.24us

2022-06-04 06:28:35.793203: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::w takes 1.507us

2022-06-04 06:28:35.793218: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel.

2022-06-04 06:28:35.793232: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 14.744us

2022-06-04 06:28:35.793242: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::b takes 0.618us

2022-06-04 06:28:35.793255: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel.

2022-06-04 06:28:35.793266: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 11.622us

2022-06-04 06:28:35.793276: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::x takes 0.63us

2022-06-04 06:28:35.793292: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 3.307us

2022-06-04 06:28:35.793306: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:CPU::MatMul takes 3.971us

2022-06-04 06:28:35.793322: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 3.789us

2022-06-04 06:28:35.793338: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:CPU::Add takes 5.269us

2022-06-04 06:28:35.793359: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 8.223us

2022-06-04 06:28:35.793371: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:CPU::Identity takes 1.238us

2022-06-04 06:28:35.793384: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.793395: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_RetVal takes 13.767us

2022-06-04 06:28:35.793406: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::identity_RetVal takes 0.8us

2022-06-04 06:28:35.793426: I tensorflow/core/common_runtime/placer.cc:124] w(_Arg) placed on: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.793436: I tensorflow/core/common_runtime/placer.cc:124] b(_Arg) placed on: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.793442: I tensorflow/core/common_runtime/placer.cc:124] x(_Arg) placed on: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.793458: I tensorflow/core/common_runtime/placer.cc:124] MatMul(BatchMatMulV2) placed on: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.793470: I tensorflow/core/common_runtime/placer.cc:124] Add(AddV2) placed on: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.793481: I tensorflow/core/common_runtime/placer.cc:124] Identity(Identity) placed on: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.793495: I tensorflow/core/common_runtime/placer.cc:124] identity_RetVal(_Retval) placed on: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.793505: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 1
2022-06-04 06:28:35.793514: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0
2022-06-04 06:28:35.793523: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: NcclReplacePass
2022-06-04 06:28:35.793534: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 1
2022-06-04 06:28:35.793550: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 2
2022-06-04 06:28:35.793559: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 5
2022-06-04 06:28:35.793567: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: CloneConstantsForBetterClusteringPass
2022-06-04 06:28:35.793579: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9
2022-06-04 06:28:35.793587: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ClusterScopingPass
2022-06-04 06:28:35.793598: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10
2022-06-04 06:28:35.793606: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MarkForCompilationPass
2022-06-04 06:28:35.793900: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:XLA_GPU_JIT::MatMul takes 1.858us

2022-06-04 06:28:35.793927: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:XLA_GPU_JIT::Add takes 1.675us

2022-06-04 06:28:35.793948: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:XLA_GPU_JIT::Identity takes 1.364us

2022-06-04 06:28:35.794030: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:650] DeadnessAnalysis time: 17 us (cumulative: 49 us, max: 19 us, #called: 3)
2022-06-04 06:28:35.794099: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:1523] MarkForCompilationPassImpl::Run time: 476 us (cumulative: 3.49 ms, max: 702 us, #called: 8)
2022-06-04 06:28:35.794116: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 12
2022-06-04 06:28:35.794125: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ForceXlaConstantsOnHostPass
2022-06-04 06:28:35.794141: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 20
2022-06-04 06:28:35.794151: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IncreaseDynamismForAutoJitPass
2022-06-04 06:28:35.794159: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 30
2022-06-04 06:28:35.794168: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: PartiallyDeclusterPass
2022-06-04 06:28:35.794206: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 40
2022-06-04 06:28:35.794215: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ReportClusteringInfoPass
2022-06-04 06:28:35.794240: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 50
2022-06-04 06:28:35.794249: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateSubgraphsPass
2022-06-04 06:28:35.794255: I tensorflow/compiler/jit/encapsulate_subgraphs_pass.cc:1139] EncapsulateSubgraphsPass::Run
2022-06-04 06:28:35.794284: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.794377: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.794414: I tensorflow/compiler/jit/xla_cluster_util.cc:590] GetNodesRelatedToRefVariables() found 0 nodes
2022-06-04 06:28:35.794442: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 60
2022-06-04 06:28:35.794451: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: BuildXlaOpsPass
2022-06-04 06:28:35.794460: I tensorflow/compiler/jit/build_xla_ops_pass.cc:603] print_outputs = 0
2022-06-04 06:28:35.794466: I tensorflow/compiler/jit/build_xla_ops_pass.cc:604] check_input_numerics = 0
2022-06-04 06:28:35.794473: I tensorflow/compiler/jit/build_xla_ops_pass.cc:605] check_output_numerics = 0
2022-06-04 06:28:35.794501: W tensorflow/core/util/dump_graph.cc:134] Failed to dump build_xla_ops because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.794523: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 2
2022-06-04 06:28:35.794548: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel.

2022-06-04 06:28:35.794562: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 17.133us

2022-06-04 06:28:35.794577: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel.

2022-06-04 06:28:35.794590: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 14.606us

2022-06-04 06:28:35.794601: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel.

2022-06-04 06:28:35.794609: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 8.509us

2022-06-04 06:28:35.794620: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 1.957us

2022-06-04 06:28:35.794637: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 3.97us

2022-06-04 06:28:35.794660: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 5.899us

2022-06-04 06:28:35.794680: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.794692: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_RetVal takes 17.575us

2022-06-04 06:28:35.794733: I tensorflow/core/graph/graph_partition.cc:1251] Added send/recv: controls=0, data=0
2022-06-04 06:28:35.794806: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 3
2022-06-04 06:28:35.794818: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 1
2022-06-04 06:28:35.794825: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MklLayoutRewritePass
2022-06-04 06:28:35.794840: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.794850: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.794857: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.794866: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node BatchMatMulV2, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.794875: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node AddV2, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.794885: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node Identity, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.794893: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.794904: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.794913: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.794922: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.794930: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node BatchMatMulV2, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.794939: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node AddV2, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.794947: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node Identity, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.794956: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.794965: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.794971: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.794977: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.794983: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node BatchMatMulV2, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.794988: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node AddV2, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.794996: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node Identity, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.795005: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU.
2022-06-04 06:28:35.795015: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 3
2022-06-04 06:28:35.795039: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_548894256_/job:localhost/replica:0/task:0/device:GPU:0 because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.795125: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_32_14350845812029827636_0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.795157: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __inference_nn_32_14350845812029827636_0 on device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.795305: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __inference_nn_32_14350845812029827636_0 with handle 16 status: OK
2022-06-04 06:28:35.795364: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op __inference_nn_32 in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.795389: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1437] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __inference_nn_32 with handle 16
2022-06-04 06:28:35.795445: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:CPU::MatMul takes 4.1us

2022-06-04 06:28:35.795462: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:CPU::Add takes 4.06us

2022-06-04 06:28:35.795471: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:CPU::Identity takes 0.843us

2022-06-04 06:28:35.795478: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found
2022-06-04 06:28:35.795528: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.795541: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 13.514us

2022-06-04 06:28:35.795552: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SINK}}'Will fall back to a default kernel.

2022-06-04 06:28:35.795562: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SINK takes 10.517us

2022-06-04 06:28:35.795577: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel.

2022-06-04 06:28:35.795588: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 12.854us

2022-06-04 06:28:35.795601: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel.

2022-06-04 06:28:35.795608: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 8.747us

2022-06-04 06:28:35.795619: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel.

2022-06-04 06:28:35.795626: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 8.583us

2022-06-04 06:28:35.795637: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 1.999us

2022-06-04 06:28:35.795653: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 2.356us

2022-06-04 06:28:35.795673: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 4.546us

2022-06-04 06:28:35.795687: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_retval_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.795697: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_retval_RetVal takes 12.7us

2022-06-04 06:28:35.795707: I tensorflow/core/common_runtime/memory_types.cc:87] 2:0 -> 5:0: 0 -> 0
2022-06-04 06:28:35.795714: I tensorflow/core/common_runtime/memory_types.cc:87] 4:0 -> 5:1: 0 -> 0
2022-06-04 06:28:35.795720: I tensorflow/core/common_runtime/memory_types.cc:87] 5:0 -> 6:0: 0 -> 0
2022-06-04 06:28:35.795726: I tensorflow/core/common_runtime/memory_types.cc:87] 3:0 -> 6:1: 0 -> 0
2022-06-04 06:28:35.795732: I tensorflow/core/common_runtime/memory_types.cc:87] 6:0 -> 7:0: 0 -> 0
2022-06-04 06:28:35.795738: I tensorflow/core/common_runtime/memory_types.cc:87] 7:0 -> 8:0: 0 -> 0
2022-06-04 06:28:35.795748: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.795755: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 7.173us

2022-06-04 06:28:35.795763: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SINK}}'Will fall back to a default kernel.

2022-06-04 06:28:35.795771: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SINK takes 7.327us

2022-06-04 06:28:35.795780: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel.

2022-06-04 06:28:35.795788: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 8.99us

2022-06-04 06:28:35.795798: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel.

2022-06-04 06:28:35.795805: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 8.681us

2022-06-04 06:28:35.795815: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel.

2022-06-04 06:28:35.795823: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 8.471us

2022-06-04 06:28:35.795835: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 1.349us

2022-06-04 06:28:35.795851: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 2.032us

2022-06-04 06:28:35.795865: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 2.266us

2022-06-04 06:28:35.795880: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_retval_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.795893: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_retval_RetVal takes 14.625us

2022-06-04 06:28:35.795903: I tensorflow/core/common_runtime/memory_types.cc:87] 2:0 -> 5:0: 0 -> 0
2022-06-04 06:28:35.795913: I tensorflow/core/common_runtime/memory_types.cc:87] 4:0 -> 5:1: 0 -> 0
2022-06-04 06:28:35.795922: I tensorflow/core/common_runtime/memory_types.cc:87] 5:0 -> 6:0: 0 -> 0
2022-06-04 06:28:35.795931: I tensorflow/core/common_runtime/memory_types.cc:87] 3:0 -> 6:1: 0 -> 0
2022-06-04 06:28:35.795940: I tensorflow/core/common_runtime/memory_types.cc:87] 6:0 -> 7:0: 0 -> 0
2022-06-04 06:28:35.795949: I tensorflow/core/common_runtime/memory_types.cc:87] 7:0 -> 8:0: 0 -> 0
2022-06-04 06:28:35.796016: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]()
2022-06-04 06:28:35.796028: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.796036: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 7.819us

2022-06-04 06:28:35.796047: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.796057: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 9.56us

2022-06-04 06:28:35.796071: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() takes 53.356us

2022-06-04 06:28:35.796096: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node w}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="W", index=0]()
2022-06-04 06:28:35.796112: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel.

2022-06-04 06:28:35.796123: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 10.646us

2022-06-04 06:28:35.796134: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel.

2022-06-04 06:28:35.796148: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 15.175us

2022-06-04 06:28:35.796166: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node w}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="W", index=0]() takes 72.1us

2022-06-04 06:28:35.796182: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node b}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="b", index=1]()
2022-06-04 06:28:35.796195: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel.

2022-06-04 06:28:35.796209: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 15.446us

2022-06-04 06:28:35.796220: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel.

2022-06-04 06:28:35.796231: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 11.76us

2022-06-04 06:28:35.796250: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node b}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="b", index=1]() takes 64.651us

2022-06-04 06:28:35.796272: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node x}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[16,128,128]], _user_specified_name="x", index=2]()
2022-06-04 06:28:35.796283: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel.

2022-06-04 06:28:35.796291: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 9.297us

2022-06-04 06:28:35.796299: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel.

2022-06-04 06:28:35.796306: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 8.211us

2022-06-04 06:28:35.796319: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node x}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[16,128,128]], _user_specified_name="x", index=2]() takes 45.756us

2022-06-04 06:28:35.796329: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](w, x)
2022-06-04 06:28:35.796342: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 1.711us

2022-06-04 06:28:35.796351: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 0.865us

2022-06-04 06:28:35.796365: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](w, x) takes 35.078us

2022-06-04 06:28:35.796382: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, b)
2022-06-04 06:28:35.796392: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 1.409us

2022-06-04 06:28:35.796401: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 1.137us

2022-06-04 06:28:35.796412: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, b) takes 29.612us

2022-06-04 06:28:35.796422: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node Identity}} = Identity[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add)
2022-06-04 06:28:35.796432: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 1.937us

2022-06-04 06:28:35.796441: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 1.568us

2022-06-04 06:28:35.796453: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node Identity}} = Identity[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add) takes 30.143us

2022-06-04 06:28:35.796462: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node identity_retval_RetVal}} = _Retval[T=DT_FLOAT, index=0](Identity)
2022-06-04 06:28:35.796473: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_retval_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.796480: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_retval_RetVal takes 9.21us

2022-06-04 06:28:35.796489: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_retval_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.796496: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_retval_RetVal takes 8.443us

2022-06-04 06:28:35.796507: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node identity_retval_RetVal}} = _Retval[T=DT_FLOAT, index=0](Identity) takes 44.52us

# run 3 compute start
2022-06-04 06:28:35.796560: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step -3121359650149651421 {{node _SOURCE}} = NoOp[]() device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.796608: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step -3121359650149651421 {{node w}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="W", index=0]() device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.796633: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper w op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.796650: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] w:_Arg
2022-06-04 06:28:35.796676: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.796693: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.796755: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished w op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.796786: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step -3121359650149651421 {{node b}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="b", index=1]() device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.796805: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper b op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.796818: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] b:_Arg
2022-06-04 06:28:35.796835: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.796852: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.796904: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished b op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.796933: I tensorflow/core/common_runtime/executor.cc:783] Process node: 4 step -3121359650149651421 {{node x}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[16,128,128]], _user_specified_name="x", index=2]() device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.796953: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper x op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.796966: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] x:_Arg
2022-06-04 06:28:35.796982: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.796999: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.797045: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished x op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.797073: I tensorflow/core/common_runtime/executor.cc:783] Process node: 5 step -3121359650149651421 {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](w, x) device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.797093: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper MatMul op BatchMatMulV2 on GPU 0 stream[0]
2022-06-04 06:28:35.797112: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] MatMul:BatchMatMulV2#shape=(float[1024,128];float[16,128,128])#
2022-06-04 06:28:35.797234: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.797258: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.797327: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished MatMul op BatchMatMulV2 on GPU 0 stream[0]
2022-06-04 06:28:35.797359: I tensorflow/core/common_runtime/executor.cc:783] Process node: 6 step -3121359650149651421 {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, b) device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.797384: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Add op AddV2 on GPU 0 stream[0]
2022-06-04 06:28:35.797410: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Add:AddV2#shape=(float[16,1024,128];float[1024,128])#
2022-06-04 06:28:35.797469: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.797488: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.797541: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished Add op AddV2 on GPU 0 stream[0]
2022-06-04 06:28:35.797572: I tensorflow/core/common_runtime/executor.cc:783] Process node: 7 step -3121359650149651421 {{node Identity}} = Identity[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add) device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.797596: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Identity op Identity on GPU 0 stream[0]
2022-06-04 06:28:35.797623: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Identity:Identity#shape=(float[16,1024,128])#
2022-06-04 06:28:35.797647: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.797669: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.797719: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished Identity op Identity on GPU 0 stream[0]
2022-06-04 06:28:35.797748: I tensorflow/core/common_runtime/executor.cc:783] Process node: 8 step -3121359650149651421 {{node identity_retval_RetVal}} = _Retval[T=DT_FLOAT, index=0](Identity) device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.797774: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper identity_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.797798: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] identity_retval_RetVal:_Retval#shape=(float[16,1024,128])#
2022-06-04 06:28:35.797822: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.797844: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.797895: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished identity_retval_RetVal op _Retval on GPU 0 stream[0]
# run 3 compute end

2022-06-04 06:28:35.798235: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:GPU::StringFormat takes 2.062us

2022-06-04 06:28:35.798254: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.764us

2022-06-04 06:28:35.798264: I tensorflow/core/common_runtime/eager/execute.cc:923] PreferredDevice StringFormat: /job:localhost/replica:0/task:0
2022-06-04 06:28:35.798271: I tensorflow/core/common_runtime/eager/execute.cc:924] Placer place op [StringFormat] on device: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.798282: I tensorflow/core/common_runtime/eager/execute.cc:1062] Device for [StringFormat] already set to: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.798308: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.798328: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.798349: I tensorflow/core/common_runtime/process_function_library_runtime.cc:772] Instantiating MultiDevice function "__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0" on default device "/job:localhost/replica:0/task:0/device:CPU:0"
2022-06-04 06:28:35.798446: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 0
2022-06-04 06:28:35.798456: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0
2022-06-04 06:28:35.798461: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MlirV1CompatGraphOptimizationPass
2022-06-04 06:28:35.798467: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9
2022-06-04 06:28:35.798471: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ControlFlowDepsToChainsPass
2022-06-04 06:28:35.798476: I tensorflow/core/common_runtime/control_flow_deps_to_chains.cc:37] ControlFlowDepsToChainsPass::Run
2022-06-04 06:28:35.798485: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.798500: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.798512: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10
2022-06-04 06:28:35.798518: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: AccumulateNV2RemovePass
2022-06-04 06:28:35.798525: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: LowerFunctionalOpsPass
2022-06-04 06:28:35.798534: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ParallelConcatRemovePass
2022-06-04 06:28:35.798542: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 35
2022-06-04 06:28:35.798548: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IsolatePlacerInspectionRequiredOpsPass
2022-06-04 06:28:35.798555: I tensorflow/core/common_runtime/isolate_placer_inspection_required_ops_pass.cc:34] IsolatePlacerInspectionRequiredOpsPass::Run
2022-06-04 06:28:35.798564: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IntroduceFloatingPointJitterPass
2022-06-04 06:28:35.798571: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 36
2022-06-04 06:28:35.798576: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateXlaComputationsPass
2022-06-04 06:28:35.798585: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.798592: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:353] EncapsulateXlaComputations(): (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.798620: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_halfway because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.798631: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:364] EncapsulateXlaComputations() half-way: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.798640: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.798647: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:370] EncapsulateXlaComputations() finished: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.798652: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 37
2022-06-04 06:28:35.798657: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: FunctionalizeControlFlowForXlaPass
2022-06-04 06:28:35.798668: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 99999
2022-06-04 06:28:35.798673: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: WeakForwardTypeInferencePass
2022-06-04 06:28:35.798677: I tensorflow/core/common_runtime/forward_type_inference.cc:130] ForwardTypeInferencePass::Run
2022-06-04 06:28:35.798685: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.798699: I tensorflow/core/common_runtime/forward_type_inference.cc:311] Finished after 1 iterations; done 4 of 4 nodes in 4 visits
2022-06-04 06:28:35.798710: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.798725: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 0
2022-06-04 06:28:35.798743: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:GPU::StringFormat takes 1.842us

2022-06-04 06:28:35.798752: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.624us

2022-06-04 06:28:35.798763: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.798771: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_RetVal takes 10.116us

2022-06-04 06:28:35.798779: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 0.463us

2022-06-04 06:28:35.798794: I tensorflow/core/common_runtime/placer.cc:124] output_RetVal(_Retval) placed on: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.798804: I tensorflow/core/common_runtime/placer.cc:124] StringFormat(StringFormat) placed on: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.798811: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 1
2022-06-04 06:28:35.798817: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0
2022-06-04 06:28:35.798822: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: NcclReplacePass
2022-06-04 06:28:35.798830: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 1
2022-06-04 06:28:35.798837: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 2
2022-06-04 06:28:35.798843: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 5
2022-06-04 06:28:35.798849: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: CloneConstantsForBetterClusteringPass
2022-06-04 06:28:35.798856: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9
2022-06-04 06:28:35.798861: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ClusterScopingPass
2022-06-04 06:28:35.798868: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10
2022-06-04 06:28:35.798874: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MarkForCompilationPass
2022-06-04 06:28:35.799070: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:XLA_CPU_JIT::StringFormat takes 1.259us

2022-06-04 06:28:35.799110: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:1523] MarkForCompilationPassImpl::Run time: 225 us (cumulative: 3.72 ms, max: 702 us, #called: 9)
2022-06-04 06:28:35.799119: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 12
2022-06-04 06:28:35.799124: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ForceXlaConstantsOnHostPass
2022-06-04 06:28:35.799132: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 20
2022-06-04 06:28:35.799136: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IncreaseDynamismForAutoJitPass
2022-06-04 06:28:35.799142: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 30
2022-06-04 06:28:35.799146: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: PartiallyDeclusterPass
2022-06-04 06:28:35.799164: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 40
2022-06-04 06:28:35.799172: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ReportClusteringInfoPass
2022-06-04 06:28:35.799187: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 50
2022-06-04 06:28:35.799192: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateSubgraphsPass
2022-06-04 06:28:35.799197: I tensorflow/compiler/jit/encapsulate_subgraphs_pass.cc:1139] EncapsulateSubgraphsPass::Run
2022-06-04 06:28:35.799208: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.799259: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.799278: I tensorflow/compiler/jit/xla_cluster_util.cc:590] GetNodesRelatedToRefVariables() found 0 nodes
2022-06-04 06:28:35.799293: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 60
2022-06-04 06:28:35.799303: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: BuildXlaOpsPass
2022-06-04 06:28:35.799309: I tensorflow/compiler/jit/build_xla_ops_pass.cc:603] print_outputs = 0
2022-06-04 06:28:35.799314: I tensorflow/compiler/jit/build_xla_ops_pass.cc:604] check_input_numerics = 0
2022-06-04 06:28:35.799318: I tensorflow/compiler/jit/build_xla_ops_pass.cc:605] check_output_numerics = 0
2022-06-04 06:28:35.799328: W tensorflow/core/util/dump_graph.cc:134] Failed to dump build_xla_ops because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.799338: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 2
2022-06-04 06:28:35.799352: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.107us

2022-06-04 06:28:35.799363: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 0.451us

2022-06-04 06:28:35.799379: I tensorflow/core/graph/graph_partition.cc:1251] Added send/recv: controls=0, data=0
2022-06-04 06:28:35.799417: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 3
2022-06-04 06:28:35.799425: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 1
2022-06-04 06:28:35.799430: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MklLayoutRewritePass
2022-06-04 06:28:35.799450: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 3
2022-06-04 06:28:35.799463: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_548916848_/job:localhost/replica:0/task:0/device:CPU:0 because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.799496: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_15487763508629335766_0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.799512: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_15487763508629335766_0 on device /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.799572: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_15487763508629335766_0 with handle 18 status: OK
2022-06-04 06:28:35.799608: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op StringFormat in device /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.799622: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0 with handle 18
2022-06-04 06:28:35.799649: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.121us

2022-06-04 06:28:35.799663: I tensorflow/core/common_runtime/constant_folding.cc:631] Constant foldable 3 : 4
2022-06-04 06:28:35.799723: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]()
2022-06-04 06:28:35.799734: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.799740: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 6.637us

2022-06-04 06:28:35.799746: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.799751: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 5.337us

2022-06-04 06:28:35.799761: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() takes 36.644us

2022-06-04 06:28:35.799774: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 3 costs 14.202356338500977", _device="/job:localhost/replica:0/task:0/device:CPU:0"]()
2022-06-04 06:28:35.799787: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.615us

2022-06-04 06:28:35.799796: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.421us

2022-06-04 06:28:35.799812: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 3 costs 14.202356338500977", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 40.305us

2022-06-04 06:28:35.799827: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=-3843511206377430069, tensor_name="StringFormat:0"](StringFormat)
2022-06-04 06:28:35.799838: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_send_StringFormat_0 takes 0.817us

2022-06-04 06:28:35.799845: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_send_StringFormat_0 takes 0.292us

2022-06-04 06:28:35.799866: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=-3843511206377430069, tensor_name="StringFormat:0"](StringFormat) takes 38.311us

2022-06-04 06:28:35.799881: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step -1 {{node _SOURCE}} = NoOp[]() device: /device:CPU:0
2022-06-04 06:28:35.799891: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step -1 {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 3 costs 14.202356338500977", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() device: /device:CPU:0
2022-06-04 06:28:35.799907: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step -1 {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=-3843511206377430069, tensor_name="StringFormat:0"](StringFormat) device: /device:CPU:0
2022-06-04 06:28:35.799949: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 0.705us

2022-06-04 06:28:35.799958: I tensorflow/core/common_runtime/constant_folding.cc:562] Replacing StringFormat :: 0 with a constant
2022-06-04 06:28:35.799998: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found
2022-06-04 06:28:35.800042: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node StringFormat/_0__cf__0}} = Const[dtype=DT_STRING, value=Tensor<type: string shape: [] values: run 3 costs 14.202356338500977>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()
2022-06-04 06:28:35.800055: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 0.669us

2022-06-04 06:28:35.800062: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 0.29us

2022-06-04 06:28:35.800083: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node StringFormat/_0__cf__0}} = Const[dtype=DT_STRING, value=Tensor<type: string shape: [] values: run 3 costs 14.202356338500977>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 42.681us

2022-06-04 06:28:35.800096: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_STRING, index=0](StringFormat)
2022-06-04 06:28:35.800105: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_retval_RetVal takes 0.493us

2022-06-04 06:28:35.800113: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_retval_RetVal takes 0.324us

2022-06-04 06:28:35.800125: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_STRING, index=0](StringFormat) takes 27.097us

2022-06-04 06:28:35.800178: I tensorflow/core/common_runtime/eager/execute.cc:982] PrintV2:input:0 /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.800198: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op PrintV2 in device /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.800212: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0 with handle 8
run 3 costs 14.202356338500977
2022-06-04 06:28:35.800562: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_32' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.800589: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_32' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.800602: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_32' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.800613: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_32:input:0 /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.800620: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_32:input:1 /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.800626: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_32:input:2 /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.800641: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op __inference_nn_32 in device /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.800657: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1437] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __inference_nn_32 with handle 16
# run 4 compute start
2022-06-04 06:28:35.800701: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step -8116964719383980009 {{node _SOURCE}} = NoOp[]() device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.800754: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step -8116964719383980009 {{node w}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="W", index=0]() device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.800783: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper w op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.800802: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] w:_Arg
2022-06-04 06:28:35.800821: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.800835: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.800890: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished w op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.800920: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step -8116964719383980009 {{node b}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="b", index=1]() device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.800942: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper b op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.800959: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] b:_Arg
2022-06-04 06:28:35.800972: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.800983: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.801024: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished b op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.801049: I tensorflow/core/common_runtime/executor.cc:783] Process node: 4 step -8116964719383980009 {{node x}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[16,128,128]], _user_specified_name="x", index=2]() device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.801066: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper x op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.801077: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] x:_Arg
2022-06-04 06:28:35.801089: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.801104: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.801139: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished x op _Arg on GPU 0 stream[0]
2022-06-04 06:28:35.801164: I tensorflow/core/common_runtime/executor.cc:783] Process node: 5 step -8116964719383980009 {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](w, x) device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.801182: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper MatMul op BatchMatMulV2 on GPU 0 stream[0]
2022-06-04 06:28:35.801198: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] MatMul:BatchMatMulV2#shape=(float[1024,128];float[16,128,128])#
2022-06-04 06:28:35.801304: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.801329: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.801399: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished MatMul op BatchMatMulV2 on GPU 0 stream[0]
2022-06-04 06:28:35.801427: I tensorflow/core/common_runtime/executor.cc:783] Process node: 6 step -8116964719383980009 {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, b) device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.801442: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Add op AddV2 on GPU 0 stream[0]
2022-06-04 06:28:35.801459: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Add:AddV2#shape=(float[16,1024,128];float[1024,128])#
2022-06-04 06:28:35.801502: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.801518: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.801564: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished Add op AddV2 on GPU 0 stream[0]
2022-06-04 06:28:35.801587: I tensorflow/core/common_runtime/executor.cc:783] Process node: 7 step -8116964719383980009 {{node Identity}} = Identity[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add) device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.801603: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Identity op Identity on GPU 0 stream[0]
2022-06-04 06:28:35.801617: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Identity:Identity#shape=(float[16,1024,128])#
2022-06-04 06:28:35.801630: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.801643: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.801679: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished Identity op Identity on GPU 0 stream[0]
2022-06-04 06:28:35.801702: I tensorflow/core/common_runtime/executor.cc:783] Process node: 8 step -8116964719383980009 {{node identity_retval_RetVal}} = _Retval[T=DT_FLOAT, index=0](Identity) device: /job:localhost/replica:0/task:0/device:GPU:0
2022-06-04 06:28:35.801718: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper identity_retval_RetVal op _Retval on GPU 0 stream[0]
2022-06-04 06:28:35.801732: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] identity_retval_RetVal:_Retval#shape=(float[16,1024,128])#
2022-06-04 06:28:35.801749: I tensorflow/core/common_runtime/gpu/gpu_util.cc:402] GPUUtil::SyncAll
2022-06-04 06:28:35.801760: I tensorflow/stream_executor/stream_executor_pimpl.cc:610] Called StreamExecutor::SynchronizeAllActivity()
2022-06-04 06:28:35.801799: I tensorflow/core/common_runtime/gpu/gpu_device.cc:698] GpuDevice::ComputeHelper finished identity_retval_RetVal op _Retval on GPU 0 stream[0]
# run 4 compute end
2022-06-04 06:28:35.802113: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:GPU::StringFormat takes 2.333us

2022-06-04 06:28:35.802137: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.854us

2022-06-04 06:28:35.802148: I tensorflow/core/common_runtime/eager/execute.cc:923] PreferredDevice StringFormat: /job:localhost/replica:0/task:0
2022-06-04 06:28:35.802155: I tensorflow/core/common_runtime/eager/execute.cc:924] Placer place op [StringFormat] on device: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.802166: I tensorflow/core/common_runtime/eager/execute.cc:1062] Device for [StringFormat] already set to: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.802190: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.802210: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.802230: I tensorflow/core/common_runtime/process_function_library_runtime.cc:772] Instantiating MultiDevice function "__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0" on default device "/job:localhost/replica:0/task:0/device:CPU:0"
2022-06-04 06:28:35.802320: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 0
2022-06-04 06:28:35.802331: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0
2022-06-04 06:28:35.802335: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MlirV1CompatGraphOptimizationPass
2022-06-04 06:28:35.802340: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9
2022-06-04 06:28:35.802346: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ControlFlowDepsToChainsPass
2022-06-04 06:28:35.802350: I tensorflow/core/common_runtime/control_flow_deps_to_chains.cc:37] ControlFlowDepsToChainsPass::Run
2022-06-04 06:28:35.802360: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.802373: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.802382: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10
2022-06-04 06:28:35.802387: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: AccumulateNV2RemovePass
2022-06-04 06:28:35.802392: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: LowerFunctionalOpsPass
2022-06-04 06:28:35.802399: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ParallelConcatRemovePass
2022-06-04 06:28:35.802405: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 35
2022-06-04 06:28:35.802409: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IsolatePlacerInspectionRequiredOpsPass
2022-06-04 06:28:35.802414: I tensorflow/core/common_runtime/isolate_placer_inspection_required_ops_pass.cc:34] IsolatePlacerInspectionRequiredOpsPass::Run
2022-06-04 06:28:35.802421: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IntroduceFloatingPointJitterPass
2022-06-04 06:28:35.802429: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 36
2022-06-04 06:28:35.802435: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateXlaComputationsPass
2022-06-04 06:28:35.802444: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.802453: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:353] EncapsulateXlaComputations(): (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.802483: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_halfway because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.802494: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:364] EncapsulateXlaComputations() half-way: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.802503: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.802510: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:370] EncapsulateXlaComputations() finished: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified)
2022-06-04 06:28:35.802516: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 37
2022-06-04 06:28:35.802521: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: FunctionalizeControlFlowForXlaPass
2022-06-04 06:28:35.802532: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 99999
2022-06-04 06:28:35.802537: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: WeakForwardTypeInferencePass
2022-06-04 06:28:35.802542: I tensorflow/core/common_runtime/forward_type_inference.cc:130] ForwardTypeInferencePass::Run
2022-06-04 06:28:35.802550: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.802563: I tensorflow/core/common_runtime/forward_type_inference.cc:311] Finished after 1 iterations; done 4 of 4 nodes in 4 visits
2022-06-04 06:28:35.802571: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.802580: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 0
2022-06-04 06:28:35.802596: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:GPU::StringFormat takes 1.748us

2022-06-04 06:28:35.802606: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.571us

2022-06-04 06:28:35.802615: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_RetVal}}'Will fall back to a default kernel.

2022-06-04 06:28:35.802621: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_RetVal takes 8.166us

2022-06-04 06:28:35.802627: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 0.44us

2022-06-04 06:28:35.802638: I tensorflow/core/common_runtime/placer.cc:124] output_RetVal(_Retval) placed on: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.802645: I tensorflow/core/common_runtime/placer.cc:124] StringFormat(StringFormat) placed on: /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.802651: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 1
2022-06-04 06:28:35.802656: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0
2022-06-04 06:28:35.802660: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: NcclReplacePass
2022-06-04 06:28:35.802666: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 1
2022-06-04 06:28:35.802671: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 2
2022-06-04 06:28:35.802676: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 5
2022-06-04 06:28:35.802680: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: CloneConstantsForBetterClusteringPass
2022-06-04 06:28:35.802685: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9
2022-06-04 06:28:35.802690: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ClusterScopingPass
2022-06-04 06:28:35.802695: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10
2022-06-04 06:28:35.802699: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MarkForCompilationPass
2022-06-04 06:28:35.802923: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:XLA_CPU_JIT::StringFormat takes 1.429us

2022-06-04 06:28:35.802974: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:1523] MarkForCompilationPassImpl::Run time: 264 us (cumulative: 3.98 ms, max: 702 us, #called: 10)
2022-06-04 06:28:35.802984: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 12
2022-06-04 06:28:35.802989: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ForceXlaConstantsOnHostPass
2022-06-04 06:28:35.802998: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 20
2022-06-04 06:28:35.803003: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IncreaseDynamismForAutoJitPass
2022-06-04 06:28:35.803008: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 30
2022-06-04 06:28:35.803012: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: PartiallyDeclusterPass
2022-06-04 06:28:35.803030: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 40
2022-06-04 06:28:35.803037: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ReportClusteringInfoPass
2022-06-04 06:28:35.803052: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 50
2022-06-04 06:28:35.803056: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateSubgraphsPass
2022-06-04 06:28:35.803061: I tensorflow/compiler/jit/encapsulate_subgraphs_pass.cc:1139] EncapsulateSubgraphsPass::Run
2022-06-04 06:28:35.803075: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_before because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.803124: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_after because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.803142: I tensorflow/compiler/jit/xla_cluster_util.cc:590] GetNodesRelatedToRefVariables() found 0 nodes
2022-06-04 06:28:35.803157: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 60
2022-06-04 06:28:35.803164: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: BuildXlaOpsPass
2022-06-04 06:28:35.803169: I tensorflow/compiler/jit/build_xla_ops_pass.cc:603] print_outputs = 0
2022-06-04 06:28:35.803174: I tensorflow/compiler/jit/build_xla_ops_pass.cc:604] check_input_numerics = 0
2022-06-04 06:28:35.803178: I tensorflow/compiler/jit/build_xla_ops_pass.cc:605] check_output_numerics = 0
2022-06-04 06:28:35.803188: W tensorflow/core/util/dump_graph.cc:134] Failed to dump build_xla_ops because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.803198: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 2
2022-06-04 06:28:35.803213: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.175us

2022-06-04 06:28:35.803224: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 0.505us

2022-06-04 06:28:35.803245: I tensorflow/core/graph/graph_partition.cc:1251] Added send/recv: controls=0, data=0
2022-06-04 06:28:35.803286: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 3
2022-06-04 06:28:35.803296: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 1
2022-06-04 06:28:35.803301: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MklLayoutRewritePass
2022-06-04 06:28:35.803325: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 3
2022-06-04 06:28:35.803338: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_548997408_/job:localhost/replica:0/task:0/device:CPU:0 because dump location is not  specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument.
2022-06-04 06:28:35.803378: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_4958233896198719099_0' in binary running on 7e5d8b5d3c06. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
2022-06-04 06:28:35.803395: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_4958233896198719099_0 on device /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.803457: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_4958233896198719099_0 with handle 20 status: OK
2022-06-04 06:28:35.803491: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op StringFormat in device /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.803505: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0 with handle 20
2022-06-04 06:28:35.803531: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.206us

2022-06-04 06:28:35.803547: I tensorflow/core/common_runtime/constant_folding.cc:631] Constant foldable 3 : 4
2022-06-04 06:28:35.803605: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]()
2022-06-04 06:28:35.803616: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.803623: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 6.986us

2022-06-04 06:28:35.803628: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel.

2022-06-04 06:28:35.803634: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 5.317us

2022-06-04 06:28:35.803643: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() takes 36.581us

2022-06-04 06:28:35.803654: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 4 costs 1.6753673553466797", _device="/job:localhost/replica:0/task:0/device:CPU:0"]()
2022-06-04 06:28:35.803664: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.473us

2022-06-04 06:28:35.803670: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.316us

2022-06-04 06:28:35.803682: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 4 costs 1.6753673553466797", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 29.082us

2022-06-04 06:28:35.803692: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=4124504909195762432, tensor_name="StringFormat:0"](StringFormat)
2022-06-04 06:28:35.803701: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_send_StringFormat_0 takes 0.552us

2022-06-04 06:28:35.803707: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_send_StringFormat_0 takes 0.289us

2022-06-04 06:28:35.803723: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=4124504909195762432, tensor_name="StringFormat:0"](StringFormat) takes 30.998us

2022-06-04 06:28:35.803735: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step -1 {{node _SOURCE}} = NoOp[]() device: /device:CPU:0
2022-06-04 06:28:35.803743: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step -1 {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 4 costs 1.6753673553466797", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() device: /device:CPU:0
2022-06-04 06:28:35.803756: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step -1 {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=4124504909195762432, tensor_name="StringFormat:0"](StringFormat) device: /device:CPU:0
2022-06-04 06:28:35.803791: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 0.608us

2022-06-04 06:28:35.803799: I tensorflow/core/common_runtime/constant_folding.cc:562] Replacing StringFormat :: 0 with a constant
2022-06-04 06:28:35.803836: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found
2022-06-04 06:28:35.803871: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node StringFormat/_0__cf__0}} = Const[dtype=DT_STRING, value=Tensor<type: string shape: [] values: run 4 costs 1.6753673553466797>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()
2022-06-04 06:28:35.803882: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 0.605us

2022-06-04 06:28:35.803889: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 0.287us

2022-06-04 06:28:35.803905: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node StringFormat/_0__cf__0}} = Const[dtype=DT_STRING, value=Tensor<type: string shape: [] values: run 4 costs 1.6753673553466797>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 36.325us

2022-06-04 06:28:35.803915: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_STRING, index=0](StringFormat)
2022-06-04 06:28:35.803923: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_retval_RetVal takes 0.411us

2022-06-04 06:28:35.803928: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_retval_RetVal takes 0.285us

2022-06-04 06:28:35.803938: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_STRING, index=0](StringFormat) takes 22.141us

2022-06-04 06:28:35.803987: I tensorflow/core/common_runtime/eager/execute.cc:982] PrintV2:input:0 /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.804007: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op PrintV2 in device /job:localhost/replica:0/task:0/device:CPU:0
2022-06-04 06:28:35.804022: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0 with handle 8
run 4 costs 1.6753673553466797