Last active
June 16, 2022 03:47
-
-
Save AmosChenYQ/88677d081a45fedd6e7a4fbaeea3e861 to your computer and use it in GitHub Desktop.
Logs from eager execution and graph lazy execution
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2022-06-13 13:40:38.064084: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE3 SSE4.1 SSE4.2 AVX AVX2 AVX512F AVX512_VNNI FMA | |
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. | |
2022-06-13 13:40:38.214452: I tensorflow/core/platform/cloud/gcs_file_system.cc:806] GCS cache max size = 0 ; block size = 67108864 ; max staleness = 0 | |
2022-06-13 13:40:38.214523: I ./tensorflow/core/platform/cloud/ram_file_block_cache.h:64] GCS file block cache is disabled | |
2022-06-13 13:40:38.214537: I tensorflow/core/platform/cloud/gcs_file_system.cc:846] GCS DNS cache is disabled, because GCS_RESOLVE_REFRESH_SECS = 0 (or is not set) | |
2022-06-13 13:40:38.214542: I tensorflow/core/platform/cloud/gcs_file_system.cc:876] GCS additional header DISABLED. No environment variable set. | |
2022-06-13 13:40:38.215427: I tensorflow/core/util/util.cc:168] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. | |
2022-06-13 13:40:38.219971: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0 | |
2022-06-13 13:40:38.256355: I tensorflow/core/platform/cloud/gcs_file_system.cc:806] GCS cache max size = 0 ; block size = 67108864 ; max staleness = 0 | |
2022-06-13 13:40:38.256399: I ./tensorflow/core/platform/cloud/ram_file_block_cache.h:64] GCS file block cache is disabled | |
2022-06-13 13:40:38.256422: I tensorflow/core/platform/cloud/gcs_file_system.cc:846] GCS DNS cache is disabled, because GCS_RESOLVE_REFRESH_SECS = 0 (or is not set) | |
2022-06-13 13:40:38.256427: I tensorflow/core/platform/cloud/gcs_file_system.cc:876] GCS additional header DISABLED. No environment variable set. | |
2022-06-13 13:40:38.937792: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libnvinfer.so.7 | |
2022-06-13 13:40:39.763279: I tensorflow/compiler/xla/parse_flags_from_env.cc:197] For env var TF_XLA_FLAGS found arguments: | |
2022-06-13 13:40:39.763374: I tensorflow/compiler/xla/parse_flags_from_env.cc:199] argv[0] = <argv[0]> | |
2022-06-13 13:40:39.763414: I tensorflow/compiler/xla/parse_flags_from_env.cc:197] For env var TF_JITRT_FLAGS found arguments: | |
2022-06-13 13:40:39.763443: I tensorflow/compiler/xla/parse_flags_from_env.cc:199] argv[0] = <argv[0]> | |
2022-06-13 13:40:39.763480: I tensorflow/compiler/jit/xla_cpu_device.cc:44] Not creating XLA devices, tf_xla_enable_xla_devices not set and XLA device creation not requested | |
2022-06-13 13:40:39.763572: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1 | |
2022-06-13 13:40:39.832088: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1836] Found device 0 with properties: | |
pciBusID: 0000:18:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5 | |
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s | |
2022-06-13 13:40:39.832352: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1836] Found device 1 with properties: | |
pciBusID: 0000:86:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5 | |
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s | |
2022-06-13 13:40:39.832381: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0 | |
2022-06-13 13:40:39.832434: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11 | |
2022-06-13 13:40:39.832463: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11 | |
2022-06-13 13:40:39.835850: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10 | |
2022-06-13 13:40:39.836145: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10 | |
2022-06-13 13:40:39.837072: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11 | |
2022-06-13 13:40:39.837841: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11 | |
2022-06-13 13:40:39.837885: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8 | |
2022-06-13 13:40:39.838704: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1975] Adding visible gpu devices: 0, 1 | |
2022-06-13 13:40:39.838731: I tensorflow/compiler/jit/xla_gpu_device.cc:48] Not creating XLA devices, tf_xla_enable_xla_devices not set and XLA devices creation not required | |
2022-06-13 13:40:39.839797: I ./tensorflow/core/common_runtime/mkl_cpu_allocator.h:178] MklCPUAllocator: Setting max_mem_bytes: 134837268480 | |
2022-06-13 13:40:39.839826: I tensorflow/core/common_runtime/bfc_allocator.cc:70] Creating new BFCAllocator named: mklcpu | |
2022-06-13 13:40:39.839835: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256B | |
2022-06-13 13:40:39.839842: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512B | |
2022-06-13 13:40:39.839854: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.0KiB | |
2022-06-13 13:40:39.839866: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.0KiB | |
2022-06-13 13:40:39.839873: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.0KiB | |
2022-06-13 13:40:39.839881: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.0KiB | |
2022-06-13 13:40:39.839889: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.0KiB | |
2022-06-13 13:40:39.839898: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.0KiB | |
2022-06-13 13:40:39.839906: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.0KiB | |
2022-06-13 13:40:39.839915: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.0KiB | |
2022-06-13 13:40:39.839922: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.0KiB | |
2022-06-13 13:40:39.839931: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512.0KiB | |
2022-06-13 13:40:39.839938: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.00MiB | |
2022-06-13 13:40:39.839946: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.00MiB | |
2022-06-13 13:40:39.839953: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.00MiB | |
2022-06-13 13:40:39.839961: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.00MiB | |
2022-06-13 13:40:39.839970: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.00MiB | |
2022-06-13 13:40:39.839977: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.00MiB | |
2022-06-13 13:40:39.839985: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.00MiB | |
2022-06-13 13:40:39.839994: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.00MiB | |
2022-06-13 13:40:39.840002: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.00MiB | |
2022-06-13 13:40:39.840060: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE3 SSE4.1 SSE4.2 AVX AVX2 AVX512F AVX512_VNNI FMA | |
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. | |
2022-06-13 13:40:39.843389: I tensorflow/compiler/jit/xla_cpu_device.cc:58] Not creating XLA devices, tf_xla_enable_xla_devices not set | |
2022-06-13 13:40:40.109274: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1836] Found device 0 with properties: | |
pciBusID: 0000:18:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5 | |
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s | |
2022-06-13 13:40:40.109518: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1836] Found device 1 with properties: | |
pciBusID: 0000:86:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5 | |
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s | |
2022-06-13 13:40:40.110131: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1975] Adding visible gpu devices: 0, 1 | |
2022-06-13 13:40:40.110166: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0 | |
2022-06-13 13:40:40.570942: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1333] Cuda stream priority range on GPU(0): -5,0 | |
2022-06-13 13:40:40.942157: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1333] Cuda stream priority range on GPU(0): -5,0 | |
2022-06-13 13:40:40.942217: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1384] TensorFlow compiled with CUDA 11.2 and cuDNN 8.1.0 | |
2022-06-13 13:40:40.942256: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1396] Device interconnect StreamExecutor with strength 1 edge matrix: | |
2022-06-13 13:40:40.942264: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] 0 1 | |
2022-06-13 13:40:40.942269: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1415] 0: N N | |
2022-06-13 13:40:40.942273: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1415] 1: N N | |
2022-06-13 13:40:40.943241: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1677] GPUDevice PlatformDeviceId 0 TfDeviceId 0 on bus 1 numa: 0 pci: 0000:18:00.0 DeviceLocality: bus_id: 1 | |
links { | |
} | |
2022-06-13 13:40:40.943455: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1677] GPUDevice PlatformDeviceId 1 TfDeviceId 1 on bus 2 numa: 1 pci: 0000:86:00.0 DeviceLocality: bus_id: 2 | |
numa_node: 1 | |
links { | |
} | |
2022-06-13 13:40:40.943642: I tensorflow/core/common_runtime/bfc_allocator.cc:70] Creating new BFCAllocator named: GPU_0_bfc | |
2022-06-13 13:40:40.943653: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256B | |
2022-06-13 13:40:40.943657: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512B | |
2022-06-13 13:40:40.943666: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.0KiB | |
2022-06-13 13:40:40.943671: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.0KiB | |
2022-06-13 13:40:40.943675: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.0KiB | |
2022-06-13 13:40:40.943680: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.0KiB | |
2022-06-13 13:40:40.943685: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.0KiB | |
2022-06-13 13:40:40.943690: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.0KiB | |
2022-06-13 13:40:40.943694: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.0KiB | |
2022-06-13 13:40:40.943699: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.0KiB | |
2022-06-13 13:40:40.943703: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.0KiB | |
2022-06-13 13:40:40.943708: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512.0KiB | |
2022-06-13 13:40:40.943712: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.00MiB | |
2022-06-13 13:40:40.943717: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.00MiB | |
2022-06-13 13:40:40.943721: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.00MiB | |
2022-06-13 13:40:40.943725: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.00MiB | |
2022-06-13 13:40:40.943730: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.00MiB | |
2022-06-13 13:40:40.943734: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.00MiB | |
2022-06-13 13:40:40.943739: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.00MiB | |
2022-06-13 13:40:40.943743: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.00MiB | |
2022-06-13 13:40:40.943748: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.00MiB | |
2022-06-13 13:40:40.943781: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1550] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9657 MB memory: -> device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:18:00.0, compute capability: 7.5 | |
2022-06-13 13:40:40.943797: I tensorflow/stream_executor/stream.cc:261] [stream=0x21641e50,impl=0x212aa010] Called Stream::Stream(parent=0x34a8420) | |
2022-06-13 13:40:40.943803: I tensorflow/stream_executor/stream.cc:308] [stream=0x21641e50,impl=0x212aa010] Called Stream::Init() | |
2022-06-13 13:40:40.943855: I tensorflow/stream_executor/stream.cc:261] [stream=0x216502f0,impl=0x71a55a0] Called Stream::Stream(parent=0x34a8420) | |
2022-06-13 13:40:40.943863: I tensorflow/stream_executor/stream.cc:308] [stream=0x216502f0,impl=0x71a55a0] Called Stream::Init() | |
2022-06-13 13:40:40.943872: I tensorflow/stream_executor/stream.cc:261] [stream=0x2111ed90,impl=0x212a9910] Called Stream::Stream(parent=0x34a8420) | |
2022-06-13 13:40:40.943877: I tensorflow/stream_executor/stream.cc:308] [stream=0x2111ed90,impl=0x212a9910] Called Stream::Init() | |
2022-06-13 13:40:40.943885: I tensorflow/stream_executor/stream.cc:261] [stream=0x2111ece0,impl=0x212a9b00] Called Stream::Stream(parent=0x34a8420) | |
2022-06-13 13:40:40.943891: I tensorflow/stream_executor/stream.cc:308] [stream=0x2111ece0,impl=0x212a9b00] Called Stream::Init() | |
2022-06-13 13:40:40.943902: I tensorflow/core/common_runtime/bfc_allocator.cc:70] Creating new BFCAllocator named: gpu_host_bfc | |
2022-06-13 13:40:40.943907: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256B | |
2022-06-13 13:40:40.943911: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512B | |
2022-06-13 13:40:40.943916: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.0KiB | |
2022-06-13 13:40:40.943920: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.0KiB | |
2022-06-13 13:40:40.943924: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.0KiB | |
2022-06-13 13:40:40.943928: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.0KiB | |
2022-06-13 13:40:40.943933: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.0KiB | |
2022-06-13 13:40:40.943937: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.0KiB | |
2022-06-13 13:40:40.943942: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.0KiB | |
2022-06-13 13:40:40.943946: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.0KiB | |
2022-06-13 13:40:40.943950: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.0KiB | |
2022-06-13 13:40:40.943955: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512.0KiB | |
2022-06-13 13:40:40.943959: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.00MiB | |
2022-06-13 13:40:40.943963: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.00MiB | |
2022-06-13 13:40:40.943968: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.00MiB | |
2022-06-13 13:40:40.943972: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.00MiB | |
2022-06-13 13:40:40.943976: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.00MiB | |
2022-06-13 13:40:40.943981: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.00MiB | |
2022-06-13 13:40:40.943985: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.00MiB | |
2022-06-13 13:40:40.943990: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.00MiB | |
2022-06-13 13:40:40.943994: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.00MiB | |
2022-06-13 13:40:40.944642: I tensorflow/core/common_runtime/bfc_allocator.cc:70] Creating new BFCAllocator named: GPU_1_bfc | |
2022-06-13 13:40:40.944656: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256B | |
2022-06-13 13:40:40.944662: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512B | |
2022-06-13 13:40:40.944667: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.0KiB | |
2022-06-13 13:40:40.944672: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.0KiB | |
2022-06-13 13:40:40.944676: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.0KiB | |
2022-06-13 13:40:40.944681: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.0KiB | |
2022-06-13 13:40:40.944686: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.0KiB | |
2022-06-13 13:40:40.944691: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.0KiB | |
2022-06-13 13:40:40.944696: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.0KiB | |
2022-06-13 13:40:40.944701: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.0KiB | |
2022-06-13 13:40:40.944705: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.0KiB | |
2022-06-13 13:40:40.944710: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512.0KiB | |
2022-06-13 13:40:40.944715: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.00MiB | |
2022-06-13 13:40:40.944719: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.00MiB | |
2022-06-13 13:40:40.944724: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.00MiB | |
2022-06-13 13:40:40.944729: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.00MiB | |
2022-06-13 13:40:40.944734: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.00MiB | |
2022-06-13 13:40:40.944738: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.00MiB | |
2022-06-13 13:40:40.944743: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.00MiB | |
2022-06-13 13:40:40.944748: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.00MiB | |
2022-06-13 13:40:40.944753: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.00MiB | |
2022-06-13 13:40:40.944770: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1550] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 9657 MB memory: -> device: 1, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:86:00.0, compute capability: 7.5 | |
2022-06-13 13:40:40.944781: I tensorflow/stream_executor/stream.cc:261] [stream=0x1b49bd30,impl=0x1b49b320] Called Stream::Stream(parent=0x359dce0) | |
2022-06-13 13:40:40.944788: I tensorflow/stream_executor/stream.cc:308] [stream=0x1b49bd30,impl=0x1b49b320] Called Stream::Init() | |
2022-06-13 13:40:40.944809: I tensorflow/stream_executor/stream.cc:261] [stream=0x1b49f2b0,impl=0x1b49b540] Called Stream::Stream(parent=0x359dce0) | |
2022-06-13 13:40:40.944816: I tensorflow/stream_executor/stream.cc:308] [stream=0x1b49f2b0,impl=0x1b49b540] Called Stream::Init() | |
2022-06-13 13:40:40.944825: I tensorflow/stream_executor/stream.cc:261] [stream=0x214b73d0,impl=0x1b49b2f0] Called Stream::Stream(parent=0x359dce0) | |
2022-06-13 13:40:40.944831: I tensorflow/stream_executor/stream.cc:308] [stream=0x214b73d0,impl=0x1b49b2f0] Called Stream::Init() | |
2022-06-13 13:40:40.944840: I tensorflow/stream_executor/stream.cc:261] [stream=0x214b76c0,impl=0x212a9cb0] Called Stream::Stream(parent=0x359dce0) | |
2022-06-13 13:40:40.944845: I tensorflow/stream_executor/stream.cc:308] [stream=0x214b76c0,impl=0x212a9cb0] Called Stream::Init() | |
2022-06-13 13:40:40.945193: I tensorflow/compiler/jit/xla_gpu_device.cc:79] Not creating XLA devices, tf_xla_enable_xla_devices not set | |
2022-06-13 13:40:40.945250: I tensorflow/core/common_runtime/process_util.cc:159] Session inter op parallelism threads: 32 | |
2022-06-13 13:40:40.949036: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op _EagerConst in device | |
2022-06-13 13:40:40.949093: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 1 | |
2022-06-13 13:40:40.949112: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute _EagerConst in device | |
2022-06-13 13:40:40.971113: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:GPU::_EagerConst takes 21899.5us | |
2022-06-13 13:40:40.971168: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:CPU::_EagerConst takes 8.856us | |
2022-06-13 13:40:40.971200: I tensorflow/core/common_runtime/eager/execute.cc:923] PreferredDevice _EagerConst: /job:localhost/replica:0/task:0 | |
2022-06-13 13:40:40.971212: I tensorflow/core/common_runtime/eager/execute.cc:924] Placer place op [_EagerConst] on device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:40.971243: I tensorflow/core/common_runtime/eager/execute.cc:982] _EagerConst:input:0 /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:40.971266: I tensorflow/core/common_runtime/eager/execute.cc:1062] Device for [_EagerConst] already set to: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:40.972475: I tensorflow/core/common_runtime/eager/execute.cc:823] signature { | |
name: "__wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0" | |
input_arg { | |
name: "input" | |
type_attr: "T" | |
} | |
output_arg { | |
name: "output" | |
type_attr: "T" | |
} | |
attr { | |
name: "T" | |
type: "type" | |
} | |
} | |
node_def { | |
name: "_EagerConst" | |
op: "_EagerConst" | |
input: "input:0" | |
device: "/job:localhost/replica:0/task:0/device:GPU:0" | |
attr { | |
key: "T" | |
value { | |
placeholder: "T" | |
} | |
} | |
} | |
ret { | |
key: "output" | |
value: "_EagerConst:output:0" | |
} | |
2022-06-13 13:40:40.981225: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:40.981303: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:40.981336: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:40.981380: I tensorflow/core/common_runtime/process_function_library_runtime.cc:772] Instantiating MultiDevice function "__wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0" on default device "/job:localhost/replica:0/task:0/device:GPU:0" | |
2022-06-13 13:40:40.982891: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:191] None of the MLIR Optimization Passes are enabled (registered 3) | |
2022-06-13 13:40:40.982917: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 0 | |
2022-06-13 13:40:40.982930: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-13 13:40:40.982936: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MlirV1CompatGraphOptimizationPass | |
2022-06-13 13:40:40.982947: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-13 13:40:40.982952: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ControlFlowDepsToChainsPass | |
2022-06-13 13:40:40.982959: I tensorflow/core/common_runtime/control_flow_deps_to_chains.cc:37] ControlFlowDepsToChainsPass::Run | |
2022-06-13 13:40:40.982975: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:40.982995: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:40.983009: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-13 13:40:40.983016: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: AccumulateNV2RemovePass | |
2022-06-13 13:40:40.983022: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: LowerFunctionalOpsPass | |
2022-06-13 13:40:40.983034: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ParallelConcatRemovePass | |
2022-06-13 13:40:40.983041: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 35 | |
2022-06-13 13:40:40.983046: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IsolatePlacerInspectionRequiredOpsPass | |
2022-06-13 13:40:40.983052: I tensorflow/core/common_runtime/isolate_placer_inspection_required_ops_pass.cc:34] IsolatePlacerInspectionRequiredOpsPass::Run | |
2022-06-13 13:40:40.983060: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IntroduceFloatingPointJitterPass | |
2022-06-13 13:40:40.983076: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 36 | |
2022-06-13 13:40:40.983081: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateXlaComputationsPass | |
2022-06-13 13:40:40.983092: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:40.983101: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:353] EncapsulateXlaComputations(): (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:40.983145: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_halfway because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:40.983156: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:364] EncapsulateXlaComputations() half-way: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:40.983166: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:40.983174: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:370] EncapsulateXlaComputations() finished: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:40.983186: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 37 | |
2022-06-13 13:40:40.983191: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: FunctionalizeControlFlowForXlaPass | |
2022-06-13 13:40:40.983227: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 99999 | |
2022-06-13 13:40:40.983234: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: WeakForwardTypeInferencePass | |
2022-06-13 13:40:40.983241: I tensorflow/core/common_runtime/forward_type_inference.cc:130] ForwardTypeInferencePass::Run | |
2022-06-13 13:40:40.983251: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:40.983275: I tensorflow/core/common_runtime/forward_type_inference.cc:311] Finished after 1 iterations; done 5 of 5 nodes in 5 visits | |
2022-06-13 13:40:40.983287: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:40.983298: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 0 | |
2022-06-13 13:40:40.983327: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node input}}'Will fall back to a default kernel. | |
2022-06-13 13:40:40.983348: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::input takes 17.743us | |
2022-06-13 13:40:40.983359: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::input takes 1.013us | |
2022-06-13 13:40:40.983375: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:GPU::_EagerConst takes 3.999us | |
2022-06-13 13:40:40.983382: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:CPU::_EagerConst takes 0.623us | |
2022-06-13 13:40:40.983405: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:40.983414: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_RetVal takes 23.784us | |
2022-06-13 13:40:40.983421: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 0.658us | |
2022-06-13 13:40:40.983437: I tensorflow/core/common_runtime/placer.cc:124] input(_Arg) placed on: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:40.983446: I tensorflow/core/common_runtime/placer.cc:124] _EagerConst(_EagerConst) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:40.983452: I tensorflow/core/common_runtime/placer.cc:124] output_RetVal(_Retval) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:40.983458: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 1 | |
2022-06-13 13:40:40.983464: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-13 13:40:40.983469: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: NcclReplacePass | |
2022-06-13 13:40:40.983481: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 1 | |
2022-06-13 13:40:40.983487: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 2 | |
2022-06-13 13:40:40.983492: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 5 | |
2022-06-13 13:40:40.983497: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: CloneConstantsForBetterClusteringPass | |
2022-06-13 13:40:40.983505: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-13 13:40:40.983515: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ClusterScopingPass | |
2022-06-13 13:40:40.983521: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-13 13:40:40.983526: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MarkForCompilationPass | |
2022-06-13 13:40:40.987619: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: XlaLaunch:CPU::_XlaLaunch-op takes 1.715us | |
2022-06-13 13:40:40.987640: I tensorflow/compiler/tf2xla/xla_op_registry.cc:51] LaunchOpHasKernelForDevice kernel_class_name: XlaLocalLaunchOp | |
2022-06-13 13:40:40.987650: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: XlaLaunch:GPU::_XlaLaunch-op takes 0.553us | |
2022-06-13 13:40:40.987655: I tensorflow/compiler/tf2xla/xla_op_registry.cc:51] LaunchOpHasKernelForDevice kernel_class_name: XlaLocalLaunchOp | |
2022-06-13 13:40:40.987684: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:XLA_GPU_JIT::_EagerConst takes 1.192us | |
2022-06-13 13:40:40.987769: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:650] DeadnessAnalysis time: 12 us (cumulative: 12 us, max: 12 us, #called: 1) | |
2022-06-13 13:40:40.987819: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:1523] MarkForCompilationPassImpl::Run time: 442 us (cumulative: 442 us, max: 442 us, #called: 1) | |
2022-06-13 13:40:40.987833: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 12 | |
2022-06-13 13:40:40.987838: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ForceXlaConstantsOnHostPass | |
2022-06-13 13:40:40.987850: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 20 | |
2022-06-13 13:40:40.987854: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IncreaseDynamismForAutoJitPass | |
2022-06-13 13:40:40.987861: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 30 | |
2022-06-13 13:40:40.987866: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: PartiallyDeclusterPass | |
2022-06-13 13:40:40.987886: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 40 | |
2022-06-13 13:40:40.987891: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ReportClusteringInfoPass | |
2022-06-13 13:40:40.987973: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 50 | |
2022-06-13 13:40:40.987980: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateSubgraphsPass | |
2022-06-13 13:40:40.987986: I tensorflow/compiler/jit/encapsulate_subgraphs_pass.cc:1139] EncapsulateSubgraphsPass::Run | |
2022-06-13 13:40:40.988003: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:40.988081: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:40.988105: I tensorflow/compiler/jit/xla_cluster_util.cc:590] GetNodesRelatedToRefVariables() found 0 nodes | |
2022-06-13 13:40:40.988126: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 60 | |
2022-06-13 13:40:40.988131: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: BuildXlaOpsPass | |
2022-06-13 13:40:40.988139: I tensorflow/compiler/jit/build_xla_ops_pass.cc:603] print_outputs = 0 | |
2022-06-13 13:40:40.988143: I tensorflow/compiler/jit/build_xla_ops_pass.cc:604] check_input_numerics = 0 | |
2022-06-13 13:40:40.988147: I tensorflow/compiler/jit/build_xla_ops_pass.cc:605] check_output_numerics = 0 | |
2022-06-13 13:40:40.988157: W tensorflow/core/util/dump_graph.cc:134] Failed to dump build_xla_ops because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:40.988169: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 2 | |
2022-06-13 13:40:40.988193: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::input takes 1.889us | |
2022-06-13 13:40:40.988211: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:GPU::_EagerConst takes 4.199us | |
2022-06-13 13:40:40.988225: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:40.988234: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_RetVal takes 12.735us | |
2022-06-13 13:40:40.988283: I tensorflow/core/graph/graph_partition.cc:281] Receiving data from input (_Arg) on /job:localhost/replica:0/task:0/device:CPU:0 in device memory for _EagerConst (_EagerConst) on /job:localhost/replica:0/task:0/device:GPU:0 in host memory | |
2022-06-13 13:40:40.988308: I tensorflow/core/graph/graph_partition.cc:1251] Added send/recv: controls=0, data=1 | |
2022-06-13 13:40:40.988374: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 3 | |
2022-06-13 13:40:40.988386: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 1 | |
2022-06-13 13:40:40.988392: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MklLayoutRewritePass | |
2022-06-13 13:40:40.991060: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _HostRecv, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:40.991079: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _EagerConst, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:40.991084: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:40.991089: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _HostRecv, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:40.991094: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _EagerConst, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:40.991098: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:40.991103: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _HostRecv, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:40.991108: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _EagerConst, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:40.991112: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:40.991121: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 3 | |
2022-06-13 13:40:40.991140: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_118900896_/job:localhost/replica:0/task:0/device:CPU:0 because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:40.991158: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_565427744_/job:localhost/replica:0/task:0/device:GPU:0 because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:40.991237: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0_18117741797234826063_0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:40.991259: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0_18117741797234826063_0 on device /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:40.991361: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0_18117741797234826063_0 with handle 0 status: OK | |
2022-06-13 13:40:40.991400: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0_18117741797234826063_1' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:40.991416: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0_18117741797234826063_1 on device /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:40.991482: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0_18117741797234826063_1 with handle 1 status: OK | |
2022-06-13 13:40:40.991543: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:40.991577: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 0 | |
2022-06-13 13:40:40.991618: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found | |
2022-06-13 13:40:40.991682: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node input/_1}} = _Send[T=DT_INT32, _dst="_EagerConst", _src="input", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=-7643437611148729878, tensor_name="edge_2_input", _device="/job:localhost/replica:0/task:0/device:CPU:0"](input) | |
2022-06-13 13:40:40.991703: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::input/_1 takes 1.332us | |
2022-06-13 13:40:40.991711: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::input/_1 takes 0.372us | |
2022-06-13 13:40:40.991746: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node input/_1}} = _Send[T=DT_INT32, _dst="_EagerConst", _src="input", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=-7643437611148729878, tensor_name="edge_2_input", _device="/job:localhost/replica:0/task:0/device:CPU:0"](input) takes 69.497us | |
2022-06-13 13:40:40.991779: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x214df9f0 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0 | |
2022-06-13 13:40:40.991792: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 1 | |
2022-06-13 13:40:40.991814: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:CPU::_EagerConst takes 0.866us | |
2022-06-13 13:40:40.991823: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found | |
2022-06-13 13:40:40.991849: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:40.991858: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 10.091us | |
2022-06-13 13:40:40.991865: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SINK}}'Will fall back to a default kernel. | |
2022-06-13 13:40:40.991870: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SINK takes 5.329us | |
2022-06-13 13:40:40.991877: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node input/_2}}'Will fall back to a default kernel. | |
2022-06-13 13:40:40.991882: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _HostRecv:GPU::input/_2 takes 5.778us | |
2022-06-13 13:40:40.991896: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:GPU::_EagerConst takes 3.694us | |
2022-06-13 13:40:40.991906: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_retval_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:40.991911: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_retval_RetVal takes 7.897us | |
2022-06-13 13:40:40.991919: I tensorflow/core/common_runtime/memory_types.cc:87] 2:0 -> 3:0: 1 -> 1 | |
2022-06-13 13:40:40.991924: I tensorflow/core/common_runtime/memory_types.cc:87] 3:0 -> 4:0: 1 -> 1 | |
2022-06-13 13:40:40.991930: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:40.991935: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 5.643us | |
2022-06-13 13:40:40.991942: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SINK}}'Will fall back to a default kernel. | |
2022-06-13 13:40:40.991947: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SINK takes 5.234us | |
2022-06-13 13:40:40.991953: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node input/_2}}'Will fall back to a default kernel. | |
2022-06-13 13:40:40.991958: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _HostRecv:GPU::input/_2 takes 5.652us | |
2022-06-13 13:40:40.991966: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:GPU::_EagerConst takes 1.355us | |
2022-06-13 13:40:40.991975: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_retval_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:40.991980: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_retval_RetVal takes 6.289us | |
2022-06-13 13:40:40.991986: I tensorflow/core/common_runtime/memory_types.cc:87] 2:0 -> 3:0: 1 -> 1 | |
2022-06-13 13:40:40.991991: I tensorflow/core/common_runtime/memory_types.cc:87] 3:0 -> 4:0: 1 -> 1 | |
2022-06-13 13:40:40.992006: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node input/_2}} = _HostRecv[_dst="_EagerConst", _src="input", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=-7643437611148729878, tensor_name="edge_2_input", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"]() | |
2022-06-13 13:40:40.992015: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node input/_2}}'Will fall back to a default kernel. | |
2022-06-13 13:40:40.992021: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _HostRecv:GPU::input/_2 takes 5.797us | |
2022-06-13 13:40:40.992026: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node input/_2}}'Will fall back to a default kernel. | |
2022-06-13 13:40:40.992031: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _HostRecv:GPU::input/_2 takes 5.33us | |
2022-06-13 13:40:40.992050: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node input/_2}} = _HostRecv[_dst="_EagerConst", _src="input", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=-7643437611148729878, tensor_name="edge_2_input", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"]() takes 45.084us | |
2022-06-13 13:40:40.992061: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _EagerConst}} = _EagerConst[T=DT_INT32, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](input/_2) | |
2022-06-13 13:40:40.992069: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:GPU::_EagerConst takes 1.424us | |
2022-06-13 13:40:40.992076: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _EagerConst:GPU::_EagerConst takes 1.224us | |
2022-06-13 13:40:40.992091: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _EagerConst}} = _EagerConst[T=DT_INT32, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](input/_2) takes 29.632us | |
2022-06-13 13:40:40.992099: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_INT32, index=0](_EagerConst) | |
2022-06-13 13:40:40.992107: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_retval_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:40.992113: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_retval_RetVal takes 6.668us | |
2022-06-13 13:40:40.992119: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_retval_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:40.992124: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_retval_RetVal takes 6.333us | |
2022-06-13 13:40:40.992134: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_INT32, index=0](_EagerConst) takes 33.958us | |
2022-06-13 13:40:41.002415: I tensorflow/stream_executor/stream_executor_pimpl.cc:534] Called StreamExecutor::Allocate(size=10126688256, memory_space=0) returns 0x7f990c000000 | |
2022-06-13 13:40:41.002440: I tensorflow/core/common_runtime/bfc_allocator.cc:157] Extending allocation by 9.43GiB bytes for GPU_0_bfc. | |
2022-06-13 13:40:41.002446: I tensorflow/core/common_runtime/bfc_allocator.cc:162] Total allocated bytes: 9.43GiB | |
2022-06-13 13:40:41.002451: I tensorflow/core/common_runtime/bfc_allocator.cc:165] Allocated memory at 0x7f990c000000 to 0x7f9b67990000 | |
2022-06-13 13:40:41.146670: I tensorflow/stream_executor/stream_executor_pimpl.cc:623] Called StreamExecutor::SynchronousMemZero(location=0x7ffd501c04c0, size=1028) | |
2022-06-13 13:40:41.147132: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper input/_2 op _HostRecv on GPU 0 stream[0] | |
2022-06-13 13:40:41.147148: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] input/_2:_HostRecv#from=input,to=_EagerConst# | |
2022-06-13 13:40:41.147164: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x214df9f0 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0 | |
2022-06-13 13:40:41.147170: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x214dfa10 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0 | |
2022-06-13 13:40:41.147188: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled input/_2 op _HostRecv on GPU 0 stream[0] | |
2022-06-13 13:40:41.147197: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper _EagerConst op _EagerConst on GPU 0 stream[0] | |
2022-06-13 13:40:41.147209: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] _EagerConst:_EagerConst#shape=(int32[2])# | |
2022-06-13 13:40:41.147223: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled _EagerConst op _EagerConst on GPU 0 stream[0] | |
2022-06-13 13:40:41.147229: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper output_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.147234: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] output_retval_RetVal:_Retval#shape=(int32[2])# | |
2022-06-13 13:40:41.147240: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled output_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.147462: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op RandomUniform in device | |
2022-06-13 13:40:41.147475: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 1 | |
2022-06-13 13:40:41.147483: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute RandomUniform in device | |
2022-06-13 13:40:41.147538: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::RandomUniform takes 7.387us | |
2022-06-13 13:40:41.147550: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:CPU::RandomUniform takes 2.496us | |
2022-06-13 13:40:41.147564: I tensorflow/core/common_runtime/eager/execute.cc:923] PreferredDevice RandomUniform: /job:localhost/replica:0/task:0 | |
2022-06-13 13:40:41.147569: I tensorflow/core/common_runtime/eager/execute.cc:924] Placer place op [RandomUniform] on device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.147581: I tensorflow/core/common_runtime/eager/execute.cc:982] RandomUniform:input:0 /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.147597: I tensorflow/core/common_runtime/eager/execute.cc:1062] Device for [RandomUniform] already set to: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.147763: I tensorflow/core/common_runtime/eager/execute.cc:823] signature { | |
name: "__wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0" | |
input_arg { | |
name: "shape" | |
type_attr: "T" | |
} | |
output_arg { | |
name: "output" | |
type_attr: "dtype" | |
} | |
attr { | |
name: "seed" | |
type: "int" | |
default_value { | |
i: 0 | |
} | |
} | |
attr { | |
name: "seed2" | |
type: "int" | |
default_value { | |
i: 0 | |
} | |
} | |
attr { | |
name: "dtype" | |
type: "type" | |
allowed_values { | |
list { | |
type: DT_HALF | |
type: DT_BFLOAT16 | |
type: DT_FLOAT | |
type: DT_DOUBLE | |
} | |
} | |
} | |
attr { | |
name: "T" | |
type: "type" | |
allowed_values { | |
list { | |
type: DT_INT32 | |
type: DT_INT64 | |
} | |
} | |
} | |
is_stateful: true | |
} | |
node_def { | |
name: "RandomUniform" | |
op: "RandomUniform" | |
input: "shape:0" | |
device: "/job:localhost/replica:0/task:0/device:GPU:0" | |
attr { | |
key: "T" | |
value { | |
placeholder: "T" | |
} | |
} | |
attr { | |
key: "dtype" | |
value { | |
placeholder: "dtype" | |
} | |
} | |
attr { | |
key: "seed" | |
value { | |
placeholder: "seed" | |
} | |
} | |
attr { | |
key: "seed2" | |
value { | |
placeholder: "seed2" | |
} | |
} | |
} | |
ret { | |
key: "output" | |
value: "RandomUniform:output:0" | |
} | |
2022-06-13 13:40:41.147791: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.147832: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.147851: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.147890: I tensorflow/core/common_runtime/process_function_library_runtime.cc:772] Instantiating MultiDevice function "__wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0" on default device "/job:localhost/replica:0/task:0/device:GPU:0" | |
2022-06-13 13:40:41.148064: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 0 | |
2022-06-13 13:40:41.148076: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-13 13:40:41.148080: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MlirV1CompatGraphOptimizationPass | |
2022-06-13 13:40:41.148086: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-13 13:40:41.148091: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ControlFlowDepsToChainsPass | |
2022-06-13 13:40:41.148096: I tensorflow/core/common_runtime/control_flow_deps_to_chains.cc:37] ControlFlowDepsToChainsPass::Run | |
2022-06-13 13:40:41.148109: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.148124: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.148132: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-13 13:40:41.148137: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: AccumulateNV2RemovePass | |
2022-06-13 13:40:41.148143: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: LowerFunctionalOpsPass | |
2022-06-13 13:40:41.148154: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ParallelConcatRemovePass | |
2022-06-13 13:40:41.148161: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 35 | |
2022-06-13 13:40:41.148166: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IsolatePlacerInspectionRequiredOpsPass | |
2022-06-13 13:40:41.148171: I tensorflow/core/common_runtime/isolate_placer_inspection_required_ops_pass.cc:34] IsolatePlacerInspectionRequiredOpsPass::Run | |
2022-06-13 13:40:41.148179: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IntroduceFloatingPointJitterPass | |
2022-06-13 13:40:41.148185: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 36 | |
2022-06-13 13:40:41.148190: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateXlaComputationsPass | |
2022-06-13 13:40:41.148199: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.148207: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:353] EncapsulateXlaComputations(): (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.148246: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_halfway because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.148266: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:364] EncapsulateXlaComputations() half-way: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.148276: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.148283: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:370] EncapsulateXlaComputations() finished: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.148289: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 37 | |
2022-06-13 13:40:41.148293: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: FunctionalizeControlFlowForXlaPass | |
2022-06-13 13:40:41.148312: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 99999 | |
2022-06-13 13:40:41.148317: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: WeakForwardTypeInferencePass | |
2022-06-13 13:40:41.148323: I tensorflow/core/common_runtime/forward_type_inference.cc:130] ForwardTypeInferencePass::Run | |
2022-06-13 13:40:41.148331: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.148349: I tensorflow/core/common_runtime/forward_type_inference.cc:311] Finished after 1 iterations; done 5 of 5 nodes in 5 visits | |
2022-06-13 13:40:41.148358: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.148367: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 0 | |
2022-06-13 13:40:41.148393: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node shape}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.148405: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::shape takes 17.953us | |
2022-06-13 13:40:41.148412: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::shape takes 1.052us | |
2022-06-13 13:40:41.148424: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::RandomUniform takes 1.937us | |
2022-06-13 13:40:41.148431: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:CPU::RandomUniform takes 1.878us | |
2022-06-13 13:40:41.148444: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.148450: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_RetVal takes 11.325us | |
2022-06-13 13:40:41.148456: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 0.608us | |
2022-06-13 13:40:41.148470: I tensorflow/core/common_runtime/placer.cc:124] shape(_Arg) placed on: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.148478: I tensorflow/core/common_runtime/placer.cc:124] RandomUniform(RandomUniform) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.148484: I tensorflow/core/common_runtime/placer.cc:124] output_RetVal(_Retval) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.148489: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 1 | |
2022-06-13 13:40:41.148495: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-13 13:40:41.148500: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: NcclReplacePass | |
2022-06-13 13:40:41.148506: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 1 | |
2022-06-13 13:40:41.148511: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 2 | |
2022-06-13 13:40:41.148515: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 5 | |
2022-06-13 13:40:41.148519: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: CloneConstantsForBetterClusteringPass | |
2022-06-13 13:40:41.148526: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-13 13:40:41.148531: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ClusterScopingPass | |
2022-06-13 13:40:41.148537: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-13 13:40:41.148542: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MarkForCompilationPass | |
2022-06-13 13:40:41.148795: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:XLA_GPU_JIT::RandomUniform takes 1.786us | |
2022-06-13 13:40:41.148837: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:1523] MarkForCompilationPassImpl::Run time: 279 us (cumulative: 721 us, max: 442 us, #called: 2) | |
2022-06-13 13:40:41.148847: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 12 | |
2022-06-13 13:40:41.148852: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ForceXlaConstantsOnHostPass | |
2022-06-13 13:40:41.148862: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 20 | |
2022-06-13 13:40:41.148867: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IncreaseDynamismForAutoJitPass | |
2022-06-13 13:40:41.148873: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 30 | |
2022-06-13 13:40:41.148878: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: PartiallyDeclusterPass | |
2022-06-13 13:40:41.148898: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 40 | |
2022-06-13 13:40:41.148906: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ReportClusteringInfoPass | |
2022-06-13 13:40:41.149069: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 50 | |
2022-06-13 13:40:41.149078: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateSubgraphsPass | |
2022-06-13 13:40:41.149084: I tensorflow/compiler/jit/encapsulate_subgraphs_pass.cc:1139] EncapsulateSubgraphsPass::Run | |
2022-06-13 13:40:41.149098: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.149167: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.149191: I tensorflow/compiler/jit/xla_cluster_util.cc:590] GetNodesRelatedToRefVariables() found 0 nodes | |
2022-06-13 13:40:41.149213: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 60 | |
2022-06-13 13:40:41.149220: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: BuildXlaOpsPass | |
2022-06-13 13:40:41.149227: I tensorflow/compiler/jit/build_xla_ops_pass.cc:603] print_outputs = 0 | |
2022-06-13 13:40:41.149232: I tensorflow/compiler/jit/build_xla_ops_pass.cc:604] check_input_numerics = 0 | |
2022-06-13 13:40:41.149236: I tensorflow/compiler/jit/build_xla_ops_pass.cc:605] check_output_numerics = 0 | |
2022-06-13 13:40:41.149246: W tensorflow/core/util/dump_graph.cc:134] Failed to dump build_xla_ops because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.149257: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 2 | |
2022-06-13 13:40:41.149278: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::shape takes 1.331us | |
2022-06-13 13:40:41.149293: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::RandomUniform takes 1.931us | |
2022-06-13 13:40:41.149307: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.149316: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_RetVal takes 14.39us | |
2022-06-13 13:40:41.149356: I tensorflow/core/graph/graph_partition.cc:281] Receiving data from shape (_Arg) on /job:localhost/replica:0/task:0/device:CPU:0 in device memory for RandomUniform (RandomUniform) on /job:localhost/replica:0/task:0/device:GPU:0 in host memory | |
2022-06-13 13:40:41.149384: I tensorflow/core/graph/graph_partition.cc:1251] Added send/recv: controls=0, data=1 | |
2022-06-13 13:40:41.149447: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 3 | |
2022-06-13 13:40:41.149456: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 1 | |
2022-06-13 13:40:41.149461: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MklLayoutRewritePass | |
2022-06-13 13:40:41.149485: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _HostRecv, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.149493: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node RandomUniform, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.149498: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.149503: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _HostRecv, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.149508: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node RandomUniform, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.149512: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.149517: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _HostRecv, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.149521: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node RandomUniform, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.149526: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.149532: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 3 | |
2022-06-13 13:40:41.149544: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_565427744_/job:localhost/replica:0/task:0/device:CPU:0 because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.149560: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_564047568_/job:localhost/replica:0/task:0/device:GPU:0 because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.149617: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0_10249392314444985097_0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.149638: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0_10249392314444985097_0 on device /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.149715: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0_10249392314444985097_0 with handle 3 status: OK | |
2022-06-13 13:40:41.149754: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0_10249392314444985097_1' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.149769: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0_10249392314444985097_1 on device /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.149838: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0_10249392314444985097_1 with handle 4 status: OK | |
2022-06-13 13:40:41.149887: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op RandomUniform in device /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.149910: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 3 | |
2022-06-13 13:40:41.149945: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found | |
2022-06-13 13:40:41.149996: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node shape/_1}} = _Send[T=DT_INT32, _dst="RandomUniform", _src="shape", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=-7643437611148729878, tensor_name="edge_2_shape", _device="/job:localhost/replica:0/task:0/device:CPU:0"](shape) | |
2022-06-13 13:40:41.150015: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::shape/_1 takes 2.469us | |
2022-06-13 13:40:41.150022: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::shape/_1 takes 0.327us | |
2022-06-13 13:40:41.150047: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node shape/_1}} = _Send[T=DT_INT32, _dst="RandomUniform", _src="shape", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=-7643437611148729878, tensor_name="edge_2_shape", _device="/job:localhost/replica:0/task:0/device:CPU:0"](shape) takes 54.851us | |
2022-06-13 13:40:41.150064: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x214df9f0 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0 | |
2022-06-13 13:40:41.150073: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 4 | |
2022-06-13 13:40:41.150091: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found | |
2022-06-13 13:40:41.150116: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.150125: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 10.856us | |
2022-06-13 13:40:41.150133: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SINK}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.150138: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SINK takes 5.369us | |
2022-06-13 13:40:41.150145: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node shape/_2}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.150150: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _HostRecv:GPU::shape/_2 takes 6.163us | |
2022-06-13 13:40:41.150160: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::RandomUniform takes 1.331us | |
2022-06-13 13:40:41.150171: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_retval_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.150176: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_retval_RetVal takes 8.151us | |
2022-06-13 13:40:41.150183: I tensorflow/core/common_runtime/memory_types.cc:87] 2:0 -> 3:0: 1 -> 1 | |
2022-06-13 13:40:41.150189: I tensorflow/core/common_runtime/memory_types.cc:87] 3:0 -> 4:0: 0 -> 0 | |
2022-06-13 13:40:41.150195: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.150200: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 5.211us | |
2022-06-13 13:40:41.150206: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SINK}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.150211: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SINK takes 5.284us | |
2022-06-13 13:40:41.150217: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node shape/_2}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.150222: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _HostRecv:GPU::shape/_2 takes 5.17us | |
2022-06-13 13:40:41.150230: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::RandomUniform takes 0.76us | |
2022-06-13 13:40:41.150239: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_retval_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.150244: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_retval_RetVal takes 6.632us | |
2022-06-13 13:40:41.150250: I tensorflow/core/common_runtime/memory_types.cc:87] 2:0 -> 3:0: 1 -> 1 | |
2022-06-13 13:40:41.150255: I tensorflow/core/common_runtime/memory_types.cc:87] 3:0 -> 4:0: 0 -> 0 | |
2022-06-13 13:40:41.150269: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node shape/_2}} = _HostRecv[_dst="RandomUniform", _src="shape", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=-7643437611148729878, tensor_name="edge_2_shape", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"]() | |
2022-06-13 13:40:41.150278: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node shape/_2}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.150284: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _HostRecv:GPU::shape/_2 takes 5.573us | |
2022-06-13 13:40:41.150289: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node shape/_2}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.150294: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _HostRecv:GPU::shape/_2 takes 5.068us | |
2022-06-13 13:40:41.150312: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node shape/_2}} = _HostRecv[_dst="RandomUniform", _src="shape", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=-7643437611148729878, tensor_name="edge_2_shape", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"]() takes 44.064us | |
2022-06-13 13:40:41.150321: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node RandomUniform}} = RandomUniform[T=DT_INT32, _XlaHasReferenceVars=false, dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](shape/_2) | |
2022-06-13 13:40:41.150330: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::RandomUniform takes 0.865us | |
2022-06-13 13:40:41.150337: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::RandomUniform takes 0.656us | |
2022-06-13 13:40:41.150355: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node RandomUniform}} = RandomUniform[T=DT_INT32, _XlaHasReferenceVars=false, dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](shape/_2) takes 33.465us | |
2022-06-13 13:40:41.150363: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_FLOAT, index=0](RandomUniform) | |
2022-06-13 13:40:41.150371: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_retval_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.150377: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_retval_RetVal takes 6.667us | |
2022-06-13 13:40:41.150383: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_retval_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.150388: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_retval_RetVal takes 5.841us | |
2022-06-13 13:40:41.150398: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_FLOAT, index=0](RandomUniform) takes 34.377us | |
2022-06-13 13:40:41.150412: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper shape/_2 op _HostRecv on GPU 0 stream[0] | |
2022-06-13 13:40:41.150417: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] shape/_2:_HostRecv#from=shape,to=RandomUniform# | |
2022-06-13 13:40:41.150429: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x214df9f0 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0 | |
2022-06-13 13:40:41.150434: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x214dfa10 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0 | |
2022-06-13 13:40:41.150443: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled shape/_2 op _HostRecv on GPU 0 stream[0] | |
2022-06-13 13:40:41.150449: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-13 13:40:41.150456: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] RandomUniform:RandomUniform#shape=(int32[2])# | |
2022-06-13 13:40:41.150518: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-13 13:40:41.150528: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper output_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.150534: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] output_retval_RetVal:_Retval#shape=(float[1024,128])# | |
2022-06-13 13:40:41.150540: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled output_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.150799: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op _EagerConst in device | |
2022-06-13 13:40:41.150812: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 1 | |
2022-06-13 13:40:41.150817: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute _EagerConst in device | |
2022-06-13 13:40:41.150829: I tensorflow/core/common_runtime/eager/execute.cc:982] _EagerConst:input:0 /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.150842: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.150855: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 0 | |
2022-06-13 13:40:41.150866: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x214df9f0 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0 | |
2022-06-13 13:40:41.150874: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 1 | |
2022-06-13 13:40:41.150883: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper input/_2 op _HostRecv on GPU 0 stream[0] | |
2022-06-13 13:40:41.150889: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] input/_2:_HostRecv#from=input,to=_EagerConst# | |
2022-06-13 13:40:41.150895: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x214df9f0 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0 | |
2022-06-13 13:40:41.150900: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x214dfa10 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0 | |
2022-06-13 13:40:41.150908: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled input/_2 op _HostRecv on GPU 0 stream[0] | |
2022-06-13 13:40:41.150914: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper _EagerConst op _EagerConst on GPU 0 stream[0] | |
2022-06-13 13:40:41.150920: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] _EagerConst:_EagerConst#shape=(int32[2])# | |
2022-06-13 13:40:41.150926: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled _EagerConst op _EagerConst on GPU 0 stream[0] | |
2022-06-13 13:40:41.150932: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper output_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.150936: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] output_retval_RetVal:_Retval#shape=(int32[2])# | |
2022-06-13 13:40:41.150941: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled output_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.150994: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op RandomUniform in device | |
2022-06-13 13:40:41.151003: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 1 | |
2022-06-13 13:40:41.151008: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute RandomUniform in device | |
2022-06-13 13:40:41.151015: I tensorflow/core/common_runtime/eager/execute.cc:982] RandomUniform:input:0 /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.151025: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op RandomUniform in device /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.151035: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 3 | |
2022-06-13 13:40:41.151042: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x214df9f0 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0 | |
2022-06-13 13:40:41.151048: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 4 | |
2022-06-13 13:40:41.151055: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper shape/_2 op _HostRecv on GPU 0 stream[0] | |
2022-06-13 13:40:41.151060: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] shape/_2:_HostRecv#from=shape,to=RandomUniform# | |
2022-06-13 13:40:41.151066: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x214df9f0 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0 | |
2022-06-13 13:40:41.151071: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x214dfa10 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0 | |
2022-06-13 13:40:41.151077: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled shape/_2 op _HostRecv on GPU 0 stream[0] | |
2022-06-13 13:40:41.151083: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-13 13:40:41.151088: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] RandomUniform:RandomUniform#shape=(int32[2])# | |
2022-06-13 13:40:41.151111: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-13 13:40:41.151120: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper output_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.151126: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] output_retval_RetVal:_Retval#shape=(float[1024,128])# | |
2022-06-13 13:40:41.151131: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled output_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.151221: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op _EagerConst in device | |
2022-06-13 13:40:41.151231: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 1 | |
2022-06-13 13:40:41.151236: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute _EagerConst in device | |
2022-06-13 13:40:41.151244: I tensorflow/core/common_runtime/eager/execute.cc:982] _EagerConst:input:0 /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.151255: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.151265: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 0 | |
2022-06-13 13:40:41.151273: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x214df9f0 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0 | |
2022-06-13 13:40:41.151280: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 1 | |
2022-06-13 13:40:41.151287: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper input/_2 op _HostRecv on GPU 0 stream[0] | |
2022-06-13 13:40:41.151292: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] input/_2:_HostRecv#from=input,to=_EagerConst# | |
2022-06-13 13:40:41.151298: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x214df9f0 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0 | |
2022-06-13 13:40:41.151303: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x214dfa10 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0 | |
2022-06-13 13:40:41.151311: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled input/_2 op _HostRecv on GPU 0 stream[0] | |
2022-06-13 13:40:41.151316: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper _EagerConst op _EagerConst on GPU 0 stream[0] | |
2022-06-13 13:40:41.151322: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] _EagerConst:_EagerConst#shape=(int32[3])# | |
2022-06-13 13:40:41.151327: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled _EagerConst op _EagerConst on GPU 0 stream[0] | |
2022-06-13 13:40:41.151332: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper output_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.151337: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] output_retval_RetVal:_Retval#shape=(int32[3])# | |
2022-06-13 13:40:41.151342: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled output_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.151377: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op RandomUniform in device | |
2022-06-13 13:40:41.151385: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 1 | |
2022-06-13 13:40:41.151390: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute RandomUniform in device | |
2022-06-13 13:40:41.151397: I tensorflow/core/common_runtime/eager/execute.cc:982] RandomUniform:input:0 /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.151407: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op RandomUniform in device /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.151417: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 3 | |
2022-06-13 13:40:41.151424: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x214df9f0 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0 | |
2022-06-13 13:40:41.151430: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 4 | |
2022-06-13 13:40:41.151437: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper shape/_2 op _HostRecv on GPU 0 stream[0] | |
2022-06-13 13:40:41.151442: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] shape/_2:_HostRecv#from=shape,to=RandomUniform# | |
2022-06-13 13:40:41.151447: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x214df9f0 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0 | |
2022-06-13 13:40:41.151452: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x214dfa10 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0 | |
2022-06-13 13:40:41.151458: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled shape/_2 op _HostRecv on GPU 0 stream[0] | |
2022-06-13 13:40:41.151464: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-13 13:40:41.151469: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] RandomUniform:RandomUniform#shape=(int32[3])# | |
2022-06-13 13:40:41.151488: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-13 13:40:41.151495: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper output_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.151500: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] output_retval_RetVal:_Retval#shape=(float[4,128,128])# | |
2022-06-13 13:40:41.151505: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled output_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.151577: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op _EagerConst in device | |
2022-06-13 13:40:41.151584: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 1 | |
2022-06-13 13:40:41.151589: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute _EagerConst in device | |
2022-06-13 13:40:41.151596: I tensorflow/core/common_runtime/eager/execute.cc:982] _EagerConst:input:0 /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.151606: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.151616: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 0 | |
2022-06-13 13:40:41.151624: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x214df9f0 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0 | |
2022-06-13 13:40:41.151630: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 1 | |
2022-06-13 13:40:41.151637: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper input/_2 op _HostRecv on GPU 0 stream[0] | |
2022-06-13 13:40:41.151642: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] input/_2:_HostRecv#from=input,to=_EagerConst# | |
2022-06-13 13:40:41.151648: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x214df9f0 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0 | |
2022-06-13 13:40:41.151653: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x214dfa10 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0 | |
2022-06-13 13:40:41.151660: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled input/_2 op _HostRecv on GPU 0 stream[0] | |
2022-06-13 13:40:41.151665: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper _EagerConst op _EagerConst on GPU 0 stream[0] | |
2022-06-13 13:40:41.151670: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] _EagerConst:_EagerConst#shape=(int32[3])# | |
2022-06-13 13:40:41.151675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled _EagerConst op _EagerConst on GPU 0 stream[0] | |
2022-06-13 13:40:41.151681: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper output_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.151685: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] output_retval_RetVal:_Retval#shape=(int32[3])# | |
2022-06-13 13:40:41.151690: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled output_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.151723: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op RandomUniform in device | |
2022-06-13 13:40:41.151729: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 1 | |
2022-06-13 13:40:41.151734: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute RandomUniform in device | |
2022-06-13 13:40:41.151740: I tensorflow/core/common_runtime/eager/execute.cc:982] RandomUniform:input:0 /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.151750: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op RandomUniform in device /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.151758: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 3 | |
2022-06-13 13:40:41.151765: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x214df9f0 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0 | |
2022-06-13 13:40:41.151771: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 4 | |
2022-06-13 13:40:41.151777: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper shape/_2 op _HostRecv on GPU 0 stream[0] | |
2022-06-13 13:40:41.151782: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] shape/_2:_HostRecv#from=shape,to=RandomUniform# | |
2022-06-13 13:40:41.151788: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x214df9f0 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0 | |
2022-06-13 13:40:41.151792: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x214dfa10 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0 | |
2022-06-13 13:40:41.151798: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled shape/_2 op _HostRecv on GPU 0 stream[0] | |
2022-06-13 13:40:41.151804: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-13 13:40:41.151809: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] RandomUniform:RandomUniform#shape=(int32[3])# | |
2022-06-13 13:40:41.151829: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-13 13:40:41.151835: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper output_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.151840: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] output_retval_RetVal:_Retval#shape=(float[16,128,128])# | |
2022-06-13 13:40:41.151845: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled output_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.153338: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op StringFormat in device | |
2022-06-13 13:40:41.153362: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 1 | |
2022-06-13 13:40:41.153367: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute StringFormat in device | |
2022-06-13 13:40:41.153396: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:GPU::StringFormat takes 2.685us | |
2022-06-13 13:40:41.153406: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.161us | |
2022-06-13 13:40:41.153415: I tensorflow/core/common_runtime/eager/execute.cc:923] PreferredDevice StringFormat: /job:localhost/replica:0/task:0 | |
2022-06-13 13:40:41.153419: I tensorflow/core/common_runtime/eager/execute.cc:924] Placer place op [StringFormat] on device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.153429: I tensorflow/core/common_runtime/eager/execute.cc:1062] Device for [StringFormat] already set to: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.153539: I tensorflow/core/common_runtime/eager/execute.cc:823] signature { | |
name: "__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0" | |
output_arg { | |
name: "output" | |
type: DT_STRING | |
} | |
attr { | |
name: "T" | |
type: "list(type)" | |
has_minimum: true | |
} | |
attr { | |
name: "template" | |
type: "string" | |
default_value { | |
s: "%s" | |
} | |
} | |
attr { | |
name: "placeholder" | |
type: "string" | |
default_value { | |
s: "%s" | |
} | |
} | |
attr { | |
name: "summarize" | |
type: "int" | |
default_value { | |
i: 3 | |
} | |
} | |
} | |
node_def { | |
name: "StringFormat" | |
op: "StringFormat" | |
device: "/job:localhost/replica:0/task:0/device:CPU:0" | |
attr { | |
key: "T" | |
value { | |
placeholder: "T" | |
} | |
} | |
attr { | |
key: "placeholder" | |
value { | |
placeholder: "placeholder" | |
} | |
} | |
attr { | |
key: "summarize" | |
value { | |
placeholder: "summarize" | |
} | |
} | |
attr { | |
key: "template" | |
value { | |
placeholder: "template" | |
} | |
} | |
} | |
ret { | |
key: "output" | |
value: "StringFormat:output:0" | |
} | |
2022-06-13 13:40:41.153561: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.153586: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.153604: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.153630: I tensorflow/core/common_runtime/process_function_library_runtime.cc:772] Instantiating MultiDevice function "__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0" on default device "/job:localhost/replica:0/task:0/device:CPU:0" | |
2022-06-13 13:40:41.153719: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 0 | |
2022-06-13 13:40:41.153728: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-13 13:40:41.153732: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MlirV1CompatGraphOptimizationPass | |
2022-06-13 13:40:41.153737: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-13 13:40:41.153742: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ControlFlowDepsToChainsPass | |
2022-06-13 13:40:41.153746: I tensorflow/core/common_runtime/control_flow_deps_to_chains.cc:37] ControlFlowDepsToChainsPass::Run | |
2022-06-13 13:40:41.153756: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.153769: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.153778: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-13 13:40:41.153783: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: AccumulateNV2RemovePass | |
2022-06-13 13:40:41.153788: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: LowerFunctionalOpsPass | |
2022-06-13 13:40:41.153797: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ParallelConcatRemovePass | |
2022-06-13 13:40:41.153802: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 35 | |
2022-06-13 13:40:41.153806: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IsolatePlacerInspectionRequiredOpsPass | |
2022-06-13 13:40:41.153811: I tensorflow/core/common_runtime/isolate_placer_inspection_required_ops_pass.cc:34] IsolatePlacerInspectionRequiredOpsPass::Run | |
2022-06-13 13:40:41.153817: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IntroduceFloatingPointJitterPass | |
2022-06-13 13:40:41.153823: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 36 | |
2022-06-13 13:40:41.153827: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateXlaComputationsPass | |
2022-06-13 13:40:41.153836: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.153843: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:353] EncapsulateXlaComputations(): (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.153871: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_halfway because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.153882: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:364] EncapsulateXlaComputations() half-way: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.153891: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.153898: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:370] EncapsulateXlaComputations() finished: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.153904: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 37 | |
2022-06-13 13:40:41.153908: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: FunctionalizeControlFlowForXlaPass | |
2022-06-13 13:40:41.153920: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 99999 | |
2022-06-13 13:40:41.153925: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: WeakForwardTypeInferencePass | |
2022-06-13 13:40:41.153930: I tensorflow/core/common_runtime/forward_type_inference.cc:130] ForwardTypeInferencePass::Run | |
2022-06-13 13:40:41.153938: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.153952: I tensorflow/core/common_runtime/forward_type_inference.cc:311] Finished after 1 iterations; done 4 of 4 nodes in 4 visits | |
2022-06-13 13:40:41.153961: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.153970: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 0 | |
2022-06-13 13:40:41.153986: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:GPU::StringFormat takes 1.612us | |
2022-06-13 13:40:41.153998: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.514us | |
2022-06-13 13:40:41.154008: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.154014: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_RetVal takes 8.229us | |
2022-06-13 13:40:41.154020: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 0.458us | |
2022-06-13 13:40:41.154032: I tensorflow/core/common_runtime/placer.cc:124] output_RetVal(_Retval) placed on: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.154039: I tensorflow/core/common_runtime/placer.cc:124] StringFormat(StringFormat) placed on: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.154045: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 1 | |
2022-06-13 13:40:41.154050: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-13 13:40:41.154054: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: NcclReplacePass | |
2022-06-13 13:40:41.154061: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 1 | |
2022-06-13 13:40:41.154066: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 2 | |
2022-06-13 13:40:41.154071: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 5 | |
2022-06-13 13:40:41.154075: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: CloneConstantsForBetterClusteringPass | |
2022-06-13 13:40:41.154080: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-13 13:40:41.154085: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ClusterScopingPass | |
2022-06-13 13:40:41.154090: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-13 13:40:41.154094: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MarkForCompilationPass | |
2022-06-13 13:40:41.154301: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:XLA_CPU_JIT::StringFormat takes 1.309us | |
2022-06-13 13:40:41.154348: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:1523] MarkForCompilationPassImpl::Run time: 243 us (cumulative: 964 us, max: 442 us, #called: 3) | |
2022-06-13 13:40:41.154358: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 12 | |
2022-06-13 13:40:41.154362: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ForceXlaConstantsOnHostPass | |
2022-06-13 13:40:41.154371: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 20 | |
2022-06-13 13:40:41.154376: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IncreaseDynamismForAutoJitPass | |
2022-06-13 13:40:41.154381: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 30 | |
2022-06-13 13:40:41.154386: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: PartiallyDeclusterPass | |
2022-06-13 13:40:41.154402: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 40 | |
2022-06-13 13:40:41.154407: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ReportClusteringInfoPass | |
2022-06-13 13:40:41.154424: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 50 | |
2022-06-13 13:40:41.154429: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateSubgraphsPass | |
2022-06-13 13:40:41.154434: I tensorflow/compiler/jit/encapsulate_subgraphs_pass.cc:1139] EncapsulateSubgraphsPass::Run | |
2022-06-13 13:40:41.154445: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.154493: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.154512: I tensorflow/compiler/jit/xla_cluster_util.cc:590] GetNodesRelatedToRefVariables() found 0 nodes | |
2022-06-13 13:40:41.154531: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 60 | |
2022-06-13 13:40:41.154540: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: BuildXlaOpsPass | |
2022-06-13 13:40:41.154546: I tensorflow/compiler/jit/build_xla_ops_pass.cc:603] print_outputs = 0 | |
2022-06-13 13:40:41.154550: I tensorflow/compiler/jit/build_xla_ops_pass.cc:604] check_input_numerics = 0 | |
2022-06-13 13:40:41.154554: I tensorflow/compiler/jit/build_xla_ops_pass.cc:605] check_output_numerics = 0 | |
2022-06-13 13:40:41.154563: W tensorflow/core/util/dump_graph.cc:134] Failed to dump build_xla_ops because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.154574: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 2 | |
2022-06-13 13:40:41.154587: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.997us | |
2022-06-13 13:40:41.154597: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 0.468us | |
2022-06-13 13:40:41.154614: I tensorflow/core/graph/graph_partition.cc:1251] Added send/recv: controls=0, data=0 | |
2022-06-13 13:40:41.154653: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 3 | |
2022-06-13 13:40:41.154661: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 1 | |
2022-06-13 13:40:41.154665: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MklLayoutRewritePass | |
2022-06-13 13:40:41.156972: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 3 | |
2022-06-13 13:40:41.157000: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_564176736_/job:localhost/replica:0/task:0/device:CPU:0 because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.157047: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_16642374198413653398_0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.157064: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_16642374198413653398_0 on device /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.157131: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_16642374198413653398_0 with handle 6 status: OK | |
2022-06-13 13:40:41.157167: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op StringFormat in device /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.157181: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0 with handle 6 | |
2022-06-13 13:40:41.157212: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.265us | |
2022-06-13 13:40:41.157231: I tensorflow/core/common_runtime/constant_folding.cc:631] Constant foldable 3 : 4 | |
2022-06-13 13:40:41.157324: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() | |
2022-06-13 13:40:41.157335: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.157342: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 7.162us | |
2022-06-13 13:40:41.157348: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.157353: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 5.475us | |
2022-06-13 13:40:41.157364: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() takes 39.915us | |
2022-06-13 13:40:41.157376: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 0 costs 0", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() | |
2022-06-13 13:40:41.157386: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.415us | |
2022-06-13 13:40:41.157393: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.362us | |
2022-06-13 13:40:41.157417: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 0 costs 0", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 41.146us | |
2022-06-13 13:40:41.157430: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=-1268297528044505333, tensor_name="StringFormat:0"](StringFormat) | |
2022-06-13 13:40:41.157439: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_send_StringFormat_0 takes 0.468us | |
2022-06-13 13:40:41.157445: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_send_StringFormat_0 takes 0.299us | |
2022-06-13 13:40:41.157462: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=-1268297528044505333, tensor_name="StringFormat:0"](StringFormat) takes 32.714us | |
2022-06-13 13:40:41.157493: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step -1 {{node _SOURCE}} = NoOp[]() device: /device:CPU:0 | |
2022-06-13 13:40:41.157505: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step -1 {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 0 costs 0", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() device: /device:CPU:0 | |
2022-06-13 13:40:41.157526: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step -1 {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=-1268297528044505333, tensor_name="StringFormat:0"](StringFormat) device: /device:CPU:0 | |
2022-06-13 13:40:41.157572: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 0.748us | |
2022-06-13 13:40:41.157581: I tensorflow/core/common_runtime/constant_folding.cc:562] Replacing StringFormat :: 0 with a constant | |
2022-06-13 13:40:41.157625: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found | |
2022-06-13 13:40:41.157663: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node StringFormat/_0__cf__0}} = Const[dtype=DT_STRING, value=Tensor<type: string shape: [] values: run 0 costs 0>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() | |
2022-06-13 13:40:41.157675: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 0.493us | |
2022-06-13 13:40:41.157681: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 0.276us | |
2022-06-13 13:40:41.157700: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node StringFormat/_0__cf__0}} = Const[dtype=DT_STRING, value=Tensor<type: string shape: [] values: run 0 costs 0>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 40.282us | |
2022-06-13 13:40:41.157710: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_STRING, index=0](StringFormat) | |
2022-06-13 13:40:41.157718: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_retval_RetVal takes 0.422us | |
2022-06-13 13:40:41.157723: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_retval_RetVal takes 0.255us | |
2022-06-13 13:40:41.157733: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_STRING, index=0](StringFormat) takes 22.593us | |
2022-06-13 13:40:41.157791: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op PrintV2 in device | |
2022-06-13 13:40:41.157799: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 0 | |
2022-06-13 13:40:41.157804: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute PrintV2 in device | |
2022-06-13 13:40:41.157823: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: PrintV2:GPU::PrintV2 takes 1.877us | |
2022-06-13 13:40:41.157833: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: PrintV2:CPU::PrintV2 takes 0.538us | |
2022-06-13 13:40:41.157841: I tensorflow/core/common_runtime/eager/execute.cc:923] PreferredDevice PrintV2: /job:localhost/replica:0/task:0 | |
2022-06-13 13:40:41.157846: I tensorflow/core/common_runtime/eager/execute.cc:924] Placer place op [PrintV2] on device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.157853: I tensorflow/core/common_runtime/eager/execute.cc:982] PrintV2:input:0 /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.157862: I tensorflow/core/common_runtime/eager/execute.cc:1062] Device for [PrintV2] already set to: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.157935: I tensorflow/core/common_runtime/eager/execute.cc:823] signature { | |
name: "__wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0" | |
input_arg { | |
name: "input" | |
type: DT_STRING | |
} | |
attr { | |
name: "output_stream" | |
type: "string" | |
default_value { | |
s: "stderr" | |
} | |
} | |
attr { | |
name: "end" | |
type: "string" | |
default_value { | |
s: "\n" | |
} | |
} | |
is_stateful: true | |
} | |
node_def { | |
name: "PrintV2" | |
op: "PrintV2" | |
input: "input:0" | |
device: "/job:localhost/replica:0/task:0/device:CPU:0" | |
attr { | |
key: "end" | |
value { | |
placeholder: "end" | |
} | |
} | |
attr { | |
key: "output_stream" | |
value { | |
placeholder: "output_stream" | |
} | |
} | |
} | |
2022-06-13 13:40:41.157953: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.157976: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.157993: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.158013: I tensorflow/core/common_runtime/process_function_library_runtime.cc:772] Instantiating MultiDevice function "__wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0" on default device "/job:localhost/replica:0/task:0/device:CPU:0" | |
2022-06-13 13:40:41.158088: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 0 | |
2022-06-13 13:40:41.158098: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-13 13:40:41.158103: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MlirV1CompatGraphOptimizationPass | |
2022-06-13 13:40:41.158109: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-13 13:40:41.158114: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ControlFlowDepsToChainsPass | |
2022-06-13 13:40:41.158119: I tensorflow/core/common_runtime/control_flow_deps_to_chains.cc:37] ControlFlowDepsToChainsPass::Run | |
2022-06-13 13:40:41.158127: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.158139: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.158147: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-13 13:40:41.158152: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: AccumulateNV2RemovePass | |
2022-06-13 13:40:41.158157: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: LowerFunctionalOpsPass | |
2022-06-13 13:40:41.158164: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ParallelConcatRemovePass | |
2022-06-13 13:40:41.158169: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 35 | |
2022-06-13 13:40:41.158173: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IsolatePlacerInspectionRequiredOpsPass | |
2022-06-13 13:40:41.158178: I tensorflow/core/common_runtime/isolate_placer_inspection_required_ops_pass.cc:34] IsolatePlacerInspectionRequiredOpsPass::Run | |
2022-06-13 13:40:41.158184: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IntroduceFloatingPointJitterPass | |
2022-06-13 13:40:41.158189: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 36 | |
2022-06-13 13:40:41.158193: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateXlaComputationsPass | |
2022-06-13 13:40:41.158201: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.158207: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:353] EncapsulateXlaComputations(): (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.158234: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_halfway because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.158242: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:364] EncapsulateXlaComputations() half-way: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.158250: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.158257: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:370] EncapsulateXlaComputations() finished: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.158262: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 37 | |
2022-06-13 13:40:41.158267: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: FunctionalizeControlFlowForXlaPass | |
2022-06-13 13:40:41.158279: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 99999 | |
2022-06-13 13:40:41.158283: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: WeakForwardTypeInferencePass | |
2022-06-13 13:40:41.158288: I tensorflow/core/common_runtime/forward_type_inference.cc:130] ForwardTypeInferencePass::Run | |
2022-06-13 13:40:41.158295: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.158309: I tensorflow/core/common_runtime/forward_type_inference.cc:311] Finished after 1 iterations; done 4 of 4 nodes in 4 visits | |
2022-06-13 13:40:41.158316: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.158325: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 0 | |
2022-06-13 13:40:41.158342: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node input}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.158353: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::input takes 14.032us | |
2022-06-13 13:40:41.158359: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::input takes 0.517us | |
2022-06-13 13:40:41.158368: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: PrintV2:GPU::PrintV2 takes 1.028us | |
2022-06-13 13:40:41.158374: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: PrintV2:CPU::PrintV2 takes 0.547us | |
2022-06-13 13:40:41.158384: I tensorflow/core/common_runtime/placer.cc:124] input(_Arg) placed on: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.158391: I tensorflow/core/common_runtime/placer.cc:124] PrintV2(PrintV2) placed on: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.158397: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 1 | |
2022-06-13 13:40:41.158402: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-13 13:40:41.158406: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: NcclReplacePass | |
2022-06-13 13:40:41.158412: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 1 | |
2022-06-13 13:40:41.158417: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 2 | |
2022-06-13 13:40:41.158421: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 5 | |
2022-06-13 13:40:41.158426: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: CloneConstantsForBetterClusteringPass | |
2022-06-13 13:40:41.158432: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-13 13:40:41.158437: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ClusterScopingPass | |
2022-06-13 13:40:41.158442: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-13 13:40:41.158446: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MarkForCompilationPass | |
2022-06-13 13:40:41.158646: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: PrintV2:XLA_CPU_JIT::PrintV2 takes 0.921us | |
2022-06-13 13:40:41.158686: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:1523] MarkForCompilationPassImpl::Run time: 229 us (cumulative: 1.19 ms, max: 442 us, #called: 4) | |
2022-06-13 13:40:41.158694: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 12 | |
2022-06-13 13:40:41.158699: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ForceXlaConstantsOnHostPass | |
2022-06-13 13:40:41.158707: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 20 | |
2022-06-13 13:40:41.158712: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IncreaseDynamismForAutoJitPass | |
2022-06-13 13:40:41.158718: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 30 | |
2022-06-13 13:40:41.158723: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: PartiallyDeclusterPass | |
2022-06-13 13:40:41.158739: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 40 | |
2022-06-13 13:40:41.158748: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ReportClusteringInfoPass | |
2022-06-13 13:40:41.158764: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 50 | |
2022-06-13 13:40:41.158768: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateSubgraphsPass | |
2022-06-13 13:40:41.158773: I tensorflow/compiler/jit/encapsulate_subgraphs_pass.cc:1139] EncapsulateSubgraphsPass::Run | |
2022-06-13 13:40:41.158783: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.158829: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.158847: I tensorflow/compiler/jit/xla_cluster_util.cc:590] GetNodesRelatedToRefVariables() found 0 nodes | |
2022-06-13 13:40:41.158860: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 60 | |
2022-06-13 13:40:41.158867: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: BuildXlaOpsPass | |
2022-06-13 13:40:41.158873: I tensorflow/compiler/jit/build_xla_ops_pass.cc:603] print_outputs = 0 | |
2022-06-13 13:40:41.158877: I tensorflow/compiler/jit/build_xla_ops_pass.cc:604] check_input_numerics = 0 | |
2022-06-13 13:40:41.158882: I tensorflow/compiler/jit/build_xla_ops_pass.cc:605] check_output_numerics = 0 | |
2022-06-13 13:40:41.158890: W tensorflow/core/util/dump_graph.cc:134] Failed to dump build_xla_ops because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.158900: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 2 | |
2022-06-13 13:40:41.158914: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::input takes 0.923us | |
2022-06-13 13:40:41.158923: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: PrintV2:CPU::PrintV2 takes 0.432us | |
2022-06-13 13:40:41.158939: I tensorflow/core/graph/graph_partition.cc:1251] Added send/recv: controls=0, data=0 | |
2022-06-13 13:40:41.158974: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 3 | |
2022-06-13 13:40:41.158982: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 1 | |
2022-06-13 13:40:41.158987: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MklLayoutRewritePass | |
2022-06-13 13:40:41.159858: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 3 | |
2022-06-13 13:40:41.159883: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_564230080_/job:localhost/replica:0/task:0/device:CPU:0 because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.159929: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0_15747355229267941188_0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.159946: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0_15747355229267941188_0 on device /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.160009: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0_15747355229267941188_0 with handle 8 status: OK | |
2022-06-13 13:40:41.160041: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op PrintV2 in device /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.160056: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0 with handle 8 | |
2022-06-13 13:40:41.160080: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found | |
2022-06-13 13:40:41.160117: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node PrintV2}} = PrintV2[_XlaHasReferenceVars=false, end="\n", output_stream="stderr", _device="/job:localhost/replica:0/task:0/device:CPU:0"](input) | |
2022-06-13 13:40:41.160132: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: PrintV2:CPU::PrintV2 takes 1.325us | |
2022-06-13 13:40:41.160138: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: PrintV2:CPU::PrintV2 takes 0.316us | |
2022-06-13 13:40:41.160158: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node PrintV2}} = PrintV2[_XlaHasReferenceVars=false, end="\n", output_stream="stderr", _device="/job:localhost/replica:0/task:0/device:CPU:0"](input) takes 41.182us | |
run 0 costs 0 | |
# run 1 schedule start | |
2022-06-13 13:40:41.200081: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_18' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.200694: I tensorflow/python/eager/pywrap_tfe_src.cc:885] Eager executes cancelable __inference_nn_18 on the number of inputs is 3 the number of output is 1 | |
2022-06-13 13:40:41.200729: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_18' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.200753: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_18' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.200771: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_18' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.200779: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op __inference_nn_18 in device | |
2022-06-13 13:40:41.200785: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 1 | |
2022-06-13 13:40:41.200790: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute __inference_nn_18 in device | |
2022-06-13 13:40:41.200806: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_18:input:0 /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.200814: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_18:input:1 /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.200820: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_18:input:2 /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.200855: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_18' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.200870: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_18' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.200881: I tensorflow/core/common_runtime/eager/execute.cc:923] PreferredDevice __inference_nn_18: /job:localhost/replica:0/task:0 | |
2022-06-13 13:40:41.200886: I tensorflow/core/common_runtime/eager/execute.cc:924] Placer place op [__inference_nn_18] on device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.200960: I tensorflow/core/common_runtime/process_function_library_runtime.cc:772] Instantiating MultiDevice function "__inference_nn_18" on default device "/job:localhost/replica:0/task:0/device:GPU:0" | |
2022-06-13 13:40:41.201156: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 0 | |
2022-06-13 13:40:41.201167: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-13 13:40:41.201172: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MlirV1CompatGraphOptimizationPass | |
2022-06-13 13:40:41.201178: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-13 13:40:41.201183: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ControlFlowDepsToChainsPass | |
2022-06-13 13:40:41.201189: I tensorflow/core/common_runtime/control_flow_deps_to_chains.cc:37] ControlFlowDepsToChainsPass::Run | |
2022-06-13 13:40:41.201206: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.201228: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.201242: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-13 13:40:41.201247: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: AccumulateNV2RemovePass | |
2022-06-13 13:40:41.201252: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: LowerFunctionalOpsPass | |
2022-06-13 13:40:41.201262: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ParallelConcatRemovePass | |
2022-06-13 13:40:41.201267: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 35 | |
2022-06-13 13:40:41.201272: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IsolatePlacerInspectionRequiredOpsPass | |
2022-06-13 13:40:41.201277: I tensorflow/core/common_runtime/isolate_placer_inspection_required_ops_pass.cc:34] IsolatePlacerInspectionRequiredOpsPass::Run | |
2022-06-13 13:40:41.201284: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IntroduceFloatingPointJitterPass | |
2022-06-13 13:40:41.201290: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 36 | |
2022-06-13 13:40:41.201295: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateXlaComputationsPass | |
2022-06-13 13:40:41.201309: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.201319: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:353] EncapsulateXlaComputations(): (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.201366: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_halfway because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.201378: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:364] EncapsulateXlaComputations() half-way: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.201393: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.201403: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:370] EncapsulateXlaComputations() finished: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.201409: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 37 | |
2022-06-13 13:40:41.201413: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: FunctionalizeControlFlowForXlaPass | |
2022-06-13 13:40:41.201433: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 99999 | |
2022-06-13 13:40:41.201438: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: WeakForwardTypeInferencePass | |
2022-06-13 13:40:41.201444: I tensorflow/core/common_runtime/forward_type_inference.cc:130] ForwardTypeInferencePass::Run | |
2022-06-13 13:40:41.201456: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.201481: I tensorflow/core/common_runtime/forward_type_inference.cc:311] Finished after 1 iterations; done 9 of 9 nodes in 9 visits | |
2022-06-13 13:40:41.201496: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.201508: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 0 | |
2022-06-13 13:40:41.201535: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.201549: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 17.377us | |
2022-06-13 13:40:41.201556: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::w takes 0.867us | |
2022-06-13 13:40:41.201566: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.201572: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 6.809us | |
2022-06-13 13:40:41.201578: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::b takes 0.326us | |
2022-06-13 13:40:41.201585: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.201590: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 6.148us | |
2022-06-13 13:40:41.201596: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::x takes 0.299us | |
2022-06-13 13:40:41.201606: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 3.231us | |
2022-06-13 13:40:41.201616: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:CPU::MatMul takes 5.182us | |
2022-06-13 13:40:41.201625: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 3.112us | |
2022-06-13 13:40:41.201634: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:CPU::Add takes 3.364us | |
2022-06-13 13:40:41.201645: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 5.064us | |
2022-06-13 13:40:41.201651: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:CPU::Identity takes 0.77us | |
2022-06-13 13:40:41.201661: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.201667: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_RetVal takes 9.881us | |
2022-06-13 13:40:41.201673: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::identity_RetVal takes 0.54us | |
2022-06-13 13:40:41.201687: I tensorflow/core/common_runtime/placer.cc:124] w(_Arg) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.201696: I tensorflow/core/common_runtime/placer.cc:124] b(_Arg) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.201701: I tensorflow/core/common_runtime/placer.cc:124] x(_Arg) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.201710: I tensorflow/core/common_runtime/placer.cc:124] MatMul(BatchMatMulV2) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.201716: I tensorflow/core/common_runtime/placer.cc:124] Add(AddV2) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.201722: I tensorflow/core/common_runtime/placer.cc:124] Identity(Identity) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.201728: I tensorflow/core/common_runtime/placer.cc:124] identity_RetVal(_Retval) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.201734: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 1 | |
2022-06-13 13:40:41.201739: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-13 13:40:41.201744: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: NcclReplacePass | |
2022-06-13 13:40:41.201751: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 1 | |
2022-06-13 13:40:41.201766: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 2 | |
2022-06-13 13:40:41.201773: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 5 | |
2022-06-13 13:40:41.201778: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: CloneConstantsForBetterClusteringPass | |
2022-06-13 13:40:41.201785: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-13 13:40:41.201790: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ClusterScopingPass | |
2022-06-13 13:40:41.201795: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-13 13:40:41.201799: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MarkForCompilationPass | |
2022-06-13 13:40:41.202047: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:XLA_GPU_JIT::MatMul takes 1.526us | |
2022-06-13 13:40:41.202066: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:XLA_GPU_JIT::Add takes 1.043us | |
2022-06-13 13:40:41.202082: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:XLA_GPU_JIT::Identity takes 0.905us | |
2022-06-13 13:40:41.202144: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:650] DeadnessAnalysis time: 14 us (cumulative: 26 us, max: 14 us, #called: 2) | |
2022-06-13 13:40:41.202192: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:1523] MarkForCompilationPassImpl::Run time: 380 us (cumulative: 1.57 ms, max: 442 us, #called: 5) | |
2022-06-13 13:40:41.202204: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 12 | |
2022-06-13 13:40:41.202209: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ForceXlaConstantsOnHostPass | |
2022-06-13 13:40:41.202219: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 20 | |
2022-06-13 13:40:41.202224: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IncreaseDynamismForAutoJitPass | |
2022-06-13 13:40:41.202230: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 30 | |
2022-06-13 13:40:41.202234: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: PartiallyDeclusterPass | |
2022-06-13 13:40:41.202258: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 40 | |
2022-06-13 13:40:41.202265: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ReportClusteringInfoPass | |
2022-06-13 13:40:41.202294: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 50 | |
2022-06-13 13:40:41.202302: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateSubgraphsPass | |
2022-06-13 13:40:41.202307: I tensorflow/compiler/jit/encapsulate_subgraphs_pass.cc:1139] EncapsulateSubgraphsPass::Run | |
2022-06-13 13:40:41.202328: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.202408: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.202436: I tensorflow/compiler/jit/xla_cluster_util.cc:590] GetNodesRelatedToRefVariables() found 0 nodes | |
2022-06-13 13:40:41.202462: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 60 | |
2022-06-13 13:40:41.202471: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: BuildXlaOpsPass | |
2022-06-13 13:40:41.202480: I tensorflow/compiler/jit/build_xla_ops_pass.cc:603] print_outputs = 0 | |
2022-06-13 13:40:41.202487: I tensorflow/compiler/jit/build_xla_ops_pass.cc:604] check_input_numerics = 0 | |
2022-06-13 13:40:41.202493: I tensorflow/compiler/jit/build_xla_ops_pass.cc:605] check_output_numerics = 0 | |
2022-06-13 13:40:41.202510: W tensorflow/core/util/dump_graph.cc:134] Failed to dump build_xla_ops because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.202523: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 2 | |
2022-06-13 13:40:41.202545: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.202555: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 13.423us | |
2022-06-13 13:40:41.202567: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.202572: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 6.645us | |
2022-06-13 13:40:41.202580: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.202586: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 6.696us | |
2022-06-13 13:40:41.202596: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 2.676us | |
2022-06-13 13:40:41.202608: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 3.833us | |
2022-06-13 13:40:41.202621: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 5.903us | |
2022-06-13 13:40:41.202633: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.202643: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_RetVal takes 14.103us | |
2022-06-13 13:40:41.202674: I tensorflow/core/graph/graph_partition.cc:1251] Added send/recv: controls=0, data=0 | |
2022-06-13 13:40:41.202732: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 3 | |
2022-06-13 13:40:41.202740: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 1 | |
2022-06-13 13:40:41.202745: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MklLayoutRewritePass | |
2022-06-13 13:40:41.202754: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.202758: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.202763: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.202767: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node BatchMatMulV2, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.202772: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node AddV2, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.202776: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node Identity, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.202780: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.202786: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.202790: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.202794: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.202799: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node BatchMatMulV2, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.202803: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node AddV2, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.202807: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node Identity, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.202811: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.202817: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.202821: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.202825: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.202829: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node BatchMatMulV2, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.202834: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node AddV2, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.202838: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node Identity, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.202842: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.202848: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 3 | |
2022-06-13 13:40:41.202865: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_562417792_/job:localhost/replica:0/task:0/device:GPU:0 because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.202931: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_18_18275566768249955521_0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.202955: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __inference_nn_18_18275566768249955521_0 on device /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.203070: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __inference_nn_18_18275566768249955521_0 with handle 10 status: OK | |
2022-06-13 13:40:41.203125: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op __inference_nn_18 in device /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.203153: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1437] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __inference_nn_18 with handle 10 | |
2022-06-13 13:40:41.203201: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:CPU::MatMul takes 4.042us | |
2022-06-13 13:40:41.203214: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:CPU::Add takes 3.091us | |
2022-06-13 13:40:41.203221: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:CPU::Identity takes 0.628us | |
2022-06-13 13:40:41.203227: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found | |
2022-06-13 13:40:41.203267: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203276: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 9.894us | |
2022-06-13 13:40:41.203283: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SINK}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203288: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SINK takes 5.317us | |
2022-06-13 13:40:41.203297: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203302: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 7.803us | |
2022-06-13 13:40:41.203311: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203316: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 6.303us | |
2022-06-13 13:40:41.203324: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203329: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 6.888us | |
2022-06-13 13:40:41.203337: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 1.336us | |
2022-06-13 13:40:41.203347: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 1.303us | |
2022-06-13 13:40:41.203356: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 2.688us | |
2022-06-13 13:40:41.203366: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_retval_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203372: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_retval_RetVal takes 7.858us | |
2022-06-13 13:40:41.203380: I tensorflow/core/common_runtime/memory_types.cc:87] 2:0 -> 5:0: 0 -> 0 | |
2022-06-13 13:40:41.203385: I tensorflow/core/common_runtime/memory_types.cc:87] 4:0 -> 5:1: 0 -> 0 | |
2022-06-13 13:40:41.203390: I tensorflow/core/common_runtime/memory_types.cc:87] 5:0 -> 6:0: 0 -> 0 | |
2022-06-13 13:40:41.203395: I tensorflow/core/common_runtime/memory_types.cc:87] 3:0 -> 6:1: 0 -> 0 | |
2022-06-13 13:40:41.203399: I tensorflow/core/common_runtime/memory_types.cc:87] 6:0 -> 7:0: 0 -> 0 | |
2022-06-13 13:40:41.203404: I tensorflow/core/common_runtime/memory_types.cc:87] 7:0 -> 8:0: 0 -> 0 | |
2022-06-13 13:40:41.203410: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203416: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 5.629us | |
2022-06-13 13:40:41.203422: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SINK}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203427: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SINK takes 5.187us | |
2022-06-13 13:40:41.203434: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203440: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 6.715us | |
2022-06-13 13:40:41.203447: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203452: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 6.077us | |
2022-06-13 13:40:41.203459: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203465: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 6.509us | |
2022-06-13 13:40:41.203472: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 0.649us | |
2022-06-13 13:40:41.203480: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 0.896us | |
2022-06-13 13:40:41.203488: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 1.143us | |
2022-06-13 13:40:41.203496: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_retval_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203502: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_retval_RetVal takes 6.389us | |
2022-06-13 13:40:41.203508: I tensorflow/core/common_runtime/memory_types.cc:87] 2:0 -> 5:0: 0 -> 0 | |
2022-06-13 13:40:41.203513: I tensorflow/core/common_runtime/memory_types.cc:87] 4:0 -> 5:1: 0 -> 0 | |
2022-06-13 13:40:41.203517: I tensorflow/core/common_runtime/memory_types.cc:87] 5:0 -> 6:0: 0 -> 0 | |
2022-06-13 13:40:41.203522: I tensorflow/core/common_runtime/memory_types.cc:87] 3:0 -> 6:1: 0 -> 0 | |
2022-06-13 13:40:41.203526: I tensorflow/core/common_runtime/memory_types.cc:87] 6:0 -> 7:0: 0 -> 0 | |
2022-06-13 13:40:41.203530: I tensorflow/core/common_runtime/memory_types.cc:87] 7:0 -> 8:0: 0 -> 0 | |
2022-06-13 13:40:41.203582: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() | |
2022-06-13 13:40:41.203591: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203596: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 5.417us | |
2022-06-13 13:40:41.203602: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203607: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 5.224us | |
2022-06-13 13:40:41.203618: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() takes 34.862us | |
2022-06-13 13:40:41.203632: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node w}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="W", index=0]() | |
2022-06-13 13:40:41.203643: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203648: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 6.909us | |
2022-06-13 13:40:41.203655: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203660: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 6.162us | |
2022-06-13 13:40:41.203671: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node w}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="W", index=0]() takes 41.881us | |
2022-06-13 13:40:41.203680: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node b}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="b", index=1]() | |
2022-06-13 13:40:41.203688: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203693: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 6.309us | |
2022-06-13 13:40:41.203700: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203705: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 6.059us | |
2022-06-13 13:40:41.203714: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node b}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="b", index=1]() takes 33.348us | |
2022-06-13 13:40:41.203723: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node x}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[4,128,128]], _user_specified_name="x", index=2]() | |
2022-06-13 13:40:41.203731: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203736: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 6.638us | |
2022-06-13 13:40:41.203742: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203747: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 5.848us | |
2022-06-13 13:40:41.203757: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node x}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[4,128,128]], _user_specified_name="x", index=2]() takes 33.291us | |
2022-06-13 13:40:41.203765: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](w, x) | |
2022-06-13 13:40:41.203773: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 0.825us | |
2022-06-13 13:40:41.203780: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 0.629us | |
2022-06-13 13:40:41.203799: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](w, x) takes 33.324us | |
2022-06-13 13:40:41.203807: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, b) | |
2022-06-13 13:40:41.203815: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 0.96us | |
2022-06-13 13:40:41.203822: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 0.825us | |
2022-06-13 13:40:41.203835: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, b) takes 26.139us | |
2022-06-13 13:40:41.203842: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node Identity}} = Identity[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add) | |
2022-06-13 13:40:41.203850: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 1.321us | |
2022-06-13 13:40:41.203857: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 1.14us | |
2022-06-13 13:40:41.203865: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node Identity}} = Identity[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add) takes 22.379us | |
2022-06-13 13:40:41.203873: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node identity_retval_RetVal}} = _Retval[T=DT_FLOAT, index=0](Identity) | |
2022-06-13 13:40:41.203880: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_retval_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203886: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_retval_RetVal takes 6.908us | |
2022-06-13 13:40:41.203892: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_retval_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.203897: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_retval_RetVal takes 6.018us | |
2022-06-13 13:40:41.203906: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node identity_retval_RetVal}} = _Retval[T=DT_FLOAT, index=0](Identity) takes 32.932us | |
# run 1 schedule end | |
# run 1 compute start | |
2022-06-13 13:40:41.203989: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step -4266793230068582322 {{node _SOURCE}} = NoOp[]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.204086: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step -4266793230068582322 {{node w}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="W", index=0]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.204123: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper w op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.204142: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] w:_Arg | |
2022-06-13 13:40:41.204180: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled w op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.204214: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step -4266793230068582322 {{node b}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="b", index=1]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.204234: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper b op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.204271: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] b:_Arg | |
2022-06-13 13:40:41.204289: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled b op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.204317: I tensorflow/core/common_runtime/executor.cc:783] Process node: 4 step -4266793230068582322 {{node x}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[4,128,128]], _user_specified_name="x", index=2]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.204336: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper x op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.204349: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] x:_Arg | |
2022-06-13 13:40:41.204363: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled x op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.204389: I tensorflow/core/common_runtime/executor.cc:783] Process node: 5 step -4266793230068582322 {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](w, x) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.204418: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper MatMul op BatchMatMulV2 on GPU 0 stream[0] | |
2022-06-13 13:40:41.204438: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] MatMul:BatchMatMulV2#shape=(float[1024,128];float[4,128,128])# | |
2022-06-13 13:40:41.204625: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11 | |
2022-06-13 13:40:41.702486: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11 | |
2022-06-13 13:40:41.703054: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled MatMul op BatchMatMulV2 on GPU 0 stream[0] | |
2022-06-13 13:40:41.703097: I tensorflow/core/common_runtime/executor.cc:783] Process node: 6 step -4266793230068582322 {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, b) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.703115: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Add op AddV2 on GPU 0 stream[0] | |
2022-06-13 13:40:41.703127: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Add:AddV2#shape=(float[4,1024,128];float[1024,128])# | |
2022-06-13 13:40:41.703340: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled Add op AddV2 on GPU 0 stream[0] | |
2022-06-13 13:40:41.703358: I tensorflow/core/common_runtime/executor.cc:783] Process node: 7 step -4266793230068582322 {{node Identity}} = Identity[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.703367: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Identity op Identity on GPU 0 stream[0] | |
2022-06-13 13:40:41.703374: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Identity:Identity#shape=(float[4,1024,128])# | |
2022-06-13 13:40:41.703381: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled Identity op Identity on GPU 0 stream[0] | |
2022-06-13 13:40:41.703390: I tensorflow/core/common_runtime/executor.cc:783] Process node: 8 step -4266793230068582322 {{node identity_retval_RetVal}} = _Retval[T=DT_FLOAT, index=0](Identity) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.703395: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper identity_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.703401: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] identity_retval_RetVal:_Retval#shape=(float[4,1024,128])# | |
2022-06-13 13:40:41.703407: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled identity_retval_RetVal op _Retval on GPU 0 stream[0] | |
# run 1 compute end | |
# below is output of run 1 | |
2022-06-13 13:40:41.704761: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op StringFormat in device | |
2022-06-13 13:40:41.704824: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 1 | |
2022-06-13 13:40:41.704849: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute StringFormat in device | |
2022-06-13 13:40:41.704957: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:GPU::StringFormat takes 10.906us | |
2022-06-13 13:40:41.704984: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 2.563us | |
2022-06-13 13:40:41.705014: I tensorflow/core/common_runtime/eager/execute.cc:923] PreferredDevice StringFormat: /job:localhost/replica:0/task:0 | |
2022-06-13 13:40:41.705032: I tensorflow/core/common_runtime/eager/execute.cc:924] Placer place op [StringFormat] on device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.705064: I tensorflow/core/common_runtime/eager/execute.cc:1062] Device for [StringFormat] already set to: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.705146: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.705205: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.705291: I tensorflow/core/common_runtime/process_function_library_runtime.cc:772] Instantiating MultiDevice function "__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0" on default device "/job:localhost/replica:0/task:0/device:CPU:0" | |
2022-06-13 13:40:41.705682: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 0 | |
2022-06-13 13:40:41.705716: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-13 13:40:41.705736: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MlirV1CompatGraphOptimizationPass | |
2022-06-13 13:40:41.705754: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-13 13:40:41.705769: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ControlFlowDepsToChainsPass | |
2022-06-13 13:40:41.705783: I tensorflow/core/common_runtime/control_flow_deps_to_chains.cc:37] ControlFlowDepsToChainsPass::Run | |
2022-06-13 13:40:41.705818: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.705863: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.705893: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-13 13:40:41.705910: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: AccumulateNV2RemovePass | |
2022-06-13 13:40:41.705941: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: LowerFunctionalOpsPass | |
2022-06-13 13:40:41.705966: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ParallelConcatRemovePass | |
2022-06-13 13:40:41.705995: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 35 | |
2022-06-13 13:40:41.706019: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IsolatePlacerInspectionRequiredOpsPass | |
2022-06-13 13:40:41.706047: I tensorflow/core/common_runtime/isolate_placer_inspection_required_ops_pass.cc:34] IsolatePlacerInspectionRequiredOpsPass::Run | |
2022-06-13 13:40:41.706073: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IntroduceFloatingPointJitterPass | |
2022-06-13 13:40:41.706092: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 36 | |
2022-06-13 13:40:41.706105: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateXlaComputationsPass | |
2022-06-13 13:40:41.706134: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.706161: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:353] EncapsulateXlaComputations(): (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.706260: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_halfway because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.706288: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:364] EncapsulateXlaComputations() half-way: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.706316: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.706342: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:370] EncapsulateXlaComputations() finished: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.706359: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 37 | |
2022-06-13 13:40:41.706372: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: FunctionalizeControlFlowForXlaPass | |
2022-06-13 13:40:41.706412: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 99999 | |
2022-06-13 13:40:41.706429: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: WeakForwardTypeInferencePass | |
2022-06-13 13:40:41.706444: I tensorflow/core/common_runtime/forward_type_inference.cc:130] ForwardTypeInferencePass::Run | |
2022-06-13 13:40:41.706467: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.706514: I tensorflow/core/common_runtime/forward_type_inference.cc:311] Finished after 1 iterations; done 4 of 4 nodes in 4 visits | |
2022-06-13 13:40:41.706542: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.706572: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 0 | |
2022-06-13 13:40:41.706625: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:GPU::StringFormat takes 6.128us | |
2022-06-13 13:40:41.706649: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 2.128us | |
2022-06-13 13:40:41.706686: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.706707: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_RetVal takes 30.352us | |
2022-06-13 13:40:41.706728: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 1.323us | |
2022-06-13 13:40:41.706769: I tensorflow/core/common_runtime/placer.cc:124] output_RetVal(_Retval) placed on: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.706796: I tensorflow/core/common_runtime/placer.cc:124] StringFormat(StringFormat) placed on: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.706816: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 1 | |
2022-06-13 13:40:41.706835: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-13 13:40:41.706853: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: NcclReplacePass | |
2022-06-13 13:40:41.706877: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 1 | |
2022-06-13 13:40:41.706899: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 2 | |
2022-06-13 13:40:41.706912: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 5 | |
2022-06-13 13:40:41.706931: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: CloneConstantsForBetterClusteringPass | |
2022-06-13 13:40:41.706949: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-13 13:40:41.706962: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ClusterScopingPass | |
2022-06-13 13:40:41.706981: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-13 13:40:41.706999: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MarkForCompilationPass | |
2022-06-13 13:40:41.707578: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:XLA_CPU_JIT::StringFormat takes 3.548us | |
2022-06-13 13:40:41.707710: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:1523] MarkForCompilationPassImpl::Run time: 676 us (cumulative: 2.25 ms, max: 676 us, #called: 6) | |
2022-06-13 13:40:41.707733: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 12 | |
2022-06-13 13:40:41.707748: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ForceXlaConstantsOnHostPass | |
2022-06-13 13:40:41.707774: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 20 | |
2022-06-13 13:40:41.707790: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IncreaseDynamismForAutoJitPass | |
2022-06-13 13:40:41.707806: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 30 | |
2022-06-13 13:40:41.707823: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: PartiallyDeclusterPass | |
2022-06-13 13:40:41.707872: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 40 | |
2022-06-13 13:40:41.707894: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ReportClusteringInfoPass | |
2022-06-13 13:40:41.707947: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 50 | |
2022-06-13 13:40:41.707966: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateSubgraphsPass | |
2022-06-13 13:40:41.707981: I tensorflow/compiler/jit/encapsulate_subgraphs_pass.cc:1139] EncapsulateSubgraphsPass::Run | |
2022-06-13 13:40:41.708015: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.708162: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.708218: I tensorflow/compiler/jit/xla_cluster_util.cc:590] GetNodesRelatedToRefVariables() found 0 nodes | |
2022-06-13 13:40:41.708282: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 60 | |
2022-06-13 13:40:41.708301: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: BuildXlaOpsPass | |
2022-06-13 13:40:41.708320: I tensorflow/compiler/jit/build_xla_ops_pass.cc:603] print_outputs = 0 | |
2022-06-13 13:40:41.708334: I tensorflow/compiler/jit/build_xla_ops_pass.cc:604] check_input_numerics = 0 | |
2022-06-13 13:40:41.708346: I tensorflow/compiler/jit/build_xla_ops_pass.cc:605] check_output_numerics = 0 | |
2022-06-13 13:40:41.708373: W tensorflow/core/util/dump_graph.cc:134] Failed to dump build_xla_ops because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.708407: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 2 | |
2022-06-13 13:40:41.708455: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 2.891us | |
2022-06-13 13:40:41.708490: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 1.428us | |
2022-06-13 13:40:41.708547: I tensorflow/core/graph/graph_partition.cc:1251] Added send/recv: controls=0, data=0 | |
2022-06-13 13:40:41.708657: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 3 | |
2022-06-13 13:40:41.708680: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 1 | |
2022-06-13 13:40:41.708693: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MklLayoutRewritePass | |
2022-06-13 13:40:41.708750: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 3 | |
2022-06-13 13:40:41.708783: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_562384224_/job:localhost/replica:0/task:0/device:CPU:0 because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.708894: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_9847480399019665821_0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.708935: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_9847480399019665821_0 on device /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.709112: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_9847480399019665821_0 with handle 12 status: OK | |
2022-06-13 13:40:41.709218: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op StringFormat in device /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.709263: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0 with handle 12 | |
2022-06-13 13:40:41.709346: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 2.816us | |
2022-06-13 13:40:41.709387: I tensorflow/core/common_runtime/constant_folding.cc:631] Constant foldable 3 : 4 | |
2022-06-13 13:40:41.709568: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() | |
2022-06-13 13:40:41.709598: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.709623: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 24.807us | |
2022-06-13 13:40:41.709644: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.709665: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 19.599us | |
2022-06-13 13:40:41.709695: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() takes 124.611us | |
2022-06-13 13:40:41.709732: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 1 costs 543.6217784881592", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() | |
2022-06-13 13:40:41.709761: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.432us | |
2022-06-13 13:40:41.709783: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.894us | |
2022-06-13 13:40:41.709821: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 1 costs 543.6217784881592", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 92.588us | |
2022-06-13 13:40:41.709853: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=-8764232170173109502, tensor_name="StringFormat:0"](StringFormat) | |
2022-06-13 13:40:41.709881: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_send_StringFormat_0 takes 1.588us | |
2022-06-13 13:40:41.709902: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_send_StringFormat_0 takes 0.901us | |
2022-06-13 13:40:41.709954: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=-8764232170173109502, tensor_name="StringFormat:0"](StringFormat) takes 100.759us | |
2022-06-13 13:40:41.709997: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step -1 {{node _SOURCE}} = NoOp[]() device: /device:CPU:0 | |
2022-06-13 13:40:41.710024: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step -1 {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 1 costs 543.6217784881592", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() device: /device:CPU:0 | |
2022-06-13 13:40:41.710070: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step -1 {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=-8764232170173109502, tensor_name="StringFormat:0"](StringFormat) device: /device:CPU:0 | |
2022-06-13 13:40:41.710179: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 2.086us | |
2022-06-13 13:40:41.710200: I tensorflow/core/common_runtime/constant_folding.cc:562] Replacing StringFormat :: 0 with a constant | |
2022-06-13 13:40:41.710308: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found | |
2022-06-13 13:40:41.710413: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node StringFormat/_0__cf__0}} = Const[dtype=DT_STRING, value=Tensor<type: string shape: [] values: run 1 costs 543.6217784881592>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() | |
2022-06-13 13:40:41.710445: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 1.563us | |
2022-06-13 13:40:41.710466: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 0.879us | |
2022-06-13 13:40:41.710516: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node StringFormat/_0__cf__0}} = Const[dtype=DT_STRING, value=Tensor<type: string shape: [] values: run 1 costs 543.6217784881592>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 108.155us | |
2022-06-13 13:40:41.710543: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_STRING, index=0](StringFormat) | |
2022-06-13 13:40:41.710568: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_retval_RetVal takes 1.131us | |
2022-06-13 13:40:41.710588: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_retval_RetVal takes 0.889us | |
2022-06-13 13:40:41.710616: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_STRING, index=0](StringFormat) takes 70.259us | |
2022-06-13 13:40:41.710751: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op PrintV2 in device | |
2022-06-13 13:40:41.710774: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 0 | |
2022-06-13 13:40:41.710789: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute PrintV2 in device | |
2022-06-13 13:40:41.710817: I tensorflow/core/common_runtime/eager/execute.cc:982] PrintV2:input:0 /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.710864: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op PrintV2 in device /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.710899: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0 with handle 8 | |
# before is output of run 1 | |
run 1 costs 543.6217784881592 | |
# run 2 schedule start | |
2022-06-13 13:40:41.711880: I tensorflow/python/eager/pywrap_tfe_src.cc:885] Eager executes cancelable __inference_nn_18 on the number of inputs is 3 the number of output is 1 | |
2022-06-13 13:40:41.711948: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_18' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.711994: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_18' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.712033: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_18' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.712057: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op __inference_nn_18 in device | |
2022-06-13 13:40:41.712075: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 1 | |
2022-06-13 13:40:41.712089: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute __inference_nn_18 in device | |
2022-06-13 13:40:41.712113: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_18:input:0 /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.712139: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_18:input:1 /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.712160: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_18:input:2 /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.712202: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op __inference_nn_18 in device /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.712248: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1437] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __inference_nn_18 with handle 10 | |
# run 2 schedule end | |
# run 2 compute start | |
2022-06-13 13:40:41.712374: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step -6484413189335459115 {{node _SOURCE}} = NoOp[]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.712493: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step -6484413189335459115 {{node w}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="W", index=0]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.712541: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper w op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.712570: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] w:_Arg | |
2022-06-13 13:40:41.712616: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled w op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.712665: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step -6484413189335459115 {{node b}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="b", index=1]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.712697: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper b op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.712723: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] b:_Arg | |
2022-06-13 13:40:41.712751: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled b op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.712793: I tensorflow/core/common_runtime/executor.cc:783] Process node: 4 step -6484413189335459115 {{node x}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[4,128,128]], _user_specified_name="x", index=2]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.712825: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper x op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.712850: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] x:_Arg | |
2022-06-13 13:40:41.712873: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled x op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.712913: I tensorflow/core/common_runtime/executor.cc:783] Process node: 5 step -6484413189335459115 {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](w, x) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.712953: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper MatMul op BatchMatMulV2 on GPU 0 stream[0] | |
2022-06-13 13:40:41.712988: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] MatMul:BatchMatMulV2#shape=(float[1024,128];float[4,128,128])# | |
2022-06-13 13:40:41.713191: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled MatMul op BatchMatMulV2 on GPU 0 stream[0] | |
2022-06-13 13:40:41.713242: I tensorflow/core/common_runtime/executor.cc:783] Process node: 6 step -6484413189335459115 {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, b) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.713276: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Add op AddV2 on GPU 0 stream[0] | |
2022-06-13 13:40:41.713309: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Add:AddV2#shape=(float[4,1024,128];float[1024,128])# | |
2022-06-13 13:40:41.713394: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled Add op AddV2 on GPU 0 stream[0] | |
2022-06-13 13:40:41.713436: I tensorflow/core/common_runtime/executor.cc:783] Process node: 7 step -6484413189335459115 {{node Identity}} = Identity[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.713469: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Identity op Identity on GPU 0 stream[0] | |
2022-06-13 13:40:41.713499: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Identity:Identity#shape=(float[4,1024,128])# | |
2022-06-13 13:40:41.713529: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled Identity op Identity on GPU 0 stream[0] | |
2022-06-13 13:40:41.713565: I tensorflow/core/common_runtime/executor.cc:783] Process node: 8 step -6484413189335459115 {{node identity_retval_RetVal}} = _Retval[T=DT_FLOAT, index=0](Identity) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.713594: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper identity_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.713625: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] identity_retval_RetVal:_Retval#shape=(float[4,1024,128])# | |
2022-06-13 13:40:41.713652: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled identity_retval_RetVal op _Retval on GPU 0 stream[0] | |
# run 2 compute end | |
# below is output of run 2 | |
2022-06-13 13:40:41.714335: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op StringFormat in device | |
2022-06-13 13:40:41.714380: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 1 | |
2022-06-13 13:40:41.714397: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute StringFormat in device | |
2022-06-13 13:40:41.714470: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:GPU::StringFormat takes 6.227us | |
2022-06-13 13:40:41.714493: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.748us | |
2022-06-13 13:40:41.714520: I tensorflow/core/common_runtime/eager/execute.cc:923] PreferredDevice StringFormat: /job:localhost/replica:0/task:0 | |
2022-06-13 13:40:41.714542: I tensorflow/core/common_runtime/eager/execute.cc:924] Placer place op [StringFormat] on device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.714584: I tensorflow/core/common_runtime/eager/execute.cc:1062] Device for [StringFormat] already set to: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.714678: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.714766: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.714836: I tensorflow/core/common_runtime/process_function_library_runtime.cc:772] Instantiating MultiDevice function "__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0" on default device "/job:localhost/replica:0/task:0/device:CPU:0" | |
2022-06-13 13:40:41.715107: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 0 | |
2022-06-13 13:40:41.715137: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-13 13:40:41.715151: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MlirV1CompatGraphOptimizationPass | |
2022-06-13 13:40:41.715170: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-13 13:40:41.715183: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ControlFlowDepsToChainsPass | |
2022-06-13 13:40:41.715196: I tensorflow/core/common_runtime/control_flow_deps_to_chains.cc:37] ControlFlowDepsToChainsPass::Run | |
2022-06-13 13:40:41.715230: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.715271: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.715302: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-13 13:40:41.715319: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: AccumulateNV2RemovePass | |
2022-06-13 13:40:41.715335: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: LowerFunctionalOpsPass | |
2022-06-13 13:40:41.715367: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ParallelConcatRemovePass | |
2022-06-13 13:40:41.715398: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 35 | |
2022-06-13 13:40:41.715424: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IsolatePlacerInspectionRequiredOpsPass | |
2022-06-13 13:40:41.715450: I tensorflow/core/common_runtime/isolate_placer_inspection_required_ops_pass.cc:34] IsolatePlacerInspectionRequiredOpsPass::Run | |
2022-06-13 13:40:41.715474: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IntroduceFloatingPointJitterPass | |
2022-06-13 13:40:41.715494: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 36 | |
2022-06-13 13:40:41.715517: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateXlaComputationsPass | |
2022-06-13 13:40:41.715557: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.715586: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:353] EncapsulateXlaComputations(): (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.715699: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_halfway because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.715735: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:364] EncapsulateXlaComputations() half-way: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.715779: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.715815: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:370] EncapsulateXlaComputations() finished: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.715836: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 37 | |
2022-06-13 13:40:41.715849: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: FunctionalizeControlFlowForXlaPass | |
2022-06-13 13:40:41.715892: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 99999 | |
2022-06-13 13:40:41.715913: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: WeakForwardTypeInferencePass | |
2022-06-13 13:40:41.715927: I tensorflow/core/common_runtime/forward_type_inference.cc:130] ForwardTypeInferencePass::Run | |
2022-06-13 13:40:41.715952: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.716005: I tensorflow/core/common_runtime/forward_type_inference.cc:311] Finished after 1 iterations; done 4 of 4 nodes in 4 visits | |
2022-06-13 13:40:41.716049: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.716091: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 0 | |
2022-06-13 13:40:41.716146: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:GPU::StringFormat takes 4.792us | |
2022-06-13 13:40:41.716179: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 2.345us | |
2022-06-13 13:40:41.716225: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.716279: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_RetVal takes 40.945us | |
2022-06-13 13:40:41.716314: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 2.908us | |
2022-06-13 13:40:41.716365: I tensorflow/core/common_runtime/placer.cc:124] output_RetVal(_Retval) placed on: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.716396: I tensorflow/core/common_runtime/placer.cc:124] StringFormat(StringFormat) placed on: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.716422: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 1 | |
2022-06-13 13:40:41.716450: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-13 13:40:41.716473: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: NcclReplacePass | |
2022-06-13 13:40:41.716501: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 1 | |
2022-06-13 13:40:41.716530: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 2 | |
2022-06-13 13:40:41.716555: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 5 | |
2022-06-13 13:40:41.716580: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: CloneConstantsForBetterClusteringPass | |
2022-06-13 13:40:41.716610: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-13 13:40:41.716635: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ClusterScopingPass | |
2022-06-13 13:40:41.716661: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-13 13:40:41.716678: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MarkForCompilationPass | |
2022-06-13 13:40:41.717268: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:XLA_CPU_JIT::StringFormat takes 3.93us | |
2022-06-13 13:40:41.717401: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:1523] MarkForCompilationPassImpl::Run time: 691 us (cumulative: 2.94 ms, max: 691 us, #called: 7) | |
2022-06-13 13:40:41.717426: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 12 | |
2022-06-13 13:40:41.717452: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ForceXlaConstantsOnHostPass | |
2022-06-13 13:40:41.717492: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 20 | |
2022-06-13 13:40:41.717508: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IncreaseDynamismForAutoJitPass | |
2022-06-13 13:40:41.717524: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 30 | |
2022-06-13 13:40:41.717547: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: PartiallyDeclusterPass | |
2022-06-13 13:40:41.717625: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 40 | |
2022-06-13 13:40:41.717648: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ReportClusteringInfoPass | |
2022-06-13 13:40:41.717709: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 50 | |
2022-06-13 13:40:41.717738: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateSubgraphsPass | |
2022-06-13 13:40:41.717760: I tensorflow/compiler/jit/encapsulate_subgraphs_pass.cc:1139] EncapsulateSubgraphsPass::Run | |
2022-06-13 13:40:41.717812: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.717990: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.718060: I tensorflow/compiler/jit/xla_cluster_util.cc:590] GetNodesRelatedToRefVariables() found 0 nodes | |
2022-06-13 13:40:41.718108: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 60 | |
2022-06-13 13:40:41.718125: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: BuildXlaOpsPass | |
2022-06-13 13:40:41.718142: I tensorflow/compiler/jit/build_xla_ops_pass.cc:603] print_outputs = 0 | |
2022-06-13 13:40:41.718156: I tensorflow/compiler/jit/build_xla_ops_pass.cc:604] check_input_numerics = 0 | |
2022-06-13 13:40:41.718169: I tensorflow/compiler/jit/build_xla_ops_pass.cc:605] check_output_numerics = 0 | |
2022-06-13 13:40:41.718197: W tensorflow/core/util/dump_graph.cc:134] Failed to dump build_xla_ops because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.718230: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 2 | |
2022-06-13 13:40:41.718276: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 3.356us | |
2022-06-13 13:40:41.718311: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 1.865us | |
2022-06-13 13:40:41.718371: I tensorflow/core/graph/graph_partition.cc:1251] Added send/recv: controls=0, data=0 | |
2022-06-13 13:40:41.718491: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 3 | |
2022-06-13 13:40:41.718523: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 1 | |
2022-06-13 13:40:41.718542: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MklLayoutRewritePass | |
2022-06-13 13:40:41.718595: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 3 | |
2022-06-13 13:40:41.718631: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_562491216_/job:localhost/replica:0/task:0/device:CPU:0 because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.718753: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_3352841610890993264_0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.718809: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_3352841610890993264_0 on device /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.719009: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_3352841610890993264_0 with handle 14 status: OK | |
2022-06-13 13:40:41.719108: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op StringFormat in device /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.719144: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0 with handle 14 | |
2022-06-13 13:40:41.719232: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 3.641us | |
2022-06-13 13:40:41.719284: I tensorflow/core/common_runtime/constant_folding.cc:631] Constant foldable 3 : 4 | |
2022-06-13 13:40:41.719498: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() | |
2022-06-13 13:40:41.719534: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.719558: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 24.345us | |
2022-06-13 13:40:41.719578: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.719598: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 18.973us | |
2022-06-13 13:40:41.719630: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() takes 129.879us | |
2022-06-13 13:40:41.719666: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 2 costs 2.9511451721191406", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() | |
2022-06-13 13:40:41.719696: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.331us | |
2022-06-13 13:40:41.719725: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.452us | |
2022-06-13 13:40:41.719784: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 2 costs 2.9511451721191406", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 113.674us | |
2022-06-13 13:40:41.719820: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=-1704941691781989391, tensor_name="StringFormat:0"](StringFormat) | |
2022-06-13 13:40:41.719852: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_send_StringFormat_0 takes 1.619us | |
2022-06-13 13:40:41.719881: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_send_StringFormat_0 takes 1.853us | |
2022-06-13 13:40:41.719961: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=-1704941691781989391, tensor_name="StringFormat:0"](StringFormat) takes 138.823us | |
2022-06-13 13:40:41.720001: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step -1 {{node _SOURCE}} = NoOp[]() device: /device:CPU:0 | |
2022-06-13 13:40:41.720024: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step -1 {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 2 costs 2.9511451721191406", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() device: /device:CPU:0 | |
2022-06-13 13:40:41.720067: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step -1 {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=-1704941691781989391, tensor_name="StringFormat:0"](StringFormat) device: /device:CPU:0 | |
2022-06-13 13:40:41.720161: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 1.843us | |
2022-06-13 13:40:41.720179: I tensorflow/core/common_runtime/constant_folding.cc:562] Replacing StringFormat :: 0 with a constant | |
2022-06-13 13:40:41.720280: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found | |
2022-06-13 13:40:41.720389: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node StringFormat/_0__cf__0}} = Const[dtype=DT_STRING, value=Tensor<type: string shape: [] values: run 2 costs 2.9511451721191406>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() | |
2022-06-13 13:40:41.720421: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 1.853us | |
2022-06-13 13:40:41.720445: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 1.146us | |
2022-06-13 13:40:41.720496: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node StringFormat/_0__cf__0}} = Const[dtype=DT_STRING, value=Tensor<type: string shape: [] values: run 2 costs 2.9511451721191406>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 112.88us | |
2022-06-13 13:40:41.720519: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_STRING, index=0](StringFormat) | |
2022-06-13 13:40:41.720544: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_retval_RetVal takes 0.96us | |
2022-06-13 13:40:41.720560: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_retval_RetVal takes 0.673us | |
2022-06-13 13:40:41.720589: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_STRING, index=0](StringFormat) takes 64.337us | |
2022-06-13 13:40:41.720691: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op PrintV2 in device | |
2022-06-13 13:40:41.720716: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 0 | |
2022-06-13 13:40:41.720733: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute PrintV2 in device | |
2022-06-13 13:40:41.720763: I tensorflow/core/common_runtime/eager/execute.cc:982] PrintV2:input:0 /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.720804: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op PrintV2 in device /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.720831: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0 with handle 8 | |
# before is output of run 2 | |
run 2 costs 2.9511451721191406 | |
2022-06-13 13:40:41.721107: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op _EagerConst in device | |
2022-06-13 13:40:41.721133: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 1 | |
2022-06-13 13:40:41.721144: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute _EagerConst in device | |
2022-06-13 13:40:41.721162: I tensorflow/core/common_runtime/eager/execute.cc:982] _EagerConst:input:0 /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.721189: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op _EagerConst in device /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.721217: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 0 | |
2022-06-13 13:40:41.721242: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x214df9f0 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0 | |
2022-06-13 13:40:41.721265: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __wrapped____EagerConst_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 1 | |
2022-06-13 13:40:41.721296: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper input/_2 op _HostRecv on GPU 0 stream[0] | |
2022-06-13 13:40:41.721316: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] input/_2:_HostRecv#from=input,to=_EagerConst# | |
2022-06-13 13:40:41.721335: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x214df9f0 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0 | |
2022-06-13 13:40:41.721354: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x214dfa10 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_input;0:0 | |
2022-06-13 13:40:41.721385: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled input/_2 op _HostRecv on GPU 0 stream[0] | |
2022-06-13 13:40:41.721410: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper _EagerConst op _EagerConst on GPU 0 stream[0] | |
2022-06-13 13:40:41.721427: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] _EagerConst:_EagerConst#shape=(int32[3])# | |
2022-06-13 13:40:41.721449: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled _EagerConst op _EagerConst on GPU 0 stream[0] | |
2022-06-13 13:40:41.721462: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper output_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.721473: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] output_retval_RetVal:_Retval#shape=(int32[3])# | |
2022-06-13 13:40:41.721486: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled output_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.721618: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op RandomUniform in device | |
2022-06-13 13:40:41.721638: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 1 | |
2022-06-13 13:40:41.721649: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute RandomUniform in device | |
2022-06-13 13:40:41.721668: I tensorflow/core/common_runtime/eager/execute.cc:982] RandomUniform:input:0 /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.721702: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op RandomUniform in device /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.721730: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 3 | |
2022-06-13 13:40:41.721751: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x214df9f0 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0 | |
2022-06-13 13:40:41.721767: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __wrapped__RandomUniform_device_/job:localhost/replica:0/task:0/device:GPU:0 with handle 4 | |
2022-06-13 13:40:41.721790: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper shape/_2 op _HostRecv on GPU 0 stream[0] | |
2022-06-13 13:40:41.721811: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] shape/_2:_HostRecv#from=shape,to=RandomUniform# | |
2022-06-13 13:40:41.721826: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x214df9f0 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0 | |
2022-06-13 13:40:41.721837: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x214dfa10 /job:localhost/replica:0/task:0/device:CPU:0;95ed0dfd449ea1ea;/job:localhost/replica:0/task:0/device:GPU:0;edge_2_shape;0:0 | |
2022-06-13 13:40:41.721853: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled shape/_2 op _HostRecv on GPU 0 stream[0] | |
2022-06-13 13:40:41.721872: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-13 13:40:41.721893: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] RandomUniform:RandomUniform#shape=(int32[3])# | |
2022-06-13 13:40:41.721955: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-13 13:40:41.721975: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper output_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.721998: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] output_retval_RetVal:_Retval#shape=(float[4,128,128])# | |
2022-06-13 13:40:41.722015: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled output_retval_RetVal op _Retval on GPU 0 stream[0] | |
# run 3 schedule start | |
2022-06-13 13:40:41.731652: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_32' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.731981: I tensorflow/python/eager/pywrap_tfe_src.cc:885] Eager executes cancelable __inference_nn_32 on the number of inputs is 3 the number of output is 1 | |
2022-06-13 13:40:41.732019: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_32' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.732044: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_32' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.732066: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_32' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.732077: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op __inference_nn_32 in device | |
2022-06-13 13:40:41.732084: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 1 | |
2022-06-13 13:40:41.732092: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute __inference_nn_32 in device | |
2022-06-13 13:40:41.732104: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_32:input:0 /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.732115: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_32:input:1 /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.732123: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_32:input:2 /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.732166: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_32' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.732187: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_32' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.732198: I tensorflow/core/common_runtime/eager/execute.cc:923] PreferredDevice __inference_nn_32: /job:localhost/replica:0/task:0 | |
2022-06-13 13:40:41.732205: I tensorflow/core/common_runtime/eager/execute.cc:924] Placer place op [__inference_nn_32] on device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.732284: I tensorflow/core/common_runtime/process_function_library_runtime.cc:772] Instantiating MultiDevice function "__inference_nn_32" on default device "/job:localhost/replica:0/task:0/device:GPU:0" | |
2022-06-13 13:40:41.732516: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 0 | |
2022-06-13 13:40:41.732533: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-13 13:40:41.732541: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MlirV1CompatGraphOptimizationPass | |
2022-06-13 13:40:41.732548: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-13 13:40:41.732555: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ControlFlowDepsToChainsPass | |
2022-06-13 13:40:41.732562: I tensorflow/core/common_runtime/control_flow_deps_to_chains.cc:37] ControlFlowDepsToChainsPass::Run | |
2022-06-13 13:40:41.732586: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.732627: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.732655: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-13 13:40:41.732671: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: AccumulateNV2RemovePass | |
2022-06-13 13:40:41.732685: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: LowerFunctionalOpsPass | |
2022-06-13 13:40:41.732708: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ParallelConcatRemovePass | |
2022-06-13 13:40:41.732727: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 35 | |
2022-06-13 13:40:41.732737: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IsolatePlacerInspectionRequiredOpsPass | |
2022-06-13 13:40:41.732748: I tensorflow/core/common_runtime/isolate_placer_inspection_required_ops_pass.cc:34] IsolatePlacerInspectionRequiredOpsPass::Run | |
2022-06-13 13:40:41.732758: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IntroduceFloatingPointJitterPass | |
2022-06-13 13:40:41.732767: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 36 | |
2022-06-13 13:40:41.732773: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateXlaComputationsPass | |
2022-06-13 13:40:41.732799: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.732825: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:353] EncapsulateXlaComputations(): (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.732909: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_halfway because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.732930: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:364] EncapsulateXlaComputations() half-way: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.732951: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.732969: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:370] EncapsulateXlaComputations() finished: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.732978: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 37 | |
2022-06-13 13:40:41.732984: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: FunctionalizeControlFlowForXlaPass | |
2022-06-13 13:40:41.733006: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 99999 | |
2022-06-13 13:40:41.733015: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: WeakForwardTypeInferencePass | |
2022-06-13 13:40:41.733023: I tensorflow/core/common_runtime/forward_type_inference.cc:130] ForwardTypeInferencePass::Run | |
2022-06-13 13:40:41.733048: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.733095: I tensorflow/core/common_runtime/forward_type_inference.cc:311] Finished after 1 iterations; done 9 of 9 nodes in 9 visits | |
2022-06-13 13:40:41.733121: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.733142: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 0 | |
2022-06-13 13:40:41.733170: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.733185: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 19.439us | |
2022-06-13 13:40:41.733196: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::w takes 1.335us | |
2022-06-13 13:40:41.733214: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.733231: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 18.093us | |
2022-06-13 13:40:41.733245: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::b takes 0.787us | |
2022-06-13 13:40:41.733263: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.733272: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 10.215us | |
2022-06-13 13:40:41.733280: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::x takes 0.401us | |
2022-06-13 13:40:41.733293: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 3.308us | |
2022-06-13 13:40:41.733305: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:CPU::MatMul takes 3.302us | |
2022-06-13 13:40:41.733317: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 3.428us | |
2022-06-13 13:40:41.733333: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:CPU::Add takes 6.716us | |
2022-06-13 13:40:41.733361: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 10.569us | |
2022-06-13 13:40:41.733379: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:CPU::Identity takes 2.335us | |
2022-06-13 13:40:41.733392: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.733400: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_RetVal takes 11.22us | |
2022-06-13 13:40:41.733408: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::identity_RetVal takes 0.521us | |
2022-06-13 13:40:41.733427: I tensorflow/core/common_runtime/placer.cc:124] w(_Arg) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.733437: I tensorflow/core/common_runtime/placer.cc:124] b(_Arg) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.733445: I tensorflow/core/common_runtime/placer.cc:124] x(_Arg) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.733458: I tensorflow/core/common_runtime/placer.cc:124] MatMul(BatchMatMulV2) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.733468: I tensorflow/core/common_runtime/placer.cc:124] Add(AddV2) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.733479: I tensorflow/core/common_runtime/placer.cc:124] Identity(Identity) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.733492: I tensorflow/core/common_runtime/placer.cc:124] identity_RetVal(_Retval) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.733505: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 1 | |
2022-06-13 13:40:41.733517: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-13 13:40:41.733527: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: NcclReplacePass | |
2022-06-13 13:40:41.733536: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 1 | |
2022-06-13 13:40:41.733546: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 2 | |
2022-06-13 13:40:41.733553: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 5 | |
2022-06-13 13:40:41.733559: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: CloneConstantsForBetterClusteringPass | |
2022-06-13 13:40:41.733568: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-13 13:40:41.733576: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ClusterScopingPass | |
2022-06-13 13:40:41.733583: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-13 13:40:41.733590: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MarkForCompilationPass | |
2022-06-13 13:40:41.733912: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:XLA_GPU_JIT::MatMul takes 2.173us | |
2022-06-13 13:40:41.733940: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:XLA_GPU_JIT::Add takes 1.524us | |
2022-06-13 13:40:41.733966: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:XLA_GPU_JIT::Identity takes 1.596us | |
2022-06-13 13:40:41.734058: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:650] DeadnessAnalysis time: 16 us (cumulative: 42 us, max: 16 us, #called: 3) | |
2022-06-13 13:40:41.734142: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:1523] MarkForCompilationPassImpl::Run time: 536 us (cumulative: 3.48 ms, max: 691 us, #called: 8) | |
2022-06-13 13:40:41.734162: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 12 | |
2022-06-13 13:40:41.734169: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ForceXlaConstantsOnHostPass | |
2022-06-13 13:40:41.734183: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 20 | |
2022-06-13 13:40:41.734190: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IncreaseDynamismForAutoJitPass | |
2022-06-13 13:40:41.734198: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 30 | |
2022-06-13 13:40:41.734205: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: PartiallyDeclusterPass | |
2022-06-13 13:40:41.734248: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 40 | |
2022-06-13 13:40:41.734258: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ReportClusteringInfoPass | |
2022-06-13 13:40:41.734285: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 50 | |
2022-06-13 13:40:41.734295: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateSubgraphsPass | |
2022-06-13 13:40:41.734303: I tensorflow/compiler/jit/encapsulate_subgraphs_pass.cc:1139] EncapsulateSubgraphsPass::Run | |
2022-06-13 13:40:41.734341: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.734448: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.734491: I tensorflow/compiler/jit/xla_cluster_util.cc:590] GetNodesRelatedToRefVariables() found 0 nodes | |
2022-06-13 13:40:41.734521: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 60 | |
2022-06-13 13:40:41.734530: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: BuildXlaOpsPass | |
2022-06-13 13:40:41.734540: I tensorflow/compiler/jit/build_xla_ops_pass.cc:603] print_outputs = 0 | |
2022-06-13 13:40:41.734547: I tensorflow/compiler/jit/build_xla_ops_pass.cc:604] check_input_numerics = 0 | |
2022-06-13 13:40:41.734554: I tensorflow/compiler/jit/build_xla_ops_pass.cc:605] check_output_numerics = 0 | |
2022-06-13 13:40:41.734582: W tensorflow/core/util/dump_graph.cc:134] Failed to dump build_xla_ops because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.734612: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 2 | |
2022-06-13 13:40:41.734648: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.734666: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 23.788us | |
2022-06-13 13:40:41.734683: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.734696: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 14.336us | |
2022-06-13 13:40:41.734708: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.734717: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 9.883us | |
2022-06-13 13:40:41.734730: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 2.828us | |
2022-06-13 13:40:41.734746: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 3.585us | |
2022-06-13 13:40:41.734766: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 6.2us | |
2022-06-13 13:40:41.734783: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.734796: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_RetVal takes 15.778us | |
2022-06-13 13:40:41.734835: I tensorflow/core/graph/graph_partition.cc:1251] Added send/recv: controls=0, data=0 | |
2022-06-13 13:40:41.734911: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 3 | |
2022-06-13 13:40:41.734923: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 1 | |
2022-06-13 13:40:41.734930: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MklLayoutRewritePass | |
2022-06-13 13:40:41.734941: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.734948: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.734955: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.734961: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node BatchMatMulV2, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.734968: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node AddV2, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.734974: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node Identity, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.734981: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.734989: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.734995: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.735002: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.735008: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node BatchMatMulV2, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.735014: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node AddV2, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.735021: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node Identity, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.735027: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.735035: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.735042: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.735048: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Arg, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.735055: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node BatchMatMulV2, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.735061: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node AddV2, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.735068: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node Identity, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.735074: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Retval, reason: User has assigned a device that is not CPU. | |
2022-06-13 13:40:41.735083: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 3 | |
2022-06-13 13:40:41.735109: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_562627152_/job:localhost/replica:0/task:0/device:GPU:0 because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.735200: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_32_17288556091578612755_0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.735235: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __inference_nn_32_17288556091578612755_0 on device /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.735397: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __inference_nn_32_17288556091578612755_0 with handle 16 status: OK | |
2022-06-13 13:40:41.735460: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op __inference_nn_32 in device /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.735486: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1437] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __inference_nn_32 with handle 16 | |
2022-06-13 13:40:41.735548: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:CPU::MatMul takes 4.991us | |
2022-06-13 13:40:41.735564: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:CPU::Add takes 3.424us | |
2022-06-13 13:40:41.735574: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:CPU::Identity takes 0.827us | |
2022-06-13 13:40:41.735581: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found | |
2022-06-13 13:40:41.735632: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.735645: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 13.716us | |
2022-06-13 13:40:41.735656: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SINK}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.735664: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SINK takes 7.984us | |
2022-06-13 13:40:41.735693: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.735701: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 26.554us | |
2022-06-13 13:40:41.735714: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.735722: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 9.173us | |
2022-06-13 13:40:41.735733: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.735741: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 9.408us | |
2022-06-13 13:40:41.735754: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 2.721us | |
2022-06-13 13:40:41.735768: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 2.979us | |
2022-06-13 13:40:41.735783: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 3.805us | |
2022-06-13 13:40:41.735799: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_retval_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.735808: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_retval_RetVal takes 11.18us | |
2022-06-13 13:40:41.735819: I tensorflow/core/common_runtime/memory_types.cc:87] 2:0 -> 5:0: 0 -> 0 | |
2022-06-13 13:40:41.735827: I tensorflow/core/common_runtime/memory_types.cc:87] 4:0 -> 5:1: 0 -> 0 | |
2022-06-13 13:40:41.735834: I tensorflow/core/common_runtime/memory_types.cc:87] 5:0 -> 6:0: 0 -> 0 | |
2022-06-13 13:40:41.735841: I tensorflow/core/common_runtime/memory_types.cc:87] 3:0 -> 6:1: 0 -> 0 | |
2022-06-13 13:40:41.735848: I tensorflow/core/common_runtime/memory_types.cc:87] 6:0 -> 7:0: 0 -> 0 | |
2022-06-13 13:40:41.735855: I tensorflow/core/common_runtime/memory_types.cc:87] 7:0 -> 8:0: 0 -> 0 | |
2022-06-13 13:40:41.735864: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.735872: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 7.945us | |
2022-06-13 13:40:41.735881: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SINK}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.735889: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SINK takes 8.154us | |
2022-06-13 13:40:41.735900: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.735907: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 9.065us | |
2022-06-13 13:40:41.735918: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.735926: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 9.695us | |
2022-06-13 13:40:41.735937: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.735945: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 9.508us | |
2022-06-13 13:40:41.735957: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 0.951us | |
2022-06-13 13:40:41.735969: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 1.434us | |
2022-06-13 13:40:41.735987: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 1.598us | |
2022-06-13 13:40:41.735999: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_retval_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.736007: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_retval_RetVal takes 9.818us | |
2022-06-13 13:40:41.736016: I tensorflow/core/common_runtime/memory_types.cc:87] 2:0 -> 5:0: 0 -> 0 | |
2022-06-13 13:40:41.736024: I tensorflow/core/common_runtime/memory_types.cc:87] 4:0 -> 5:1: 0 -> 0 | |
2022-06-13 13:40:41.736030: I tensorflow/core/common_runtime/memory_types.cc:87] 5:0 -> 6:0: 0 -> 0 | |
2022-06-13 13:40:41.736037: I tensorflow/core/common_runtime/memory_types.cc:87] 3:0 -> 6:1: 0 -> 0 | |
2022-06-13 13:40:41.736044: I tensorflow/core/common_runtime/memory_types.cc:87] 6:0 -> 7:0: 0 -> 0 | |
2022-06-13 13:40:41.736051: I tensorflow/core/common_runtime/memory_types.cc:87] 7:0 -> 8:0: 0 -> 0 | |
2022-06-13 13:40:41.736120: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() | |
2022-06-13 13:40:41.736131: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.736140: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 8.559us | |
2022-06-13 13:40:41.736148: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.736155: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 7.198us | |
2022-06-13 13:40:41.736167: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() takes 47.122us | |
2022-06-13 13:40:41.736185: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node w}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="W", index=0]() | |
2022-06-13 13:40:41.736199: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.736207: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 9.724us | |
2022-06-13 13:40:41.736216: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node w}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.736224: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::w takes 9.432us | |
2022-06-13 13:40:41.736240: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node w}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="W", index=0]() takes 57.002us | |
2022-06-13 13:40:41.736263: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node b}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="b", index=1]() | |
2022-06-13 13:40:41.736276: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.736285: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 10.04us | |
2022-06-13 13:40:41.736294: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node b}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.736302: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::b takes 8.877us | |
2022-06-13 13:40:41.736315: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node b}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="b", index=1]() takes 61.009us | |
2022-06-13 13:40:41.736327: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node x}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[16,128,128]], _user_specified_name="x", index=2]() | |
2022-06-13 13:40:41.736339: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.736347: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 9.361us | |
2022-06-13 13:40:41.736356: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node x}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.736364: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:GPU::x takes 9.342us | |
2022-06-13 13:40:41.736377: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node x}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[16,128,128]], _user_specified_name="x", index=2]() takes 48.513us | |
2022-06-13 13:40:41.736389: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](w, x) | |
2022-06-13 13:40:41.736402: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 1.281us | |
2022-06-13 13:40:41.736412: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 0.908us | |
2022-06-13 13:40:41.736427: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](w, x) takes 36.878us | |
2022-06-13 13:40:41.736444: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, b) | |
2022-06-13 13:40:41.736456: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 1.602us | |
2022-06-13 13:40:41.736466: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 1.325us | |
2022-06-13 13:40:41.736478: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, b) takes 33.284us | |
2022-06-13 13:40:41.736489: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node Identity}} = Identity[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add) | |
2022-06-13 13:40:41.736501: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 1.841us | |
2022-06-13 13:40:41.736511: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Identity:GPU::Identity takes 1.566us | |
2022-06-13 13:40:41.736524: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node Identity}} = Identity[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add) takes 33.463us | |
2022-06-13 13:40:41.736534: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node identity_retval_RetVal}} = _Retval[T=DT_FLOAT, index=0](Identity) | |
2022-06-13 13:40:41.736545: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_retval_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.736554: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_retval_RetVal takes 10.032us | |
2022-06-13 13:40:41.736564: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node identity_retval_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.736572: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::identity_retval_RetVal takes 8.917us | |
2022-06-13 13:40:41.736584: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node identity_retval_RetVal}} = _Retval[T=DT_FLOAT, index=0](Identity) takes 49.302us | |
# run 3 schedule end | |
# run 3 compute start | |
2022-06-13 13:40:41.736648: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step -8749612558746935679 {{node _SOURCE}} = NoOp[]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.736721: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step -8749612558746935679 {{node w}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="W", index=0]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.736759: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper w op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.736787: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] w:_Arg | |
2022-06-13 13:40:41.736819: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled w op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.736860: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step -8749612558746935679 {{node b}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="b", index=1]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.736888: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper b op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.736913: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] b:_Arg | |
2022-06-13 13:40:41.736941: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled b op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.736978: I tensorflow/core/common_runtime/executor.cc:783] Process node: 4 step -8749612558746935679 {{node x}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[16,128,128]], _user_specified_name="x", index=2]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.737006: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper x op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.737031: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] x:_Arg | |
2022-06-13 13:40:41.737058: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled x op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.737094: I tensorflow/core/common_runtime/executor.cc:783] Process node: 5 step -8749612558746935679 {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](w, x) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.737124: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper MatMul op BatchMatMulV2 on GPU 0 stream[0] | |
2022-06-13 13:40:41.737157: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] MatMul:BatchMatMulV2#shape=(float[1024,128];float[16,128,128])# | |
2022-06-13 13:40:41.737309: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled MatMul op BatchMatMulV2 on GPU 0 stream[0] | |
2022-06-13 13:40:41.737354: I tensorflow/core/common_runtime/executor.cc:783] Process node: 6 step -8749612558746935679 {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, b) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.737385: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Add op AddV2 on GPU 0 stream[0] | |
2022-06-13 13:40:41.737416: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Add:AddV2#shape=(float[16,1024,128];float[1024,128])# | |
2022-06-13 13:40:41.737480: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled Add op AddV2 on GPU 0 stream[0] | |
2022-06-13 13:40:41.737517: I tensorflow/core/common_runtime/executor.cc:783] Process node: 7 step -8749612558746935679 {{node Identity}} = Identity[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.737546: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Identity op Identity on GPU 0 stream[0] | |
2022-06-13 13:40:41.737576: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Identity:Identity#shape=(float[16,1024,128])# | |
2022-06-13 13:40:41.737603: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled Identity op Identity on GPU 0 stream[0] | |
2022-06-13 13:40:41.737636: I tensorflow/core/common_runtime/executor.cc:783] Process node: 8 step -8749612558746935679 {{node identity_retval_RetVal}} = _Retval[T=DT_FLOAT, index=0](Identity) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.737664: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper identity_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.737693: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] identity_retval_RetVal:_Retval#shape=(float[16,1024,128])# | |
2022-06-13 13:40:41.737720: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled identity_retval_RetVal op _Retval on GPU 0 stream[0] | |
# run 3 compute end | |
# below is output of run 3 | |
2022-06-13 13:40:41.738117: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op StringFormat in device | |
2022-06-13 13:40:41.738140: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 1 | |
2022-06-13 13:40:41.738148: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute StringFormat in device | |
2022-06-13 13:40:41.738182: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:GPU::StringFormat takes 2.83us | |
2022-06-13 13:40:41.738194: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.066us | |
2022-06-13 13:40:41.738206: I tensorflow/core/common_runtime/eager/execute.cc:923] PreferredDevice StringFormat: /job:localhost/replica:0/task:0 | |
2022-06-13 13:40:41.738213: I tensorflow/core/common_runtime/eager/execute.cc:924] Placer place op [StringFormat] on device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.738226: I tensorflow/core/common_runtime/eager/execute.cc:1062] Device for [StringFormat] already set to: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.738265: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.738294: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.738325: I tensorflow/core/common_runtime/process_function_library_runtime.cc:772] Instantiating MultiDevice function "__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0" on default device "/job:localhost/replica:0/task:0/device:CPU:0" | |
2022-06-13 13:40:41.738456: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 0 | |
2022-06-13 13:40:41.738470: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-13 13:40:41.738477: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MlirV1CompatGraphOptimizationPass | |
2022-06-13 13:40:41.738485: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-13 13:40:41.738491: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ControlFlowDepsToChainsPass | |
2022-06-13 13:40:41.738498: I tensorflow/core/common_runtime/control_flow_deps_to_chains.cc:37] ControlFlowDepsToChainsPass::Run | |
2022-06-13 13:40:41.738512: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.738533: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.738548: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-13 13:40:41.738555: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: AccumulateNV2RemovePass | |
2022-06-13 13:40:41.738563: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: LowerFunctionalOpsPass | |
2022-06-13 13:40:41.738574: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ParallelConcatRemovePass | |
2022-06-13 13:40:41.738581: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 35 | |
2022-06-13 13:40:41.738588: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IsolatePlacerInspectionRequiredOpsPass | |
2022-06-13 13:40:41.738595: I tensorflow/core/common_runtime/isolate_placer_inspection_required_ops_pass.cc:34] IsolatePlacerInspectionRequiredOpsPass::Run | |
2022-06-13 13:40:41.738604: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IntroduceFloatingPointJitterPass | |
2022-06-13 13:40:41.738612: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 36 | |
2022-06-13 13:40:41.738619: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateXlaComputationsPass | |
2022-06-13 13:40:41.738631: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.738642: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:353] EncapsulateXlaComputations(): (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.738682: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_halfway because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.738697: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:364] EncapsulateXlaComputations() half-way: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.738710: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.738721: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:370] EncapsulateXlaComputations() finished: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.738730: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 37 | |
2022-06-13 13:40:41.738736: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: FunctionalizeControlFlowForXlaPass | |
2022-06-13 13:40:41.738753: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 99999 | |
2022-06-13 13:40:41.738760: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: WeakForwardTypeInferencePass | |
2022-06-13 13:40:41.738768: I tensorflow/core/common_runtime/forward_type_inference.cc:130] ForwardTypeInferencePass::Run | |
2022-06-13 13:40:41.738779: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.738799: I tensorflow/core/common_runtime/forward_type_inference.cc:311] Finished after 1 iterations; done 4 of 4 nodes in 4 visits | |
2022-06-13 13:40:41.738814: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.738827: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 0 | |
2022-06-13 13:40:41.738849: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:GPU::StringFormat takes 2.105us | |
2022-06-13 13:40:41.738860: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.797us | |
2022-06-13 13:40:41.738875: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.738884: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_RetVal takes 11.916us | |
2022-06-13 13:40:41.738893: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 0.656us | |
2022-06-13 13:40:41.738908: I tensorflow/core/common_runtime/placer.cc:124] output_RetVal(_Retval) placed on: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.738922: I tensorflow/core/common_runtime/placer.cc:124] StringFormat(StringFormat) placed on: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.738931: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 1 | |
2022-06-13 13:40:41.738938: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-13 13:40:41.738945: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: NcclReplacePass | |
2022-06-13 13:40:41.738953: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 1 | |
2022-06-13 13:40:41.738960: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 2 | |
2022-06-13 13:40:41.738968: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 5 | |
2022-06-13 13:40:41.738974: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: CloneConstantsForBetterClusteringPass | |
2022-06-13 13:40:41.738983: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-13 13:40:41.738990: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ClusterScopingPass | |
2022-06-13 13:40:41.738997: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-13 13:40:41.739004: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MarkForCompilationPass | |
2022-06-13 13:40:41.739284: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:XLA_CPU_JIT::StringFormat takes 1.601us | |
2022-06-13 13:40:41.739348: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:1523] MarkForCompilationPassImpl::Run time: 328 us (cumulative: 3.8 ms, max: 691 us, #called: 9) | |
2022-06-13 13:40:41.739360: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 12 | |
2022-06-13 13:40:41.739367: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ForceXlaConstantsOnHostPass | |
2022-06-13 13:40:41.739380: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 20 | |
2022-06-13 13:40:41.739387: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IncreaseDynamismForAutoJitPass | |
2022-06-13 13:40:41.739395: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 30 | |
2022-06-13 13:40:41.739401: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: PartiallyDeclusterPass | |
2022-06-13 13:40:41.739426: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 40 | |
2022-06-13 13:40:41.739435: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ReportClusteringInfoPass | |
2022-06-13 13:40:41.739457: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 50 | |
2022-06-13 13:40:41.739466: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateSubgraphsPass | |
2022-06-13 13:40:41.739473: I tensorflow/compiler/jit/encapsulate_subgraphs_pass.cc:1139] EncapsulateSubgraphsPass::Run | |
2022-06-13 13:40:41.739491: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.739556: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.739584: I tensorflow/compiler/jit/xla_cluster_util.cc:590] GetNodesRelatedToRefVariables() found 0 nodes | |
2022-06-13 13:40:41.739609: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 60 | |
2022-06-13 13:40:41.739620: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: BuildXlaOpsPass | |
2022-06-13 13:40:41.739629: I tensorflow/compiler/jit/build_xla_ops_pass.cc:603] print_outputs = 0 | |
2022-06-13 13:40:41.739636: I tensorflow/compiler/jit/build_xla_ops_pass.cc:604] check_input_numerics = 0 | |
2022-06-13 13:40:41.739642: I tensorflow/compiler/jit/build_xla_ops_pass.cc:605] check_output_numerics = 0 | |
2022-06-13 13:40:41.739667: W tensorflow/core/util/dump_graph.cc:134] Failed to dump build_xla_ops because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.739681: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 2 | |
2022-06-13 13:40:41.739696: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.102us | |
2022-06-13 13:40:41.739711: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 0.559us | |
2022-06-13 13:40:41.739728: I tensorflow/core/graph/graph_partition.cc:1251] Added send/recv: controls=0, data=0 | |
2022-06-13 13:40:41.739768: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 3 | |
2022-06-13 13:40:41.739776: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 1 | |
2022-06-13 13:40:41.739781: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MklLayoutRewritePass | |
2022-06-13 13:40:41.739802: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 3 | |
2022-06-13 13:40:41.739815: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_562649184_/job:localhost/replica:0/task:0/device:CPU:0 because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.739851: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_7245493243845350481_0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.739868: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_7245493243845350481_0 on device /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.739930: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_7245493243845350481_0 with handle 18 status: OK | |
2022-06-13 13:40:41.739964: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op StringFormat in device /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.739979: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0 with handle 18 | |
2022-06-13 13:40:41.740007: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.171us | |
2022-06-13 13:40:41.740023: I tensorflow/core/common_runtime/constant_folding.cc:631] Constant foldable 3 : 4 | |
2022-06-13 13:40:41.740083: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() | |
2022-06-13 13:40:41.740094: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.740101: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 7.149us | |
2022-06-13 13:40:41.740107: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.740113: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 5.558us | |
2022-06-13 13:40:41.740123: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() takes 38.715us | |
2022-06-13 13:40:41.740135: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 3 costs 15.825033187866211", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() | |
2022-06-13 13:40:41.740144: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.444us | |
2022-06-13 13:40:41.740150: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.34us | |
2022-06-13 13:40:41.740164: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 3 costs 15.825033187866211", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 30.203us | |
2022-06-13 13:40:41.740175: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=-7433993840239241574, tensor_name="StringFormat:0"](StringFormat) | |
2022-06-13 13:40:41.740185: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_send_StringFormat_0 takes 1.591us | |
2022-06-13 13:40:41.740192: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_send_StringFormat_0 takes 0.316us | |
2022-06-13 13:40:41.740210: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=-7433993840239241574, tensor_name="StringFormat:0"](StringFormat) takes 35.212us | |
2022-06-13 13:40:41.740228: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step -1 {{node _SOURCE}} = NoOp[]() device: /device:CPU:0 | |
2022-06-13 13:40:41.740237: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step -1 {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 3 costs 15.825033187866211", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() device: /device:CPU:0 | |
2022-06-13 13:40:41.740250: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step -1 {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=-7433993840239241574, tensor_name="StringFormat:0"](StringFormat) device: /device:CPU:0 | |
2022-06-13 13:40:41.740301: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 0.88us | |
2022-06-13 13:40:41.740310: I tensorflow/core/common_runtime/constant_folding.cc:562] Replacing StringFormat :: 0 with a constant | |
2022-06-13 13:40:41.740349: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found | |
2022-06-13 13:40:41.740389: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node StringFormat/_0__cf__0}} = Const[dtype=DT_STRING, value=Tensor<type: string shape: [] values: run 3 costs 15.825033187866211>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() | |
2022-06-13 13:40:41.740402: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 0.642us | |
2022-06-13 13:40:41.740409: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 0.3us | |
2022-06-13 13:40:41.740426: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node StringFormat/_0__cf__0}} = Const[dtype=DT_STRING, value=Tensor<type: string shape: [] values: run 3 costs 15.825033187866211>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 40.095us | |
2022-06-13 13:40:41.740437: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_STRING, index=0](StringFormat) | |
2022-06-13 13:40:41.740445: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_retval_RetVal takes 0.466us | |
2022-06-13 13:40:41.740451: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_retval_RetVal takes 0.289us | |
2022-06-13 13:40:41.740461: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_STRING, index=0](StringFormat) takes 24.051us | |
2022-06-13 13:40:41.740508: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op PrintV2 in device | |
2022-06-13 13:40:41.740518: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 0 | |
2022-06-13 13:40:41.740523: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute PrintV2 in device | |
2022-06-13 13:40:41.740533: I tensorflow/core/common_runtime/eager/execute.cc:982] PrintV2:input:0 /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.740548: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op PrintV2 in device /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.740561: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0 with handle 8 | |
# before is output of run 3 | |
run 3 costs 15.825033187866211 | |
# run 4 schedule start | |
2022-06-13 13:40:41.740898: I tensorflow/python/eager/pywrap_tfe_src.cc:885] Eager executes cancelable __inference_nn_32 on the number of inputs is 3 the number of output is 1 | |
2022-06-13 13:40:41.740923: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_32' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.740943: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_32' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.740955: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__inference_nn_32' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.740963: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op __inference_nn_32 in device | |
2022-06-13 13:40:41.740968: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 1 | |
2022-06-13 13:40:41.740974: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute __inference_nn_32 in device | |
2022-06-13 13:40:41.740982: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_32:input:0 /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.740989: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_32:input:1 /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.740996: I tensorflow/core/common_runtime/eager/execute.cc:982] __inference_nn_32:input:2 /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.741010: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op __inference_nn_32 in device /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.741026: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1437] Running component function on device /job:localhost/replica:0/task:0/device:GPU:0 from __inference_nn_32 with handle 16 | |
# run 4 schedule end | |
# run 4 compute start | |
2022-06-13 13:40:41.741080: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step -776110010382447230 {{node _SOURCE}} = NoOp[]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.741143: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step -776110010382447230 {{node w}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="W", index=0]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.741170: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper w op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.741189: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] w:_Arg | |
2022-06-13 13:40:41.741216: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled w op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.741245: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step -776110010382447230 {{node b}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[1024,128]], _user_specified_name="b", index=1]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.741264: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper b op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.741277: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] b:_Arg | |
2022-06-13 13:40:41.741295: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled b op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.741320: I tensorflow/core/common_runtime/executor.cc:783] Process node: 4 step -776110010382447230 {{node x}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, _output_shapes=[[16,128,128]], _user_specified_name="x", index=2]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.741339: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper x op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.741351: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] x:_Arg | |
2022-06-13 13:40:41.741371: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled x op _Arg on GPU 0 stream[0] | |
2022-06-13 13:40:41.741398: I tensorflow/core/common_runtime/executor.cc:783] Process node: 5 step -776110010382447230 {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](w, x) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.741417: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper MatMul op BatchMatMulV2 on GPU 0 stream[0] | |
2022-06-13 13:40:41.741435: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] MatMul:BatchMatMulV2#shape=(float[1024,128];float[16,128,128])# | |
2022-06-13 13:40:41.741544: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled MatMul op BatchMatMulV2 on GPU 0 stream[0] | |
2022-06-13 13:40:41.741576: I tensorflow/core/common_runtime/executor.cc:783] Process node: 6 step -776110010382447230 {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, b) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.741597: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Add op AddV2 on GPU 0 stream[0] | |
2022-06-13 13:40:41.741614: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Add:AddV2#shape=(float[16,1024,128];float[1024,128])# | |
2022-06-13 13:40:41.741674: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled Add op AddV2 on GPU 0 stream[0] | |
2022-06-13 13:40:41.741712: I tensorflow/core/common_runtime/executor.cc:783] Process node: 7 step -776110010382447230 {{node Identity}} = Identity[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.741731: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Identity op Identity on GPU 0 stream[0] | |
2022-06-13 13:40:41.741748: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Identity:Identity#shape=(float[16,1024,128])# | |
2022-06-13 13:40:41.741768: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled Identity op Identity on GPU 0 stream[0] | |
2022-06-13 13:40:41.741792: I tensorflow/core/common_runtime/executor.cc:783] Process node: 8 step -776110010382447230 {{node identity_retval_RetVal}} = _Retval[T=DT_FLOAT, index=0](Identity) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-13 13:40:41.741811: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper identity_retval_RetVal op _Retval on GPU 0 stream[0] | |
2022-06-13 13:40:41.741826: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] identity_retval_RetVal:_Retval#shape=(float[16,1024,128])# | |
2022-06-13 13:40:41.741844: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled identity_retval_RetVal op _Retval on GPU 0 stream[0] | |
# run 4 compute end | |
# below is output of run 4 | |
2022-06-13 13:40:41.742103: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op StringFormat in device | |
2022-06-13 13:40:41.742120: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 1 | |
2022-06-13 13:40:41.742125: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute StringFormat in device | |
2022-06-13 13:40:41.742152: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:GPU::StringFormat takes 2.428us | |
2022-06-13 13:40:41.742162: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.659us | |
2022-06-13 13:40:41.742171: I tensorflow/core/common_runtime/eager/execute.cc:923] PreferredDevice StringFormat: /job:localhost/replica:0/task:0 | |
2022-06-13 13:40:41.742176: I tensorflow/core/common_runtime/eager/execute.cc:924] Placer place op [StringFormat] on device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.742186: I tensorflow/core/common_runtime/eager/execute.cc:1062] Device for [StringFormat] already set to: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.742209: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.742230: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.742252: I tensorflow/core/common_runtime/process_function_library_runtime.cc:772] Instantiating MultiDevice function "__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0" on default device "/job:localhost/replica:0/task:0/device:CPU:0" | |
2022-06-13 13:40:41.742348: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 0 | |
2022-06-13 13:40:41.742359: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-13 13:40:41.742364: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MlirV1CompatGraphOptimizationPass | |
2022-06-13 13:40:41.742369: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-13 13:40:41.742374: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ControlFlowDepsToChainsPass | |
2022-06-13 13:40:41.742379: I tensorflow/core/common_runtime/control_flow_deps_to_chains.cc:37] ControlFlowDepsToChainsPass::Run | |
2022-06-13 13:40:41.742389: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.742403: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.742412: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-13 13:40:41.742417: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: AccumulateNV2RemovePass | |
2022-06-13 13:40:41.742423: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: LowerFunctionalOpsPass | |
2022-06-13 13:40:41.742431: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ParallelConcatRemovePass | |
2022-06-13 13:40:41.742436: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 35 | |
2022-06-13 13:40:41.742441: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IsolatePlacerInspectionRequiredOpsPass | |
2022-06-13 13:40:41.742446: I tensorflow/core/common_runtime/isolate_placer_inspection_required_ops_pass.cc:34] IsolatePlacerInspectionRequiredOpsPass::Run | |
2022-06-13 13:40:41.742452: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IntroduceFloatingPointJitterPass | |
2022-06-13 13:40:41.742458: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 36 | |
2022-06-13 13:40:41.742463: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateXlaComputationsPass | |
2022-06-13 13:40:41.742471: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.742479: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:353] EncapsulateXlaComputations(): (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.742508: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_halfway because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.742519: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:364] EncapsulateXlaComputations() half-way: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.742528: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.742536: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:370] EncapsulateXlaComputations() finished: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-13 13:40:41.742542: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 37 | |
2022-06-13 13:40:41.742547: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: FunctionalizeControlFlowForXlaPass | |
2022-06-13 13:40:41.742559: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 99999 | |
2022-06-13 13:40:41.742564: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: WeakForwardTypeInferencePass | |
2022-06-13 13:40:41.742569: I tensorflow/core/common_runtime/forward_type_inference.cc:130] ForwardTypeInferencePass::Run | |
2022-06-13 13:40:41.742577: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.742591: I tensorflow/core/common_runtime/forward_type_inference.cc:311] Finished after 1 iterations; done 4 of 4 nodes in 4 visits | |
2022-06-13 13:40:41.742599: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.742609: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 0 | |
2022-06-13 13:40:41.742624: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:GPU::StringFormat takes 1.533us | |
2022-06-13 13:40:41.742631: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.582us | |
2022-06-13 13:40:41.742642: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node output_RetVal}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.742648: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:GPU::output_RetVal takes 8.704us | |
2022-06-13 13:40:41.742654: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 0.433us | |
2022-06-13 13:40:41.742665: I tensorflow/core/common_runtime/placer.cc:124] output_RetVal(_Retval) placed on: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.742673: I tensorflow/core/common_runtime/placer.cc:124] StringFormat(StringFormat) placed on: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.742679: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 1 | |
2022-06-13 13:40:41.742684: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-13 13:40:41.742689: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: NcclReplacePass | |
2022-06-13 13:40:41.742695: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 1 | |
2022-06-13 13:40:41.742701: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 2 | |
2022-06-13 13:40:41.742706: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 5 | |
2022-06-13 13:40:41.742711: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: CloneConstantsForBetterClusteringPass | |
2022-06-13 13:40:41.742716: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-13 13:40:41.742721: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ClusterScopingPass | |
2022-06-13 13:40:41.742726: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-13 13:40:41.742730: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MarkForCompilationPass | |
2022-06-13 13:40:41.742931: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:XLA_CPU_JIT::StringFormat takes 1.089us | |
2022-06-13 13:40:41.742978: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:1523] MarkForCompilationPassImpl::Run time: 237 us (cumulative: 4.04 ms, max: 691 us, #called: 10) | |
2022-06-13 13:40:41.742988: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 12 | |
2022-06-13 13:40:41.742993: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ForceXlaConstantsOnHostPass | |
2022-06-13 13:40:41.743001: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 20 | |
2022-06-13 13:40:41.743006: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IncreaseDynamismForAutoJitPass | |
2022-06-13 13:40:41.743012: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 30 | |
2022-06-13 13:40:41.743017: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: PartiallyDeclusterPass | |
2022-06-13 13:40:41.743035: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 40 | |
2022-06-13 13:40:41.743043: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ReportClusteringInfoPass | |
2022-06-13 13:40:41.743060: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 50 | |
2022-06-13 13:40:41.743064: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateSubgraphsPass | |
2022-06-13 13:40:41.743070: I tensorflow/compiler/jit/encapsulate_subgraphs_pass.cc:1139] EncapsulateSubgraphsPass::Run | |
2022-06-13 13:40:41.743082: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.743131: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.743151: I tensorflow/compiler/jit/xla_cluster_util.cc:590] GetNodesRelatedToRefVariables() found 0 nodes | |
2022-06-13 13:40:41.743166: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 60 | |
2022-06-13 13:40:41.743173: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: BuildXlaOpsPass | |
2022-06-13 13:40:41.743180: I tensorflow/compiler/jit/build_xla_ops_pass.cc:603] print_outputs = 0 | |
2022-06-13 13:40:41.743184: I tensorflow/compiler/jit/build_xla_ops_pass.cc:604] check_input_numerics = 0 | |
2022-06-13 13:40:41.743189: I tensorflow/compiler/jit/build_xla_ops_pass.cc:605] check_output_numerics = 0 | |
2022-06-13 13:40:41.743199: W tensorflow/core/util/dump_graph.cc:134] Failed to dump build_xla_ops because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.743210: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 2 | |
2022-06-13 13:40:41.743224: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.077us | |
2022-06-13 13:40:41.743235: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_RetVal takes 0.447us | |
2022-06-13 13:40:41.743253: I tensorflow/core/graph/graph_partition.cc:1251] Added send/recv: controls=0, data=0 | |
2022-06-13 13:40:41.743293: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 3 | |
2022-06-13 13:40:41.743302: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 1 | |
2022-06-13 13:40:41.743306: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MklLayoutRewritePass | |
2022-06-13 13:40:41.743325: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 3 | |
2022-06-13 13:40:41.743339: W tensorflow/core/util/dump_graph.cc:134] Failed to dump pflr_after_all_optimization_passes_562728544_/job:localhost/replica:0/task:0/device:CPU:0 because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-13 13:40:41.743375: I tensorflow/core/framework/op.cc:80] NOT_FOUND: Op type not registered '__wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_5078087082401372147_0' in binary running on 90e62df95daa. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed. | |
2022-06-13 13:40:41.743392: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1105] Start instantiating component function __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_5078087082401372147_0 on device /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.743452: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1114] Finished instantiating component function __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0_5078087082401372147_0 with handle 20 status: OK | |
2022-06-13 13:40:41.743487: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op StringFormat in device /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.743502: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__StringFormat_T_0_device_/job:localhost/replica:0/task:0/device:CPU:0 with handle 20 | |
2022-06-13 13:40:41.743529: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 1.186us | |
2022-06-13 13:40:41.743545: I tensorflow/core/common_runtime/constant_folding.cc:631] Constant foldable 3 : 4 | |
2022-06-13 13:40:41.743604: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() | |
2022-06-13 13:40:41.743616: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.743623: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 7.611us | |
2022-06-13 13:40:41.743629: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-13 13:40:41.743634: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 5.455us | |
2022-06-13 13:40:41.743644: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() takes 38.971us | |
2022-06-13 13:40:41.743656: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 4 costs 1.3682842254638672", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() | |
2022-06-13 13:40:41.743666: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.509us | |
2022-06-13 13:40:41.743673: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: StringFormat:CPU::StringFormat takes 0.348us | |
2022-06-13 13:40:41.743686: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 4 costs 1.3682842254638672", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 30.797us | |
2022-06-13 13:40:41.743697: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=6617487762276893854, tensor_name="StringFormat:0"](StringFormat) | |
2022-06-13 13:40:41.743707: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_send_StringFormat_0 takes 0.559us | |
2022-06-13 13:40:41.743713: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_send_StringFormat_0 takes 0.316us | |
2022-06-13 13:40:41.743731: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=6617487762276893854, tensor_name="StringFormat:0"](StringFormat) takes 33.315us | |
2022-06-13 13:40:41.743749: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step -1 {{node _SOURCE}} = NoOp[]() device: /device:CPU:0 | |
2022-06-13 13:40:41.743758: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step -1 {{node StringFormat}} = StringFormat[T=[], _XlaHasReferenceVars=false, placeholder="{}", summarize=3, template="run 4 costs 1.3682842254638672", _device="/job:localhost/replica:0/task:0/device:CPU:0"]() device: /device:CPU:0 | |
2022-06-13 13:40:41.743771: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step -1 {{node _send_StringFormat_0}} = _Send[T=DT_STRING, client_terminated=true, recv_device="/device:CPU:0", send_device="/device:CPU:0", send_device_incarnation=6617487762276893854, tensor_name="StringFormat:0"](StringFormat) device: /device:CPU:0 | |
2022-06-13 13:40:41.743808: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 0.745us | |
2022-06-13 13:40:41.743817: I tensorflow/core/common_runtime/constant_folding.cc:562] Replacing StringFormat :: 0 with a constant | |
2022-06-13 13:40:41.743856: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found | |
2022-06-13 13:40:41.743893: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node StringFormat/_0__cf__0}} = Const[dtype=DT_STRING, value=Tensor<type: string shape: [] values: run 4 costs 1.3682842254638672>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() | |
2022-06-13 13:40:41.743905: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 0.502us | |
2022-06-13 13:40:41.743912: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::StringFormat/_0__cf__0 takes 0.31us | |
2022-06-13 13:40:41.743930: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node StringFormat/_0__cf__0}} = Const[dtype=DT_STRING, value=Tensor<type: string shape: [] values: run 4 costs 1.3682842254638672>, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 38.841us | |
2022-06-13 13:40:41.743940: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_STRING, index=0](StringFormat) | |
2022-06-13 13:40:41.743948: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_retval_RetVal takes 0.435us | |
2022-06-13 13:40:41.743955: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::output_retval_RetVal takes 0.31us | |
2022-06-13 13:40:41.743965: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node output_retval_RetVal}} = _Retval[T=DT_STRING, index=0](StringFormat) takes 24.343us | |
2022-06-13 13:40:41.744009: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:57] Process op PrintV2 in device | |
2022-06-13 13:40:41.744018: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:58] Number of return vals is 0 | |
2022-06-13 13:40:41.744023: I tensorflow/core/common_runtime/eager/custom_device_op_handler.cc:95] Execute PrintV2 in device | |
2022-06-13 13:40:41.744033: I tensorflow/core/common_runtime/eager/execute.cc:982] PrintV2:input:0 /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.744048: I tensorflow/core/common_runtime/eager/execute.cc:1353] Executing op PrintV2 in device /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-13 13:40:41.744062: I tensorflow/core/common_runtime/process_function_library_runtime.cc:1302] Running component function on device /job:localhost/replica:0/task:0/device:CPU:0 from __wrapped__PrintV2_device_/job:localhost/replica:0/task:0/device:CPU:0 with handle 8 | |
# before is output of run 4 | |
run 4 costs 1.3682842254638672 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2022-06-16 02:29:11.215225: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE3 SSE4.1 SSE4.2 AVX AVX2 AVX512F AVX512_VNNI FMA | |
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. | |
2022-06-16 02:29:11.372311: I tensorflow/core/platform/cloud/gcs_file_system.cc:806] GCS cache max size = 0 ; block size = 67108864 ; max staleness = 0 | |
2022-06-16 02:29:11.372372: I ./tensorflow/core/platform/cloud/ram_file_block_cache.h:64] GCS file block cache is disabled | |
2022-06-16 02:29:11.372388: I tensorflow/core/platform/cloud/gcs_file_system.cc:846] GCS DNS cache is disabled, because GCS_RESOLVE_REFRESH_SECS = 0 (or is not set) | |
2022-06-16 02:29:11.372393: I tensorflow/core/platform/cloud/gcs_file_system.cc:876] GCS additional header DISABLED. No environment variable set. | |
2022-06-16 02:29:11.373307: I tensorflow/core/util/util.cc:168] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`. | |
2022-06-16 02:29:11.378043: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0 | |
2022-06-16 02:29:11.417732: I tensorflow/core/platform/cloud/gcs_file_system.cc:806] GCS cache max size = 0 ; block size = 67108864 ; max staleness = 0 | |
2022-06-16 02:29:11.417772: I ./tensorflow/core/platform/cloud/ram_file_block_cache.h:64] GCS file block cache is disabled | |
2022-06-16 02:29:11.417778: I tensorflow/core/platform/cloud/gcs_file_system.cc:846] GCS DNS cache is disabled, because GCS_RESOLVE_REFRESH_SECS = 0 (or is not set) | |
2022-06-16 02:29:11.417799: I tensorflow/core/platform/cloud/gcs_file_system.cc:876] GCS additional header DISABLED. No environment variable set. | |
2022-06-16 02:29:12.083804: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libnvinfer.so.7 | |
2022-06-16 02:29:12.907647: I ./tensorflow/core/common_runtime/mkl_cpu_allocator.h:178] MklCPUAllocator: Setting max_mem_bytes: 134837268480 | |
2022-06-16 02:29:12.907765: I tensorflow/core/common_runtime/bfc_allocator.cc:70] Creating new BFCAllocator named: mklcpu | |
2022-06-16 02:29:12.907802: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256B | |
2022-06-16 02:29:12.907834: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512B | |
2022-06-16 02:29:12.907872: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.0KiB | |
2022-06-16 02:29:12.907904: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.0KiB | |
2022-06-16 02:29:12.907935: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.0KiB | |
2022-06-16 02:29:12.907968: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.0KiB | |
2022-06-16 02:29:12.907998: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.0KiB | |
2022-06-16 02:29:12.908025: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.0KiB | |
2022-06-16 02:29:12.908052: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.0KiB | |
2022-06-16 02:29:12.908122: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.0KiB | |
2022-06-16 02:29:12.908156: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.0KiB | |
2022-06-16 02:29:12.908257: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512.0KiB | |
2022-06-16 02:29:12.908283: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.00MiB | |
2022-06-16 02:29:12.908310: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.00MiB | |
2022-06-16 02:29:12.908338: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.00MiB | |
2022-06-16 02:29:12.908392: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.00MiB | |
2022-06-16 02:29:12.908434: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.00MiB | |
2022-06-16 02:29:12.908467: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.00MiB | |
2022-06-16 02:29:12.908529: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.00MiB | |
2022-06-16 02:29:12.908582: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.00MiB | |
2022-06-16 02:29:12.908643: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.00MiB | |
2022-06-16 02:29:12.923406: I tensorflow/compiler/xla/parse_flags_from_env.cc:197] For env var TF_XLA_FLAGS found arguments: | |
2022-06-16 02:29:12.923498: I tensorflow/compiler/xla/parse_flags_from_env.cc:199] argv[0] = <argv[0]> | |
2022-06-16 02:29:12.923586: I tensorflow/compiler/xla/parse_flags_from_env.cc:197] For env var TF_JITRT_FLAGS found arguments: | |
2022-06-16 02:29:12.923638: I tensorflow/compiler/xla/parse_flags_from_env.cc:199] argv[0] = <argv[0]> | |
2022-06-16 02:29:12.923675: I tensorflow/compiler/jit/xla_cpu_device.cc:44] Not creating XLA devices, tf_xla_enable_xla_devices not set and XLA device creation not requested | |
2022-06-16 02:29:12.923810: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1 | |
2022-06-16 02:29:12.979252: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1836] Found device 0 with properties: | |
pciBusID: 0000:18:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5 | |
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s | |
2022-06-16 02:29:12.979500: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1836] Found device 1 with properties: | |
pciBusID: 0000:86:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5 | |
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s | |
2022-06-16 02:29:12.979518: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0 | |
2022-06-16 02:29:12.979554: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11 | |
2022-06-16 02:29:12.979575: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11 | |
2022-06-16 02:29:12.980734: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10 | |
2022-06-16 02:29:12.980967: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10 | |
2022-06-16 02:29:12.981848: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11 | |
2022-06-16 02:29:12.982606: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11 | |
2022-06-16 02:29:12.982648: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8 | |
2022-06-16 02:29:12.983379: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1975] Adding visible gpu devices: 0, 1 | |
2022-06-16 02:29:12.983405: I tensorflow/compiler/jit/xla_gpu_device.cc:48] Not creating XLA devices, tf_xla_enable_xla_devices not set and XLA devices creation not required | |
2022-06-16 02:29:12.983775: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE3 SSE4.1 SSE4.2 AVX AVX2 AVX512F AVX512_VNNI FMA | |
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. | |
2022-06-16 02:29:12.986811: I tensorflow/compiler/jit/xla_cpu_device.cc:58] Not creating XLA devices, tf_xla_enable_xla_devices not set | |
2022-06-16 02:29:13.228591: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1836] Found device 0 with properties: | |
pciBusID: 0000:18:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5 | |
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s | |
2022-06-16 02:29:13.228896: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1836] Found device 1 with properties: | |
pciBusID: 0000:86:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5 | |
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s | |
2022-06-16 02:29:13.229531: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1975] Adding visible gpu devices: 0, 1 | |
2022-06-16 02:29:13.229569: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0 | |
2022-06-16 02:29:13.794299: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1333] Cuda stream priority range on GPU(0): -5,0 | |
2022-06-16 02:29:14.211081: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1333] Cuda stream priority range on GPU(0): -5,0 | |
2022-06-16 02:29:14.211141: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1384] TensorFlow compiled with CUDA 11.2 and cuDNN 8.1.0 | |
2022-06-16 02:29:14.211189: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1396] Device interconnect StreamExecutor with strength 1 edge matrix: | |
2022-06-16 02:29:14.211199: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] 0 1 | |
2022-06-16 02:29:14.211204: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1415] 0: N N | |
2022-06-16 02:29:14.211208: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1415] 1: N N | |
2022-06-16 02:29:14.212286: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1677] GPUDevice PlatformDeviceId 0 TfDeviceId 0 on bus 1 numa: 0 pci: 0000:18:00.0 DeviceLocality: bus_id: 1 | |
links { | |
} | |
2022-06-16 02:29:14.212522: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1677] GPUDevice PlatformDeviceId 1 TfDeviceId 1 on bus 2 numa: 1 pci: 0000:86:00.0 DeviceLocality: bus_id: 2 | |
numa_node: 1 | |
links { | |
} | |
2022-06-16 02:29:14.212720: I tensorflow/core/common_runtime/bfc_allocator.cc:70] Creating new BFCAllocator named: GPU_0_bfc | |
2022-06-16 02:29:14.212733: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256B | |
2022-06-16 02:29:14.212738: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512B | |
2022-06-16 02:29:14.212745: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.0KiB | |
2022-06-16 02:29:14.212750: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.0KiB | |
2022-06-16 02:29:14.212754: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.0KiB | |
2022-06-16 02:29:14.212760: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.0KiB | |
2022-06-16 02:29:14.212765: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.0KiB | |
2022-06-16 02:29:14.212770: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.0KiB | |
2022-06-16 02:29:14.212775: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.0KiB | |
2022-06-16 02:29:14.212780: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.0KiB | |
2022-06-16 02:29:14.212785: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.0KiB | |
2022-06-16 02:29:14.212789: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512.0KiB | |
2022-06-16 02:29:14.212794: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.00MiB | |
2022-06-16 02:29:14.212798: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.00MiB | |
2022-06-16 02:29:14.212803: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.00MiB | |
2022-06-16 02:29:14.212808: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.00MiB | |
2022-06-16 02:29:14.212813: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.00MiB | |
2022-06-16 02:29:14.212818: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.00MiB | |
2022-06-16 02:29:14.212823: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.00MiB | |
2022-06-16 02:29:14.212827: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.00MiB | |
2022-06-16 02:29:14.212832: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.00MiB | |
2022-06-16 02:29:14.212869: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1550] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9657 MB memory: -> device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:18:00.0, compute capability: 7.5 | |
2022-06-16 02:29:14.212895: I tensorflow/stream_executor/stream.cc:261] [stream=0x21d0ed00,impl=0x21d0d8d0] Called Stream::Stream(parent=0x437abe0) | |
2022-06-16 02:29:14.212906: I tensorflow/stream_executor/stream.cc:308] [stream=0x21d0ed00,impl=0x21d0d8d0] Called Stream::Init() | |
2022-06-16 02:29:14.212971: I tensorflow/stream_executor/stream.cc:261] [stream=0x21bc8d30,impl=0x5d08860] Called Stream::Stream(parent=0x437abe0) | |
2022-06-16 02:29:14.212979: I tensorflow/stream_executor/stream.cc:308] [stream=0x21bc8d30,impl=0x5d08860] Called Stream::Init() | |
2022-06-16 02:29:14.212989: I tensorflow/stream_executor/stream.cc:261] [stream=0x21bc9000,impl=0x21d0d1d0] Called Stream::Stream(parent=0x437abe0) | |
2022-06-16 02:29:14.212995: I tensorflow/stream_executor/stream.cc:308] [stream=0x21bc9000,impl=0x21d0d1d0] Called Stream::Init() | |
2022-06-16 02:29:14.213003: I tensorflow/stream_executor/stream.cc:261] [stream=0x21bc92f0,impl=0x21d0d570] Called Stream::Stream(parent=0x437abe0) | |
2022-06-16 02:29:14.213009: I tensorflow/stream_executor/stream.cc:308] [stream=0x21bc92f0,impl=0x21d0d570] Called Stream::Init() | |
2022-06-16 02:29:14.213022: I tensorflow/core/common_runtime/bfc_allocator.cc:70] Creating new BFCAllocator named: gpu_host_bfc | |
2022-06-16 02:29:14.213032: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256B | |
2022-06-16 02:29:14.213038: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512B | |
2022-06-16 02:29:14.213042: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.0KiB | |
2022-06-16 02:29:14.213046: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.0KiB | |
2022-06-16 02:29:14.213051: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.0KiB | |
2022-06-16 02:29:14.213056: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.0KiB | |
2022-06-16 02:29:14.213061: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.0KiB | |
2022-06-16 02:29:14.213065: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.0KiB | |
2022-06-16 02:29:14.213070: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.0KiB | |
2022-06-16 02:29:14.213075: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.0KiB | |
2022-06-16 02:29:14.213079: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.0KiB | |
2022-06-16 02:29:14.213084: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512.0KiB | |
2022-06-16 02:29:14.213088: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.00MiB | |
2022-06-16 02:29:14.213093: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.00MiB | |
2022-06-16 02:29:14.213098: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.00MiB | |
2022-06-16 02:29:14.213102: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.00MiB | |
2022-06-16 02:29:14.213107: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.00MiB | |
2022-06-16 02:29:14.213111: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.00MiB | |
2022-06-16 02:29:14.213116: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.00MiB | |
2022-06-16 02:29:14.213121: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.00MiB | |
2022-06-16 02:29:14.213126: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.00MiB | |
2022-06-16 02:29:14.213699: I tensorflow/core/common_runtime/bfc_allocator.cc:70] Creating new BFCAllocator named: GPU_1_bfc | |
2022-06-16 02:29:14.213713: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256B | |
2022-06-16 02:29:14.213717: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512B | |
2022-06-16 02:29:14.213723: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.0KiB | |
2022-06-16 02:29:14.213727: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.0KiB | |
2022-06-16 02:29:14.213732: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.0KiB | |
2022-06-16 02:29:14.213737: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.0KiB | |
2022-06-16 02:29:14.213741: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.0KiB | |
2022-06-16 02:29:14.213746: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.0KiB | |
2022-06-16 02:29:14.213751: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.0KiB | |
2022-06-16 02:29:14.213756: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.0KiB | |
2022-06-16 02:29:14.213760: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.0KiB | |
2022-06-16 02:29:14.213765: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 512.0KiB | |
2022-06-16 02:29:14.213770: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 1.00MiB | |
2022-06-16 02:29:14.213775: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 2.00MiB | |
2022-06-16 02:29:14.213779: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 4.00MiB | |
2022-06-16 02:29:14.213784: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 8.00MiB | |
2022-06-16 02:29:14.213789: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 16.00MiB | |
2022-06-16 02:29:14.213793: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 32.00MiB | |
2022-06-16 02:29:14.213798: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 64.00MiB | |
2022-06-16 02:29:14.213803: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 128.00MiB | |
2022-06-16 02:29:14.213808: I tensorflow/core/common_runtime/bfc_allocator.cc:73] Creating bin of max chunk size 256.00MiB | |
2022-06-16 02:29:14.213827: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1550] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 9657 MB memory: -> device: 1, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:86:00.0, compute capability: 7.5 | |
2022-06-16 02:29:14.213838: I tensorflow/stream_executor/stream.cc:261] [stream=0x21bd4450,impl=0x21bd3ac0] Called Stream::Stream(parent=0x55dace0) | |
2022-06-16 02:29:14.213844: I tensorflow/stream_executor/stream.cc:308] [stream=0x21bd4450,impl=0x21bd3ac0] Called Stream::Init() | |
2022-06-16 02:29:14.213867: I tensorflow/stream_executor/stream.cc:261] [stream=0x21b085d0,impl=0x21bd3ce0] Called Stream::Stream(parent=0x55dace0) | |
2022-06-16 02:29:14.213874: I tensorflow/stream_executor/stream.cc:308] [stream=0x21b085d0,impl=0x21bd3ce0] Called Stream::Init() | |
2022-06-16 02:29:14.213883: I tensorflow/stream_executor/stream.cc:261] [stream=0x21b088c0,impl=0x21d0d290] Called Stream::Stream(parent=0x55dace0) | |
2022-06-16 02:29:14.213889: I tensorflow/stream_executor/stream.cc:308] [stream=0x21b088c0,impl=0x21d0d290] Called Stream::Init() | |
2022-06-16 02:29:14.213898: I tensorflow/stream_executor/stream.cc:261] [stream=0x21b08bb0,impl=0x21d0d260] Called Stream::Stream(parent=0x55dace0) | |
2022-06-16 02:29:14.213904: I tensorflow/stream_executor/stream.cc:308] [stream=0x21b08bb0,impl=0x21d0d260] Called Stream::Init() | |
2022-06-16 02:29:14.214228: I tensorflow/compiler/jit/xla_gpu_device.cc:79] Not creating XLA devices, tf_xla_enable_xla_devices not set | |
2022-06-16 02:29:14.214287: I tensorflow/core/common_runtime/process_util.cc:159] Session inter op parallelism threads: 32 | |
2022-06-16 02:29:14.219599: I tensorflow/compiler/jit/xla_cpu_device.cc:58] Not creating XLA devices, tf_xla_enable_xla_devices not set | |
2022-06-16 02:29:14.220169: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1836] Found device 0 with properties: | |
pciBusID: 0000:18:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5 | |
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s | |
2022-06-16 02:29:14.220612: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1836] Found device 1 with properties: | |
pciBusID: 0000:86:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5 | |
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.76GiB deviceMemoryBandwidth: 573.69GiB/s | |
2022-06-16 02:29:14.222014: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1975] Adding visible gpu devices: 0, 1 | |
2022-06-16 02:29:14.222057: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1333] Cuda stream priority range on GPU(1): -5,0 | |
2022-06-16 02:29:14.222078: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1333] Cuda stream priority range on GPU(1): -5,0 | |
2022-06-16 02:29:14.222097: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1384] TensorFlow compiled with CUDA 11.2 and cuDNN 8.1.0 | |
2022-06-16 02:29:14.222139: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1396] Device interconnect StreamExecutor with strength 1 edge matrix: | |
2022-06-16 02:29:14.222157: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] 0 1 | |
2022-06-16 02:29:14.222171: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1415] 0: N N | |
2022-06-16 02:29:14.222187: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1415] 1: N N | |
2022-06-16 02:29:14.222673: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1677] GPUDevice PlatformDeviceId 0 TfDeviceId 0 on bus 1 numa: 0 pci: 0000:18:00.0 DeviceLocality: bus_id: 1 | |
links { | |
} | |
2022-06-16 02:29:14.223063: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1677] GPUDevice PlatformDeviceId 1 TfDeviceId 1 on bus 2 numa: 1 pci: 0000:86:00.0 DeviceLocality: bus_id: 2 | |
numa_node: 1 | |
links { | |
} | |
2022-06-16 02:29:14.223470: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1550] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9657 MB memory: -> device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:18:00.0, compute capability: 7.5 | |
2022-06-16 02:29:14.223876: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1550] Created device /job:localhost/replica:0/task:0/device:GPU:1 with 9657 MB memory: -> device: 1, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:86:00.0, compute capability: 7.5 | |
2022-06-16 02:29:14.223915: I tensorflow/compiler/jit/xla_gpu_device.cc:79] Not creating XLA devices, tf_xla_enable_xla_devices not set | |
2022-06-16 02:29:14.228686: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 0 | |
2022-06-16 02:29:14.228809: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-16 02:29:14.228824: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MlirV1CompatGraphOptimizationPass | |
2022-06-16 02:29:14.228836: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:354] MLIR V1 optimization pass is not enabled | |
2022-06-16 02:29:14.228854: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-16 02:29:14.228873: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ControlFlowDepsToChainsPass | |
2022-06-16 02:29:14.228884: I tensorflow/core/common_runtime/control_flow_deps_to_chains.cc:37] ControlFlowDepsToChainsPass::Run | |
2022-06-16 02:29:14.228945: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-16 02:29:14.229000: W tensorflow/core/util/dump_graph.cc:134] Failed to dump control_flow_deps_to_chains_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-16 02:29:14.229024: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-16 02:29:14.229042: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: AccumulateNV2RemovePass | |
2022-06-16 02:29:14.229064: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: LowerFunctionalOpsPass | |
2022-06-16 02:29:14.229093: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ParallelConcatRemovePass | |
2022-06-16 02:29:14.229117: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 35 | |
2022-06-16 02:29:14.229126: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IsolatePlacerInspectionRequiredOpsPass | |
2022-06-16 02:29:14.229136: I tensorflow/core/common_runtime/isolate_placer_inspection_required_ops_pass.cc:34] IsolatePlacerInspectionRequiredOpsPass::Run | |
2022-06-16 02:29:14.229151: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IntroduceFloatingPointJitterPass | |
2022-06-16 02:29:14.229169: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 36 | |
2022-06-16 02:29:14.229184: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateXlaComputationsPass | |
2022-06-16 02:29:14.229216: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-16 02:29:14.229238: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:353] EncapsulateXlaComputations(): (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-16 02:29:14.229328: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_halfway because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-16 02:29:14.229350: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:364] EncapsulateXlaComputations() half-way: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-16 02:29:14.229387: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_xla_computations_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-16 02:29:14.229407: I tensorflow/compiler/jit/encapsulate_xla_computations_pass.cc:370] EncapsulateXlaComputations() finished: (failed to create writable file: INVALID_ARGUMENT: TF_DUMP_GRAPH_PREFIX not specified) | |
2022-06-16 02:29:14.229429: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 37 | |
2022-06-16 02:29:14.229440: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: FunctionalizeControlFlowForXlaPass | |
2022-06-16 02:29:14.229519: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 99999 | |
2022-06-16 02:29:14.229537: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: WeakForwardTypeInferencePass | |
2022-06-16 02:29:14.229549: I tensorflow/core/common_runtime/forward_type_inference.cc:130] ForwardTypeInferencePass::Run | |
2022-06-16 02:29:14.229573: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-16 02:29:14.229628: I tensorflow/core/common_runtime/forward_type_inference.cc:311] Finished after 1 iterations; done 9 of 9 nodes in 9 visits | |
2022-06-16 02:29:14.229654: W tensorflow/core/util/dump_graph.cc:134] Failed to dump forward_type_inference_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-16 02:29:14.229678: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 0 | |
2022-06-16 02:29:14.245353: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node random_uniform/shape}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.245392: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:GPU::random_uniform/shape takes 15673.1us | |
2022-06-16 02:29:14.245404: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::random_uniform/shape takes 2.157us | |
2022-06-16 02:29:14.245427: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::random_uniform/RandomUniform takes 6.933us | |
2022-06-16 02:29:14.245439: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:CPU::random_uniform/RandomUniform takes 1.49us | |
2022-06-16 02:29:14.245451: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node random_uniform_1/shape}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.245458: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:GPU::random_uniform_1/shape takes 10.165us | |
2022-06-16 02:29:14.245466: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:CPU::random_uniform_1/shape takes 0.407us | |
2022-06-16 02:29:14.245475: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::random_uniform_1/RandomUniform takes 1.045us | |
2022-06-16 02:29:14.245482: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:CPU::random_uniform_1/RandomUniform takes 0.749us | |
2022-06-16 02:29:14.245491: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node Placeholder}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.245498: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Placeholder:GPU::Placeholder takes 7.154us | |
2022-06-16 02:29:14.245505: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Placeholder:CPU::Placeholder takes 1.031us | |
2022-06-16 02:29:14.245522: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 8.5us | |
2022-06-16 02:29:14.245532: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:CPU::MatMul takes 2.909us | |
2022-06-16 02:29:14.245543: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 3.099us | |
2022-06-16 02:29:14.245557: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:CPU::Add takes 2.481us | |
2022-06-16 02:29:14.245600: I tensorflow/core/common_runtime/placer.cc:124] random_uniform/RandomUniform(RandomUniform) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:14.245613: I tensorflow/core/common_runtime/placer.cc:124] random_uniform_1/RandomUniform(RandomUniform) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:14.245621: I tensorflow/core/common_runtime/placer.cc:124] MatMul(BatchMatMulV2) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:14.245629: I tensorflow/core/common_runtime/placer.cc:124] Add(AddV2) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:14.245638: I tensorflow/core/common_runtime/placer.cc:124] random_uniform/shape(Const) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:14.245646: I tensorflow/core/common_runtime/placer.cc:124] random_uniform_1/shape(Const) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:14.245653: I tensorflow/core/common_runtime/placer.cc:124] Placeholder(Placeholder) placed on: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:14.245664: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 1 | |
2022-06-16 02:29:14.245672: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 0 | |
2022-06-16 02:29:14.245678: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: NcclReplacePass | |
2022-06-16 02:29:14.245699: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 1 | |
2022-06-16 02:29:14.245905: I tensorflow/core/common_runtime/bfc_allocator.cc:157] Extending allocation by 2.00MiB bytes for mklcpu. | |
2022-06-16 02:29:14.245918: I tensorflow/core/common_runtime/bfc_allocator.cc:162] Total allocated bytes: 2.00MiB | |
2022-06-16 02:29:14.245925: I tensorflow/core/common_runtime/bfc_allocator.cc:165] Allocated memory at 0x211a3840 to 0x213a3840 | |
2022-06-16 02:29:14.246040: I tensorflow/core/common_runtime/graph_execution_state.cc:854] BuildGraph | |
2022-06-16 02:29:14.264466: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 2500000000 Hz | |
2022-06-16 02:29:14.264905: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1013] Starting optimization for grappler item: tf_graph | |
2022-06-16 02:29:14.264932: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1034] Deleted 0 unreachable functions from the graph (library size = 0) | |
2022-06-16 02:29:14.265262: I tensorflow/core/grappler/grappler_item.cc:109] Add fetch Add:0 | |
2022-06-16 02:29:14.265277: I tensorflow/core/grappler/grappler_item.cc:113] Add feed Placeholder | |
2022-06-16 02:29:14.265401: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:933] model_pruner: Graph size after: 7 nodes (0), 6 edges (0), time = 0.131ms. | |
2022-06-16 02:29:14.266088: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:933] tfg_optimizer{tfg-consolidate-attrs,tfg-prepare-attrs-export}: Graph size after: 7 nodes (0), 6 edges (0), time = 0.638ms. | |
2022-06-16 02:29:14.266184: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:933] implementation_selector: Graph size after: 7 nodes (0), 6 edges (0), time = 0.057ms. | |
2022-06-16 02:29:14.266217: I tensorflow/core/grappler/grappler_item.cc:109] Add fetch Add:0 | |
2022-06-16 02:29:14.266227: I tensorflow/core/grappler/grappler_item.cc:113] Add feed Placeholder | |
2022-06-16 02:29:14.266352: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:933] common_subgraph_elimination: Graph size after: 6 nodes (-1), 6 edges (0), time = 0.13ms. | |
2022-06-16 02:29:14.266373: I tensorflow/core/grappler/grappler_item.cc:109] Add fetch Add:0 | |
2022-06-16 02:29:14.266384: I tensorflow/core/grappler/grappler_item.cc:113] Add feed Placeholder | |
2022-06-16 02:29:14.266448: I tensorflow/core/grappler/costs/graph_properties.cc:2377] Propagating 2 new shapes through 0 loops and 0 resources | |
2022-06-16 02:29:14.266558: I tensorflow/core/grappler/costs/graph_properties.cc:2145] Checking any conflics in shapes and dimensions ... | |
2022-06-16 02:29:14.266575: I tensorflow/core/grappler/costs/graph_properties.cc:2180] **** No incompatible shape found from SymbolicShapeManager. | |
2022-06-16 02:29:14.266733: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:933] constant_folding: Graph size after: 6 nodes (0), 6 edges (0), time = 0.357ms. | |
2022-06-16 02:29:14.266765: I tensorflow/core/grappler/grappler_item.cc:109] Add fetch Add:0 | |
2022-06-16 02:29:14.266772: I tensorflow/core/grappler/grappler_item.cc:113] Add feed Placeholder | |
2022-06-16 02:29:14.266812: I tensorflow/core/grappler/costs/graph_properties.cc:2377] Propagating 2 new shapes through 0 loops and 0 resources | |
2022-06-16 02:29:14.266873: I tensorflow/core/grappler/costs/graph_properties.cc:2145] Checking any conflics in shapes and dimensions ... | |
2022-06-16 02:29:14.266888: I tensorflow/core/grappler/costs/graph_properties.cc:2180] **** No incompatible shape found from SymbolicShapeManager. | |
2022-06-16 02:29:14.266966: I tensorflow/core/grappler/optimizers/arithmetic_optimizer.cc:4372] Run 31 arithmetic optimizer stages: AddOpsRewrite, FoldConjugateIntoTranspose, FoldMultiplyIntoConv, FoldTransposeIntoMatMul, MinimizeBroadcasts, RemoveIdentityTranspose, RemoveInvolution, RemoveRedundantBitcast, RemoveRedundantCast, ReplacePackWithTileReshape, ReplaceMulWithBroadcastByTile, ReduceUpsamplingDims, RemoveRedundantReshapeOrBroadcastTo, RemoveNegation, ReplaceMulWithSquare, RemoveLogicalNot, ReorderCastLikeAndValuePreserving, SimplifyAggregation, , SqrtDivToRsqrtMul, RemoveIdempotent, ConvertPow, ConvertLog1p, LogSoftmaxStage, OptimizeMaxOrMinOfMonotonicStage, ConvertExpm1, UnaryOpsComposition, RemoveStackStridedSliceSameAxis, SimplifyEmbeddingLookupStage, RemoveCastIntoSegmentReductionStage, FuseSquaredDiffStage | |
2022-06-16 02:29:14.267051: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:933] arithmetic_optimizer: Graph size after: 6 nodes (0), 6 edges (0), time = 0.28ms. | |
2022-06-16 02:29:14.267108: I tensorflow/core/grappler/costs/graph_properties.cc:2377] Propagating 2 new shapes through 0 loops and 0 resources | |
2022-06-16 02:29:14.267165: I tensorflow/core/grappler/costs/graph_properties.cc:2145] Checking any conflics in shapes and dimensions ... | |
2022-06-16 02:29:14.267181: I tensorflow/core/grappler/costs/graph_properties.cc:2180] **** No incompatible shape found from SymbolicShapeManager. | |
2022-06-16 02:29:14.267236: I tensorflow/core/grappler/grappler_item.cc:109] Add fetch Add:0 | |
2022-06-16 02:29:14.267247: I tensorflow/core/grappler/grappler_item.cc:113] Add feed Placeholder | |
2022-06-16 02:29:14.267343: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node random_uniform/shape}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.267358: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:GPU::random_uniform/shape takes 22.455us | |
2022-06-16 02:29:14.267379: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node Placeholder}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.267390: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Placeholder:GPU::Placeholder takes 10.951us | |
2022-06-16 02:29:14.267406: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::random_uniform/RandomUniform takes 2.338us | |
2022-06-16 02:29:14.267421: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::random_uniform_1/RandomUniform takes 1.171us | |
2022-06-16 02:29:14.267439: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 2.827us | |
2022-06-16 02:29:14.267460: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 3.702us | |
2022-06-16 02:29:14.267497: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:933] layout: Graph size after: 6 nodes (0), 6 edges (0), time = 0.421ms. | |
2022-06-16 02:29:14.267526: I tensorflow/core/grappler/grappler_item.cc:109] Add fetch Add:0 | |
2022-06-16 02:29:14.267535: I tensorflow/core/grappler/grappler_item.cc:113] Add feed Placeholder | |
2022-06-16 02:29:14.267751: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:933] remapper: Graph size after: 6 nodes (0), 6 edges (0), time = 0.229ms. | |
2022-06-16 02:29:14.267781: I tensorflow/core/grappler/grappler_item.cc:109] Add fetch Add:0 | |
2022-06-16 02:29:14.267790: I tensorflow/core/grappler/grappler_item.cc:113] Add feed Placeholder | |
2022-06-16 02:29:14.267808: I tensorflow/core/grappler/grappler_item.cc:109] Add fetch Add:0 | |
2022-06-16 02:29:14.267814: I tensorflow/core/grappler/grappler_item.cc:113] Add feed Placeholder | |
2022-06-16 02:29:14.267845: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:933] loop_optimizer: Graph size after: 6 nodes (0), 6 edges (0), time = 0.073ms. | |
2022-06-16 02:29:14.267868: I tensorflow/core/grappler/grappler_item.cc:109] Add fetch Add:0 | |
2022-06-16 02:29:14.267876: I tensorflow/core/grappler/grappler_item.cc:113] Add feed Placeholder | |
2022-06-16 02:29:14.267921: I tensorflow/core/grappler/optimizers/dependency_optimizer.cc:626] Removed 0 out of 0 control dependencies | |
2022-06-16 02:29:14.267940: I tensorflow/core/grappler/optimizers/dependency_optimizer.cc:494] Deleted 0 out of 6 nodes. | |
2022-06-16 02:29:14.267960: I tensorflow/core/grappler/optimizers/dependency_optimizer.cc:648] DependencyOptimizer::GroupCrossDeviceControlEdges host_granularity=0 | |
2022-06-16 02:29:14.267972: I tensorflow/core/grappler/optimizers/dependency_optimizer.cc:648] DependencyOptimizer::GroupCrossDeviceControlEdges host_granularity=1 | |
2022-06-16 02:29:14.267998: I tensorflow/core/grappler/optimizers/dependency_optimizer.cc:626] Removed 0 out of 0 control dependencies | |
2022-06-16 02:29:14.268011: I tensorflow/core/grappler/optimizers/dependency_optimizer.cc:494] Deleted 0 out of 6 nodes. | |
2022-06-16 02:29:14.268025: I tensorflow/core/grappler/optimizers/dependency_optimizer.cc:648] DependencyOptimizer::GroupCrossDeviceControlEdges host_granularity=0 | |
2022-06-16 02:29:14.268034: I tensorflow/core/grappler/optimizers/dependency_optimizer.cc:648] DependencyOptimizer::GroupCrossDeviceControlEdges host_granularity=1 | |
2022-06-16 02:29:14.268050: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:933] dependency_optimizer: Graph size after: 6 nodes (0), 6 edges (0), time = 0.184ms. | |
2022-06-16 02:29:14.268233: I tensorflow/core/grappler/costs/graph_properties.cc:2377] Propagating 2 new shapes through 0 loops and 0 resources | |
2022-06-16 02:29:14.268346: I tensorflow/core/grappler/costs/graph_properties.cc:2145] Checking any conflics in shapes and dimensions ... | |
2022-06-16 02:29:14.268372: I tensorflow/core/grappler/costs/graph_properties.cc:2180] **** No incompatible shape found from SymbolicShapeManager. | |
2022-06-16 02:29:14.268519: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:1923] Op:Placeholder Minimum cost for Identity | |
2022-06-16 02:29:14.268540: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:1634] Output Size: 65536 Total Output Size:65536 | |
2022-06-16 02:29:14.268553: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:677] Operation Placeholder takes 1 ns. | |
2022-06-16 02:29:14.268610: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:1942] Op:Const Minimum cost for Variable | |
2022-06-16 02:29:14.268623: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:1634] Output Size: 8 Total Output Size:8 | |
2022-06-16 02:29:14.268630: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:677] Operation Const takes 1 ns. | |
2022-06-16 02:29:14.268666: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:714] Missing accurate estimator for op: RandomUniform | |
2022-06-16 02:29:14.268691: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:778] Device: GPU gflops: 13447.7 gb_per_sec: 616 | |
2022-06-16 02:29:14.268704: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:853] Op:RandomUniform GOps:0 Compute Time (ns):0 | |
2022-06-16 02:29:14.268714: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:858] Op:RandomUniform Size (KB):524.296 Memory Time (ns):852 | |
2022-06-16 02:29:14.268723: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:878] Op:RandomUniform Size (KB):524.296 Intermediate Memory Time (ns):0 | |
2022-06-16 02:29:14.268731: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:677] Operation RandomUniform takes 852 ns. | |
2022-06-16 02:29:14.268770: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:714] Missing accurate estimator for op: RandomUniform | |
2022-06-16 02:29:14.268786: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:778] Device: GPU gflops: 13447.7 gb_per_sec: 616 | |
2022-06-16 02:29:14.268795: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:853] Op:RandomUniform GOps:0 Compute Time (ns):0 | |
2022-06-16 02:29:14.268803: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:858] Op:RandomUniform Size (KB):524.296 Memory Time (ns):852 | |
2022-06-16 02:29:14.268812: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:878] Op:RandomUniform Size (KB):524.296 Intermediate Memory Time (ns):0 | |
2022-06-16 02:29:14.268820: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:677] Operation RandomUniform takes 852 ns. | |
2022-06-16 02:29:14.268863: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:1072] Key:transpose_a Value:false | |
2022-06-16 02:29:14.268875: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:1072] Key:transpose_b Value:false | |
2022-06-16 02:29:14.268883: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:1079] transpose_a:0 | |
2022-06-16 02:29:14.268890: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:1080] transpose_b:0 | |
2022-06-16 02:29:14.268902: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:1100] M, N, K: 1024,128,128 | |
2022-06-16 02:29:14.268911: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:1112] Operations for Matmul: 3.35544e+07 | |
2022-06-16 02:29:14.268926: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:778] Device: GPU gflops: 13447.7 gb_per_sec: 616 | |
2022-06-16 02:29:14.268940: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:853] Op:BatchMatMulV2 GOps:0.0335544 Compute Time (ns):2496 | |
2022-06-16 02:29:14.268949: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:858] Op:BatchMatMulV2 Size (KB):1114.11 Memory Time (ns):1809 | |
2022-06-16 02:29:14.268957: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:878] Op:BatchMatMulV2 Size (KB):1114.11 Intermediate Memory Time (ns):0 | |
2022-06-16 02:29:14.268965: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:677] Operation BatchMatMulV2 takes 4305 ns. | |
2022-06-16 02:29:14.269004: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:1606] Input Count: 131072 Largest Input Count:131072 | |
2022-06-16 02:29:14.269015: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:1606] Input Count: 131072 Largest Input Count:131072 | |
2022-06-16 02:29:14.269029: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:778] Device: GPU gflops: 13447.7 gb_per_sec: 616 | |
2022-06-16 02:29:14.269038: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:853] Op:AddV2 GOps:0.000131072 Compute Time (ns):10 | |
2022-06-16 02:29:14.269047: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:858] Op:AddV2 Size (KB):1572.86 Memory Time (ns):2554 | |
2022-06-16 02:29:14.269056: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:878] Op:AddV2 Size (KB):1572.86 Intermediate Memory Time (ns):0 | |
2022-06-16 02:29:14.269063: I tensorflow/core/grappler/costs/op_level_cost_estimator.cc:677] Operation AddV2 takes 2564 ns. | |
2022-06-16 02:29:14.269091: I tensorflow/core/grappler/costs/analytical_cost_estimator.cc:221] 5 out of 6 nodes have inaccurate time estimation | |
2022-06-16 02:29:14.269178: I tensorflow/core/grappler/costs/analytical_cost_estimator.cc:239] | |
Aggregated per device / channel type tensor size histogram: | |
Device: /localhost/GPU | |
Count: 6, Average: 352.0KiB, Min: 8B, Max: 512.0KiB | |
------------------------------------------------------ | |
[ 8B, 16B) 1 16.667% 16.667% ####### | |
[ 64.0KiB, 128.0KiB) 1 16.667% 33.333% ####### | |
[ 512.0KiB, 1.00MiB) 4 66.667% 100.000% ########################### | |
2022-06-16 02:29:14.269232: I tensorflow/core/grappler/costs/graph_memory.cc:263] At time 0 allocated 524288 for tensor random_uniform_1/RandomUniform:0 | |
2022-06-16 02:29:14.269244: I tensorflow/core/grappler/costs/graph_memory.cc:263] At time 0 allocated 524288 for tensor random_uniform/RandomUniform:0 | |
2022-06-16 02:29:14.269251: I tensorflow/core/grappler/costs/graph_memory.cc:263] At time 0 allocated 8 for tensor random_uniform/shape:0 | |
2022-06-16 02:29:14.269259: I tensorflow/core/grappler/costs/graph_memory.cc:263] At time 0 allocated 65536 for tensor Placeholder:0 | |
2022-06-16 02:29:14.269268: I tensorflow/core/grappler/costs/graph_memory.cc:263] At time 1000 allocated 524288 for tensor MatMul:0 | |
2022-06-16 02:29:14.269276: I tensorflow/core/grappler/costs/graph_memory.cc:269] At time 1001 deallocated 8 for tensor random_uniform/shape:0 | |
2022-06-16 02:29:14.269283: I tensorflow/core/grappler/costs/graph_memory.cc:263] At time 6000 allocated 524288 for tensor Add:0 | |
2022-06-16 02:29:14.269291: I tensorflow/core/grappler/costs/graph_memory.cc:269] At time 6001 deallocated 524288 for tensor random_uniform/RandomUniform:0 | |
2022-06-16 02:29:14.269298: I tensorflow/core/grappler/costs/graph_memory.cc:269] At time 6001 deallocated 65536 for tensor Placeholder:0 | |
2022-06-16 02:29:14.269306: I tensorflow/core/grappler/costs/graph_memory.cc:269] At time 8001 deallocated 524288 for tensor Add:0 | |
2022-06-16 02:29:14.269313: I tensorflow/core/grappler/costs/graph_memory.cc:269] At time 8001 deallocated 524288 for tensor MatMul:0 | |
2022-06-16 02:29:14.269320: I tensorflow/core/grappler/costs/graph_memory.cc:269] At time 8001 deallocated 524288 for tensor random_uniform_1/RandomUniform:0 | |
2022-06-16 02:29:14.269415: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:933] memory_optimizer: Graph size after: 6 nodes (0), 6 edges (0), time = 1.337ms. | |
2022-06-16 02:29:14.269440: I tensorflow/core/grappler/grappler_item.cc:109] Add fetch Add:0 | |
2022-06-16 02:29:14.269451: I tensorflow/core/grappler/grappler_item.cc:113] Add feed Placeholder | |
2022-06-16 02:29:14.269510: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:933] model_pruner: Graph size after: 6 nodes (0), 6 edges (0), time = 0.066ms. | |
2022-06-16 02:29:14.269775: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:933] tfg_optimizer{tfg-consolidate-attrs,tfg-prepare-attrs-export}: Graph size after: 6 nodes (0), 6 edges (0), time = 0.233ms. | |
2022-06-16 02:29:14.269836: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:933] implementation_selector: Graph size after: 6 nodes (0), 6 edges (0), time = 0.027ms. | |
2022-06-16 02:29:14.269869: I tensorflow/core/grappler/grappler_item.cc:109] Add fetch Add:0 | |
2022-06-16 02:29:14.269880: I tensorflow/core/grappler/grappler_item.cc:113] Add feed Placeholder | |
2022-06-16 02:29:14.269955: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:933] common_subgraph_elimination: Graph size after: 6 nodes (0), 6 edges (0), time = 0.081ms. | |
2022-06-16 02:29:14.269979: I tensorflow/core/grappler/grappler_item.cc:109] Add fetch Add:0 | |
2022-06-16 02:29:14.269991: I tensorflow/core/grappler/grappler_item.cc:113] Add feed Placeholder | |
2022-06-16 02:29:14.270038: I tensorflow/core/grappler/costs/graph_properties.cc:2377] Propagating 2 new shapes through 0 loops and 0 resources | |
2022-06-16 02:29:14.270126: I tensorflow/core/grappler/costs/graph_properties.cc:2145] Checking any conflics in shapes and dimensions ... | |
2022-06-16 02:29:14.270148: I tensorflow/core/grappler/costs/graph_properties.cc:2180] **** No incompatible shape found from SymbolicShapeManager. | |
2022-06-16 02:29:14.270275: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:933] constant_folding: Graph size after: 6 nodes (0), 6 edges (0), time = 0.292ms. | |
2022-06-16 02:29:14.270306: I tensorflow/core/grappler/grappler_item.cc:109] Add fetch Add:0 | |
2022-06-16 02:29:14.270317: I tensorflow/core/grappler/grappler_item.cc:113] Add feed Placeholder | |
2022-06-16 02:29:14.270375: I tensorflow/core/grappler/costs/graph_properties.cc:2377] Propagating 2 new shapes through 0 loops and 0 resources | |
2022-06-16 02:29:14.270450: I tensorflow/core/grappler/costs/graph_properties.cc:2145] Checking any conflics in shapes and dimensions ... | |
2022-06-16 02:29:14.270470: I tensorflow/core/grappler/costs/graph_properties.cc:2180] **** No incompatible shape found from SymbolicShapeManager. | |
2022-06-16 02:29:14.270568: I tensorflow/core/grappler/optimizers/arithmetic_optimizer.cc:4372] Run 31 arithmetic optimizer stages: AddOpsRewrite, FoldConjugateIntoTranspose, FoldMultiplyIntoConv, FoldTransposeIntoMatMul, MinimizeBroadcasts, RemoveIdentityTranspose, RemoveInvolution, RemoveRedundantBitcast, RemoveRedundantCast, ReplacePackWithTileReshape, ReplaceMulWithBroadcastByTile, ReduceUpsamplingDims, RemoveRedundantReshapeOrBroadcastTo, RemoveNegation, ReplaceMulWithSquare, RemoveLogicalNot, ReorderCastLikeAndValuePreserving, SimplifyAggregation, , SqrtDivToRsqrtMul, RemoveIdempotent, ConvertPow, ConvertLog1p, LogSoftmaxStage, OptimizeMaxOrMinOfMonotonicStage, ConvertExpm1, UnaryOpsComposition, RemoveStackStridedSliceSameAxis, SimplifyEmbeddingLookupStage, RemoveCastIntoSegmentReductionStage, FuseSquaredDiffStage | |
2022-06-16 02:29:14.270636: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:933] arithmetic_optimizer: Graph size after: 6 nodes (0), 6 edges (0), time = 0.325ms. | |
2022-06-16 02:29:14.270670: I tensorflow/core/grappler/grappler_item.cc:109] Add fetch Add:0 | |
2022-06-16 02:29:14.270681: I tensorflow/core/grappler/grappler_item.cc:113] Add feed Placeholder | |
2022-06-16 02:29:14.270948: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:933] remapper: Graph size after: 6 nodes (0), 6 edges (0), time = 0.284ms. | |
2022-06-16 02:29:14.270982: I tensorflow/core/grappler/grappler_item.cc:109] Add fetch Add:0 | |
2022-06-16 02:29:14.270993: I tensorflow/core/grappler/grappler_item.cc:113] Add feed Placeholder | |
2022-06-16 02:29:14.271026: I tensorflow/core/grappler/optimizers/dependency_optimizer.cc:626] Removed 0 out of 0 control dependencies | |
2022-06-16 02:29:14.271043: I tensorflow/core/grappler/optimizers/dependency_optimizer.cc:494] Deleted 0 out of 6 nodes. | |
2022-06-16 02:29:14.271058: I tensorflow/core/grappler/optimizers/dependency_optimizer.cc:648] DependencyOptimizer::GroupCrossDeviceControlEdges host_granularity=0 | |
2022-06-16 02:29:14.271069: I tensorflow/core/grappler/optimizers/dependency_optimizer.cc:648] DependencyOptimizer::GroupCrossDeviceControlEdges host_granularity=1 | |
2022-06-16 02:29:14.271094: I tensorflow/core/grappler/optimizers/dependency_optimizer.cc:626] Removed 0 out of 0 control dependencies | |
2022-06-16 02:29:14.271107: I tensorflow/core/grappler/optimizers/dependency_optimizer.cc:494] Deleted 0 out of 6 nodes. | |
2022-06-16 02:29:14.271120: I tensorflow/core/grappler/optimizers/dependency_optimizer.cc:648] DependencyOptimizer::GroupCrossDeviceControlEdges host_granularity=0 | |
2022-06-16 02:29:14.271129: I tensorflow/core/grappler/optimizers/dependency_optimizer.cc:648] DependencyOptimizer::GroupCrossDeviceControlEdges host_granularity=1 | |
2022-06-16 02:29:14.271144: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:933] dependency_optimizer: Graph size after: 6 nodes (0), 6 edges (0), time = 0.17ms. | |
2022-06-16 02:29:14.271252: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1046] Optimized main graph. | |
2022-06-16 02:29:14.271824: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:933] tfg_optimizer{tfg-consolidate-attrs,tfg-functional-to-region,tfg.func(tfg-cf-sink),tfg-region-to-functional{force-control-capture=true},tfg-prepare-attrs-export}: Graph size after: 6 nodes (0), 6 edges (0), time = 0.417ms. | |
2022-06-16 02:29:14.272067: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:933] tfg_optimizer{tfg-consolidate-attrs,tfg-functional-to-region,tfg.func(tfg-cf-sink),tfg-region-to-functional{force-control-capture=true},tfg-prepare-attrs-export}: Graph size after: 6 nodes (0), 6 edges (0), time = 0.193ms. | |
2022-06-16 02:29:14.272175: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:1267] Optimized 0 functions: | |
2022-06-16 02:29:14.272192: W tensorflow/core/util/dump_graph.cc:134] Failed to dump after_MetaOptimizer_140722650481008 because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-16 02:29:14.272497: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 2 | |
2022-06-16 02:29:14.272517: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 5 | |
2022-06-16 02:29:14.272525: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: CloneConstantsForBetterClusteringPass | |
2022-06-16 02:29:14.272538: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 9 | |
2022-06-16 02:29:14.272552: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ClusterScopingPass | |
2022-06-16 02:29:14.272561: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 10 | |
2022-06-16 02:29:14.272568: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MarkForCompilationPass | |
2022-06-16 02:29:14.280846: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: XlaLaunch:CPU::_XlaLaunch-op takes 2.005us | |
2022-06-16 02:29:14.280871: I tensorflow/compiler/tf2xla/xla_op_registry.cc:51] LaunchOpHasKernelForDevice kernel_class_name: XlaLocalLaunchOp | |
2022-06-16 02:29:14.280883: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: XlaLaunch:GPU::_XlaLaunch-op takes 0.986us | |
2022-06-16 02:29:14.280889: I tensorflow/compiler/tf2xla/xla_op_registry.cc:51] LaunchOpHasKernelForDevice kernel_class_name: XlaLocalLaunchOp | |
2022-06-16 02:29:14.280914: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:XLA_GPU_JIT::random_uniform/shape takes 1.888us | |
2022-06-16 02:29:14.280951: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:XLA_GPU_JIT::random_uniform/RandomUniform takes 2.493us | |
2022-06-16 02:29:14.280966: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:XLA_GPU_JIT::random_uniform_1/RandomUniform takes 0.527us | |
2022-06-16 02:29:14.280979: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:XLA_GPU_JIT::MatMul takes 0.676us | |
2022-06-16 02:29:14.280992: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:XLA_GPU_JIT::Add takes 1.28us | |
2022-06-16 02:29:14.281079: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:650] DeadnessAnalysis time: 18 us (cumulative: 18 us, max: 18 us, #called: 1) | |
2022-06-16 02:29:14.281151: I tensorflow/compiler/jit/mark_for_compilation_pass.cc:1523] MarkForCompilationPassImpl::Run time: 579 us (cumulative: 579 us, max: 579 us, #called: 1) | |
2022-06-16 02:29:14.281168: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 12 | |
2022-06-16 02:29:14.281174: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ForceXlaConstantsOnHostPass | |
2022-06-16 02:29:14.281192: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 20 | |
2022-06-16 02:29:14.281199: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: IncreaseDynamismForAutoJitPass | |
2022-06-16 02:29:14.281207: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 30 | |
2022-06-16 02:29:14.281212: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: PartiallyDeclusterPass | |
2022-06-16 02:29:14.281244: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 40 | |
2022-06-16 02:29:14.281251: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: ReportClusteringInfoPass | |
2022-06-16 02:29:14.281375: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 50 | |
2022-06-16 02:29:14.281384: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: EncapsulateSubgraphsPass | |
2022-06-16 02:29:14.281390: I tensorflow/compiler/jit/encapsulate_subgraphs_pass.cc:1139] EncapsulateSubgraphsPass::Run | |
2022-06-16 02:29:14.281420: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_before because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-16 02:29:14.281541: W tensorflow/core/util/dump_graph.cc:134] Failed to dump encapsulate_subgraphs_after because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-16 02:29:14.281576: I tensorflow/compiler/jit/xla_cluster_util.cc:590] GetNodesRelatedToRefVariables() found 0 nodes | |
2022-06-16 02:29:14.281607: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 60 | |
2022-06-16 02:29:14.281614: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: BuildXlaOpsPass | |
2022-06-16 02:29:14.281628: I tensorflow/compiler/jit/build_xla_ops_pass.cc:603] print_outputs = 0 | |
2022-06-16 02:29:14.281633: I tensorflow/compiler/jit/build_xla_ops_pass.cc:604] check_input_numerics = 0 | |
2022-06-16 02:29:14.281637: I tensorflow/compiler/jit/build_xla_ops_pass.cc:605] check_output_numerics = 0 | |
2022-06-16 02:29:14.281656: W tensorflow/core/util/dump_graph.cc:134] Failed to dump build_xla_ops because dump location is not specified through either TF_DUMP_GRAPH_PREFIX environment variable or function argument. | |
2022-06-16 02:29:14.281673: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 2 | |
2022-06-16 02:29:14.281731: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node random_uniform/shape}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.281742: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:GPU::random_uniform/shape takes 17.308us | |
2022-06-16 02:29:14.281758: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::random_uniform/RandomUniform takes 1.953us | |
2022-06-16 02:29:14.281768: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::random_uniform_1/RandomUniform takes 0.808us | |
2022-06-16 02:29:14.281777: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 1.621us | |
2022-06-16 02:29:14.281788: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 2.219us | |
2022-06-16 02:29:14.281796: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::_arg_Placeholder_0_0 takes 0.58us | |
2022-06-16 02:29:14.281805: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::_retval_Add_0_0 takes 0.602us | |
2022-06-16 02:29:14.281886: I tensorflow/core/graph/graph_partition.cc:281] Receiving data from _arg_Placeholder_0_0 (_Arg) on /job:localhost/replica:0/task:0/device:CPU:0 in device memory for MatMul (BatchMatMulV2) on /job:localhost/replica:0/task:0/device:GPU:0 in device memory | |
2022-06-16 02:29:14.281925: I tensorflow/core/graph/graph_partition.cc:281] Receiving data from Add (AddV2) on /job:localhost/replica:0/task:0/device:GPU:0 in device memory for _retval_Add_0_0 (_Retval) on /job:localhost/replica:0/task:0/device:CPU:0 in device memory | |
2022-06-16 02:29:14.281948: I tensorflow/core/graph/graph_partition.cc:1251] Added send/recv: controls=0, data=2 | |
2022-06-16 02:29:14.282035: I tensorflow/core/common_runtime/optimization_registry.cc:54] Starting optimization of a group 3 | |
2022-06-16 02:29:14.282050: I tensorflow/core/common_runtime/optimization_registry.cc:66] Running optimization phase 1 | |
2022-06-16 02:29:14.282060: I tensorflow/core/common_runtime/optimization_registry.cc:68] Running optimization pass: MklLayoutRewritePass | |
2022-06-16 02:29:14.282070: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node Const, reason: User has assigned a device that is not CPU. | |
2022-06-16 02:29:14.282076: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node RandomUniform, reason: User has assigned a device that is not CPU. | |
2022-06-16 02:29:14.282080: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node RandomUniform, reason: User has assigned a device that is not CPU. | |
2022-06-16 02:29:14.282085: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Recv, reason: User has assigned a device that is not CPU. | |
2022-06-16 02:29:14.282089: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node BatchMatMulV2, reason: User has assigned a device that is not CPU. | |
2022-06-16 02:29:14.282094: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node AddV2, reason: User has assigned a device that is not CPU. | |
2022-06-16 02:29:14.282099: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Send, reason: User has assigned a device that is not CPU. | |
2022-06-16 02:29:14.282104: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node Const, reason: User has assigned a device that is not CPU. | |
2022-06-16 02:29:14.282109: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node RandomUniform, reason: User has assigned a device that is not CPU. | |
2022-06-16 02:29:14.282113: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node RandomUniform, reason: User has assigned a device that is not CPU. | |
2022-06-16 02:29:14.282118: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Recv, reason: User has assigned a device that is not CPU. | |
2022-06-16 02:29:14.282122: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node BatchMatMulV2, reason: User has assigned a device that is not CPU. | |
2022-06-16 02:29:14.282127: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node AddV2, reason: User has assigned a device that is not CPU. | |
2022-06-16 02:29:14.282131: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Send, reason: User has assigned a device that is not CPU. | |
2022-06-16 02:29:14.282137: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node Const, reason: User has assigned a device that is not CPU. | |
2022-06-16 02:29:14.282141: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node RandomUniform, reason: User has assigned a device that is not CPU. | |
2022-06-16 02:29:14.282146: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node RandomUniform, reason: User has assigned a device that is not CPU. | |
2022-06-16 02:29:14.282150: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Recv, reason: User has assigned a device that is not CPU. | |
2022-06-16 02:29:14.282155: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node BatchMatMulV2, reason: User has assigned a device that is not CPU. | |
2022-06-16 02:29:14.282159: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node AddV2, reason: User has assigned a device that is not CPU. | |
2022-06-16 02:29:14.282163: I tensorflow/core/common_runtime/mkl_layout_pass.cc:1040] MklLayoutRewritePass: Skipping rewriting of the node _Send, reason: User has assigned a device that is not CPU. | |
2022-06-16 02:29:14.286237: I tensorflow/core/common_runtime/optimization_registry.cc:87] Finished optimization of a group 3 | |
2022-06-16 02:29:14.286346: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:CPU::MatMul takes 5.777us | |
2022-06-16 02:29:14.286361: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:CPU::Add takes 2.502us | |
2022-06-16 02:29:14.286369: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found | |
2022-06-16 02:29:14.286438: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.286449: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 11.318us | |
2022-06-16 02:29:14.286458: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SINK}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.286465: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SINK takes 6.905us | |
2022-06-16 02:29:14.286476: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node random_uniform/shape}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.286482: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:GPU::random_uniform/shape takes 10.405us | |
2022-06-16 02:29:14.286493: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::random_uniform/RandomUniform takes 1.416us | |
2022-06-16 02:29:14.286503: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::random_uniform_1/RandomUniform takes 0.86us | |
2022-06-16 02:29:14.286511: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _arg_Placeholder_0_0/_1}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.286517: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Recv:GPU::_arg_Placeholder_0_0/_1 takes 6.053us | |
2022-06-16 02:29:14.286525: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 1.442us | |
2022-06-16 02:29:14.286537: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 3.399us | |
2022-06-16 02:29:14.286545: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node Add/_2}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.286550: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:GPU::Add/_2 takes 5.581us | |
2022-06-16 02:29:14.286558: I tensorflow/core/common_runtime/memory_types.cc:87] 2:0 -> 3:0: 1 -> 1 | |
2022-06-16 02:29:14.286564: I tensorflow/core/common_runtime/memory_types.cc:87] 2:0 -> 4:0: 1 -> 1 | |
2022-06-16 02:29:14.286569: I tensorflow/core/common_runtime/memory_types.cc:87] 3:0 -> 6:0: 0 -> 0 | |
2022-06-16 02:29:14.286574: I tensorflow/core/common_runtime/memory_types.cc:87] 5:0 -> 6:1: 0 -> 0 | |
2022-06-16 02:29:14.286578: I tensorflow/core/common_runtime/memory_types.cc:87] 6:0 -> 7:0: 0 -> 0 | |
2022-06-16 02:29:14.286583: I tensorflow/core/common_runtime/memory_types.cc:87] 4:0 -> 7:1: 0 -> 0 | |
2022-06-16 02:29:14.286589: I tensorflow/core/common_runtime/memory_types.cc:87] 7:0 -> 8:0: 0 -> 0 | |
2022-06-16 02:29:14.286596: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.286601: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 5.44us | |
2022-06-16 02:29:14.286607: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SINK}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.286613: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SINK takes 5.229us | |
2022-06-16 02:29:14.286621: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node random_uniform/shape}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.286627: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:GPU::random_uniform/shape takes 7.897us | |
2022-06-16 02:29:14.286635: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::random_uniform/RandomUniform takes 0.774us | |
2022-06-16 02:29:14.286643: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::random_uniform_1/RandomUniform takes 0.826us | |
2022-06-16 02:29:14.286650: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _arg_Placeholder_0_0/_1}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.286656: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Recv:GPU::_arg_Placeholder_0_0/_1 takes 5.725us | |
2022-06-16 02:29:14.286664: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 0.789us | |
2022-06-16 02:29:14.286672: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 1.056us | |
2022-06-16 02:29:14.286679: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node Add/_2}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.286684: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:GPU::Add/_2 takes 5.257us | |
2022-06-16 02:29:14.286691: I tensorflow/core/common_runtime/memory_types.cc:87] 2:0 -> 3:0: 1 -> 1 | |
2022-06-16 02:29:14.286696: I tensorflow/core/common_runtime/memory_types.cc:87] 2:0 -> 4:0: 1 -> 1 | |
2022-06-16 02:29:14.286701: I tensorflow/core/common_runtime/memory_types.cc:87] 3:0 -> 6:0: 0 -> 0 | |
2022-06-16 02:29:14.286706: I tensorflow/core/common_runtime/memory_types.cc:87] 5:0 -> 6:1: 0 -> 0 | |
2022-06-16 02:29:14.286711: I tensorflow/core/common_runtime/memory_types.cc:87] 6:0 -> 7:0: 0 -> 0 | |
2022-06-16 02:29:14.286716: I tensorflow/core/common_runtime/memory_types.cc:87] 4:0 -> 7:1: 0 -> 0 | |
2022-06-16 02:29:14.286720: I tensorflow/core/common_runtime/memory_types.cc:87] 7:0 -> 8:0: 0 -> 0 | |
2022-06-16 02:29:14.286825: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() | |
2022-06-16 02:29:14.286837: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.286843: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 6.343us | |
2022-06-16 02:29:14.286849: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.286854: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:GPU::_SOURCE takes 5.019us | |
2022-06-16 02:29:14.286870: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() takes 46.275us | |
2022-06-16 02:29:14.286905: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node random_uniform/shape}} = Const[_XlaHasReferenceVars=false, dtype=DT_INT32, value=Tensor<type: int32 shape: [2] values: 1024 128>, _device="/job:localhost/replica:0/task:0/device:GPU:0"]() | |
2022-06-16 02:29:14.286919: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node random_uniform/shape}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.286925: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:GPU::random_uniform/shape takes 8.587us | |
2022-06-16 02:29:14.286934: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node random_uniform/shape}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.286939: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: Const:GPU::random_uniform/shape takes 7.849us | |
2022-06-16 02:29:14.288588: I tensorflow/stream_executor/stream_executor_pimpl.cc:581] Called StreamExecutor::HostMemoryAllocate(size=2097152) returns 0x7ff8b1400000 | |
2022-06-16 02:29:14.288613: I tensorflow/core/common_runtime/bfc_allocator.cc:157] Extending allocation by 2.00MiB bytes for gpu_host_bfc. | |
2022-06-16 02:29:14.288619: I tensorflow/core/common_runtime/bfc_allocator.cc:162] Total allocated bytes: 2.00MiB | |
2022-06-16 02:29:14.288624: I tensorflow/core/common_runtime/bfc_allocator.cc:165] Allocated memory at 0x7ff8b1400000 to 0x7ff8b1600000 | |
2022-06-16 02:29:14.288675: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node random_uniform/shape}} = Const[_XlaHasReferenceVars=false, dtype=DT_INT32, value=Tensor<type: int32 shape: [2] values: 1024 128>, _device="/job:localhost/replica:0/task:0/device:GPU:0"]() takes 1773.54us | |
2022-06-16 02:29:14.288700: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node random_uniform/RandomUniform}} = RandomUniform[T=DT_INT32, _XlaHasReferenceVars=false, dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](random_uniform/shape) | |
2022-06-16 02:29:14.288717: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::random_uniform/RandomUniform takes 2.157us | |
2022-06-16 02:29:14.288724: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::random_uniform/RandomUniform takes 0.717us | |
2022-06-16 02:29:14.288758: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node random_uniform/RandomUniform}} = RandomUniform[T=DT_INT32, _XlaHasReferenceVars=false, dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](random_uniform/shape) takes 59.679us | |
2022-06-16 02:29:14.288773: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node random_uniform_1/RandomUniform}} = RandomUniform[T=DT_INT32, _XlaHasReferenceVars=false, dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](random_uniform/shape) | |
2022-06-16 02:29:14.288782: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::random_uniform_1/RandomUniform takes 0.863us | |
2022-06-16 02:29:14.288788: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: RandomUniform:GPU::random_uniform_1/RandomUniform takes 0.569us | |
2022-06-16 02:29:14.288797: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node random_uniform_1/RandomUniform}} = RandomUniform[T=DT_INT32, _XlaHasReferenceVars=false, dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](random_uniform/shape) takes 24.786us | |
2022-06-16 02:29:14.288814: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _arg_Placeholder_0_0/_1}} = _Recv[_dst="MatMul", _src="_arg_Placeholder_0_0", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_11__arg_Placeholder_0_0", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]() | |
2022-06-16 02:29:14.288830: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _arg_Placeholder_0_0/_1}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.288835: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Recv:GPU::_arg_Placeholder_0_0/_1 takes 7.041us | |
2022-06-16 02:29:14.288841: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _arg_Placeholder_0_0/_1}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.288846: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Recv:GPU::_arg_Placeholder_0_0/_1 takes 5.254us | |
2022-06-16 02:29:14.288879: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _arg_Placeholder_0_0/_1}} = _Recv[_dst="MatMul", _src="_arg_Placeholder_0_0", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_11__arg_Placeholder_0_0", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]() takes 65.945us | |
2022-06-16 02:29:14.288892: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](random_uniform/RandomUniform, _arg_Placeholder_0_0/_1) | |
2022-06-16 02:29:14.288901: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 1.541us | |
2022-06-16 02:29:14.288908: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: BatchMatMulV2:GPU::MatMul takes 0.633us | |
2022-06-16 02:29:14.288926: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](random_uniform/RandomUniform, _arg_Placeholder_0_0/_1) takes 34.362us | |
2022-06-16 02:29:14.288939: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, random_uniform_1/RandomUniform) | |
2022-06-16 02:29:14.288948: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 1.775us | |
2022-06-16 02:29:14.288955: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: AddV2:GPU::Add takes 0.909us | |
2022-06-16 02:29:14.288969: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, random_uniform_1/RandomUniform) takes 29.579us | |
2022-06-16 02:29:14.288982: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node Add/_2}} = _Send[T=DT_FLOAT, _dst="_retval_Add_0_0", _src="Add", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_12_Add", _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add) | |
2022-06-16 02:29:14.288996: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node Add/_2}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.289001: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:GPU::Add/_2 takes 5.9us | |
2022-06-16 02:29:14.289006: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node Add/_2}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.289012: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:GPU::Add/_2 takes 5.055us | |
2022-06-16 02:29:14.289028: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node Add/_2}} = _Send[T=DT_FLOAT, _dst="_retval_Add_0_0", _src="Add", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_12_Add", _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add) takes 47.598us | |
2022-06-16 02:29:14.289067: I tensorflow/core/common_runtime/constant_folding.cc:613] No constant foldable nodes found | |
2022-06-16 02:29:14.289151: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() | |
2022-06-16 02:29:14.289161: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.289167: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 6.069us | |
2022-06-16 02:29:14.289172: I tensorflow/core/framework/op_kernel.cc:1360] No device-specific kernels found for NodeDef '{{node _SOURCE}}'Will fall back to a default kernel. | |
2022-06-16 02:29:14.289177: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: NoOp:CPU::_SOURCE takes 4.854us | |
2022-06-16 02:29:14.289186: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _SOURCE}} = NoOp[]() takes 34.209us | |
2022-06-16 02:29:14.289195: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _arg_Placeholder_0_0}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, index=0, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() | |
2022-06-16 02:29:14.289203: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::_arg_Placeholder_0_0 takes 0.538us | |
2022-06-16 02:29:14.289209: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Arg:CPU::_arg_Placeholder_0_0 takes 0.302us | |
2022-06-16 02:29:14.289220: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _arg_Placeholder_0_0}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, index=0, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 24.914us | |
2022-06-16 02:29:14.289233: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _arg_Placeholder_0_0/_0}} = _Send[T=DT_FLOAT, _dst="MatMul", _src="_arg_Placeholder_0_0", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_11__arg_Placeholder_0_0", _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_Placeholder_0_0) | |
2022-06-16 02:29:14.289242: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_arg_Placeholder_0_0/_0 takes 0.426us | |
2022-06-16 02:29:14.289248: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Send:CPU::_arg_Placeholder_0_0/_0 takes 0.259us | |
2022-06-16 02:29:14.289264: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _arg_Placeholder_0_0/_0}} = _Send[T=DT_FLOAT, _dst="MatMul", _src="_arg_Placeholder_0_0", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_11__arg_Placeholder_0_0", _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_Placeholder_0_0) takes 32.493us | |
2022-06-16 02:29:14.289276: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node Add/_3}} = _Recv[_dst="_retval_Add_0_0", _src="Add", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_12_Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() | |
2022-06-16 02:29:14.289286: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Recv:CPU::Add/_3 takes 0.541us | |
2022-06-16 02:29:14.289291: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Recv:CPU::Add/_3 takes 0.274us | |
2022-06-16 02:29:14.289307: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node Add/_3}} = _Recv[_dst="_retval_Add_0_0", _src="Add", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_12_Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() takes 30.472us | |
2022-06-16 02:29:14.289319: I tensorflow/core/framework/op_kernel.cc:1616] Instantiating kernel for node: {{node _retval_Add_0_0}} = _Retval[T=DT_FLOAT, _XlaHasReferenceVars=false, index=0, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Add/_3) | |
2022-06-16 02:29:14.289328: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::_retval_Add_0_0 takes 0.383us | |
2022-06-16 02:29:14.289333: I tensorflow/core/framework/op_kernel.cc:1370] Find Kernel Registration for node: _Retval:CPU::_retval_Add_0_0 takes 0.268us | |
2022-06-16 02:29:14.289343: I tensorflow/core/framework/op_kernel.cc:1665] Instantiating kernel for node: {{node _retval_Add_0_0}} = _Retval[T=DT_FLOAT, _XlaHasReferenceVars=false, index=0, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Add/_3) takes 23.498us | |
# run 1 compute start | |
2022-06-16 02:29:14.289511: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step 1 {{node _SOURCE}} = NoOp[]() device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-16 02:29:14.289612: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step 1 {{node _SOURCE}} = NoOp[]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:14.289701: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step 1 {{node _arg_Placeholder_0_0}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, index=0, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-16 02:29:14.289737: I tensorflow/core/common_runtime/executor.cc:783] Process node: 4 step 1 {{node Add/_3}} = _Recv[_dst="_retval_Add_0_0", _src="Add", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_12_Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-16 02:29:14.289818: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step 1 {{node random_uniform/shape}} = Const[_XlaHasReferenceVars=false, dtype=DT_INT32, value=Tensor<type: int32 shape: [2] values: 1024 128>, _device="/job:localhost/replica:0/task:0/device:GPU:0"]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:14.289852: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step 1 {{node _arg_Placeholder_0_0/_0}} = _Send[T=DT_FLOAT, _dst="MatMul", _src="_arg_Placeholder_0_0", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_11__arg_Placeholder_0_0", _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_Placeholder_0_0) device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-16 02:29:14.289877: I tensorflow/core/common_runtime/executor.cc:783] Process node: 5 step 1 {{node _arg_Placeholder_0_0/_1}} = _Recv[_dst="MatMul", _src="_arg_Placeholder_0_0", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_11__arg_Placeholder_0_0", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:14.289903: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x1dbf8b0 /job:localhost/replica:0/task:0/device:GPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:CPU:0;edge_12_Add;0:0 | |
2022-06-16 02:29:14.289945: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x1dbf8b0 /job:localhost/replica:0/task:0/device:CPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:GPU:0;edge_11__arg_Placeholder_0_0;0:0 | |
2022-06-16 02:29:14.289975: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x1dbf8d0 /job:localhost/replica:0/task:0/device:GPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:CPU:0;edge_12_Add;0:0 | |
2022-06-16 02:29:14.314493: I tensorflow/stream_executor/stream_executor_pimpl.cc:534] Called StreamExecutor::Allocate(size=10126688256, memory_space=0) returns 0x7ff5a4000000 | |
2022-06-16 02:29:14.314531: I tensorflow/core/common_runtime/bfc_allocator.cc:157] Extending allocation by 9.43GiB bytes for GPU_0_bfc. | |
2022-06-16 02:29:14.314542: I tensorflow/core/common_runtime/bfc_allocator.cc:162] Total allocated bytes: 9.43GiB | |
2022-06-16 02:29:14.314551: I tensorflow/core/common_runtime/bfc_allocator.cc:165] Allocated memory at 0x7ff5a4000000 to 0x7ff7ff990000 | |
2022-06-16 02:29:14.507828: I tensorflow/stream_executor/stream_executor_pimpl.cc:623] Called StreamExecutor::SynchronousMemZero(location=0x7ff87e7fb060, size=1028) | |
2022-06-16 02:29:14.508299: I tensorflow/core/common_runtime/gpu/gpu_device.cc:753] GpuDevice::ComputeAsync _arg_Placeholder_0_0/_1 op _Recv on GPU0 stream[0] | |
2022-06-16 02:29:14.508321: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x1dbf8b0 /job:localhost/replica:0/task:0/device:CPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:GPU:0;edge_11__arg_Placeholder_0_0;0:0 | |
2022-06-16 02:29:14.508327: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x1dbf8d0 /job:localhost/replica:0/task:0/device:CPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:GPU:0;edge_11__arg_Placeholder_0_0;0:0 | |
2022-06-16 02:29:14.508353: I tensorflow/core/common_runtime/copy_tensor.cc:211] Copy edge_11__arg_Placeholder_0_0 | |
2022-06-16 02:29:14.508367: I tensorflow/core/common_runtime/gpu/gpu_util.cc:315] CopyCPUTensorToGPU | |
2022-06-16 02:29:14.508384: I tensorflow/stream_executor/stream.cc:1052] [stream=0x21bc8d30,impl=0x5d08860] Called Stream::ThenWaitFor(other=0x21d0ed00) | |
2022-06-16 02:29:14.508413: I tensorflow/stream_executor/stream.cc:3887] [stream=0x21bc8d30,impl=0x5d08860] Called Stream::ThenMemcpy(gpu_dst=0x7ff5a4000500, host_src=0x211a3840, size=65536) | |
2022-06-16 02:29:14.508503: I tensorflow/stream_executor/stream.cc:340] [stream=0x21bc8d30,impl=0x5d08860] Called Stream::ThenRecordEvent(event=0x7ff80c00a770) | |
2022-06-16 02:29:14.508573: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step 1 {{node random_uniform/RandomUniform}} = RandomUniform[T=DT_INT32, _XlaHasReferenceVars=false, dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](random_uniform/shape) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:14.508591: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper random_uniform/RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-16 02:29:14.508605: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] random_uniform/RandomUniform:RandomUniform#shape=(int32[2])# | |
2022-06-16 02:29:14.508680: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled random_uniform/RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-16 02:29:14.508702: I tensorflow/core/common_runtime/executor.cc:783] Process node: 4 step 1 {{node random_uniform_1/RandomUniform}} = RandomUniform[T=DT_INT32, _XlaHasReferenceVars=false, dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](random_uniform/shape) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:14.508711: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper random_uniform_1/RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-16 02:29:14.508716: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] random_uniform_1/RandomUniform:RandomUniform#shape=(int32[2])# | |
2022-06-16 02:29:14.508730: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled random_uniform_1/RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-16 02:29:14.508741: I tensorflow/core/common_runtime/executor.cc:783] Process node: 6 step 1 {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](random_uniform/RandomUniform, _arg_Placeholder_0_0/_1) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:14.508747: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper MatMul op BatchMatMulV2 on GPU 0 stream[0] | |
2022-06-16 02:29:14.508754: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] MatMul:BatchMatMulV2#shape=(float[1024,128];float[1,128,128])# | |
2022-06-16 02:29:14.508893: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11 | |
2022-06-16 02:29:15.100994: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11 | |
2022-06-16 02:29:15.101097: I tensorflow/stream_executor/cuda/cuda_blas.cc:1821] doing cuBLAS SGEMM: at=0 bt=0 m=128 n=1024 k=128 alpha=0x7ff87e7fad00 a=0x7ff5a4000500 lda=128 b=0x7ff5a4010500 ldb=128 beta=0x7ff87e7fad10 c=0x7ff5a4110500 ldc=128 | |
2022-06-16 02:29:15.101832: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled MatMul op BatchMatMulV2 on GPU 0 stream[0] | |
2022-06-16 02:29:15.101902: I tensorflow/core/common_runtime/executor.cc:783] Process node: 7 step 1 {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, random_uniform_1/RandomUniform) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.101935: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Add op AddV2 on GPU 0 stream[0] | |
2022-06-16 02:29:15.101950: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Add:AddV2#shape=(float[1,1024,128];float[1024,128])# | |
2022-06-16 02:29:15.102173: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled Add op AddV2 on GPU 0 stream[0] | |
2022-06-16 02:29:15.102204: I tensorflow/core/common_runtime/executor.cc:783] Process node: 8 step 1 {{node Add/_2}} = _Send[T=DT_FLOAT, _dst="_retval_Add_0_0", _src="Add", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_12_Add", _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.102215: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Add/_2 op _Send on GPU 0 stream[0] | |
2022-06-16 02:29:15.102223: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Add/_2:_Send#from=Add,to=_retval_Add_0_0# | |
2022-06-16 02:29:15.102233: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x1dbf8b0 /job:localhost/replica:0/task:0/device:GPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:CPU:0;edge_12_Add;0:0 | |
2022-06-16 02:29:15.102261: I tensorflow/core/common_runtime/copy_tensor.cc:211] Copy edge_12_Add | |
2022-06-16 02:29:15.102272: I tensorflow/core/common_runtime/gpu/gpu_util.cc:270] CopyGPUTensorToCPU | |
2022-06-16 02:29:15.102285: I tensorflow/stream_executor/stream.cc:1052] [stream=0x21bc9000,impl=0x21d0d1d0] Called Stream::ThenWaitFor(other=0x21d0ed00) | |
2022-06-16 02:29:15.102306: I tensorflow/stream_executor/stream.cc:3879] [stream=0x21bc9000,impl=0x21d0d1d0] Called Stream::ThenMemcpy(host_dst=0x7ff8b1400100, gpu_src=0x7ff5a4110500, size=524288) | |
2022-06-16 02:29:15.102339: I tensorflow/stream_executor/stream.cc:340] [stream=0x21bc9000,impl=0x21d0d1d0] Called Stream::ThenRecordEvent(event=0x7ff80c00a770) | |
2022-06-16 02:29:15.102369: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled Add/_2 op _Send on GPU 0 stream[0] | |
2022-06-16 02:29:15.102387: I tensorflow/stream_executor/stream.cc:4366] [stream=0x21d0ed00,impl=0x21d0d8d0] Called Stream::BlockHostUntilDone() | |
2022-06-16 02:29:15.102398: I tensorflow/stream_executor/temporary_memory_manager.cc:64] deallocated 0 finalized temporaries | |
2022-06-16 02:29:15.102596: I tensorflow/core/common_runtime/executor.cc:783] Process node: 5 step 1 {{node _retval_Add_0_0}} = _Retval[T=DT_FLOAT, _XlaHasReferenceVars=false, index=0, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Add/_3) device: /job:localhost/replica:0/task:0/device:CPU:0 | |
# run 1 compute end | |
# run 2 compute start | |
2022-06-16 02:29:15.104302: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step 2 {{node _SOURCE}} = NoOp[]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.104366: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step 2 {{node _SOURCE}} = NoOp[]() device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-16 02:29:15.104416: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step 2 {{node _arg_Placeholder_0_0}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, index=0, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-16 02:29:15.104426: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step 2 {{node random_uniform/shape}} = Const[_XlaHasReferenceVars=false, dtype=DT_INT32, value=Tensor<type: int32 shape: [2] values: 1024 128>, _device="/job:localhost/replica:0/task:0/device:GPU:0"]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.104452: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step 2 {{node _arg_Placeholder_0_0/_0}} = _Send[T=DT_FLOAT, _dst="MatMul", _src="_arg_Placeholder_0_0", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_11__arg_Placeholder_0_0", _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_Placeholder_0_0) device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-16 02:29:15.104463: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x1dbf8b0 /job:localhost/replica:0/task:0/device:CPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:GPU:0;edge_11__arg_Placeholder_0_0;0:0 | |
2022-06-16 02:29:15.104483: I tensorflow/core/common_runtime/executor.cc:783] Process node: 5 step 2 {{node _arg_Placeholder_0_0/_1}} = _Recv[_dst="MatMul", _src="_arg_Placeholder_0_0", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_11__arg_Placeholder_0_0", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.104528: I tensorflow/core/common_runtime/executor.cc:783] Process node: 4 step 2 {{node Add/_3}} = _Recv[_dst="_retval_Add_0_0", _src="Add", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_12_Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-16 02:29:15.104592: I tensorflow/core/common_runtime/gpu/gpu_device.cc:753] GpuDevice::ComputeAsync _arg_Placeholder_0_0/_1 op _Recv on GPU0 stream[0] | |
2022-06-16 02:29:15.104628: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x1dbf8b0 /job:localhost/replica:0/task:0/device:GPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:CPU:0;edge_12_Add;0:0 | |
2022-06-16 02:29:15.104672: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x1dbf8b0 /job:localhost/replica:0/task:0/device:CPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:GPU:0;edge_11__arg_Placeholder_0_0;0:0 | |
2022-06-16 02:29:15.104709: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x1dbf8d0 /job:localhost/replica:0/task:0/device:CPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:GPU:0;edge_11__arg_Placeholder_0_0;0:0 | |
2022-06-16 02:29:15.104735: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x1dbf8d0 /job:localhost/replica:0/task:0/device:GPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:CPU:0;edge_12_Add;0:0 | |
2022-06-16 02:29:15.104771: I tensorflow/core/common_runtime/copy_tensor.cc:211] Copy edge_11__arg_Placeholder_0_0 | |
2022-06-16 02:29:15.104803: I tensorflow/core/common_runtime/gpu/gpu_util.cc:315] CopyCPUTensorToGPU | |
2022-06-16 02:29:15.104831: I tensorflow/stream_executor/stream.cc:1052] [stream=0x21bc8d30,impl=0x5d08860] Called Stream::ThenWaitFor(other=0x21d0ed00) | |
2022-06-16 02:29:15.104885: I tensorflow/stream_executor/stream.cc:3887] [stream=0x21bc8d30,impl=0x5d08860] Called Stream::ThenMemcpy(gpu_dst=0x7ff5a4000500, host_src=0x211a3840, size=65536) | |
2022-06-16 02:29:15.104984: I tensorflow/stream_executor/stream.cc:340] [stream=0x21bc8d30,impl=0x5d08860] Called Stream::ThenRecordEvent(event=0x7ff80c00a770) | |
2022-06-16 02:29:15.105071: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step 2 {{node random_uniform/RandomUniform}} = RandomUniform[T=DT_INT32, _XlaHasReferenceVars=false, dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](random_uniform/shape) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.105106: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper random_uniform/RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-16 02:29:15.105139: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] random_uniform/RandomUniform:RandomUniform#shape=(int32[2])# | |
2022-06-16 02:29:15.105226: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled random_uniform/RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-16 02:29:15.105274: I tensorflow/core/common_runtime/executor.cc:783] Process node: 4 step 2 {{node random_uniform_1/RandomUniform}} = RandomUniform[T=DT_INT32, _XlaHasReferenceVars=false, dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](random_uniform/shape) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.105295: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper random_uniform_1/RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-16 02:29:15.105316: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] random_uniform_1/RandomUniform:RandomUniform#shape=(int32[2])# | |
2022-06-16 02:29:15.105360: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled random_uniform_1/RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-16 02:29:15.105398: I tensorflow/core/common_runtime/executor.cc:783] Process node: 6 step 2 {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](random_uniform/RandomUniform, _arg_Placeholder_0_0/_1) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.105419: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper MatMul op BatchMatMulV2 on GPU 0 stream[0] | |
2022-06-16 02:29:15.105451: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] MatMul:BatchMatMulV2#shape=(float[1024,128];float[1,128,128])# | |
2022-06-16 02:29:15.105508: I tensorflow/stream_executor/cuda/cuda_blas.cc:1821] doing cuBLAS SGEMM: at=0 bt=0 m=128 n=1024 k=128 alpha=0x7ff87effbd00 a=0x7ff5a4000500 lda=128 b=0x7ff5a4010500 ldb=128 beta=0x7ff87effbd10 c=0x7ff5a4110500 ldc=128 | |
2022-06-16 02:29:15.105661: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled MatMul op BatchMatMulV2 on GPU 0 stream[0] | |
2022-06-16 02:29:15.105709: I tensorflow/core/common_runtime/executor.cc:783] Process node: 7 step 2 {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, random_uniform_1/RandomUniform) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.105731: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Add op AddV2 on GPU 0 stream[0] | |
2022-06-16 02:29:15.105756: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Add:AddV2#shape=(float[1,1024,128];float[1024,128])# | |
2022-06-16 02:29:15.105813: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled Add op AddV2 on GPU 0 stream[0] | |
2022-06-16 02:29:15.105878: I tensorflow/core/common_runtime/executor.cc:783] Process node: 8 step 2 {{node Add/_2}} = _Send[T=DT_FLOAT, _dst="_retval_Add_0_0", _src="Add", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_12_Add", _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.105906: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Add/_2 op _Send on GPU 0 stream[0] | |
2022-06-16 02:29:15.105933: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Add/_2:_Send#from=Add,to=_retval_Add_0_0# | |
2022-06-16 02:29:15.105960: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x1dbf8b0 /job:localhost/replica:0/task:0/device:GPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:CPU:0;edge_12_Add;0:0 | |
2022-06-16 02:29:15.105999: I tensorflow/core/common_runtime/copy_tensor.cc:211] Copy edge_12_Add | |
2022-06-16 02:29:15.106021: I tensorflow/core/common_runtime/gpu/gpu_util.cc:270] CopyGPUTensorToCPU | |
2022-06-16 02:29:15.106047: I tensorflow/stream_executor/stream.cc:1052] [stream=0x21bc9000,impl=0x21d0d1d0] Called Stream::ThenWaitFor(other=0x21d0ed00) | |
2022-06-16 02:29:15.106092: I tensorflow/stream_executor/stream.cc:3879] [stream=0x21bc9000,impl=0x21d0d1d0] Called Stream::ThenMemcpy(host_dst=0x7ff8b1400100, gpu_src=0x7ff5a4110500, size=524288) | |
2022-06-16 02:29:15.106140: I tensorflow/stream_executor/stream.cc:340] [stream=0x21bc9000,impl=0x21d0d1d0] Called Stream::ThenRecordEvent(event=0x7ff80c00a770) | |
2022-06-16 02:29:15.106178: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled Add/_2 op _Send on GPU 0 stream[0] | |
2022-06-16 02:29:15.106218: I tensorflow/stream_executor/stream.cc:4366] [stream=0x21d0ed00,impl=0x21d0d8d0] Called Stream::BlockHostUntilDone() | |
2022-06-16 02:29:15.106242: I tensorflow/stream_executor/temporary_memory_manager.cc:64] deallocated 0 finalized temporaries | |
2022-06-16 02:29:15.106303: I tensorflow/core/common_runtime/executor.cc:783] Process node: 5 step 2 {{node _retval_Add_0_0}} = _Retval[T=DT_FLOAT, _XlaHasReferenceVars=false, index=0, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Add/_3) device: /job:localhost/replica:0/task:0/device:CPU:0 | |
# run 2 compute end | |
# run 3 compute start | |
2022-06-16 02:29:15.112003: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step 3 {{node _SOURCE}} = NoOp[]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.112041: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step 3 {{node _SOURCE}} = NoOp[]() device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-16 02:29:15.112069: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step 3 {{node random_uniform/shape}} = Const[_XlaHasReferenceVars=false, dtype=DT_INT32, value=Tensor<type: int32 shape: [2] values: 1024 128>, _device="/job:localhost/replica:0/task:0/device:GPU:0"]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.112093: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step 3 {{node _arg_Placeholder_0_0}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, index=0, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-16 02:29:15.112112: I tensorflow/core/common_runtime/executor.cc:783] Process node: 5 step 3 {{node _arg_Placeholder_0_0/_1}} = _Recv[_dst="MatMul", _src="_arg_Placeholder_0_0", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_11__arg_Placeholder_0_0", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.112132: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step 3 {{node _arg_Placeholder_0_0/_0}} = _Send[T=DT_FLOAT, _dst="MatMul", _src="_arg_Placeholder_0_0", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_11__arg_Placeholder_0_0", _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_Placeholder_0_0) device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-16 02:29:15.112140: I tensorflow/core/common_runtime/gpu/gpu_device.cc:753] GpuDevice::ComputeAsync _arg_Placeholder_0_0/_1 op _Recv on GPU0 stream[0] | |
2022-06-16 02:29:15.112164: I tensorflow/core/common_runtime/executor.cc:783] Process node: 4 step 3 {{node Add/_3}} = _Recv[_dst="_retval_Add_0_0", _src="Add", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_12_Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-16 02:29:15.112204: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x1dbf8b0 /job:localhost/replica:0/task:0/device:CPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:GPU:0;edge_11__arg_Placeholder_0_0;0:0 | |
2022-06-16 02:29:15.112213: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x1dbf8b0 /job:localhost/replica:0/task:0/device:GPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:CPU:0;edge_12_Add;0:0 | |
2022-06-16 02:29:15.112243: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x1dbf8b0 /job:localhost/replica:0/task:0/device:CPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:GPU:0;edge_11__arg_Placeholder_0_0;0:0 | |
2022-06-16 02:29:15.112293: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x1dbf8d0 /job:localhost/replica:0/task:0/device:CPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:GPU:0;edge_11__arg_Placeholder_0_0;0:0 | |
2022-06-16 02:29:15.112315: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x1dbf8d0 /job:localhost/replica:0/task:0/device:GPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:CPU:0;edge_12_Add;0:0 | |
2022-06-16 02:29:15.112353: I tensorflow/core/common_runtime/copy_tensor.cc:211] Copy edge_11__arg_Placeholder_0_0 | |
2022-06-16 02:29:15.112386: I tensorflow/core/common_runtime/gpu/gpu_util.cc:315] CopyCPUTensorToGPU | |
# run 3 H2D end | |
2022-06-16 02:29:15.112415: I tensorflow/stream_executor/stream.cc:1052] [stream=0x21bc8d30,impl=0x5d08860] Called Stream::ThenWaitFor(other=0x21d0ed00) | |
2022-06-16 02:29:15.112448: I tensorflow/stream_executor/stream.cc:3887] [stream=0x21bc8d30,impl=0x5d08860] Called Stream::ThenMemcpy(gpu_dst=0x7ff5a4000500, host_src=0x21d46040, size=2097152) | |
2022-06-16 02:29:15.112950: I tensorflow/stream_executor/stream.cc:340] [stream=0x21bc8d30,impl=0x5d08860] Called Stream::ThenRecordEvent(event=0x7ff80c00a770) | |
2022-06-16 02:29:15.113027: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step 3 {{node random_uniform/RandomUniform}} = RandomUniform[T=DT_INT32, _XlaHasReferenceVars=false, dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](random_uniform/shape) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.113056: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper random_uniform/RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-16 02:29:15.113084: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] random_uniform/RandomUniform:RandomUniform#shape=(int32[2])# | |
2022-06-16 02:29:15.113151: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled random_uniform/RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-16 02:29:15.113184: I tensorflow/core/common_runtime/executor.cc:783] Process node: 4 step 3 {{node random_uniform_1/RandomUniform}} = RandomUniform[T=DT_INT32, _XlaHasReferenceVars=false, dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](random_uniform/shape) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.113204: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper random_uniform_1/RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-16 02:29:15.113230: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] random_uniform_1/RandomUniform:RandomUniform#shape=(int32[2])# | |
2022-06-16 02:29:15.113276: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled random_uniform_1/RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-16 02:29:15.113308: I tensorflow/core/common_runtime/executor.cc:783] Process node: 6 step 3 {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](random_uniform/RandomUniform, _arg_Placeholder_0_0/_1) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.113328: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper MatMul op BatchMatMulV2 on GPU 0 stream[0] | |
2022-06-16 02:29:15.113345: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] MatMul:BatchMatMulV2#shape=(float[1024,128];float[32,128,128])# | |
2022-06-16 02:29:15.113489: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled MatMul op BatchMatMulV2 on GPU 0 stream[0] | |
2022-06-16 02:29:15.113530: I tensorflow/core/common_runtime/executor.cc:783] Process node: 7 step 3 {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, random_uniform_1/RandomUniform) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.113551: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Add op AddV2 on GPU 0 stream[0] | |
2022-06-16 02:29:15.113578: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Add:AddV2#shape=(float[32,1024,128];float[1024,128])# | |
2022-06-16 02:29:15.113957: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled Add op AddV2 on GPU 0 stream[0] | |
2022-06-16 02:29:15.114015: I tensorflow/core/common_runtime/executor.cc:783] Process node: 8 step 3 {{node Add/_2}} = _Send[T=DT_FLOAT, _dst="_retval_Add_0_0", _src="Add", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_12_Add", _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.114038: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Add/_2 op _Send on GPU 0 stream[0] | |
2022-06-16 02:29:15.114059: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Add/_2:_Send#from=Add,to=_retval_Add_0_0# | |
2022-06-16 02:29:15.114081: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x1dbf8b0 /job:localhost/replica:0/task:0/device:GPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:CPU:0;edge_12_Add;0:0 | |
2022-06-16 02:29:15.129182: I tensorflow/stream_executor/stream_executor_pimpl.cc:581] Called StreamExecutor::HostMemoryAllocate(size=16777216) returns 0x7ff56e000000 | |
2022-06-16 02:29:15.129240: I tensorflow/core/common_runtime/bfc_allocator.cc:157] Extending allocation by 16.00MiB bytes for gpu_host_bfc. | |
2022-06-16 02:29:15.129252: I tensorflow/core/common_runtime/bfc_allocator.cc:162] Total allocated bytes: 18.00MiB | |
2022-06-16 02:29:15.129262: I tensorflow/core/common_runtime/bfc_allocator.cc:165] Allocated memory at 0x7ff56e000000 to 0x7ff56f000000 | |
2022-06-16 02:29:15.129780: I tensorflow/core/common_runtime/copy_tensor.cc:211] Copy edge_12_Add | |
2022-06-16 02:29:15.129801: I tensorflow/core/common_runtime/gpu/gpu_util.cc:270] CopyGPUTensorToCPU | |
# run 3 D2H end | |
2022-06-16 02:29:15.129819: I tensorflow/stream_executor/stream.cc:1052] [stream=0x21bc9000,impl=0x21d0d1d0] Called Stream::ThenWaitFor(other=0x21d0ed00) | |
2022-06-16 02:29:15.129848: I tensorflow/stream_executor/stream.cc:3879] [stream=0x21bc9000,impl=0x21d0d1d0] Called Stream::ThenMemcpy(host_dst=0x7ff56e000000, gpu_src=0x7ff5a4300500, size=16777216) | |
2022-06-16 02:29:15.129888: I tensorflow/stream_executor/stream.cc:340] [stream=0x21bc9000,impl=0x21d0d1d0] Called Stream::ThenRecordEvent(event=0x7ff80c00a770) | |
2022-06-16 02:29:15.129925: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled Add/_2 op _Send on GPU 0 stream[0] | |
2022-06-16 02:29:15.129956: I tensorflow/stream_executor/stream.cc:4366] [stream=0x21d0ed00,impl=0x21d0d8d0] Called Stream::BlockHostUntilDone() | |
2022-06-16 02:29:15.129976: I tensorflow/stream_executor/temporary_memory_manager.cc:64] deallocated 0 finalized temporaries | |
2022-06-16 02:29:15.131372: I tensorflow/core/common_runtime/executor.cc:783] Process node: 5 step 3 {{node _retval_Add_0_0}} = _Retval[T=DT_FLOAT, _XlaHasReferenceVars=false, index=0, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Add/_3) device: /job:localhost/replica:0/task:0/device:CPU:0 | |
# run 3 compute end | |
# run 4 compute start | |
2022-06-16 02:29:15.133385: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step 4 {{node _SOURCE}} = NoOp[]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.133448: I tensorflow/core/common_runtime/executor.cc:783] Process node: 0 step 4 {{node _SOURCE}} = NoOp[]() device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-16 02:29:15.133501: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step 4 {{node _arg_Placeholder_0_0}} = _Arg[T=DT_FLOAT, _XlaHasReferenceVars=false, index=0, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-16 02:29:15.133522: I tensorflow/core/common_runtime/executor.cc:783] Process node: 2 step 4 {{node random_uniform/shape}} = Const[_XlaHasReferenceVars=false, dtype=DT_INT32, value=Tensor<type: int32 shape: [2] values: 1024 128>, _device="/job:localhost/replica:0/task:0/device:GPU:0"]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.133554: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step 4 {{node _arg_Placeholder_0_0/_0}} = _Send[T=DT_FLOAT, _dst="MatMul", _src="_arg_Placeholder_0_0", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_11__arg_Placeholder_0_0", _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_Placeholder_0_0) device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-16 02:29:15.133571: I tensorflow/core/common_runtime/executor.cc:783] Process node: 4 step 4 {{node Add/_3}} = _Recv[_dst="_retval_Add_0_0", _src="Add", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_12_Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]() device: /job:localhost/replica:0/task:0/device:CPU:0 | |
2022-06-16 02:29:15.133612: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x1dbf8b0 /job:localhost/replica:0/task:0/device:CPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:GPU:0;edge_11__arg_Placeholder_0_0;0:0 | |
2022-06-16 02:29:15.133630: I tensorflow/core/common_runtime/executor.cc:783] Process node: 5 step 4 {{node _arg_Placeholder_0_0/_1}} = _Recv[_dst="MatMul", _src="_arg_Placeholder_0_0", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_11__arg_Placeholder_0_0", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]() device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.133661: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x1dbf8b0 /job:localhost/replica:0/task:0/device:GPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:CPU:0;edge_12_Add;0:0 | |
2022-06-16 02:29:15.133682: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x1dbf8d0 /job:localhost/replica:0/task:0/device:GPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:CPU:0;edge_12_Add;0:0 | |
2022-06-16 02:29:15.133698: I tensorflow/core/common_runtime/gpu/gpu_device.cc:753] GpuDevice::ComputeAsync _arg_Placeholder_0_0/_1 op _Recv on GPU0 stream[0] | |
2022-06-16 02:29:15.133740: I tensorflow/core/common_runtime/rendezvous_mgr.cc:174] IntraProcessRendezvous Recv 0x1dbf8b0 /job:localhost/replica:0/task:0/device:CPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:GPU:0;edge_11__arg_Placeholder_0_0;0:0 | |
2022-06-16 02:29:15.133759: I tensorflow/core/common_runtime/rendezvous_mgr.cc:125] IntraProcessRendezvous Recv 0x1dbf8d0 /job:localhost/replica:0/task:0/device:CPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:GPU:0;edge_11__arg_Placeholder_0_0;0:0 | |
2022-06-16 02:29:15.133791: I tensorflow/core/common_runtime/copy_tensor.cc:211] Copy edge_11__arg_Placeholder_0_0 | |
2022-06-16 02:29:15.133820: I tensorflow/core/common_runtime/gpu/gpu_util.cc:315] CopyCPUTensorToGPU | |
# run 4 H2D end | |
2022-06-16 02:29:15.133846: I tensorflow/stream_executor/stream.cc:1052] [stream=0x21bc8d30,impl=0x5d08860] Called Stream::ThenWaitFor(other=0x21d0ed00) | |
2022-06-16 02:29:15.133891: I tensorflow/stream_executor/stream.cc:3887] [stream=0x21bc8d30,impl=0x5d08860] Called Stream::ThenMemcpy(gpu_dst=0x7ff5a4000500, host_src=0x21d46040, size=2097152) | |
2022-06-16 02:29:15.134408: I tensorflow/stream_executor/stream.cc:340] [stream=0x21bc8d30,impl=0x5d08860] Called Stream::ThenRecordEvent(event=0x7ff80c00a770) | |
2022-06-16 02:29:15.134490: I tensorflow/core/common_runtime/executor.cc:783] Process node: 3 step 4 {{node random_uniform/RandomUniform}} = RandomUniform[T=DT_INT32, _XlaHasReferenceVars=false, dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](random_uniform/shape) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.134521: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper random_uniform/RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-16 02:29:15.134545: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] random_uniform/RandomUniform:RandomUniform#shape=(int32[2])# | |
2022-06-16 02:29:15.134640: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled random_uniform/RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-16 02:29:15.134677: I tensorflow/core/common_runtime/executor.cc:783] Process node: 4 step 4 {{node random_uniform_1/RandomUniform}} = RandomUniform[T=DT_INT32, _XlaHasReferenceVars=false, dtype=DT_FLOAT, seed=0, seed2=0, _device="/job:localhost/replica:0/task:0/device:GPU:0"](random_uniform/shape) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.134697: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper random_uniform_1/RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-16 02:29:15.134713: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] random_uniform_1/RandomUniform:RandomUniform#shape=(int32[2])# | |
2022-06-16 02:29:15.134749: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled random_uniform_1/RandomUniform op RandomUniform on GPU 0 stream[0] | |
2022-06-16 02:29:15.134778: I tensorflow/core/common_runtime/executor.cc:783] Process node: 6 step 4 {{node MatMul}} = BatchMatMulV2[T=DT_FLOAT, _XlaHasReferenceVars=false, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](random_uniform/RandomUniform, _arg_Placeholder_0_0/_1) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.134798: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper MatMul op BatchMatMulV2 on GPU 0 stream[0] | |
2022-06-16 02:29:15.134816: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] MatMul:BatchMatMulV2#shape=(float[1024,128];float[32,128,128])# | |
2022-06-16 02:29:15.134962: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled MatMul op BatchMatMulV2 on GPU 0 stream[0] | |
2022-06-16 02:29:15.135002: I tensorflow/core/common_runtime/executor.cc:783] Process node: 7 step 4 {{node Add}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MatMul, random_uniform_1/RandomUniform) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.135022: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Add op AddV2 on GPU 0 stream[0] | |
2022-06-16 02:29:15.135041: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Add:AddV2#shape=(float[32,1024,128];float[1024,128])# | |
2022-06-16 02:29:15.135097: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled Add op AddV2 on GPU 0 stream[0] | |
2022-06-16 02:29:15.135142: I tensorflow/core/common_runtime/executor.cc:783] Process node: 8 step 4 {{node Add/_2}} = _Send[T=DT_FLOAT, _dst="_retval_Add_0_0", _src="Add", client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_12_Add", _device="/job:localhost/replica:0/task:0/device:GPU:0"](Add) device: /job:localhost/replica:0/task:0/device:GPU:0 | |
2022-06-16 02:29:15.135164: I tensorflow/core/common_runtime/gpu/gpu_device.cc:664] GpuDevice::ComputeHelper Add/_2 op _Send on GPU 0 stream[0] | |
2022-06-16 02:29:15.135180: I tensorflow/core/common_runtime/gpu/gpu_device.cc:666] Add/_2:_Send#from=Add,to=_retval_Add_0_0# | |
2022-06-16 02:29:15.135200: I tensorflow/core/common_runtime/rendezvous_mgr.cc:167] IntraProcessRendezvous Send 0x1dbf8b0 /job:localhost/replica:0/task:0/device:GPU:0;0000000000000001;/job:localhost/replica:0/task:0/device:CPU:0;edge_12_Add;0:0 | |
2022-06-16 02:29:15.135228: I tensorflow/core/common_runtime/copy_tensor.cc:211] Copy edge_12_Add | |
2022-06-16 02:29:15.135249: I tensorflow/core/common_runtime/gpu/gpu_util.cc:270] CopyGPUTensorToCPU | |
# run 4 D2H end | |
2022-06-16 02:29:15.135273: I tensorflow/stream_executor/stream.cc:1052] [stream=0x21bc9000,impl=0x21d0d1d0] Called Stream::ThenWaitFor(other=0x21d0ed00) | |
2022-06-16 02:29:15.135306: I tensorflow/stream_executor/stream.cc:3879] [stream=0x21bc9000,impl=0x21d0d1d0] Called Stream::ThenMemcpy(host_dst=0x7ff56e000000, gpu_src=0x7ff5a4300500, size=16777216) | |
2022-06-16 02:29:15.135348: I tensorflow/stream_executor/stream.cc:340] [stream=0x21bc9000,impl=0x21d0d1d0] Called Stream::ThenRecordEvent(event=0x7ff80c00a770) | |
2022-06-16 02:29:15.135386: I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] GpuDevice::ComputeHelper scheduled Add/_2 op _Send on GPU 0 stream[0] | |
2022-06-16 02:29:15.135416: I tensorflow/stream_executor/stream.cc:4366] [stream=0x21d0ed00,impl=0x21d0d8d0] Called Stream::BlockHostUntilDone() | |
2022-06-16 02:29:15.135434: I tensorflow/stream_executor/temporary_memory_manager.cc:64] deallocated 0 finalized temporaries | |
2022-06-16 02:29:15.136709: I tensorflow/core/common_runtime/executor.cc:783] Process node: 5 step 4 {{node _retval_Add_0_0}} = _Retval[T=DT_FLOAT, _XlaHasReferenceVars=false, index=0, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Add/_3) device: /job:localhost/replica:0/task:0/device:CPU:0 | |
# run 4 compute end | |
run 1 costs 878.5572052001953 | |
run 2 costs 3.3550262451171875 | |
run 3 costs 25.066852569580078 | |
run 4 costs 5.223751068115234 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment