Created
September 13, 2021 14:16
-
-
Save jlucier/d4c7d4a394fcfd6f2e5cb0a2f8b52770 to your computer and use it in GitHub Desktop.
TRT 8 ONNX model output
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[MemUsageChange] Init CUDA: CPU +331, GPU +0, now: CPU 353, GPU 397 (MiB) | |
---------------------------------------------------------------- | |
Input filename: /tmp/tmpz078yzbz.onnx | |
ONNX IR version: 0.0.6 | |
Opset version: 8 | |
Producer name: pytorch | |
Producer version: 1.8 | |
Domain: | |
Model version: 0 | |
Doc string: | |
---------------------------------------------------------------- | |
Registered plugin creator - ::GridAnchor_TRT version 1 | |
Registered plugin creator - ::GridAnchorRect_TRT version 1 | |
Registered plugin creator - ::NMS_TRT version 1 | |
Registered plugin creator - ::Reorg_TRT version 1 | |
Registered plugin creator - ::Region_TRT version 1 | |
Registered plugin creator - ::Clip_TRT version 1 | |
Registered plugin creator - ::LReLU_TRT version 1 | |
Registered plugin creator - ::PriorBox_TRT version 1 | |
Registered plugin creator - ::Normalize_TRT version 1 | |
Registered plugin creator - ::ScatterND version 1 | |
Registered plugin creator - ::RPROI_TRT version 1 | |
Registered plugin creator - ::BatchedNMS_TRT version 1 | |
Registered plugin creator - ::BatchedNMSDynamic_TRT version 1 | |
Registered plugin creator - ::FlattenConcat_TRT version 1 | |
Registered plugin creator - ::CropAndResize version 1 | |
Registered plugin creator - ::DetectionLayer_TRT version 1 | |
Registered plugin creator - ::EfficientNMS_ONNX_TRT version 1 | |
Registered plugin creator - ::EfficientNMS_TRT version 1 | |
Registered plugin creator - ::Proposal version 1 | |
Registered plugin creator - ::ProposalLayer_TRT version 1 | |
Registered plugin creator - ::PyramidROIAlign_TRT version 1 | |
Registered plugin creator - ::ResizeNearest_TRT version 1 | |
Registered plugin creator - ::Split version 1 | |
Registered plugin creator - ::SpecialSlice_TRT version 1 | |
Registered plugin creator - ::InstanceNormalization_TRT version 1 | |
Adding network input: input with dtype: float32, dimensions: (1, 1, 180, 240) | |
Registering tensor: input for ONNX tensor: input | |
Importing initializer: _layers.deconv1.0.weight | |
Importing initializer: _layers.deconv1.0.bias | |
Importing initializer: _layers.deconv1.1.weight | |
Importing initializer: _layers.deconv1.1.bias | |
Importing initializer: _layers.deconv1.1.running_mean | |
Importing initializer: _layers.deconv1.1.running_var | |
Importing initializer: _layers.deconv2.0.weight | |
Importing initializer: _layers.deconv2.0.bias | |
Importing initializer: _layers.deconv2.2.weight | |
Importing initializer: _layers.deconv2.2.bias | |
Importing initializer: _layers.deconv2.2.running_mean | |
Importing initializer: _layers.deconv2.2.running_var | |
Importing initializer: _layers.deconv3.0.weight | |
Importing initializer: _layers.deconv3.0.bias | |
Importing initializer: _layers.deconv3.2.weight | |
Importing initializer: _layers.deconv3.2.bias | |
Importing initializer: _layers.deconv3.2.running_mean | |
Importing initializer: _layers.deconv3.2.running_var | |
Importing initializer: 199 | |
Importing initializer: 200 | |
Importing initializer: 202 | |
Importing initializer: 203 | |
Importing initializer: 205 | |
Importing initializer: 206 | |
Importing initializer: 208 | |
Importing initializer: 209 | |
Importing initializer: 211 | |
Importing initializer: 212 | |
Importing initializer: 214 | |
Importing initializer: 215 | |
Importing initializer: 217 | |
Importing initializer: 218 | |
Importing initializer: 220 | |
Importing initializer: 221 | |
Importing initializer: 223 | |
Importing initializer: 224 | |
Importing initializer: 226 | |
Importing initializer: 227 | |
Importing initializer: 229 | |
Importing initializer: 230 | |
Importing initializer: 232 | |
Importing initializer: 233 | |
Importing initializer: 235 | |
Importing initializer: 236 | |
Importing initializer: 238 | |
Importing initializer: 239 | |
Importing initializer: 241 | |
Importing initializer: 242 | |
Importing initializer: 244 | |
Importing initializer: 245 | |
Parsing node: Conv_0 [Conv] | |
Searching for input: input | |
Searching for input: 199 | |
Searching for input: 200 | |
Conv_0 [Conv] inputs: [input -> (1, 1, 180, 240)[FLOAT]], [199 -> (8, 1, 5, 5)[FLOAT]], [200 -> (8)[FLOAT]], | |
Convolution input dimensions: (1, 1, 180, 240) | |
Registering layer: Conv_0 for ONNX node: Conv_0 | |
Using kernel: (5, 5), strides: (1, 1), prepadding: (2, 2), postpadding: (2, 2), dilations: (1, 1), numOutputs: 8 | |
Convolution output dimensions: (1, 8, 180, 240) | |
Registering tensor: 198 for ONNX tensor: 198 | |
Conv_0 [Conv] outputs: [198 -> (1, 8, 180, 240)[FLOAT]], | |
Parsing node: Relu_1 [Relu] | |
Searching for input: 198 | |
Relu_1 [Relu] inputs: [198 -> (1, 8, 180, 240)[FLOAT]], | |
Registering layer: Relu_1 for ONNX node: Relu_1 | |
Registering tensor: 136 for ONNX tensor: 136 | |
Relu_1 [Relu] outputs: [136 -> (1, 8, 180, 240)[FLOAT]], | |
Parsing node: Conv_2 [Conv] | |
Searching for input: 136 | |
Searching for input: 202 | |
Searching for input: 203 | |
Conv_2 [Conv] inputs: [136 -> (1, 8, 180, 240)[FLOAT]], [202 -> (8, 8, 5, 5)[FLOAT]], [203 -> (8)[FLOAT]], | |
Convolution input dimensions: (1, 8, 180, 240) | |
Registering layer: Conv_2 for ONNX node: Conv_2 | |
Using kernel: (5, 5), strides: (1, 1), prepadding: (2, 2), postpadding: (2, 2), dilations: (1, 1), numOutputs: 8 | |
Convolution output dimensions: (1, 8, 180, 240) | |
Registering tensor: 201 for ONNX tensor: 201 | |
Conv_2 [Conv] outputs: [201 -> (1, 8, 180, 240)[FLOAT]], | |
Parsing node: Relu_3 [Relu] | |
Searching for input: 201 | |
Relu_3 [Relu] inputs: [201 -> (1, 8, 180, 240)[FLOAT]], | |
Registering layer: Relu_3 for ONNX node: Relu_3 | |
Registering tensor: 139 for ONNX tensor: 139 | |
Relu_3 [Relu] outputs: [139 -> (1, 8, 180, 240)[FLOAT]], | |
Parsing node: MaxPool_4 [MaxPool] | |
Searching for input: 139 | |
MaxPool_4 [MaxPool] inputs: [139 -> (1, 8, 180, 240)[FLOAT]], | |
Registering layer: MaxPool_4 for ONNX node: MaxPool_4 | |
Registering tensor: 140 for ONNX tensor: 140 | |
MaxPool_4 [MaxPool] outputs: [140 -> (1, 8, 90, 120)[FLOAT]], | |
Parsing node: Conv_5 [Conv] | |
Searching for input: 140 | |
Searching for input: 205 | |
Searching for input: 206 | |
Conv_5 [Conv] inputs: [140 -> (1, 8, 90, 120)[FLOAT]], [205 -> (16, 8, 3, 3)[FLOAT]], [206 -> (16)[FLOAT]], | |
Convolution input dimensions: (1, 8, 90, 120) | |
Registering layer: Conv_5 for ONNX node: Conv_5 | |
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 16 | |
Convolution output dimensions: (1, 16, 90, 120) | |
Registering tensor: 204 for ONNX tensor: 204 | |
Conv_5 [Conv] outputs: [204 -> (1, 16, 90, 120)[FLOAT]], | |
Parsing node: Relu_6 [Relu] | |
Searching for input: 204 | |
Relu_6 [Relu] inputs: [204 -> (1, 16, 90, 120)[FLOAT]], | |
Registering layer: Relu_6 for ONNX node: Relu_6 | |
Registering tensor: 143 for ONNX tensor: 143 | |
Relu_6 [Relu] outputs: [143 -> (1, 16, 90, 120)[FLOAT]], | |
Parsing node: Conv_7 [Conv] | |
Searching for input: 143 | |
Searching for input: 208 | |
Searching for input: 209 | |
Conv_7 [Conv] inputs: [143 -> (1, 16, 90, 120)[FLOAT]], [208 -> (16, 16, 3, 3)[FLOAT]], [209 -> (16)[FLOAT]], | |
Convolution input dimensions: (1, 16, 90, 120) | |
Registering layer: Conv_7 for ONNX node: Conv_7 | |
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 16 | |
Convolution output dimensions: (1, 16, 90, 120) | |
Registering tensor: 207 for ONNX tensor: 207 | |
Conv_7 [Conv] outputs: [207 -> (1, 16, 90, 120)[FLOAT]], | |
Parsing node: Relu_8 [Relu] | |
Searching for input: 207 | |
Relu_8 [Relu] inputs: [207 -> (1, 16, 90, 120)[FLOAT]], | |
Registering layer: Relu_8 for ONNX node: Relu_8 | |
Registering tensor: 146 for ONNX tensor: 146 | |
Relu_8 [Relu] outputs: [146 -> (1, 16, 90, 120)[FLOAT]], | |
Parsing node: MaxPool_9 [MaxPool] | |
Searching for input: 146 | |
MaxPool_9 [MaxPool] inputs: [146 -> (1, 16, 90, 120)[FLOAT]], | |
Registering layer: MaxPool_9 for ONNX node: MaxPool_9 | |
Registering tensor: 147 for ONNX tensor: 147 | |
MaxPool_9 [MaxPool] outputs: [147 -> (1, 16, 45, 60)[FLOAT]], | |
Parsing node: Conv_10 [Conv] | |
Searching for input: 147 | |
Searching for input: 211 | |
Searching for input: 212 | |
Conv_10 [Conv] inputs: [147 -> (1, 16, 45, 60)[FLOAT]], [211 -> (32, 16, 3, 3)[FLOAT]], [212 -> (32)[FLOAT]], | |
Convolution input dimensions: (1, 16, 45, 60) | |
Registering layer: Conv_10 for ONNX node: Conv_10 | |
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 32 | |
Convolution output dimensions: (1, 32, 45, 60) | |
Registering tensor: 210 for ONNX tensor: 210 | |
Conv_10 [Conv] outputs: [210 -> (1, 32, 45, 60)[FLOAT]], | |
Parsing node: Relu_11 [Relu] | |
Searching for input: 210 | |
Relu_11 [Relu] inputs: [210 -> (1, 32, 45, 60)[FLOAT]], | |
Registering layer: Relu_11 for ONNX node: Relu_11 | |
Registering tensor: 150 for ONNX tensor: 150 | |
Relu_11 [Relu] outputs: [150 -> (1, 32, 45, 60)[FLOAT]], | |
Parsing node: Conv_12 [Conv] | |
Searching for input: 150 | |
Searching for input: 214 | |
Searching for input: 215 | |
Conv_12 [Conv] inputs: [150 -> (1, 32, 45, 60)[FLOAT]], [214 -> (32, 32, 3, 3)[FLOAT]], [215 -> (32)[FLOAT]], | |
Convolution input dimensions: (1, 32, 45, 60) | |
Registering layer: Conv_12 for ONNX node: Conv_12 | |
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 32 | |
Convolution output dimensions: (1, 32, 45, 60) | |
Registering tensor: 213 for ONNX tensor: 213 | |
Conv_12 [Conv] outputs: [213 -> (1, 32, 45, 60)[FLOAT]], | |
Parsing node: Relu_13 [Relu] | |
Searching for input: 213 | |
Relu_13 [Relu] inputs: [213 -> (1, 32, 45, 60)[FLOAT]], | |
Registering layer: Relu_13 for ONNX node: Relu_13 | |
Registering tensor: 153 for ONNX tensor: 153 | |
Relu_13 [Relu] outputs: [153 -> (1, 32, 45, 60)[FLOAT]], | |
Parsing node: MaxPool_14 [MaxPool] | |
Searching for input: 153 | |
MaxPool_14 [MaxPool] inputs: [153 -> (1, 32, 45, 60)[FLOAT]], | |
Registering layer: MaxPool_14 for ONNX node: MaxPool_14 | |
Registering tensor: 154 for ONNX tensor: 154 | |
MaxPool_14 [MaxPool] outputs: [154 -> (1, 32, 15, 20)[FLOAT]], | |
Parsing node: Conv_15 [Conv] | |
Searching for input: 154 | |
Searching for input: 217 | |
Searching for input: 218 | |
Conv_15 [Conv] inputs: [154 -> (1, 32, 15, 20)[FLOAT]], [217 -> (64, 32, 3, 3)[FLOAT]], [218 -> (64)[FLOAT]], | |
Convolution input dimensions: (1, 32, 15, 20) | |
Registering layer: Conv_15 for ONNX node: Conv_15 | |
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 64 | |
Convolution output dimensions: (1, 64, 15, 20) | |
Registering tensor: 216 for ONNX tensor: 216 | |
Conv_15 [Conv] outputs: [216 -> (1, 64, 15, 20)[FLOAT]], | |
Parsing node: Relu_16 [Relu] | |
Searching for input: 216 | |
Relu_16 [Relu] inputs: [216 -> (1, 64, 15, 20)[FLOAT]], | |
Registering layer: Relu_16 for ONNX node: Relu_16 | |
Registering tensor: 157 for ONNX tensor: 157 | |
Relu_16 [Relu] outputs: [157 -> (1, 64, 15, 20)[FLOAT]], | |
Parsing node: Conv_17 [Conv] | |
Searching for input: 157 | |
Searching for input: 220 | |
Searching for input: 221 | |
Conv_17 [Conv] inputs: [157 -> (1, 64, 15, 20)[FLOAT]], [220 -> (64, 64, 3, 3)[FLOAT]], [221 -> (64)[FLOAT]], | |
Convolution input dimensions: (1, 64, 15, 20) | |
Registering layer: Conv_17 for ONNX node: Conv_17 | |
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 64 | |
Convolution output dimensions: (1, 64, 15, 20) | |
Registering tensor: 219 for ONNX tensor: 219 | |
Conv_17 [Conv] outputs: [219 -> (1, 64, 15, 20)[FLOAT]], | |
Parsing node: Relu_18 [Relu] | |
Searching for input: 219 | |
Relu_18 [Relu] inputs: [219 -> (1, 64, 15, 20)[FLOAT]], | |
Registering layer: Relu_18 for ONNX node: Relu_18 | |
Registering tensor: 160 for ONNX tensor: 160 | |
Relu_18 [Relu] outputs: [160 -> (1, 64, 15, 20)[FLOAT]], | |
Parsing node: Conv_19 [Conv] | |
Searching for input: 160 | |
Searching for input: 223 | |
Searching for input: 224 | |
Conv_19 [Conv] inputs: [160 -> (1, 64, 15, 20)[FLOAT]], [223 -> (64, 64, 3, 3)[FLOAT]], [224 -> (64)[FLOAT]], | |
Convolution input dimensions: (1, 64, 15, 20) | |
Registering layer: Conv_19 for ONNX node: Conv_19 | |
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 64 | |
Convolution output dimensions: (1, 64, 15, 20) | |
Registering tensor: 222 for ONNX tensor: 222 | |
Conv_19 [Conv] outputs: [222 -> (1, 64, 15, 20)[FLOAT]], | |
Parsing node: Relu_20 [Relu] | |
Searching for input: 222 | |
Relu_20 [Relu] inputs: [222 -> (1, 64, 15, 20)[FLOAT]], | |
Registering layer: Relu_20 for ONNX node: Relu_20 | |
Registering tensor: 163 for ONNX tensor: 163 | |
Relu_20 [Relu] outputs: [163 -> (1, 64, 15, 20)[FLOAT]], | |
Parsing node: ConvTranspose_21 [ConvTranspose] | |
Searching for input: 163 | |
Searching for input: _layers.deconv1.0.weight | |
Searching for input: _layers.deconv1.0.bias | |
ConvTranspose_21 [ConvTranspose] inputs: [163 -> (1, 64, 15, 20)[FLOAT]], [_layers.deconv1.0.weight -> (64, 64, 3, 3)[FLOAT]], [_layers.deconv1.0.bias -> (64)[FLOAT]], | |
Running deconvolution with: | |
Padding mode: NOTSET | |
Pre-padding: (0, 0) | |
Post-padding: (0, 0) | |
Registering layer: ConvTranspose_21 for ONNX node: ConvTranspose_21 | |
Registering tensor: 164 for ONNX tensor: 164 | |
ConvTranspose_21 [ConvTranspose] outputs: [164 -> (1, 64, 45, 60)[FLOAT]], | |
Parsing node: BatchNormalization_22 [BatchNormalization] | |
Searching for input: 164 | |
Searching for input: _layers.deconv1.1.weight | |
Searching for input: _layers.deconv1.1.bias | |
Searching for input: _layers.deconv1.1.running_mean | |
Searching for input: _layers.deconv1.1.running_var | |
BatchNormalization_22 [BatchNormalization] inputs: [164 -> (1, 64, 45, 60)[FLOAT]], [_layers.deconv1.1.weight -> (64)[FLOAT]], [_layers.deconv1.1.bias -> (64)[FLOAT]], [_layers.deconv1.1.running_mean -> (64)[FLOAT]], [_layers.deconv1.1.running_var -> (64)[FLOAT]], | |
Registering layer: BatchNormalization_22 for ONNX node: BatchNormalization_22 | |
Registering tensor: 165 for ONNX tensor: 165 | |
BatchNormalization_22 [BatchNormalization] outputs: [165 -> (1, 64, 45, 60)[FLOAT]], | |
Parsing node: Relu_23 [Relu] | |
Searching for input: 165 | |
Relu_23 [Relu] inputs: [165 -> (1, 64, 45, 60)[FLOAT]], | |
Registering layer: Relu_23 for ONNX node: Relu_23 | |
Registering tensor: 166 for ONNX tensor: 166 | |
Relu_23 [Relu] outputs: [166 -> (1, 64, 45, 60)[FLOAT]], | |
Parsing node: Concat_24 [Concat] | |
Searching for input: 166 | |
Searching for input: 153 | |
Concat_24 [Concat] inputs: [166 -> (1, 64, 45, 60)[FLOAT]], [153 -> (1, 32, 45, 60)[FLOAT]], | |
Registering layer: Concat_24 for ONNX node: Concat_24 | |
Registering tensor: 167 for ONNX tensor: 167 | |
Concat_24 [Concat] outputs: [167 -> (1, 96, 45, 60)[FLOAT]], | |
Parsing node: Conv_25 [Conv] | |
Searching for input: 167 | |
Searching for input: 226 | |
Searching for input: 227 | |
Conv_25 [Conv] inputs: [167 -> (1, 96, 45, 60)[FLOAT]], [226 -> (32, 96, 3, 3)[FLOAT]], [227 -> (32)[FLOAT]], | |
Convolution input dimensions: (1, 96, 45, 60) | |
Registering layer: Conv_25 for ONNX node: Conv_25 | |
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 32 | |
Convolution output dimensions: (1, 32, 45, 60) | |
Registering tensor: 225 for ONNX tensor: 225 | |
Conv_25 [Conv] outputs: [225 -> (1, 32, 45, 60)[FLOAT]], | |
Parsing node: Relu_26 [Relu] | |
Searching for input: 225 | |
Relu_26 [Relu] inputs: [225 -> (1, 32, 45, 60)[FLOAT]], | |
Registering layer: Relu_26 for ONNX node: Relu_26 | |
Registering tensor: 170 for ONNX tensor: 170 | |
Relu_26 [Relu] outputs: [170 -> (1, 32, 45, 60)[FLOAT]], | |
Parsing node: Conv_27 [Conv] | |
Searching for input: 170 | |
Searching for input: 229 | |
Searching for input: 230 | |
Conv_27 [Conv] inputs: [170 -> (1, 32, 45, 60)[FLOAT]], [229 -> (32, 32, 3, 3)[FLOAT]], [230 -> (32)[FLOAT]], | |
Convolution input dimensions: (1, 32, 45, 60) | |
Registering layer: Conv_27 for ONNX node: Conv_27 | |
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 32 | |
Convolution output dimensions: (1, 32, 45, 60) | |
Registering tensor: 228 for ONNX tensor: 228 | |
Conv_27 [Conv] outputs: [228 -> (1, 32, 45, 60)[FLOAT]], | |
Parsing node: Relu_28 [Relu] | |
Searching for input: 228 | |
Relu_28 [Relu] inputs: [228 -> (1, 32, 45, 60)[FLOAT]], | |
Registering layer: Relu_28 for ONNX node: Relu_28 | |
Registering tensor: 173 for ONNX tensor: 173 | |
Relu_28 [Relu] outputs: [173 -> (1, 32, 45, 60)[FLOAT]], | |
Parsing node: ConvTranspose_29 [ConvTranspose] | |
Searching for input: 173 | |
Searching for input: _layers.deconv2.0.weight | |
Searching for input: _layers.deconv2.0.bias | |
ConvTranspose_29 [ConvTranspose] inputs: [173 -> (1, 32, 45, 60)[FLOAT]], [_layers.deconv2.0.weight -> (32, 16, 3, 3)[FLOAT]], [_layers.deconv2.0.bias -> (16)[FLOAT]], | |
Running deconvolution with: | |
Padding mode: NOTSET | |
Pre-padding: (0, 0) | |
Post-padding: (0, 0) | |
Registering layer: ConvTranspose_29 for ONNX node: ConvTranspose_29 | |
Registering tensor: 174 for ONNX tensor: 174 | |
ConvTranspose_29 [ConvTranspose] outputs: [174 -> (1, 16, 91, 121)[FLOAT]], | |
Parsing node: Pad_30 [Pad] | |
Searching for input: 174 | |
Pad_30 [Pad] inputs: [174 -> (1, 16, 91, 121)[FLOAT]], | |
Registering layer: Pad_30 for ONNX node: Pad_30 | |
Registering tensor: 175 for ONNX tensor: 175 | |
Pad_30 [Pad] outputs: [175 -> (1, 16, 90, 120)[FLOAT]], | |
Parsing node: BatchNormalization_31 [BatchNormalization] | |
Searching for input: 175 | |
Searching for input: _layers.deconv2.2.weight | |
Searching for input: _layers.deconv2.2.bias | |
Searching for input: _layers.deconv2.2.running_mean | |
Searching for input: _layers.deconv2.2.running_var | |
BatchNormalization_31 [BatchNormalization] inputs: [175 -> (1, 16, 90, 120)[FLOAT]], [_layers.deconv2.2.weight -> (16)[FLOAT]], [_layers.deconv2.2.bias -> (16)[FLOAT]], [_layers.deconv2.2.running_mean -> (16)[FLOAT]], [_layers.deconv2.2.running_var -> (16)[FLOAT]], | |
Registering layer: BatchNormalization_31 for ONNX node: BatchNormalization_31 | |
Registering tensor: 176 for ONNX tensor: 176 | |
BatchNormalization_31 [BatchNormalization] outputs: [176 -> (1, 16, 90, 120)[FLOAT]], | |
Parsing node: Relu_32 [Relu] | |
Searching for input: 176 | |
Relu_32 [Relu] inputs: [176 -> (1, 16, 90, 120)[FLOAT]], | |
Registering layer: Relu_32 for ONNX node: Relu_32 | |
Registering tensor: 177 for ONNX tensor: 177 | |
Relu_32 [Relu] outputs: [177 -> (1, 16, 90, 120)[FLOAT]], | |
Parsing node: Concat_33 [Concat] | |
Searching for input: 177 | |
Searching for input: 146 | |
Concat_33 [Concat] inputs: [177 -> (1, 16, 90, 120)[FLOAT]], [146 -> (1, 16, 90, 120)[FLOAT]], | |
Registering layer: Concat_33 for ONNX node: Concat_33 | |
Registering tensor: 178 for ONNX tensor: 178 | |
Concat_33 [Concat] outputs: [178 -> (1, 32, 90, 120)[FLOAT]], | |
Parsing node: Conv_34 [Conv] | |
Searching for input: 178 | |
Searching for input: 232 | |
Searching for input: 233 | |
Conv_34 [Conv] inputs: [178 -> (1, 32, 90, 120)[FLOAT]], [232 -> (16, 32, 3, 3)[FLOAT]], [233 -> (16)[FLOAT]], | |
Convolution input dimensions: (1, 32, 90, 120) | |
Registering layer: Conv_34 for ONNX node: Conv_34 | |
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 16 | |
Convolution output dimensions: (1, 16, 90, 120) | |
Registering tensor: 231 for ONNX tensor: 231 | |
Conv_34 [Conv] outputs: [231 -> (1, 16, 90, 120)[FLOAT]], | |
Parsing node: Relu_35 [Relu] | |
Searching for input: 231 | |
Relu_35 [Relu] inputs: [231 -> (1, 16, 90, 120)[FLOAT]], | |
Registering layer: Relu_35 for ONNX node: Relu_35 | |
Registering tensor: 181 for ONNX tensor: 181 | |
Relu_35 [Relu] outputs: [181 -> (1, 16, 90, 120)[FLOAT]], | |
Parsing node: Conv_36 [Conv] | |
Searching for input: 181 | |
Searching for input: 235 | |
Searching for input: 236 | |
Conv_36 [Conv] inputs: [181 -> (1, 16, 90, 120)[FLOAT]], [235 -> (16, 16, 3, 3)[FLOAT]], [236 -> (16)[FLOAT]], | |
Convolution input dimensions: (1, 16, 90, 120) | |
Registering layer: Conv_36 for ONNX node: Conv_36 | |
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 16 | |
Convolution output dimensions: (1, 16, 90, 120) | |
Registering tensor: 234 for ONNX tensor: 234 | |
Conv_36 [Conv] outputs: [234 -> (1, 16, 90, 120)[FLOAT]], | |
Parsing node: Relu_37 [Relu] | |
Searching for input: 234 | |
Relu_37 [Relu] inputs: [234 -> (1, 16, 90, 120)[FLOAT]], | |
Registering layer: Relu_37 for ONNX node: Relu_37 | |
Registering tensor: 184 for ONNX tensor: 184 | |
Relu_37 [Relu] outputs: [184 -> (1, 16, 90, 120)[FLOAT]], | |
Parsing node: ConvTranspose_38 [ConvTranspose] | |
Searching for input: 184 | |
Searching for input: _layers.deconv3.0.weight | |
Searching for input: _layers.deconv3.0.bias | |
ConvTranspose_38 [ConvTranspose] inputs: [184 -> (1, 16, 90, 120)[FLOAT]], [_layers.deconv3.0.weight -> (16, 8, 3, 3)[FLOAT]], [_layers.deconv3.0.bias -> (8)[FLOAT]], | |
Running deconvolution with: | |
Padding mode: NOTSET | |
Pre-padding: (0, 0) | |
Post-padding: (0, 0) | |
Registering layer: ConvTranspose_38 for ONNX node: ConvTranspose_38 | |
Registering tensor: 185 for ONNX tensor: 185 | |
ConvTranspose_38 [ConvTranspose] outputs: [185 -> (1, 8, 181, 241)[FLOAT]], | |
Parsing node: Pad_39 [Pad] | |
Searching for input: 185 | |
Pad_39 [Pad] inputs: [185 -> (1, 8, 181, 241)[FLOAT]], | |
Registering layer: Pad_39 for ONNX node: Pad_39 | |
Registering tensor: 186 for ONNX tensor: 186 | |
Pad_39 [Pad] outputs: [186 -> (1, 8, 180, 240)[FLOAT]], | |
Parsing node: BatchNormalization_40 [BatchNormalization] | |
Searching for input: 186 | |
Searching for input: _layers.deconv3.2.weight | |
Searching for input: _layers.deconv3.2.bias | |
Searching for input: _layers.deconv3.2.running_mean | |
Searching for input: _layers.deconv3.2.running_var | |
BatchNormalization_40 [BatchNormalization] inputs: [186 -> (1, 8, 180, 240)[FLOAT]], [_layers.deconv3.2.weight -> (8)[FLOAT]], [_layers.deconv3.2.bias -> (8)[FLOAT]], [_layers.deconv3.2.running_mean -> (8)[FLOAT]], [_layers.deconv3.2.running_var -> (8)[FLOAT]], | |
Registering layer: BatchNormalization_40 for ONNX node: BatchNormalization_40 | |
Registering tensor: 187 for ONNX tensor: 187 | |
BatchNormalization_40 [BatchNormalization] outputs: [187 -> (1, 8, 180, 240)[FLOAT]], | |
Parsing node: Relu_41 [Relu] | |
Searching for input: 187 | |
Relu_41 [Relu] inputs: [187 -> (1, 8, 180, 240)[FLOAT]], | |
Registering layer: Relu_41 for ONNX node: Relu_41 | |
Registering tensor: 188 for ONNX tensor: 188 | |
Relu_41 [Relu] outputs: [188 -> (1, 8, 180, 240)[FLOAT]], | |
Parsing node: Concat_42 [Concat] | |
Searching for input: 188 | |
Searching for input: 139 | |
Concat_42 [Concat] inputs: [188 -> (1, 8, 180, 240)[FLOAT]], [139 -> (1, 8, 180, 240)[FLOAT]], | |
Registering layer: Concat_42 for ONNX node: Concat_42 | |
Registering tensor: 189 for ONNX tensor: 189 | |
Concat_42 [Concat] outputs: [189 -> (1, 16, 180, 240)[FLOAT]], | |
Parsing node: Conv_43 [Conv] | |
Searching for input: 189 | |
Searching for input: 238 | |
Searching for input: 239 | |
Conv_43 [Conv] inputs: [189 -> (1, 16, 180, 240)[FLOAT]], [238 -> (8, 16, 3, 3)[FLOAT]], [239 -> (8)[FLOAT]], | |
Convolution input dimensions: (1, 16, 180, 240) | |
Registering layer: Conv_43 for ONNX node: Conv_43 | |
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 8 | |
Convolution output dimensions: (1, 8, 180, 240) | |
Registering tensor: 237 for ONNX tensor: 237 | |
Conv_43 [Conv] outputs: [237 -> (1, 8, 180, 240)[FLOAT]], | |
Parsing node: Relu_44 [Relu] | |
Searching for input: 237 | |
Relu_44 [Relu] inputs: [237 -> (1, 8, 180, 240)[FLOAT]], | |
Registering layer: Relu_44 for ONNX node: Relu_44 | |
Registering tensor: 192 for ONNX tensor: 192 | |
Relu_44 [Relu] outputs: [192 -> (1, 8, 180, 240)[FLOAT]], | |
Parsing node: Conv_45 [Conv] | |
Searching for input: 192 | |
Searching for input: 241 | |
Searching for input: 242 | |
Conv_45 [Conv] inputs: [192 -> (1, 8, 180, 240)[FLOAT]], [241 -> (8, 8, 3, 3)[FLOAT]], [242 -> (8)[FLOAT]], | |
Convolution input dimensions: (1, 8, 180, 240) | |
Registering layer: Conv_45 for ONNX node: Conv_45 | |
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 8 | |
Convolution output dimensions: (1, 8, 180, 240) | |
Registering tensor: 240 for ONNX tensor: 240 | |
Conv_45 [Conv] outputs: [240 -> (1, 8, 180, 240)[FLOAT]], | |
Parsing node: Relu_46 [Relu] | |
Searching for input: 240 | |
Relu_46 [Relu] inputs: [240 -> (1, 8, 180, 240)[FLOAT]], | |
Registering layer: Relu_46 for ONNX node: Relu_46 | |
Registering tensor: 195 for ONNX tensor: 195 | |
Relu_46 [Relu] outputs: [195 -> (1, 8, 180, 240)[FLOAT]], | |
Parsing node: Conv_47 [Conv] | |
Searching for input: 195 | |
Searching for input: 244 | |
Searching for input: 245 | |
Conv_47 [Conv] inputs: [195 -> (1, 8, 180, 240)[FLOAT]], [244 -> (3, 8, 1, 1)[FLOAT]], [245 -> (3)[FLOAT]], | |
Convolution input dimensions: (1, 8, 180, 240) | |
Registering layer: Conv_47 for ONNX node: Conv_47 | |
Using kernel: (1, 1), strides: (1, 1), prepadding: (0, 0), postpadding: (0, 0), dilations: (1, 1), numOutputs: 3 | |
Convolution output dimensions: (1, 3, 180, 240) | |
Registering tensor: raw_conv_out_50 for ONNX tensor: raw_conv_out | |
Conv_47 [Conv] outputs: [raw_conv_out -> (1, 3, 180, 240)[FLOAT]], | |
Marking raw_conv_out_50 as output: raw_conv_out | |
Tensor DataType is determined at build time for tensors not marked as input or output. | |
[MemUsageSnapshot] Builder begin: CPU 352 MiB, GPU 397 MiB | |
Applying generic optimizations to the graph for inference. | |
Original: 49 layers | |
After dead-layer removal: 49 layers | |
After Myelin optimization: 49 layers | |
After scale fusion: 49 layers | |
ConvReluFusion: Fusing Conv_0 with Relu_1 | |
ConvReluFusion: Fusing Conv_2 with Relu_3 | |
ConvReluFusion: Fusing Conv_5 with Relu_6 | |
ConvReluFusion: Fusing Conv_7 with Relu_8 | |
ConvReluFusion: Fusing Conv_10 with Relu_11 | |
ConvReluFusion: Fusing Conv_12 with Relu_13 | |
ConvReluFusion: Fusing Conv_15 with Relu_16 | |
ConvReluFusion: Fusing Conv_17 with Relu_18 | |
ConvReluFusion: Fusing Conv_19 with Relu_20 | |
DeconvScaleFusion: Fusing ConvTranspose_21 with BatchNormalization_22 | |
DeconvReluFusion: Fusing ConvTranspose_21 + BatchNormalization_22 with Relu_23 | |
ConvReluFusion: Fusing Conv_25 with Relu_26 | |
ConvReluFusion: Fusing Conv_27 with Relu_28 | |
DeconvolutionPaddingFusion: Fusing ConvTranspose_29 with Pad_30 | |
DeconvScaleFusion: Fusing ConvTranspose_29 + Pad_30 with BatchNormalization_31 | |
DeconvReluFusion: Fusing ConvTranspose_29 + Pad_30 + BatchNormalization_31 with Relu_32 | |
ConvReluFusion: Fusing Conv_34 with Relu_35 | |
ConvReluFusion: Fusing Conv_36 with Relu_37 | |
DeconvolutionPaddingFusion: Fusing ConvTranspose_38 with Pad_39 | |
DeconvScaleFusion: Fusing ConvTranspose_38 + Pad_39 with BatchNormalization_40 | |
DeconvReluFusion: Fusing ConvTranspose_38 + Pad_39 + BatchNormalization_40 with Relu_41 | |
ConvReluFusion: Fusing Conv_43 with Relu_44 | |
ConvReluFusion: Fusing Conv_45 with Relu_46 | |
After vertical fusions: 26 layers | |
After dupe layer removal: 26 layers | |
After final dead-layer removal: 26 layers | |
After tensor merging: 26 layers | |
Eliminating concatenation Concat_24 | |
Generating copy for 166 to 167 because input does not support striding. | |
Retargeting 153 to 167 | |
Eliminating concatenation Concat_33 | |
Generating copy for 177 to 178 because input does not support striding. | |
Retargeting 146 to 178 | |
Eliminating concatenation Concat_42 | |
Generating copy for 188 to 189 because input does not support striding. | |
Retargeting 139 to 189 | |
After concat removal: 26 layers | |
Graph construction and optimization completed in 0.0164805 seconds. | |
Using cublasLt a tactic source | |
[MemUsageChange] Init cuBLAS/cuBLASLt: CPU +483, GPU +206, now: CPU 836, GPU 603 (MiB) | |
Using cuDNN as a tactic source | |
[MemUsageChange] Init cuDNN: CPU +469, GPU +204, now: CPU 1305, GPU 807 (MiB) | |
Detected invalid timing cache, setup a local cache instead | |
Constructing optimization profile number 0 [1/1]. | |
*************** Autotuning Reformat:Float(43200,43200,240,1) -> Float(43200,1,240,1) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.006048 | |
Tactic: 0 Time: 0.00416 | |
Fastest Tactic: 0 Time: 0.00416 | |
*************** Autotuning format combination: Float(43200,43200,240,1) -> Float(345600,43200,240,1) *************** | |
--------------- Timing Runner: Conv_0 + Relu_1 (CudaDepthwiseConvolution) | |
CudaDepthwiseConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_0 + Relu_1 (FusedConvActConvolution) | |
Tactic: 393215 Time: 0.02896 | |
Tactic: 1245183 Time: 0.030816 | |
Tactic: 1572863 Time: 0.028672 | |
Tactic: 4784127 Time: 0.045056 | |
Tactic: 4849663 Time: 0.032768 | |
Tactic: 5111807 Time: 0.032864 | |
Tactic: 6553599 Time: 0.029856 | |
Tactic: 6619135 Time: 0.030816 | |
Tactic: 9306111 Time: 0.034496 | |
Tactic: 9371647 Time: 0.034912 | |
Tactic: 9633791 Time: 0.0328 | |
Fastest Tactic: 1572863 Time: 0.028672 | |
--------------- Timing Runner: Conv_0 + Relu_1 (CudnnConvolution) | |
Tactic: 0 Time: 0.036256 | |
Tactic: 1 Time: 0.034976 | |
Tactic: 2 Time: 0.044896 | |
Tactic: 4 Time: 0.223616 | |
Tactic: 5 Time: 0.559072 | |
Tactic: 56 Time: 0.035072 | |
Tactic: 57 Time: 0.035072 | |
Tactic: 58 Time: 0.044032 | |
Tactic: 60 Time: 0.2208 | |
Tactic: 61 Time: 0.528384 | |
Fastest Tactic: 1 Time: 0.034976 | |
--------------- Timing Runner: Conv_0 + Relu_1 (CaskConvolution) | |
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384 | |
Tactic: 1825138533642645384 Time: 0.068544 | |
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458 | |
Tactic: 2842488832350522458 Time: 0.041152 | |
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238 | |
Tactic: 3915320020053085238 Time: 0.068832 | |
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203 | |
Tactic: 6448355332020552203 Time: 0.093824 | |
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604 | |
Tactic: 6808617066150061604 Time: 0.037888 | |
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864 | |
Tactic: -8060443123034038864 Time: 0.038688 | |
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522 | |
Tactic: -4420849921117327522 Time: 0.024416 | |
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337 | |
Tactic: -3946921629105938337 Time: 0.022272 | |
Fastest Tactic: -3946921629105938337 Time: 0.022272 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -3946921629105938337 | |
*************** Autotuning format combination: Float(43200,1,240,1) -> Float(345600,1,1920,8) *************** | |
--------------- Timing Runner: Conv_0 + Relu_1 (CudnnConvolution) | |
CudnnConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_0 + Relu_1 (CaskConvolution) | |
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376 | |
Tactic: 861694390046228376 Time: 0.269248 | |
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316 | |
Tactic: 5821621277990374316 Time: 0.269536 | |
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465 | |
Tactic: -3853827649136781465 Time: 0.270432 | |
Fastest Tactic: 861694390046228376 Time: 0.269248 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 861694390046228376 | |
*************** Autotuning Reformat:Float(345600,43200,240,1) -> Float(345600,1,1920,8) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.034912 | |
Tactic: 0 Time: 0.009504 | |
Fastest Tactic: 0 Time: 0.009504 | |
*************** Autotuning Reformat:Float(345600,1,1920,8) -> Float(345600,43200,240,1) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.03584 | |
Tactic: 0 Time: 0.012288 | |
Fastest Tactic: 0 Time: 0.012288 | |
*************** Autotuning format combination: Float(345600,43200,240,1) -> Float(691200,43200,240,1) *************** | |
--------------- Timing Runner: Conv_2 + Relu_3 (CudaDepthwiseConvolution) | |
CudaDepthwiseConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_2 + Relu_3 (FusedConvActConvolution) | |
Tactic: 393215 Time: 0.0792 | |
Tactic: 917503 Time: 0.075776 | |
Tactic: 1114111 Time: 0.130528 | |
Tactic: 1245183 Time: 0.128928 | |
Tactic: 1572863 Time: 0.073632 | |
Tactic: 2490367 Time: 0.07728 | |
Tactic: 2555903 Time: 0.075776 | |
Tactic: 2949119 Time: 0.149472 | |
Tactic: 3211263 Time: 0.267648 | |
Tactic: 3801087 Time: 0.077728 | |
Tactic: 3866623 Time: 0.0768 | |
Tactic: 4128767 Time: 0.132192 | |
Tactic: 4456447 Time: 0.074752 | |
Tactic: 4718591 Time: 0.13312 | |
Tactic: 4784127 Time: 0.253952 | |
Tactic: 4849663 Time: 0.134304 | |
Tactic: 5111807 Time: 0.13312 | |
Tactic: 5308415 Time: 0.139456 | |
Tactic: 5505023 Time: 0.262144 | |
Tactic: 6094847 Time: 0.078848 | |
Tactic: 6356991 Time: 0.085184 | |
Tactic: 6553599 Time: 0.079168 | |
Tactic: 6619135 Time: 0.099488 | |
Tactic: 6684671 Time: 0.268384 | |
Tactic: 7471103 Time: 0.074912 | |
Tactic: 7667711 Time: 0.133184 | |
Tactic: 7929855 Time: 0.13824 | |
Tactic: 8060927 Time: 0.082048 | |
Tactic: 8126463 Time: 0.142784 | |
Tactic: 8388607 Time: 0.14544 | |
Tactic: 8519679 Time: 0.094112 | |
Tactic: 8781823 Time: 0.16736 | |
Tactic: 8912895 Time: 0.151456 | |
Tactic: 9240575 Time: 0.136768 | |
Tactic: 9306111 Time: 0.132352 | |
Tactic: 9371647 Time: 0.135264 | |
Tactic: 9437183 Time: 0.1496 | |
Tactic: 9633791 Time: 0.134528 | |
Tactic: 9699327 Time: 0.076704 | |
Tactic: 9764863 Time: 0.075264 | |
Tactic: 10158079 Time: 0.077152 | |
Tactic: 10420223 Time: 0.153696 | |
Tactic: 10616831 Time: 0.077632 | |
Tactic: 10878975 Time: 0.074688 | |
Fastest Tactic: 1572863 Time: 0.073632 | |
--------------- Timing Runner: Conv_2 + Relu_3 (CudnnConvolution) | |
Tactic: 0 Time: 0.141216 | |
Tactic: 1 Time: 0.09792 | |
Tactic: 2 Time: 0.481376 | |
Tactic: 4 Time: 0.484928 | |
Tactic: 5 Time: 0.593632 | |
Tactic: 56 Time: 0.139584 | |
Tactic: 57 Time: 0.096672 | |
Tactic: 58 Time: 0.484736 | |
Tactic: 60 Time: 0.483424 | |
Tactic: 61 Time: 0.592768 | |
Fastest Tactic: 57 Time: 0.096672 | |
--------------- Timing Runner: Conv_2 + Relu_3 (CaskConvolution) | |
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384 | |
Tactic: 1825138533642645384 Time: 0.274272 | |
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458 | |
Tactic: 2842488832350522458 Time: 0.145312 | |
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238 | |
Tactic: 3915320020053085238 Time: 0.274272 | |
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203 | |
Tactic: 6448355332020552203 Time: 0.305248 | |
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604 | |
Tactic: 6808617066150061604 Time: 0.14656 | |
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864 | |
Tactic: -8060443123034038864 Time: 0.147456 | |
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522 | |
Tactic: -4420849921117327522 Time: 0.086112 | |
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337 | |
Tactic: -3946921629105938337 Time: 0.082016 | |
Fastest Tactic: -3946921629105938337 Time: 0.082016 | |
>>>>>>>>>>>>>>> Chose Runner Type: FusedConvActConvolution Tactic: 1572863 | |
*************** Autotuning format combination: Float(345600,1,1920,8) -> Float(691200,1,3840,16) *************** | |
--------------- Timing Runner: Conv_2 + Relu_3 (CudnnConvolution) | |
CudnnConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_2 + Relu_3 (CaskConvolution) | |
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376 | |
Tactic: 861694390046228376 Time: 0.268352 | |
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567 | |
Tactic: 1017870653102653567 Time: 0.2664 | |
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167 | |
Tactic: 5258189349241541167 Time: 0.280224 | |
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316 | |
Tactic: 5821621277990374316 Time: 0.268384 | |
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648 | |
Tactic: 5863767799113001648 Time: 0.323296 | |
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536 | |
Tactic: -9147980667639709536 Time: 0.26624 | |
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857 | |
Tactic: -8850904373104590857 Time: 0.278688 | |
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660 | |
Tactic: -7751035352149795660 Time: 0.267424 | |
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465 | |
Tactic: -3853827649136781465 Time: 0.270336 | |
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196 | |
Tactic: -3263369460438823196 Time: 0.276384 | |
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819 | |
Tactic: -423878181466897819 Time: 0.325728 | |
Fastest Tactic: -9147980667639709536 Time: 0.26624 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -9147980667639709536 | |
*************** Autotuning Reformat:Float(691200,43200,240,1) -> Float(691200,1,3840,16) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.034976 | |
Tactic: 0 Time: 0.010016 | |
Fastest Tactic: 0 Time: 0.010016 | |
*************** Autotuning Reformat:Float(691200,1,3840,16) -> Float(691200,43200,240,1) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.036288 | |
Tactic: 0 Time: 0.012352 | |
Fastest Tactic: 0 Time: 0.012352 | |
*************** Autotuning Reformat:Float(691200,1,3840,16) -> Float(691200,43200,240,1) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.03616 | |
Tactic: 0 Time: 0.01216 | |
Fastest Tactic: 0 Time: 0.01216 | |
*************** Autotuning format combination: Float(691200,43200,240,1) -> Float(86400,10800,120,1) *************** | |
--------------- Timing Runner: MaxPool_4 (TiledPooling) | |
Tactic: 5505281 Time: 0.008288 | |
Tactic: 5570817 Time: 0.00624 | |
Tactic: 5636353 Time: 0.006208 | |
Tactic: 5701889 Time: 0.006176 | |
Tactic: 5767425 Time: 0.006048 | |
Tactic: 5832961 Time: 0.00624 | |
Tactic: 5898497 Time: 0.006144 | |
Tactic: 5964033 Time: 0.006048 | |
Tactic: 6029569 Time: 0.008288 | |
Tactic: 6095105 Time: 0.006208 | |
Tactic: 6160641 Time: 0.00624 | |
Tactic: 6226177 Time: 0.005984 | |
Tactic: 6291713 Time: 0.006016 | |
Tactic: 6357249 Time: 0.006176 | |
Tactic: 6422785 Time: 0.00624 | |
Tactic: 6488321 Time: 0.005984 | |
Fastest Tactic: 6226177 Time: 0.005984 | |
--------------- Timing Runner: MaxPool_4 (CudnnPooling) | |
Tactic: -1 Time: 0.006048 | |
Fastest Tactic: -1 Time: 0.006048 | |
>>>>>>>>>>>>>>> Chose Runner Type: TiledPooling Tactic: 6226177 | |
*************** Autotuning Reformat:Float(86400,10800,120,1) -> Float(86400,1,960,8) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.01424 | |
Tactic: 0 Time: 0.00576 | |
Fastest Tactic: 0 Time: 0.00576 | |
*************** Autotuning Reformat:Float(86400,10800,120,1) -> Float(86400,1,960,8) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.013152 | |
Tactic: 0 Time: 0.005824 | |
Fastest Tactic: 0 Time: 0.005824 | |
*************** Autotuning Reformat:Float(86400,1,960,8) -> Float(86400,10800,120,1) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.013984 | |
Tactic: 0 Time: 0.00624 | |
Fastest Tactic: 0 Time: 0.00624 | |
*************** Autotuning format combination: Float(86400,10800,120,1) -> Float(172800,10800,120,1) *************** | |
--------------- Timing Runner: Conv_5 + Relu_6 (CudaDepthwiseConvolution) | |
CudaDepthwiseConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_5 + Relu_6 (FusedConvActConvolution) | |
Tactic: 524287 Time: 0.020352 | |
Tactic: 720895 Time: 0.020576 | |
Tactic: 983039 Time: 0.020544 | |
Tactic: 1048575 Time: 0.018272 | |
Tactic: 1703935 Time: 0.01648 | |
Tactic: 1769471 Time: 0.032928 | |
Tactic: 1966079 Time: 0.023936 | |
Tactic: 2031615 Time: 0.024672 | |
Tactic: 2228223 Time: 0.018304 | |
Tactic: 2621439 Time: 0.018432 | |
Tactic: 2752511 Time: 0.022528 | |
Tactic: 2818047 Time: 0.034176 | |
Tactic: 2883583 Time: 0.032864 | |
Tactic: 3014655 Time: 0.01792 | |
Tactic: 3145727 Time: 0.020576 | |
Tactic: 3473407 Time: 0.03488 | |
Tactic: 3604479 Time: 0.017568 | |
Tactic: 3735551 Time: 0.030816 | |
Tactic: 4390911 Time: 0.02464 | |
Tactic: 5046271 Time: 0.01648 | |
Tactic: 5963775 Time: 0.02672 | |
Tactic: 6160383 Time: 0.020544 | |
Tactic: 6488063 Time: 0.018336 | |
Tactic: 6881279 Time: 0.022592 | |
Tactic: 7274495 Time: 0.025824 | |
Tactic: 7864319 Time: 0.018496 | |
Tactic: 7995391 Time: 0.02048 | |
Tactic: 8585215 Time: 0.01984 | |
Tactic: 8847359 Time: 0.019424 | |
Tactic: 8978431 Time: 0.026688 | |
Tactic: 9043967 Time: 0.016448 | |
Tactic: 9175039 Time: 0.017408 | |
Tactic: 9502719 Time: 0.02464 | |
Tactic: 9830399 Time: 0.032672 | |
Tactic: 10027007 Time: 0.017664 | |
Tactic: 10092543 Time: 0.024736 | |
Tactic: 10289151 Time: 0.023936 | |
Tactic: 10485759 Time: 0.016192 | |
Tactic: 10682367 Time: 0.018336 | |
Tactic: 10813439 Time: 0.020384 | |
Fastest Tactic: 10485759 Time: 0.016192 | |
--------------- Timing Runner: Conv_5 + Relu_6 (CudnnConvolution) | |
Tactic: 0 Time: 0.024672 | |
Tactic: 1 Time: 0.02352 | |
Tactic: 2 Time: 0.061504 | |
Tactic: 4 Time: 0.254528 | |
Tactic: 5 Time: 0.15216 | |
Tactic: 6 Time: 0.02464 | |
Tactic: 56 Time: 0.025888 | |
Tactic: 57 Time: 0.02432 | |
Tactic: 58 Time: 0.062592 | |
Tactic: 60 Time: 0.251744 | |
Tactic: 61 Time: 0.153216 | |
Tactic: 62 Time: 0.025728 | |
Fastest Tactic: 1 Time: 0.02352 | |
--------------- Timing Runner: Conv_5 + Relu_6 (CaskConvolution) | |
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384 | |
Tactic: 1825138533642645384 Time: 0.034912 | |
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867 | |
Tactic: 2775507031594384867 Time: 0.016128 | |
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458 | |
Tactic: 2842488832350522458 Time: 0.022624 | |
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238 | |
Tactic: 3915320020053085238 Time: 0.034848 | |
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203 | |
Tactic: 6448355332020552203 Time: 0.039904 | |
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604 | |
Tactic: 6808617066150061604 Time: 0.02256 | |
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864 | |
Tactic: -8060443123034038864 Time: 0.022752 | |
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522 | |
Tactic: -4420849921117327522 Time: 0.015904 | |
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337 | |
Tactic: -3946921629105938337 Time: 0.0224 | |
Fastest Tactic: -4420849921117327522 Time: 0.015904 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -4420849921117327522 | |
*************** Autotuning format combination: Float(86400,1,960,8) -> Float(172800,1,1920,16) *************** | |
--------------- Timing Runner: Conv_5 + Relu_6 (CudnnConvolution) | |
CudnnConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_5 + Relu_6 (CaskConvolution) | |
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376 | |
Tactic: 861694390046228376 Time: 0.032832 | |
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567 | |
Tactic: 1017870653102653567 Time: 0.0328 | |
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167 | |
Tactic: 5258189349241541167 Time: 0.034272 | |
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316 | |
Tactic: 5821621277990374316 Time: 0.032832 | |
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648 | |
Tactic: 5863767799113001648 Time: 0.039008 | |
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536 | |
Tactic: -9147980667639709536 Time: 0.032768 | |
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857 | |
Tactic: -8850904373104590857 Time: 0.03424 | |
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660 | |
Tactic: -7751035352149795660 Time: 0.032832 | |
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465 | |
Tactic: -3853827649136781465 Time: 0.032832 | |
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196 | |
Tactic: -3263369460438823196 Time: 0.03408 | |
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819 | |
Tactic: -423878181466897819 Time: 0.038912 | |
Fastest Tactic: -9147980667639709536 Time: 0.032768 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -9147980667639709536 | |
*************** Autotuning Reformat:Float(172800,10800,120,1) -> Float(172800,1,1920,16) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.013984 | |
Tactic: 0 Time: 0.007296 | |
Fastest Tactic: 0 Time: 0.007296 | |
*************** Autotuning Reformat:Float(172800,1,1920,16) -> Float(172800,10800,120,1) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.014336 | |
Tactic: 0 Time: 0.008256 | |
Fastest Tactic: 0 Time: 0.008256 | |
*************** Autotuning format combination: Float(172800,10800,120,1) -> Float(345600,10800,120,1) *************** | |
--------------- Timing Runner: Conv_7 + Relu_8 (CudaDepthwiseConvolution) | |
CudaDepthwiseConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_7 + Relu_8 (FusedConvActConvolution) | |
Tactic: 524287 Time: 0.02816 | |
Tactic: 720895 Time: 0.032768 | |
Tactic: 983039 Time: 0.03232 | |
Tactic: 1048575 Time: 0.026112 | |
Tactic: 1703935 Time: 0.020576 | |
Tactic: 1769471 Time: 0.044992 | |
Tactic: 1966079 Time: 0.03664 | |
Tactic: 2031615 Time: 0.040128 | |
Tactic: 2228223 Time: 0.024128 | |
Tactic: 2424831 Time: 0.030784 | |
Tactic: 2621439 Time: 0.023968 | |
Tactic: 2752511 Time: 0.035968 | |
Tactic: 2818047 Time: 0.057408 | |
Tactic: 2883583 Time: 0.055296 | |
Tactic: 3014655 Time: 0.02256 | |
Tactic: 3145727 Time: 0.032672 | |
Tactic: 3473407 Time: 0.060416 | |
Tactic: 3604479 Time: 0.022528 | |
Tactic: 3735551 Time: 0.054304 | |
Tactic: 4390911 Time: 0.038816 | |
Tactic: 5046271 Time: 0.02256 | |
Tactic: 5963775 Time: 0.043104 | |
Tactic: 6160383 Time: 0.027648 | |
Tactic: 6488063 Time: 0.024864 | |
Tactic: 6881279 Time: 0.034912 | |
Tactic: 7274495 Time: 0.03712 | |
Tactic: 7864319 Time: 0.02352 | |
Tactic: 7995391 Time: 0.032768 | |
Tactic: 8585215 Time: 0.026176 | |
Tactic: 8847359 Time: 0.024576 | |
Tactic: 8978431 Time: 0.043072 | |
Tactic: 9043967 Time: 0.02144 | |
Tactic: 9175039 Time: 0.022528 | |
Tactic: 9502719 Time: 0.038816 | |
Tactic: 9830399 Time: 0.056416 | |
Tactic: 9961471 Time: 0.032768 | |
Tactic: 10027007 Time: 0.022624 | |
Tactic: 10092543 Time: 0.038816 | |
Tactic: 10289151 Time: 0.036 | |
Tactic: 10485759 Time: 0.020544 | |
Tactic: 10682367 Time: 0.022528 | |
Tactic: 10813439 Time: 0.031744 | |
Fastest Tactic: 10485759 Time: 0.020544 | |
--------------- Timing Runner: Conv_7 + Relu_8 (CudnnConvolution) | |
Tactic: 0 Time: 0.038816 | |
Tactic: 1 Time: 0.032576 | |
Tactic: 2 Time: 0.114656 | |
Tactic: 4 Time: 0.372416 | |
Tactic: 5 Time: 0.171648 | |
Tactic: 6 Time: 0.028768 | |
Tactic: 56 Time: 0.038592 | |
Tactic: 57 Time: 0.032512 | |
Tactic: 58 Time: 0.114752 | |
Tactic: 60 Time: 0.374464 | |
Tactic: 61 Time: 0.17264 | |
Tactic: 62 Time: 0.028704 | |
Fastest Tactic: 62 Time: 0.028704 | |
--------------- Timing Runner: Conv_7 + Relu_8 (CaskConvolution) | |
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384 | |
Tactic: 1825138533642645384 Time: 0.058528 | |
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867 | |
Tactic: 2775507031594384867 Time: 0.020288 | |
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458 | |
Tactic: 2842488832350522458 Time: 0.034656 | |
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238 | |
Tactic: 3915320020053085238 Time: 0.057344 | |
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203 | |
Tactic: 6448355332020552203 Time: 0.063104 | |
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604 | |
Tactic: 6808617066150061604 Time: 0.034912 | |
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864 | |
Tactic: -8060443123034038864 Time: 0.036256 | |
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522 | |
Tactic: -4420849921117327522 Time: 0.024032 | |
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337 | |
Tactic: -3946921629105938337 Time: 0.034912 | |
Fastest Tactic: 2775507031594384867 Time: 0.020288 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 2775507031594384867 | |
*************** Autotuning format combination: Float(172800,1,1920,16) -> Float(345600,1,3840,32) *************** | |
--------------- Timing Runner: Conv_7 + Relu_8 (CudnnConvolution) | |
CudnnConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_7 + Relu_8 (CaskConvolution) | |
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376 | |
Tactic: 861694390046228376 Time: 0.056416 | |
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567 | |
Tactic: 1017870653102653567 Time: 0.056992 | |
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167 | |
Tactic: 5258189349241541167 Time: 0.0336 | |
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316 | |
Tactic: 5821621277990374316 Time: 0.055392 | |
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648 | |
Tactic: 5863767799113001648 Time: 0.039008 | |
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536 | |
Tactic: -9147980667639709536 Time: 0.055392 | |
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857 | |
Tactic: -8850904373104590857 Time: 0.0344 | |
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660 | |
Tactic: -7751035352149795660 Time: 0.055392 | |
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465 | |
Tactic: -3853827649136781465 Time: 0.0568 | |
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196 | |
Tactic: -3263369460438823196 Time: 0.034368 | |
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819 | |
Tactic: -423878181466897819 Time: 0.039072 | |
Fastest Tactic: 5258189349241541167 Time: 0.0336 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 5258189349241541167 | |
*************** Autotuning Reformat:Float(345600,10800,120,1) -> Float(345600,1,3840,32) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.013952 | |
Tactic: 0 Time: 0.00624 | |
Fastest Tactic: 0 Time: 0.00624 | |
*************** Autotuning Reformat:Float(345600,1,3840,32) -> Float(345600,10800,120,1) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.014432 | |
Tactic: 0 Time: 0.008288 | |
Fastest Tactic: 0 Time: 0.008288 | |
*************** Autotuning Reformat:Float(345600,1,3840,32) -> Float(345600,10800,120,1) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.014432 | |
Tactic: 0 Time: 0.009952 | |
Fastest Tactic: 0 Time: 0.009952 | |
*************** Autotuning format combination: Float(345600,10800,120,1) -> Float(43200,2700,60,1) *************** | |
--------------- Timing Runner: MaxPool_9 (TiledPooling) | |
Tactic: 5505281 Time: 0.00624 | |
Tactic: 5570817 Time: 0.005312 | |
Tactic: 5636353 Time: 0.005088 | |
Tactic: 5701889 Time: 0.00512 | |
Tactic: 5767425 Time: 0.00512 | |
Tactic: 5832961 Time: 0.005344 | |
Tactic: 5898497 Time: 0.005472 | |
Tactic: 5964033 Time: 0.00512 | |
Tactic: 6029569 Time: 0.00624 | |
Tactic: 6095105 Time: 0.00512 | |
Tactic: 6160641 Time: 0.005792 | |
Tactic: 6226177 Time: 0.005344 | |
Tactic: 6291713 Time: 0.005088 | |
Tactic: 6357249 Time: 0.005344 | |
Tactic: 6422785 Time: 0.005504 | |
Tactic: 6488321 Time: 0.005312 | |
Fastest Tactic: 5636353 Time: 0.005088 | |
--------------- Timing Runner: MaxPool_9 (CudnnPooling) | |
Tactic: -1 Time: 0.005504 | |
Fastest Tactic: -1 Time: 0.005504 | |
>>>>>>>>>>>>>>> Chose Runner Type: TiledPooling Tactic: 5636353 | |
*************** Autotuning Reformat:Float(43200,2700,60,1) -> Float(43200,1,960,16) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.008288 | |
Tactic: 0 Time: 0.00416 | |
Fastest Tactic: 0 Time: 0.00416 | |
*************** Autotuning Reformat:Float(43200,2700,60,1) -> Float(43200,1,960,16) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.008256 | |
Tactic: 0 Time: 0.004224 | |
Fastest Tactic: 0 Time: 0.004224 | |
*************** Autotuning Reformat:Float(43200,1,960,16) -> Float(43200,2700,60,1) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.008224 | |
Tactic: 0 Time: 0.00512 | |
Fastest Tactic: 0 Time: 0.00512 | |
*************** Autotuning format combination: Float(43200,2700,60,1) -> Float(86400,2700,60,1) *************** | |
--------------- Timing Runner: Conv_10 + Relu_11 (CudaDepthwiseConvolution) | |
CudaDepthwiseConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_10 + Relu_11 (FusedConvActConvolution) | |
Tactic: 524287 Time: 0.018496 | |
Tactic: 720895 Time: 0.020224 | |
Tactic: 983039 Time: 0.015232 | |
Tactic: 1048575 Time: 0.01824 | |
Tactic: 1703935 Time: 0.013408 | |
Tactic: 1769471 Time: 0.017376 | |
Tactic: 1966079 Time: 0.02464 | |
Tactic: 2031615 Time: 0.022528 | |
Tactic: 2228223 Time: 0.018528 | |
Tactic: 2424831 Time: 0.01808 | |
Tactic: 2621439 Time: 0.0136 | |
Tactic: 2752511 Time: 0.01664 | |
Tactic: 2818047 Time: 0.021856 | |
Tactic: 2883583 Time: 0.027744 | |
Tactic: 3014655 Time: 0.014496 | |
Tactic: 3145727 Time: 0.014432 | |
Tactic: 3473407 Time: 0.023552 | |
Tactic: 3604479 Time: 0.015616 | |
Tactic: 3735551 Time: 0.022624 | |
Tactic: 4390911 Time: 0.0256 | |
Tactic: 5046271 Time: 0.016 | |
Tactic: 5963775 Time: 0.023808 | |
Tactic: 6160383 Time: 0.01808 | |
Tactic: 6488063 Time: 0.02048 | |
Tactic: 6881279 Time: 0.02048 | |
Tactic: 7274495 Time: 0.0144 | |
Tactic: 7864319 Time: 0.01408 | |
Tactic: 7995391 Time: 0.020608 | |
Tactic: 8585215 Time: 0.019776 | |
Tactic: 8847359 Time: 0.014368 | |
Tactic: 8978431 Time: 0.02448 | |
Tactic: 9043967 Time: 0.014432 | |
Tactic: 9175039 Time: 0.016288 | |
Tactic: 9502719 Time: 0.026528 | |
Tactic: 9830399 Time: 0.024192 | |
Tactic: 9961471 Time: 0.01808 | |
Tactic: 10027007 Time: 0.018336 | |
Tactic: 10092543 Time: 0.026272 | |
Tactic: 10289151 Time: 0.024608 | |
Tactic: 10485759 Time: 0.013888 | |
Tactic: 10682367 Time: 0.013408 | |
Tactic: 10813439 Time: 0.01392 | |
Fastest Tactic: 1703935 Time: 0.013408 | |
--------------- Timing Runner: Conv_10 + Relu_11 (CudnnConvolution) | |
Tactic: 0 Time: 0.022528 | |
Tactic: 1 Time: 0.022528 | |
Tactic: 2 Time: 0.051296 | |
Tactic: 4 Time: 0.137216 | |
Tactic: 5 Time: 0.070432 | |
Tactic: 6 Time: 0.017408 | |
Tactic: 56 Time: 0.022528 | |
Tactic: 57 Time: 0.022624 | |
Tactic: 58 Time: 0.0512 | |
Tactic: 60 Time: 0.136672 | |
Tactic: 61 Time: 0.072928 | |
Tactic: 62 Time: 0.018336 | |
Fastest Tactic: 6 Time: 0.017408 | |
--------------- Timing Runner: Conv_10 + Relu_11 (CaskConvolution) | |
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384 | |
Tactic: 1825138533642645384 Time: 0.032832 | |
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867 | |
Tactic: 2775507031594384867 Time: 0.010304 | |
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458 | |
Tactic: 2842488832350522458 Time: 0.024416 | |
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238 | |
Tactic: 3915320020053085238 Time: 0.03264 | |
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203 | |
Tactic: 6448355332020552203 Time: 0.032832 | |
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604 | |
Tactic: 6808617066150061604 Time: 0.02256 | |
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864 | |
Tactic: -8060443123034038864 Time: 0.023456 | |
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522 | |
Tactic: -4420849921117327522 Time: 0.02048 | |
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337 | |
Tactic: -3946921629105938337 Time: 0.024672 | |
Fastest Tactic: 2775507031594384867 Time: 0.010304 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 2775507031594384867 | |
*************** Autotuning format combination: Float(43200,1,960,16) -> Float(86400,1,1920,32) *************** | |
--------------- Timing Runner: Conv_10 + Relu_11 (CudnnConvolution) | |
CudnnConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_10 + Relu_11 (CaskConvolution) | |
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376 | |
Tactic: 861694390046228376 Time: 0.03072 | |
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567 | |
Tactic: 1017870653102653567 Time: 0.030816 | |
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167 | |
Tactic: 5258189349241541167 Time: 0.02032 | |
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316 | |
Tactic: 5821621277990374316 Time: 0.030816 | |
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648 | |
Tactic: 5863767799113001648 Time: 0.020576 | |
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536 | |
Tactic: -9147980667639709536 Time: 0.032128 | |
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857 | |
Tactic: -8850904373104590857 Time: 0.02032 | |
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660 | |
Tactic: -7751035352149795660 Time: 0.030816 | |
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465 | |
Tactic: -3853827649136781465 Time: 0.031872 | |
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196 | |
Tactic: -3263369460438823196 Time: 0.020384 | |
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819 | |
Tactic: -423878181466897819 Time: 0.02048 | |
Fastest Tactic: 5258189349241541167 Time: 0.02032 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 5258189349241541167 | |
*************** Autotuning Reformat:Float(86400,2700,60,1) -> Float(86400,1,1920,32) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.008288 | |
Tactic: 0 Time: 0.005984 | |
Fastest Tactic: 0 Time: 0.005984 | |
*************** Autotuning Reformat:Float(86400,1,1920,32) -> Float(86400,2700,60,1) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.009056 | |
Tactic: 0 Time: 0.00768 | |
Fastest Tactic: 0 Time: 0.00768 | |
*************** Autotuning format combination: Float(86400,2700,60,1) -> Float(259200,2700,60,1) *************** | |
--------------- Timing Runner: Conv_12 + Relu_13 (CudaDepthwiseConvolution) | |
CudaDepthwiseConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_12 + Relu_13 (FusedConvActConvolution) | |
Tactic: 524287 Time: 0.02672 | |
Tactic: 720895 Time: 0.032352 | |
Tactic: 983039 Time: 0.024064 | |
Tactic: 1048575 Time: 0.026272 | |
Tactic: 1703935 Time: 0.017664 | |
Tactic: 1769471 Time: 0.02432 | |
Tactic: 1966079 Time: 0.04 | |
Tactic: 2031615 Time: 0.03472 | |
Tactic: 2228223 Time: 0.024128 | |
Tactic: 2424831 Time: 0.022592 | |
Tactic: 2621439 Time: 0.018336 | |
Tactic: 2752511 Time: 0.036864 | |
Tactic: 2818047 Time: 0.034816 | |
Tactic: 2883583 Time: 0.045056 | |
Tactic: 3014655 Time: 0.020576 | |
Tactic: 3145727 Time: 0.021984 | |
Tactic: 3473407 Time: 0.03904 | |
Tactic: 3604479 Time: 0.021888 | |
Tactic: 3735551 Time: 0.039008 | |
Tactic: 4390911 Time: 0.041024 | |
Tactic: 5046271 Time: 0.022464 | |
Tactic: 5963775 Time: 0.03696 | |
Tactic: 6160383 Time: 0.024672 | |
Tactic: 6488063 Time: 0.032096 | |
Tactic: 6881279 Time: 0.03168 | |
Tactic: 7274495 Time: 0.020544 | |
Tactic: 7864319 Time: 0.018496 | |
Tactic: 7995391 Time: 0.032832 | |
Tactic: 8585215 Time: 0.028704 | |
Tactic: 8847359 Time: 0.019648 | |
Tactic: 8978431 Time: 0.03696 | |
Tactic: 9043967 Time: 0.019808 | |
Tactic: 9175039 Time: 0.02064 | |
Tactic: 9502719 Time: 0.042816 | |
Tactic: 9830399 Time: 0.040672 | |
Tactic: 9961471 Time: 0.023936 | |
Tactic: 10027007 Time: 0.025824 | |
Tactic: 10092543 Time: 0.042208 | |
Tactic: 10289151 Time: 0.040032 | |
Tactic: 10485759 Time: 0.01744 | |
Tactic: 10682367 Time: 0.017696 | |
Tactic: 10813439 Time: 0.020384 | |
Fastest Tactic: 10485759 Time: 0.01744 | |
--------------- Timing Runner: Conv_12 + Relu_13 (CudnnConvolution) | |
Tactic: 0 Time: 0.034016 | |
Tactic: 1 Time: 0.033952 | |
Tactic: 2 Time: 0.085088 | |
Tactic: 4 Time: 0.212992 | |
Tactic: 5 Time: 0.114688 | |
Tactic: 6 Time: 0.021568 | |
Tactic: 56 Time: 0.0328 | |
Tactic: 57 Time: 0.032864 | |
Tactic: 58 Time: 0.083968 | |
Tactic: 60 Time: 0.212992 | |
Tactic: 61 Time: 0.114688 | |
Tactic: 62 Time: 0.020672 | |
Fastest Tactic: 62 Time: 0.020672 | |
--------------- Timing Runner: Conv_12 + Relu_13 (CaskConvolution) | |
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384 | |
Tactic: 1825138533642645384 Time: 0.056704 | |
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867 | |
Tactic: 2775507031594384867 Time: 0.013824 | |
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458 | |
Tactic: 2842488832350522458 Time: 0.037888 | |
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238 | |
Tactic: 3915320020053085238 Time: 0.05504 | |
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203 | |
Tactic: 6448355332020552203 Time: 0.05632 | |
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604 | |
Tactic: 6808617066150061604 Time: 0.034912 | |
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864 | |
Tactic: -8060443123034038864 Time: 0.038976 | |
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522 | |
Tactic: -4420849921117327522 Time: 0.034272 | |
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337 | |
Tactic: -3946921629105938337 Time: 0.041056 | |
Fastest Tactic: 2775507031594384867 Time: 0.013824 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 2775507031594384867 | |
*************** Autotuning format combination: Float(86400,1,1920,32) -> Float(259200,1,5760,96) *************** | |
--------------- Timing Runner: Conv_12 + Relu_13 (CudnnConvolution) | |
CudnnConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_12 + Relu_13 (CaskConvolution) | |
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376 | |
Tactic: 861694390046228376 Time: 0.05504 | |
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567 | |
Tactic: 1017870653102653567 Time: 0.054528 | |
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167 | |
Tactic: 5258189349241541167 Time: 0.032064 | |
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316 | |
Tactic: 5821621277990374316 Time: 0.053248 | |
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648 | |
Tactic: 5863767799113001648 Time: 0.020576 | |
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536 | |
Tactic: -9147980667639709536 Time: 0.05456 | |
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857 | |
Tactic: -8850904373104590857 Time: 0.032608 | |
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660 | |
Tactic: -7751035352149795660 Time: 0.053344 | |
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465 | |
Tactic: -3853827649136781465 Time: 0.055072 | |
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196 | |
Tactic: -3263369460438823196 Time: 0.03216 | |
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819 | |
Tactic: -423878181466897819 Time: 0.020544 | |
Fastest Tactic: -423878181466897819 Time: 0.020544 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -423878181466897819 | |
*************** Autotuning Reformat:Float(259200,2700,60,1) -> Float(259200,1,5760,96) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.009408 | |
Tactic: 0 Time: 0.005952 | |
Fastest Tactic: 0 Time: 0.005952 | |
*************** Autotuning Reformat:Float(259200,1,5760,96) -> Float(259200,2700,60,1) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.009408 | |
Tactic: 0 Time: 0.007392 | |
Fastest Tactic: 0 Time: 0.007392 | |
*************** Autotuning Reformat:Float(259200,1,5760,96) -> Float(259200,2700,60,1) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.00944 | |
Tactic: 0 Time: 0.007296 | |
Fastest Tactic: 0 Time: 0.007296 | |
*************** Autotuning format combination: Float(259200,2700,60,1) -> Float(9600,300,20,1) *************** | |
--------------- Timing Runner: MaxPool_14 (TiledPooling) | |
Tactic: 7536897 Time: 0.016448 | |
Tactic: 7536898 Time: 0.01024 | |
Tactic: 7536900 Time: 0.00624 | |
Tactic: 7536903 Time: 0.006112 | |
Tactic: 7536904 Time: 0.006144 | |
Tactic: 7536911 Time: 0.00624 | |
Tactic: 7536912 Time: 0.006144 | |
Tactic: 7536916 Time: 0.005632 | |
Tactic: 7537153 Time: 0.010304 | |
Tactic: 7537154 Time: 0.007488 | |
Tactic: 7537156 Time: 0.007296 | |
Tactic: 7537159 Time: 0.005856 | |
Tactic: 7537160 Time: 0.00624 | |
Tactic: 7537167 Time: 0.006432 | |
Tactic: 7537168 Time: 0.006176 | |
Tactic: 7537172 Time: 0.00624 | |
Tactic: 7537409 Time: 0.008288 | |
Tactic: 7537410 Time: 0.00624 | |
Tactic: 7537412 Time: 0.006048 | |
Tactic: 7537415 Time: 0.00496 | |
Tactic: 7537416 Time: 0.00624 | |
Tactic: 7537423 Time: 0.007392 | |
Tactic: 7537424 Time: 0.007264 | |
Tactic: 7537428 Time: 0.007232 | |
Tactic: 7537665 Time: 0.007712 | |
Tactic: 7537666 Time: 0.00624 | |
Tactic: 7537668 Time: 0.006048 | |
Tactic: 7537671 Time: 0.005824 | |
Tactic: 7537672 Time: 0.00624 | |
Tactic: 7537679 Time: 0.008224 | |
Tactic: 7537680 Time: 0.008224 | |
Tactic: 7537684 Time: 0.006144 | |
Tactic: 7537921 Time: 0.00624 | |
Tactic: 7537922 Time: 0.004928 | |
Tactic: 7537924 Time: 0.005824 | |
Tactic: 7537927 Time: 0.005024 | |
Tactic: 7537928 Time: 0.00624 | |
Tactic: 7537935 Time: 0.007744 | |
Tactic: 7537936 Time: 0.007712 | |
Tactic: 7537940 Time: 0.005888 | |
Tactic: 7538177 Time: 0.006208 | |
Tactic: 7538178 Time: 0.006208 | |
Tactic: 7538180 Time: 0.006144 | |
Tactic: 7538183 Time: 0.00624 | |
Tactic: 7538184 Time: 0.00624 | |
Tactic: 7538191 Time: 0.008 | |
Tactic: 7538192 Time: 0.007168 | |
Tactic: 7538433 Time: 0.006208 | |
Tactic: 7538434 Time: 0.006144 | |
Tactic: 7538436 Time: 0.00624 | |
Tactic: 7538439 Time: 0.00624 | |
Tactic: 7538440 Time: 0.00624 | |
Tactic: 7538447 Time: 0.00736 | |
Tactic: 7538448 Time: 0.00816 | |
Tactic: 7538689 Time: 0.00624 | |
Tactic: 7538690 Time: 0.006048 | |
Tactic: 7538692 Time: 0.00624 | |
Tactic: 7538695 Time: 0.006048 | |
Tactic: 7538696 Time: 0.00624 | |
Tactic: 7538945 Time: 0.00624 | |
Tactic: 7538946 Time: 0.006144 | |
Tactic: 7538948 Time: 0.00624 | |
Tactic: 7538951 Time: 0.007392 | |
Tactic: 7538952 Time: 0.00752 | |
Tactic: 7539201 Time: 0.00624 | |
Tactic: 7539202 Time: 0.00624 | |
Tactic: 7539204 Time: 0.006208 | |
Tactic: 7539207 Time: 0.00752 | |
Tactic: 7539208 Time: 0.007872 | |
Tactic: 7539457 Time: 0.00624 | |
Tactic: 7539458 Time: 0.006208 | |
Tactic: 7539460 Time: 0.00624 | |
Tactic: 7539463 Time: 0.007712 | |
Tactic: 7539464 Time: 0.007712 | |
Tactic: 7539713 Time: 0.006208 | |
Tactic: 7539714 Time: 0.006208 | |
Tactic: 7539716 Time: 0.006208 | |
Tactic: 7539719 Time: 0.008096 | |
Tactic: 7539720 Time: 0.007872 | |
Fastest Tactic: 7537922 Time: 0.004928 | |
--------------- Timing Runner: MaxPool_14 (CudnnPooling) | |
Tactic: -1 Time: 0.005472 | |
Fastest Tactic: -1 Time: 0.005472 | |
>>>>>>>>>>>>>>> Chose Runner Type: TiledPooling Tactic: 7537922 | |
*************** Autotuning Reformat:Float(9600,300,20,1) -> Float(9600,1,640,32) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.008256 | |
Tactic: 0 Time: 0.004096 | |
Fastest Tactic: 0 Time: 0.004096 | |
*************** Autotuning Reformat:Float(9600,300,20,1) -> Float(9600,1,640,32) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.008192 | |
Tactic: 0 Time: 0.004192 | |
Fastest Tactic: 0 Time: 0.004192 | |
*************** Autotuning Reformat:Float(9600,1,640,32) -> Float(9600,300,20,1) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.008192 | |
Tactic: 0 Time: 0.004192 | |
Fastest Tactic: 0 Time: 0.004192 | |
*************** Autotuning format combination: Float(9600,300,20,1) -> Float(19200,300,20,1) *************** | |
--------------- Timing Runner: Conv_15 + Relu_16 (CudaDepthwiseConvolution) | |
CudaDepthwiseConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_15 + Relu_16 (FusedConvActConvolution) | |
Tactic: 524287 Time: 0.024416 | |
Tactic: 720895 Time: 0.020608 | |
Tactic: 983039 Time: 0.01408 | |
Tactic: 1048575 Time: 0.017408 | |
Tactic: 1703935 Time: 0.012096 | |
Tactic: 1769471 Time: 0.015424 | |
Tactic: 1966079 Time: 0.03824 | |
Tactic: 2031615 Time: 0.032096 | |
Tactic: 2228223 Time: 0.019936 | |
Tactic: 2424831 Time: 0.012384 | |
Tactic: 2621439 Time: 0.0112 | |
Tactic: 2752511 Time: 0.023808 | |
Tactic: 2818047 Time: 0.022624 | |
Tactic: 2883583 Time: 0.042656 | |
Tactic: 3014655 Time: 0.014048 | |
Tactic: 3145727 Time: 0.01536 | |
Tactic: 3473407 Time: 0.024608 | |
Tactic: 3604479 Time: 0.014112 | |
Tactic: 3735551 Time: 0.018496 | |
Tactic: 4390911 Time: 0.040384 | |
Tactic: 5046271 Time: 0.015936 | |
Tactic: 5963775 Time: 0.03472 | |
Tactic: 6160383 Time: 0.02224 | |
Tactic: 6488063 Time: 0.020384 | |
Tactic: 6881279 Time: 0.028736 | |
Tactic: 7274495 Time: 0.011616 | |
Tactic: 7864319 Time: 0.012256 | |
Tactic: 7995391 Time: 0.022432 | |
Tactic: 8585215 Time: 0.02672 | |
Tactic: 8847359 Time: 0.012384 | |
Tactic: 8978431 Time: 0.03472 | |
Tactic: 9043967 Time: 0.012384 | |
Tactic: 9175039 Time: 0.01424 | |
Tactic: 9502719 Time: 0.040448 | |
Tactic: 9830399 Time: 0.019648 | |
Tactic: 9961471 Time: 0.013312 | |
Tactic: 10027007 Time: 0.01648 | |
Tactic: 10092543 Time: 0.040544 | |
Tactic: 10289151 Time: 0.038304 | |
Tactic: 10485759 Time: 0.010336 | |
Tactic: 10682367 Time: 0.011616 | |
Tactic: 10813439 Time: 0.014464 | |
Fastest Tactic: 10485759 Time: 0.010336 | |
--------------- Timing Runner: Conv_15 + Relu_16 (CudnnConvolution) | |
Tactic: 0 Time: 0.023648 | |
Tactic: 1 Time: 0.022624 | |
Tactic: 2 Time: 0.06768 | |
Tactic: 4 Time: 0.082816 | |
Tactic: 5 Time: 0.082688 | |
Tactic: 6 Time: 0.019712 | |
Tactic: 56 Time: 0.022624 | |
Tactic: 57 Time: 0.02272 | |
Tactic: 58 Time: 0.06768 | |
Tactic: 60 Time: 0.082752 | |
Tactic: 61 Time: 0.082016 | |
Tactic: 62 Time: 0.019552 | |
Fastest Tactic: 62 Time: 0.019552 | |
--------------- Timing Runner: Conv_15 + Relu_16 (CaskConvolution) | |
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384 | |
Tactic: 1825138533642645384 Time: 0.05632 | |
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867 | |
Tactic: 2775507031594384867 Time: 0.013536 | |
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458 | |
Tactic: 2842488832350522458 Time: 0.038816 | |
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238 | |
Tactic: 3915320020053085238 Time: 0.05504 | |
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203 | |
Tactic: 6448355332020552203 Time: 0.057152 | |
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604 | |
Tactic: 6808617066150061604 Time: 0.034848 | |
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864 | |
Tactic: -8060443123034038864 Time: 0.04064 | |
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522 | |
Tactic: -4420849921117327522 Time: 0.03488 | |
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337 | |
Tactic: -3946921629105938337 Time: 0.04096 | |
Fastest Tactic: 2775507031594384867 Time: 0.013536 | |
>>>>>>>>>>>>>>> Chose Runner Type: FusedConvActConvolution Tactic: 10485759 | |
*************** Autotuning format combination: Float(9600,1,640,32) -> Float(19200,1,1280,64) *************** | |
--------------- Timing Runner: Conv_15 + Relu_16 (CudnnConvolution) | |
CudnnConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_15 + Relu_16 (CaskConvolution) | |
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376 | |
Tactic: 861694390046228376 Time: 0.053344 | |
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567 | |
Tactic: 1017870653102653567 Time: 0.055104 | |
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167 | |
Tactic: 5258189349241541167 Time: 0.032 | |
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316 | |
Tactic: 5821621277990374316 Time: 0.053344 | |
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648 | |
Tactic: 5863767799113001648 Time: 0.020576 | |
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536 | |
Tactic: -9147980667639709536 Time: 0.053312 | |
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857 | |
Tactic: -8850904373104590857 Time: 0.032416 | |
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660 | |
Tactic: -7751035352149795660 Time: 0.053344 | |
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465 | |
Tactic: -3853827649136781465 Time: 0.055104 | |
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196 | |
Tactic: -3263369460438823196 Time: 0.032352 | |
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819 | |
Tactic: -423878181466897819 Time: 0.02048 | |
Fastest Tactic: -423878181466897819 Time: 0.02048 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -423878181466897819 | |
*************** Autotuning Reformat:Float(19200,300,20,1) -> Float(19200,1,1280,64) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.008096 | |
Tactic: 0 Time: 0.005664 | |
Fastest Tactic: 0 Time: 0.005664 | |
*************** Autotuning Reformat:Float(19200,1,1280,64) -> Float(19200,300,20,1) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.008288 | |
Tactic: 0 Time: 0.004192 | |
Fastest Tactic: 0 Time: 0.004192 | |
*************** Autotuning format combination: Float(19200,300,20,1) -> Float(19200,300,20,1) *************** | |
--------------- Timing Runner: Conv_17 + Relu_18 (CudaDepthwiseConvolution) | |
CudaDepthwiseConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_17 + Relu_18 (FusedConvActConvolution) | |
Tactic: 524287 Time: 0.041056 | |
Tactic: 720895 Time: 0.036288 | |
Tactic: 983039 Time: 0.020576 | |
Tactic: 1048575 Time: 0.028768 | |
Tactic: 1703935 Time: 0.017664 | |
Tactic: 1769471 Time: 0.022432 | |
Tactic: 1966079 Time: 0.06768 | |
Tactic: 2031615 Time: 0.056768 | |
Tactic: 2228223 Time: 0.03216 | |
Tactic: 2424831 Time: 0.01648 | |
Tactic: 2621439 Time: 0.015456 | |
Tactic: 2752511 Time: 0.039008 | |
Tactic: 2818047 Time: 0.036864 | |
Tactic: 2883583 Time: 0.077824 | |
Tactic: 3014655 Time: 0.020512 | |
Tactic: 3145727 Time: 0.024 | |
Tactic: 3473407 Time: 0.042656 | |
Tactic: 3604479 Time: 0.020576 | |
Tactic: 3735551 Time: 0.030816 | |
Tactic: 4390911 Time: 0.071776 | |
Tactic: 5046271 Time: 0.026464 | |
Tactic: 5963775 Time: 0.061536 | |
Tactic: 6160383 Time: 0.0368 | |
Tactic: 6488063 Time: 0.03248 | |
Tactic: 6881279 Time: 0.051008 | |
Tactic: 7274495 Time: 0.016096 | |
Tactic: 7864319 Time: 0.01648 | |
Tactic: 7995391 Time: 0.03696 | |
Tactic: 8585215 Time: 0.0472 | |
Tactic: 8847359 Time: 0.016544 | |
Tactic: 8978431 Time: 0.061536 | |
Tactic: 9043967 Time: 0.01856 | |
Tactic: 9175039 Time: 0.020576 | |
Tactic: 9502719 Time: 0.071776 | |
Tactic: 9830399 Time: 0.032864 | |
Tactic: 9961471 Time: 0.018528 | |
Tactic: 10027007 Time: 0.02672 | |
Tactic: 10092543 Time: 0.071776 | |
Tactic: 10289151 Time: 0.067648 | |
Tactic: 10485759 Time: 0.016288 | |
Tactic: 10682367 Time: 0.014432 | |
Tactic: 10813439 Time: 0.022592 | |
Fastest Tactic: 10682367 Time: 0.014432 | |
--------------- Timing Runner: Conv_17 + Relu_18 (CudnnConvolution) | |
Tactic: 0 Time: 0.036768 | |
Tactic: 1 Time: 0.03696 | |
Tactic: 2 Time: 0.097504 | |
Tactic: 4 Time: 0.134496 | |
Tactic: 5 Time: 0.137856 | |
Tactic: 6 Time: 0.02608 | |
Tactic: 56 Time: 0.036768 | |
Tactic: 57 Time: 0.036832 | |
Tactic: 58 Time: 0.096352 | |
Tactic: 60 Time: 0.134464 | |
Tactic: 61 Time: 0.137504 | |
Tactic: 62 Time: 0.026336 | |
Fastest Tactic: 6 Time: 0.02608 | |
--------------- Timing Runner: Conv_17 + Relu_18 (CaskConvolution) | |
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384 | |
Tactic: 1825138533642645384 Time: 0.102496 | |
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867 | |
Tactic: 2775507031594384867 Time: 0.019648 | |
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458 | |
Tactic: 2842488832350522458 Time: 0.067328 | |
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238 | |
Tactic: 3915320020053085238 Time: 0.100032 | |
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203 | |
Tactic: 6448355332020552203 Time: 0.103808 | |
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604 | |
Tactic: 6808617066150061604 Time: 0.061536 | |
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864 | |
Tactic: -8060443123034038864 Time: 0.071552 | |
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522 | |
Tactic: -4420849921117327522 Time: 0.063264 | |
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337 | |
Tactic: -3946921629105938337 Time: 0.075328 | |
Fastest Tactic: 2775507031594384867 Time: 0.019648 | |
>>>>>>>>>>>>>>> Chose Runner Type: FusedConvActConvolution Tactic: 10682367 | |
*************** Autotuning format combination: Float(19200,1,1280,64) -> Float(19200,1,1280,64) *************** | |
--------------- Timing Runner: Conv_17 + Relu_18 (CudnnConvolution) | |
CudnnConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_17 + Relu_18 (CaskConvolution) | |
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376 | |
Tactic: 861694390046228376 Time: 0.099904 | |
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567 | |
Tactic: 1017870653102653567 Time: 0.099648 | |
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167 | |
Tactic: 5258189349241541167 Time: 0.055296 | |
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316 | |
Tactic: 5821621277990374316 Time: 0.099008 | |
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648 | |
Tactic: 5863767799113001648 Time: 0.032864 | |
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536 | |
Tactic: -9147980667639709536 Time: 0.099552 | |
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857 | |
Tactic: -8850904373104590857 Time: 0.056416 | |
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660 | |
Tactic: -7751035352149795660 Time: 0.098368 | |
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465 | |
Tactic: -3853827649136781465 Time: 0.100448 | |
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196 | |
Tactic: -3263369460438823196 Time: 0.055392 | |
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819 | |
Tactic: -423878181466897819 Time: 0.032864 | |
Fastest Tactic: 5863767799113001648 Time: 0.032864 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 5863767799113001648 | |
*************** Autotuning Reformat:Float(19200,300,20,1) -> Float(19200,1,1280,64) *************** | |
*************** Autotuning Reformat:Float(19200,1,1280,64) -> Float(19200,300,20,1) *************** | |
*************** Autotuning format combination: Float(19200,300,20,1) -> Float(19200,300,20,1) *************** | |
*************** Autotuning format combination: Float(19200,1,1280,64) -> Float(19200,1,1280,64) *************** | |
*************** Autotuning Reformat:Float(19200,300,20,1) -> Float(19200,1,1280,64) *************** | |
*************** Autotuning Reformat:Float(19200,1,1280,64) -> Float(19200,300,20,1) *************** | |
*************** Autotuning format combination: Float(19200,300,20,1) -> Float(172800,2700,60,1) *************** | |
--------------- Timing Runner: ConvTranspose_21 + BatchNormalization_22 + Relu_23 (CudnnDeconvolution) | |
Tactic: 0 Time: 0.030656 | |
Tactic: 1 Time: 0.251808 | |
Fastest Tactic: 0 Time: 0.030656 | |
--------------- Timing Runner: ConvTranspose_21 + BatchNormalization_22 + Relu_23 (GemmDeconvolution) | |
Tactic: 0 Time: 0.018464 | |
Fastest Tactic: 0 Time: 0.018464 | |
--------------- Timing Runner: ConvTranspose_21 + BatchNormalization_22 + Relu_23 (CaskDeconvolution) | |
CaskDeconvolution has no valid tactics for this config, skipping | |
>>>>>>>>>>>>>>> Chose Runner Type: GemmDeconvolution Tactic: 0 | |
*************** Autotuning format combination: Float(19200,1,1280,64) -> Float(172800,1,3840,64) *************** | |
--------------- Timing Runner: ConvTranspose_21 + BatchNormalization_22 + Relu_23 (CudnnDeconvolution) | |
CudnnDeconvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: ConvTranspose_21 + BatchNormalization_22 + Relu_23 (GemmDeconvolution) | |
GemmDeconvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: ConvTranspose_21 + BatchNormalization_22 + Relu_23 (CaskDeconvolution) | |
CaskDeconvolution has no valid tactics for this config, skipping | |
*************** Autotuning Reformat:Float(172800,2700,60,1) -> Float(259200,2700,60,1) *************** | |
--------------- Timing Runner: 166 copy (Reformat) | |
Tactic: 1002 Time: 0.008192 | |
Tactic: 0 Time: 0.005344 | |
Fastest Tactic: 0 Time: 0.005344 | |
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0 | |
*************** Autotuning Reformat:Float(172800,2700,60,1) -> Float(259200,1,5760,96) *************** | |
--------------- Timing Runner: 166 copy (Reformat) | |
Tactic: 1002 Time: 0.008224 | |
Tactic: 0 Time: 0.008096 | |
Fastest Tactic: 0 Time: 0.008096 | |
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0 | |
*************** Autotuning Reformat:Float(172800,1,3840,64) -> Float(259200,2700,60,1) *************** | |
--------------- Timing Runner: 166 copy (Reformat) | |
Tactic: 1002 Time: 0.010112 | |
Tactic: 0 Time: 0.010144 | |
Fastest Tactic: 1002 Time: 0.010112 | |
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 1002 | |
*************** Autotuning Reformat:Float(172800,1,3840,64) -> Float(259200,1,5760,96) *************** | |
--------------- Timing Runner: 166 copy (Reformat) | |
Tactic: 1002 Time: 0.008288 | |
Tactic: 0 Time: 0.00624 | |
Fastest Tactic: 0 Time: 0.00624 | |
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0 | |
*************** Autotuning Reformat:Float(259200,2700,60,1) -> Float(259200,1,5760,96) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.009856 | |
Tactic: 0 Time: 0.010368 | |
Fastest Tactic: 1002 Time: 0.009856 | |
*************** Autotuning Reformat:Float(259200,1,5760,96) -> Float(259200,2700,60,1) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.012288 | |
Tactic: 0 Time: 0.012288 | |
Fastest Tactic: 1002 Time: 0.012288 | |
*************** Autotuning format combination: Float(259200,2700,60,1) -> Float(86400,2700,60,1) *************** | |
--------------- Timing Runner: Conv_25 + Relu_26 (CudaDepthwiseConvolution) | |
CudaDepthwiseConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_25 + Relu_26 (FusedConvActConvolution) | |
Tactic: 524287 Time: 0.06096 | |
Tactic: 720895 Time: 0.081024 | |
Tactic: 983039 Time: 0.057216 | |
Tactic: 1048575 Time: 0.059296 | |
Tactic: 1703935 Time: 0.034144 | |
Tactic: 1769471 Time: 0.051104 | |
Tactic: 1966079 Time: 0.099904 | |
Tactic: 2031615 Time: 0.084064 | |
Tactic: 2228223 Time: 0.049088 | |
Tactic: 2424831 Time: 0.04064 | |
Tactic: 2621439 Time: 0.032832 | |
Tactic: 2752511 Time: 0.092256 | |
Tactic: 2818047 Time: 0.08912 | |
Tactic: 2883583 Time: 0.116544 | |
Tactic: 3014655 Time: 0.043104 | |
Tactic: 3145727 Time: 0.050304 | |
Tactic: 3473407 Time: 0.103712 | |
Tactic: 3604479 Time: 0.043104 | |
Tactic: 3735551 Time: 0.105728 | |
Tactic: 4390911 Time: 0.104544 | |
Tactic: 5046271 Time: 0.049216 | |
Tactic: 5963775 Time: 0.090272 | |
Tactic: 6160383 Time: 0.054784 | |
Tactic: 6488063 Time: 0.071776 | |
Tactic: 6881279 Time: 0.07568 | |
Tactic: 7274495 Time: 0.04512 | |
Tactic: 7864319 Time: 0.034912 | |
Tactic: 7995391 Time: 0.08528 | |
Tactic: 8585215 Time: 0.06768 | |
Tactic: 8847359 Time: 0.03696 | |
Tactic: 8978431 Time: 0.091744 | |
Tactic: 9043967 Time: 0.040672 | |
Tactic: 9175039 Time: 0.043104 | |
Tactic: 9502719 Time: 0.106464 | |
Tactic: 9830399 Time: 0.108064 | |
Tactic: 9961471 Time: 0.043264 | |
Tactic: 10027007 Time: 0.0552 | |
Tactic: 10092543 Time: 0.105632 | |
Tactic: 10289151 Time: 0.09968 | |
Tactic: 10485759 Time: 0.032672 | |
Tactic: 10682367 Time: 0.032864 | |
Tactic: 10813439 Time: 0.045888 | |
Fastest Tactic: 10485759 Time: 0.032672 | |
--------------- Timing Runner: Conv_25 + Relu_26 (CudnnConvolution) | |
Tactic: 0 Time: 0.13088 | |
Tactic: 1 Time: 0.061536 | |
Tactic: 2 Time: 0.171936 | |
Tactic: 4 Time: 0.570912 | |
Tactic: 5 Time: 0.2448 | |
Tactic: 6 Time: 0.033056 | |
Tactic: 56 Time: 0.130912 | |
Tactic: 57 Time: 0.062432 | |
Tactic: 58 Time: 0.171872 | |
Tactic: 60 Time: 0.56704 | |
Tactic: 61 Time: 0.241792 | |
Tactic: 62 Time: 0.033952 | |
Fastest Tactic: 6 Time: 0.033056 | |
--------------- Timing Runner: Conv_25 + Relu_26 (CaskConvolution) | |
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384 | |
Tactic: 1825138533642645384 Time: 0.147552 | |
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867 | |
Tactic: 2775507031594384867 Time: 0.026368 | |
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458 | |
Tactic: 2842488832350522458 Time: 0.095488 | |
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238 | |
Tactic: 3915320020053085238 Time: 0.145184 | |
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203 | |
Tactic: 6448355332020552203 Time: 0.148864 | |
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604 | |
Tactic: 6808617066150061604 Time: 0.08816 | |
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864 | |
Tactic: -8060443123034038864 Time: 0.0984 | |
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522 | |
Tactic: -4420849921117327522 Time: 0.088128 | |
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337 | |
Tactic: -3946921629105938337 Time: 0.106592 | |
Fastest Tactic: 2775507031594384867 Time: 0.026368 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 2775507031594384867 | |
*************** Autotuning format combination: Float(259200,1,5760,96) -> Float(86400,1,1920,32) *************** | |
--------------- Timing Runner: Conv_25 + Relu_26 (CudnnConvolution) | |
CudnnConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_25 + Relu_26 (CaskConvolution) | |
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376 | |
Tactic: 861694390046228376 Time: 0.145248 | |
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567 | |
Tactic: 1017870653102653567 Time: 0.14544 | |
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167 | |
Tactic: 5258189349241541167 Time: 0.079776 | |
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316 | |
Tactic: 5821621277990374316 Time: 0.14416 | |
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648 | |
Tactic: 5863767799113001648 Time: 0.045152 | |
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536 | |
Tactic: -9147980667639709536 Time: 0.145536 | |
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857 | |
Tactic: -8850904373104590857 Time: 0.079968 | |
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660 | |
Tactic: -7751035352149795660 Time: 0.144672 | |
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465 | |
Tactic: -3853827649136781465 Time: 0.147392 | |
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196 | |
Tactic: -3263369460438823196 Time: 0.079744 | |
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819 | |
Tactic: -423878181466897819 Time: 0.046272 | |
Fastest Tactic: 5863767799113001648 Time: 0.045152 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 5863767799113001648 | |
*************** Autotuning Reformat:Float(86400,2700,60,1) -> Float(86400,1,1920,32) *************** | |
*************** Autotuning Reformat:Float(86400,1,1920,32) -> Float(86400,2700,60,1) *************** | |
*************** Autotuning format combination: Float(86400,2700,60,1) -> Float(86400,2700,60,1) *************** | |
--------------- Timing Runner: Conv_27 + Relu_28 (CudaDepthwiseConvolution) | |
CudaDepthwiseConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_27 + Relu_28 (FusedConvActConvolution) | |
Tactic: 524287 Time: 0.02672 | |
Tactic: 720895 Time: 0.031584 | |
Tactic: 983039 Time: 0.023552 | |
Tactic: 1048575 Time: 0.026464 | |
Tactic: 1703935 Time: 0.017664 | |
Tactic: 1769471 Time: 0.024128 | |
Tactic: 1966079 Time: 0.04016 | |
Tactic: 2031615 Time: 0.034912 | |
Tactic: 2228223 Time: 0.024672 | |
Tactic: 2424831 Time: 0.022624 | |
Tactic: 2621439 Time: 0.018016 | |
Tactic: 2752511 Time: 0.03696 | |
Tactic: 2818047 Time: 0.034816 | |
Tactic: 2883583 Time: 0.04512 | |
Tactic: 3014655 Time: 0.020576 | |
Tactic: 3145727 Time: 0.022144 | |
Tactic: 3473407 Time: 0.038912 | |
Tactic: 3604479 Time: 0.02048 | |
Tactic: 3735551 Time: 0.039008 | |
Tactic: 4390911 Time: 0.04192 | |
Tactic: 5046271 Time: 0.022592 | |
Tactic: 5963775 Time: 0.036896 | |
Tactic: 6160383 Time: 0.024672 | |
Tactic: 6488063 Time: 0.030816 | |
Tactic: 6881279 Time: 0.03088 | |
Tactic: 7274495 Time: 0.020576 | |
Tactic: 7864319 Time: 0.018432 | |
Tactic: 7995391 Time: 0.0328 | |
Tactic: 8585215 Time: 0.030304 | |
Tactic: 8847359 Time: 0.019328 | |
Tactic: 8978431 Time: 0.036928 | |
Tactic: 9043967 Time: 0.019968 | |
Tactic: 9175039 Time: 0.02048 | |
Tactic: 9502719 Time: 0.042656 | |
Tactic: 9830399 Time: 0.039904 | |
Tactic: 9961471 Time: 0.023904 | |
Tactic: 10027007 Time: 0.02608 | |
Tactic: 10092543 Time: 0.041056 | |
Tactic: 10289151 Time: 0.040224 | |
Tactic: 10485759 Time: 0.01648 | |
Tactic: 10682367 Time: 0.017984 | |
Tactic: 10813439 Time: 0.02048 | |
Fastest Tactic: 10485759 Time: 0.01648 | |
--------------- Timing Runner: Conv_27 + Relu_28 (CudnnConvolution) | |
Tactic: 0 Time: 0.033056 | |
Tactic: 1 Time: 0.032864 | |
Tactic: 2 Time: 0.083936 | |
Tactic: 4 Time: 0.212864 | |
Tactic: 5 Time: 0.113216 | |
Tactic: 6 Time: 0.020576 | |
Tactic: 56 Time: 0.033024 | |
Tactic: 57 Time: 0.032864 | |
Tactic: 58 Time: 0.08368 | |
Tactic: 60 Time: 0.214688 | |
Tactic: 61 Time: 0.11376 | |
Tactic: 62 Time: 0.020576 | |
Fastest Tactic: 6 Time: 0.020576 | |
--------------- Timing Runner: Conv_27 + Relu_28 (CaskConvolution) | |
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384 | |
Tactic: 1825138533642645384 Time: 0.056576 | |
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867 | |
Tactic: 2775507031594384867 Time: 0.014016 | |
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458 | |
Tactic: 2842488832350522458 Time: 0.0384 | |
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238 | |
Tactic: 3915320020053085238 Time: 0.055168 | |
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203 | |
Tactic: 6448355332020552203 Time: 0.055392 | |
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604 | |
Tactic: 6808617066150061604 Time: 0.034912 | |
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864 | |
Tactic: -8060443123034038864 Time: 0.039008 | |
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522 | |
Tactic: -4420849921117327522 Time: 0.034592 | |
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337 | |
Tactic: -3946921629105938337 Time: 0.04112 | |
Fastest Tactic: 2775507031594384867 Time: 0.014016 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 2775507031594384867 | |
*************** Autotuning format combination: Float(86400,1,1920,32) -> Float(86400,1,1920,32) *************** | |
--------------- Timing Runner: Conv_27 + Relu_28 (CudnnConvolution) | |
CudnnConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_27 + Relu_28 (CaskConvolution) | |
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376 | |
Tactic: 861694390046228376 Time: 0.053344 | |
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567 | |
Tactic: 1017870653102653567 Time: 0.054496 | |
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167 | |
Tactic: 5258189349241541167 Time: 0.032128 | |
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316 | |
Tactic: 5821621277990374316 Time: 0.053344 | |
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648 | |
Tactic: 5863767799113001648 Time: 0.020576 | |
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536 | |
Tactic: -9147980667639709536 Time: 0.05328 | |
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857 | |
Tactic: -8850904373104590857 Time: 0.032544 | |
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660 | |
Tactic: -7751035352149795660 Time: 0.05328 | |
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465 | |
Tactic: -3853827649136781465 Time: 0.055168 | |
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196 | |
Tactic: -3263369460438823196 Time: 0.032192 | |
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819 | |
Tactic: -423878181466897819 Time: 0.020576 | |
Fastest Tactic: 5863767799113001648 Time: 0.020576 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 5863767799113001648 | |
*************** Autotuning Reformat:Float(86400,2700,60,1) -> Float(86400,1,1920,32) *************** | |
*************** Autotuning Reformat:Float(86400,1,1920,32) -> Float(86400,2700,60,1) *************** | |
*************** Autotuning format combination: Float(86400,2700,60,1) -> Float(172800,10800,120,1) *************** | |
--------------- Timing Runner: ConvTranspose_29 + Pad_30 + BatchNormalization_31 + Relu_32 (CudnnDeconvolution) | |
CudnnDeconvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: ConvTranspose_29 + Pad_30 + BatchNormalization_31 + Relu_32 (GemmDeconvolution) | |
Tactic: 0 Time: 0.026464 | |
Fastest Tactic: 0 Time: 0.026464 | |
--------------- Timing Runner: ConvTranspose_29 + Pad_30 + BatchNormalization_31 + Relu_32 (CaskDeconvolution) | |
CaskDeconvolution has no valid tactics for this config, skipping | |
>>>>>>>>>>>>>>> Chose Runner Type: GemmDeconvolution Tactic: 0 | |
*************** Autotuning format combination: Float(86400,1,1920,32) -> Float(172800,1,1920,16) *************** | |
--------------- Timing Runner: ConvTranspose_29 + Pad_30 + BatchNormalization_31 + Relu_32 (CudnnDeconvolution) | |
CudnnDeconvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: ConvTranspose_29 + Pad_30 + BatchNormalization_31 + Relu_32 (GemmDeconvolution) | |
GemmDeconvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: ConvTranspose_29 + Pad_30 + BatchNormalization_31 + Relu_32 (CaskDeconvolution) | |
CaskDeconvolution has no valid tactics for this config, skipping | |
*************** Autotuning Reformat:Float(172800,10800,120,1) -> Float(345600,10800,120,1) *************** | |
--------------- Timing Runner: 177 copy (Reformat) | |
Tactic: 1002 Time: 0.008096 | |
Tactic: 0 Time: 0.005344 | |
Fastest Tactic: 0 Time: 0.005344 | |
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0 | |
*************** Autotuning Reformat:Float(172800,10800,120,1) -> Float(345600,1,3840,32) *************** | |
--------------- Timing Runner: 177 copy (Reformat) | |
Tactic: 1002 Time: 0.013984 | |
Tactic: 0 Time: 0.00624 | |
Fastest Tactic: 0 Time: 0.00624 | |
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0 | |
*************** Autotuning Reformat:Float(172800,1,1920,16) -> Float(345600,10800,120,1) *************** | |
--------------- Timing Runner: 177 copy (Reformat) | |
Tactic: 1002 Time: 0.0144 | |
Tactic: 0 Time: 0.008224 | |
Fastest Tactic: 0 Time: 0.008224 | |
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0 | |
*************** Autotuning Reformat:Float(172800,1,1920,16) -> Float(345600,1,3840,32) *************** | |
--------------- Timing Runner: 177 copy (Reformat) | |
Tactic: 1002 Time: 0.012288 | |
Tactic: 0 Time: 0.00624 | |
Fastest Tactic: 0 Time: 0.00624 | |
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0 | |
*************** Autotuning Reformat:Float(345600,10800,120,1) -> Float(345600,1,3840,32) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.014432 | |
Tactic: 0 Time: 0.011264 | |
Fastest Tactic: 0 Time: 0.011264 | |
*************** Autotuning Reformat:Float(345600,1,3840,32) -> Float(345600,10800,120,1) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.016288 | |
Tactic: 0 Time: 0.014432 | |
Fastest Tactic: 0 Time: 0.014432 | |
*************** Autotuning format combination: Float(345600,10800,120,1) -> Float(172800,10800,120,1) *************** | |
--------------- Timing Runner: Conv_34 + Relu_35 (CudaDepthwiseConvolution) | |
CudaDepthwiseConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_34 + Relu_35 (FusedConvActConvolution) | |
Tactic: 524287 Time: 0.045152 | |
Tactic: 720895 Time: 0.055392 | |
Tactic: 983039 Time: 0.055392 | |
Tactic: 1048575 Time: 0.041056 | |
Tactic: 1703935 Time: 0.033152 | |
Tactic: 1769471 Time: 0.06768 | |
Tactic: 1966079 Time: 0.061472 | |
Tactic: 2031615 Time: 0.069312 | |
Tactic: 2228223 Time: 0.03872 | |
Tactic: 2424831 Time: 0.043104 | |
Tactic: 2621439 Time: 0.03488 | |
Tactic: 2752511 Time: 0.063008 | |
Tactic: 2818047 Time: 0.102304 | |
Tactic: 2883583 Time: 0.101504 | |
Tactic: 3014655 Time: 0.036864 | |
Tactic: 3145727 Time: 0.055296 | |
Tactic: 3473407 Time: 0.112352 | |
Tactic: 3604479 Time: 0.036544 | |
Tactic: 3735551 Time: 0.100256 | |
Tactic: 4390911 Time: 0.06688 | |
Tactic: 5046271 Time: 0.03696 | |
Tactic: 5963775 Time: 0.076896 | |
Tactic: 6160383 Time: 0.04496 | |
Tactic: 6488063 Time: 0.038976 | |
Tactic: 6881279 Time: 0.059488 | |
Tactic: 7274495 Time: 0.059488 | |
Tactic: 7864319 Time: 0.036704 | |
Tactic: 7995391 Time: 0.05744 | |
Tactic: 8585215 Time: 0.041024 | |
Tactic: 8847359 Time: 0.035968 | |
Tactic: 8978431 Time: 0.075872 | |
Tactic: 9043967 Time: 0.033024 | |
Tactic: 9175039 Time: 0.036704 | |
Tactic: 9502719 Time: 0.066912 | |
Tactic: 9830399 Time: 0.103904 | |
Tactic: 9961471 Time: 0.047008 | |
Tactic: 10027007 Time: 0.036928 | |
Tactic: 10092543 Time: 0.066944 | |
Tactic: 10289151 Time: 0.061536 | |
Tactic: 10485759 Time: 0.0328 | |
Tactic: 10682367 Time: 0.034912 | |
Tactic: 10813439 Time: 0.055296 | |
Fastest Tactic: 10485759 Time: 0.0328 | |
--------------- Timing Runner: Conv_34 + Relu_35 (CudnnConvolution) | |
Tactic: 0 Time: 0.153504 | |
Tactic: 1 Time: 0.048736 | |
Tactic: 2 Time: 0.203904 | |
Tactic: 4 Time: 0.64512 | |
Tactic: 5 Time: 0.178976 | |
Tactic: 6 Time: 0.03632 | |
Tactic: 56 Time: 0.153696 | |
Tactic: 57 Time: 0.049024 | |
Tactic: 58 Time: 0.204384 | |
Tactic: 60 Time: 0.650464 | |
Tactic: 61 Time: 0.182496 | |
Tactic: 62 Time: 0.03616 | |
Fastest Tactic: 62 Time: 0.03616 | |
--------------- Timing Runner: Conv_34 + Relu_35 (CaskConvolution) | |
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384 | |
Tactic: 1825138533642645384 Time: 0.103904 | |
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867 | |
Tactic: 2775507031594384867 Time: 0.028064 | |
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458 | |
Tactic: 2842488832350522458 Time: 0.059296 | |
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238 | |
Tactic: 3915320020053085238 Time: 0.102432 | |
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203 | |
Tactic: 6448355332020552203 Time: 0.10864 | |
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604 | |
Tactic: 6808617066150061604 Time: 0.059488 | |
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864 | |
Tactic: -8060443123034038864 Time: 0.061536 | |
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522 | |
Tactic: -4420849921117327522 Time: 0.04048 | |
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337 | |
Tactic: -3946921629105938337 Time: 0.061536 | |
Fastest Tactic: 2775507031594384867 Time: 0.028064 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 2775507031594384867 | |
*************** Autotuning format combination: Float(345600,1,3840,32) -> Float(172800,1,1920,16) *************** | |
--------------- Timing Runner: Conv_34 + Relu_35 (CudnnConvolution) | |
CudnnConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_34 + Relu_35 (CaskConvolution) | |
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376 | |
Tactic: 861694390046228376 Time: 0.102496 | |
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567 | |
Tactic: 1017870653102653567 Time: 0.102304 | |
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167 | |
Tactic: 5258189349241541167 Time: 0.05744 | |
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316 | |
Tactic: 5821621277990374316 Time: 0.102496 | |
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648 | |
Tactic: 5863767799113001648 Time: 0.039008 | |
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536 | |
Tactic: -9147980667639709536 Time: 0.102144 | |
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857 | |
Tactic: -8850904373104590857 Time: 0.058528 | |
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660 | |
Tactic: -7751035352149795660 Time: 0.101696 | |
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465 | |
Tactic: -3853827649136781465 Time: 0.102496 | |
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196 | |
Tactic: -3263369460438823196 Time: 0.05744 | |
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819 | |
Tactic: -423878181466897819 Time: 0.038912 | |
Fastest Tactic: -423878181466897819 Time: 0.038912 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -423878181466897819 | |
*************** Autotuning Reformat:Float(172800,10800,120,1) -> Float(172800,1,1920,16) *************** | |
*************** Autotuning Reformat:Float(172800,1,1920,16) -> Float(172800,10800,120,1) *************** | |
*************** Autotuning format combination: Float(172800,10800,120,1) -> Float(172800,10800,120,1) *************** | |
--------------- Timing Runner: Conv_36 + Relu_37 (CudaDepthwiseConvolution) | |
CudaDepthwiseConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_36 + Relu_37 (FusedConvActConvolution) | |
Tactic: 524287 Time: 0.02784 | |
Tactic: 720895 Time: 0.032672 | |
Tactic: 983039 Time: 0.032032 | |
Tactic: 1048575 Time: 0.025856 | |
Tactic: 1703935 Time: 0.021792 | |
Tactic: 1769471 Time: 0.044992 | |
Tactic: 1966079 Time: 0.036736 | |
Tactic: 2031615 Time: 0.039104 | |
Tactic: 2228223 Time: 0.024224 | |
Tactic: 2424831 Time: 0.031936 | |
Tactic: 2621439 Time: 0.02448 | |
Tactic: 2752511 Time: 0.03584 | |
Tactic: 2818047 Time: 0.058624 | |
Tactic: 2883583 Time: 0.055392 | |
Tactic: 3014655 Time: 0.022624 | |
Tactic: 3145727 Time: 0.032672 | |
Tactic: 3473407 Time: 0.061344 | |
Tactic: 3604479 Time: 0.022624 | |
Tactic: 3735551 Time: 0.053472 | |
Tactic: 4390911 Time: 0.038816 | |
Tactic: 5046271 Time: 0.022624 | |
Tactic: 5963775 Time: 0.043104 | |
Tactic: 6160383 Time: 0.027648 | |
Tactic: 6488063 Time: 0.024576 | |
Tactic: 6881279 Time: 0.034912 | |
Tactic: 7274495 Time: 0.036928 | |
Tactic: 7864319 Time: 0.023584 | |
Tactic: 7995391 Time: 0.032864 | |
Tactic: 8585215 Time: 0.0256 | |
Tactic: 8847359 Time: 0.024576 | |
Tactic: 8978431 Time: 0.043008 | |
Tactic: 9043967 Time: 0.02144 | |
Tactic: 9175039 Time: 0.022624 | |
Tactic: 9502719 Time: 0.038912 | |
Tactic: 9830399 Time: 0.056448 | |
Tactic: 9961471 Time: 0.0328 | |
Tactic: 10027007 Time: 0.022528 | |
Tactic: 10092543 Time: 0.039008 | |
Tactic: 10289151 Time: 0.03648 | |
Tactic: 10485759 Time: 0.020576 | |
Tactic: 10682367 Time: 0.022624 | |
Tactic: 10813439 Time: 0.032224 | |
Fastest Tactic: 10485759 Time: 0.020576 | |
--------------- Timing Runner: Conv_36 + Relu_37 (CudnnConvolution) | |
Tactic: 0 Time: 0.038752 | |
Tactic: 1 Time: 0.032416 | |
Tactic: 2 Time: 0.113664 | |
Tactic: 4 Time: 0.376224 | |
Tactic: 5 Time: 0.172736 | |
Tactic: 6 Time: 0.028704 | |
Tactic: 56 Time: 0.038752 | |
Tactic: 57 Time: 0.032416 | |
Tactic: 58 Time: 0.114688 | |
Tactic: 60 Time: 0.374304 | |
Tactic: 61 Time: 0.170656 | |
Tactic: 62 Time: 0.029824 | |
Fastest Tactic: 6 Time: 0.028704 | |
--------------- Timing Runner: Conv_36 + Relu_37 (CaskConvolution) | |
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384 | |
Tactic: 1825138533642645384 Time: 0.05744 | |
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867 | |
Tactic: 2775507031594384867 Time: 0.020032 | |
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458 | |
Tactic: 2842488832350522458 Time: 0.034688 | |
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238 | |
Tactic: 3915320020053085238 Time: 0.05744 | |
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203 | |
Tactic: 6448355332020552203 Time: 0.063008 | |
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604 | |
Tactic: 6808617066150061604 Time: 0.034912 | |
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864 | |
Tactic: -8060443123034038864 Time: 0.03632 | |
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522 | |
Tactic: -4420849921117327522 Time: 0.024032 | |
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337 | |
Tactic: -3946921629105938337 Time: 0.034912 | |
Fastest Tactic: 2775507031594384867 Time: 0.020032 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 2775507031594384867 | |
*************** Autotuning format combination: Float(172800,1,1920,16) -> Float(172800,1,1920,16) *************** | |
--------------- Timing Runner: Conv_36 + Relu_37 (CudnnConvolution) | |
CudnnConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_36 + Relu_37 (CaskConvolution) | |
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376 | |
Tactic: 861694390046228376 Time: 0.055392 | |
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567 | |
Tactic: 1017870653102653567 Time: 0.055392 | |
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167 | |
Tactic: 5258189349241541167 Time: 0.034272 | |
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316 | |
Tactic: 5821621277990374316 Time: 0.05648 | |
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648 | |
Tactic: 5863767799113001648 Time: 0.039008 | |
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536 | |
Tactic: -9147980667639709536 Time: 0.056896 | |
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857 | |
Tactic: -8850904373104590857 Time: 0.03472 | |
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660 | |
Tactic: -7751035352149795660 Time: 0.05536 | |
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465 | |
Tactic: -3853827649136781465 Time: 0.056864 | |
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196 | |
Tactic: -3263369460438823196 Time: 0.034208 | |
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819 | |
Tactic: -423878181466897819 Time: 0.039008 | |
Fastest Tactic: -3263369460438823196 Time: 0.034208 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -3263369460438823196 | |
*************** Autotuning Reformat:Float(172800,10800,120,1) -> Float(172800,1,1920,16) *************** | |
*************** Autotuning Reformat:Float(172800,1,1920,16) -> Float(172800,10800,120,1) *************** | |
*************** Autotuning format combination: Float(172800,10800,120,1) -> Float(345600,43200,240,1) *************** | |
--------------- Timing Runner: ConvTranspose_38 + Pad_39 + BatchNormalization_40 + Relu_41 (CudnnDeconvolution) | |
CudnnDeconvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: ConvTranspose_38 + Pad_39 + BatchNormalization_40 + Relu_41 (GemmDeconvolution) | |
Tactic: 0 Time: 0.03664 | |
Fastest Tactic: 0 Time: 0.03664 | |
--------------- Timing Runner: ConvTranspose_38 + Pad_39 + BatchNormalization_40 + Relu_41 (CaskDeconvolution) | |
CaskDeconvolution has no valid tactics for this config, skipping | |
>>>>>>>>>>>>>>> Chose Runner Type: GemmDeconvolution Tactic: 0 | |
*************** Autotuning format combination: Float(172800,1,1920,16) -> Float(345600,1,1920,8) *************** | |
--------------- Timing Runner: ConvTranspose_38 + Pad_39 + BatchNormalization_40 + Relu_41 (CudnnDeconvolution) | |
CudnnDeconvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: ConvTranspose_38 + Pad_39 + BatchNormalization_40 + Relu_41 (GemmDeconvolution) | |
GemmDeconvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: ConvTranspose_38 + Pad_39 + BatchNormalization_40 + Relu_41 (CaskDeconvolution) | |
CaskDeconvolution has no valid tactics for this config, skipping | |
*************** Autotuning Reformat:Float(345600,43200,240,1) -> Float(691200,43200,240,1) *************** | |
--------------- Timing Runner: 188 copy (Reformat) | |
Tactic: 1002 Time: 0.010304 | |
Tactic: 0 Time: 0.007296 | |
Fastest Tactic: 0 Time: 0.007296 | |
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0 | |
*************** Autotuning Reformat:Float(345600,43200,240,1) -> Float(691200,1,3840,16) *************** | |
--------------- Timing Runner: 188 copy (Reformat) | |
Tactic: 1002 Time: 0.034912 | |
Tactic: 0 Time: 0.010144 | |
Fastest Tactic: 0 Time: 0.010144 | |
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0 | |
*************** Autotuning Reformat:Float(345600,1,1920,8) -> Float(691200,43200,240,1) *************** | |
--------------- Timing Runner: 188 copy (Reformat) | |
Tactic: 1002 Time: 0.036416 | |
Tactic: 0 Time: 0.012192 | |
Fastest Tactic: 0 Time: 0.012192 | |
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0 | |
*************** Autotuning Reformat:Float(345600,1,1920,8) -> Float(691200,1,3840,16) *************** | |
--------------- Timing Runner: 188 copy (Reformat) | |
Tactic: 1002 Time: 0.032096 | |
Tactic: 0 Time: 0.009664 | |
Fastest Tactic: 0 Time: 0.009664 | |
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0 | |
*************** Autotuning Reformat:Float(691200,43200,240,1) -> Float(691200,1,3840,16) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.036544 | |
Tactic: 0 Time: 0.020224 | |
Fastest Tactic: 0 Time: 0.020224 | |
*************** Autotuning Reformat:Float(691200,1,3840,16) -> Float(691200,43200,240,1) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.03696 | |
Tactic: 0 Time: 0.018464 | |
Fastest Tactic: 0 Time: 0.018464 | |
*************** Autotuning format combination: Float(691200,43200,240,1) -> Float(345600,43200,240,1) *************** | |
--------------- Timing Runner: Conv_43 + Relu_44 (CudaDepthwiseConvolution) | |
CudaDepthwiseConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_43 + Relu_44 (FusedConvActConvolution) | |
Tactic: 524287 Time: 0.061536 | |
Tactic: 720895 Time: 0.102496 | |
Tactic: 983039 Time: 0.099776 | |
Tactic: 1048575 Time: 0.060672 | |
Tactic: 1703935 Time: 0.059232 | |
Tactic: 1769471 Time: 0.149504 | |
Tactic: 1966079 Time: 0.106592 | |
Tactic: 2031615 Time: 0.096352 | |
Tactic: 2228223 Time: 0.061536 | |
Tactic: 2424831 Time: 0.102496 | |
Tactic: 2621439 Time: 0.067648 | |
Tactic: 2752511 Time: 0.105632 | |
Tactic: 2818047 Time: 0.199552 | |
Tactic: 2883583 Time: 0.190272 | |
Tactic: 3014655 Time: 0.061216 | |
Tactic: 3145727 Time: 0.104544 | |
Tactic: 3473407 Time: 0.185472 | |
Tactic: 3604479 Time: 0.060384 | |
Tactic: 3735551 Time: 0.186272 | |
Tactic: 4390911 Time: 0.106304 | |
Tactic: 5046271 Time: 0.05904 | |
Tactic: 5963775 Time: 0.106592 | |
Tactic: 6160383 Time: 0.0632 | |
Tactic: 6488063 Time: 0.0608 | |
Tactic: 6881279 Time: 0.099328 | |
Tactic: 7274495 Time: 0.118368 | |
Tactic: 7864319 Time: 0.068992 | |
Tactic: 7995391 Time: 0.100352 | |
Tactic: 8585215 Time: 0.06128 | |
Tactic: 8847359 Time: 0.071584 | |
Tactic: 8978431 Time: 0.106464 | |
Tactic: 9043967 Time: 0.059104 | |
Tactic: 9175039 Time: 0.06064 | |
Tactic: 9502719 Time: 0.106176 | |
Tactic: 9830399 Time: 0.19056 | |
Tactic: 9961471 Time: 0.100512 | |
Tactic: 10027007 Time: 0.059392 | |
Tactic: 10092543 Time: 0.105792 | |
Tactic: 10289151 Time: 0.107904 | |
Tactic: 10485759 Time: 0.058848 | |
Tactic: 10682367 Time: 0.06672 | |
Tactic: 10813439 Time: 0.09808 | |
Fastest Tactic: 10485759 Time: 0.058848 | |
--------------- Timing Runner: Conv_43 + Relu_44 (CudnnConvolution) | |
Tactic: 0 Time: 0.109856 | |
Tactic: 1 Time: 0.078688 | |
Tactic: 2 Time: 0.202848 | |
Tactic: 4 Time: 0.757632 | |
Tactic: 5 Time: 0.561088 | |
Tactic: 6 Time: 0.075424 | |
Tactic: 56 Time: 0.108544 | |
Tactic: 57 Time: 0.07792 | |
Tactic: 58 Time: 0.202848 | |
Tactic: 60 Time: 0.759904 | |
Tactic: 61 Time: 0.565248 | |
Tactic: 62 Time: 0.075584 | |
Fastest Tactic: 6 Time: 0.075424 | |
--------------- Timing Runner: Conv_43 + Relu_44 (CaskConvolution) | |
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384 | |
Tactic: 1825138533642645384 Time: 0.204864 | |
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867 | |
Tactic: 2775507031594384867 Time: 0.063392 | |
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458 | |
Tactic: 2842488832350522458 Time: 0.10864 | |
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238 | |
Tactic: 3915320020053085238 Time: 0.204608 | |
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203 | |
Tactic: 6448355332020552203 Time: 0.235424 | |
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604 | |
Tactic: 6808617066150061604 Time: 0.109888 | |
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864 | |
Tactic: -8060443123034038864 Time: 0.110688 | |
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522 | |
Tactic: -4420849921117327522 Time: 0.06768 | |
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337 | |
Tactic: -3946921629105938337 Time: 0.063584 | |
Fastest Tactic: 2775507031594384867 Time: 0.063392 | |
>>>>>>>>>>>>>>> Chose Runner Type: FusedConvActConvolution Tactic: 10485759 | |
*************** Autotuning format combination: Float(691200,1,3840,16) -> Float(345600,1,1920,8) *************** | |
--------------- Timing Runner: Conv_43 + Relu_44 (CudnnConvolution) | |
CudnnConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_43 + Relu_44 (CaskConvolution) | |
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376 | |
Tactic: 861694390046228376 Time: 0.199808 | |
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567 | |
Tactic: 1017870653102653567 Time: 0.198336 | |
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167 | |
Tactic: 5258189349241541167 Time: 0.11264 | |
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316 | |
Tactic: 5821621277990374316 Time: 0.200608 | |
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648 | |
Tactic: 5863767799113001648 Time: 0.141344 | |
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536 | |
Tactic: -9147980667639709536 Time: 0.197952 | |
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857 | |
Tactic: -8850904373104590857 Time: 0.112736 | |
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660 | |
Tactic: -7751035352149795660 Time: 0.197728 | |
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465 | |
Tactic: -3853827649136781465 Time: 0.201728 | |
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196 | |
Tactic: -3263369460438823196 Time: 0.112416 | |
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819 | |
Tactic: -423878181466897819 Time: 0.14288 | |
Fastest Tactic: -3263369460438823196 Time: 0.112416 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -3263369460438823196 | |
*************** Autotuning Reformat:Float(345600,43200,240,1) -> Float(345600,1,1920,8) *************** | |
*************** Autotuning Reformat:Float(345600,1,1920,8) -> Float(345600,43200,240,1) *************** | |
*************** Autotuning format combination: Float(345600,43200,240,1) -> Float(345600,43200,240,1) *************** | |
--------------- Timing Runner: Conv_45 + Relu_46 (CudaDepthwiseConvolution) | |
CudaDepthwiseConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_45 + Relu_46 (FusedConvActConvolution) | |
Tactic: 524287 Time: 0.041056 | |
Tactic: 720895 Time: 0.058144 | |
Tactic: 983039 Time: 0.057344 | |
Tactic: 1048575 Time: 0.038944 | |
Tactic: 1703935 Time: 0.041056 | |
Tactic: 1769471 Time: 0.105824 | |
Tactic: 1966079 Time: 0.059392 | |
Tactic: 2031615 Time: 0.055392 | |
Tactic: 2228223 Time: 0.041952 | |
Tactic: 2621439 Time: 0.049184 | |
Tactic: 2752511 Time: 0.060864 | |
Tactic: 2818047 Time: 0.10944 | |
Tactic: 2883583 Time: 0.102496 | |
Tactic: 3014655 Time: 0.042144 | |
Tactic: 3145727 Time: 0.061344 | |
Tactic: 3473407 Time: 0.104512 | |
Tactic: 3604479 Time: 0.041024 | |
Tactic: 3735551 Time: 0.100416 | |
Tactic: 4390911 Time: 0.060192 | |
Tactic: 5046271 Time: 0.037056 | |
Tactic: 5963775 Time: 0.060224 | |
Tactic: 6160383 Time: 0.04096 | |
Tactic: 6488063 Time: 0.038816 | |
Tactic: 6881279 Time: 0.056832 | |
Tactic: 7274495 Time: 0.075776 | |
Tactic: 7864319 Time: 0.050368 | |
Tactic: 7995391 Time: 0.05632 | |
Tactic: 8585215 Time: 0.03984 | |
Tactic: 8847359 Time: 0.053248 | |
Tactic: 8978431 Time: 0.059424 | |
Tactic: 9043967 Time: 0.040096 | |
Tactic: 9175039 Time: 0.041056 | |
Tactic: 9502719 Time: 0.060512 | |
Tactic: 9830399 Time: 0.103776 | |
Tactic: 10027007 Time: 0.036928 | |
Tactic: 10092543 Time: 0.059488 | |
Tactic: 10289151 Time: 0.060448 | |
Tactic: 10485759 Time: 0.038912 | |
Tactic: 10682367 Time: 0.049088 | |
Tactic: 10813439 Time: 0.055296 | |
Fastest Tactic: 10027007 Time: 0.036928 | |
--------------- Timing Runner: Conv_45 + Relu_46 (CudnnConvolution) | |
Tactic: 0 Time: 0.061344 | |
Tactic: 1 Time: 0.051072 | |
Tactic: 2 Time: 0.108448 | |
Tactic: 4 Time: 0.485376 | |
Tactic: 5 Time: 0.459936 | |
Tactic: 6 Time: 0.060992 | |
Tactic: 56 Time: 0.060544 | |
Tactic: 57 Time: 0.051008 | |
Tactic: 58 Time: 0.106688 | |
Tactic: 60 Time: 0.482208 | |
Tactic: 61 Time: 0.457024 | |
Tactic: 62 Time: 0.060704 | |
Fastest Tactic: 57 Time: 0.051008 | |
--------------- Timing Runner: Conv_45 + Relu_46 (CaskConvolution) | |
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384 | |
Tactic: 1825138533642645384 Time: 0.114528 | |
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867 | |
Tactic: 2775507031594384867 Time: 0.048928 | |
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458 | |
Tactic: 2842488832350522458 Time: 0.065024 | |
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238 | |
Tactic: 3915320020053085238 Time: 0.114624 | |
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203 | |
Tactic: 6448355332020552203 Time: 0.145312 | |
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604 | |
Tactic: 6808617066150061604 Time: 0.062464 | |
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864 | |
Tactic: -8060443123034038864 Time: 0.063552 | |
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522 | |
Tactic: -4420849921117327522 Time: 0.040288 | |
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337 | |
Tactic: -3946921629105938337 Time: 0.036896 | |
Fastest Tactic: -3946921629105938337 Time: 0.036896 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -3946921629105938337 | |
*************** Autotuning format combination: Float(345600,1,1920,8) -> Float(345600,1,1920,8) *************** | |
--------------- Timing Runner: Conv_45 + Relu_46 (CudnnConvolution) | |
CudnnConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_45 + Relu_46 (CaskConvolution) | |
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376 | |
Tactic: 861694390046228376 Time: 0.108544 | |
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567 | |
Tactic: 1017870653102653567 Time: 0.108224 | |
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167 | |
Tactic: 5258189349241541167 Time: 0.112704 | |
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316 | |
Tactic: 5821621277990374316 Time: 0.10864 | |
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648 | |
Tactic: 5863767799113001648 Time: 0.140832 | |
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536 | |
Tactic: -9147980667639709536 Time: 0.107648 | |
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857 | |
Tactic: -8850904373104590857 Time: 0.113888 | |
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660 | |
Tactic: -7751035352149795660 Time: 0.108384 | |
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465 | |
Tactic: -3853827649136781465 Time: 0.109568 | |
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196 | |
Tactic: -3263369460438823196 Time: 0.112384 | |
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819 | |
Tactic: -423878181466897819 Time: 0.141408 | |
Fastest Tactic: -9147980667639709536 Time: 0.107648 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -9147980667639709536 | |
*************** Autotuning Reformat:Float(345600,43200,240,1) -> Float(345600,1,1920,8) *************** | |
*************** Autotuning Reformat:Float(345600,1,1920,8) -> Float(345600,43200,240,1) *************** | |
*************** Autotuning format combination: Float(345600,43200,240,1) -> Float(129600,43200,240,1) *************** | |
--------------- Timing Runner: Conv_47 (CudaDepthwiseConvolution) | |
CudaDepthwiseConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_47 (FusedConvActConvolution) | |
Tactic: 589823 Time: 0.026144 | |
Tactic: 786431 Time: 0.023424 | |
Tactic: 1310719 Time: 0.02432 | |
Tactic: 1638399 Time: 0.034048 | |
Tactic: 1835007 Time: 0.023872 | |
Tactic: 4194303 Time: 0.022624 | |
Tactic: 4325375 Time: 0.024672 | |
Tactic: 4521983 Time: 0.023552 | |
Tactic: 4980735 Time: 0.022624 | |
Tactic: 5439487 Time: 0.028672 | |
Tactic: 5767167 Time: 0.053344 | |
Tactic: 6946815 Time: 0.028128 | |
Tactic: 7143423 Time: 0.028768 | |
Tactic: 7602175 Time: 0.022624 | |
Tactic: 7798783 Time: 0.023744 | |
Tactic: 8191999 Time: 0.0248 | |
Tactic: 8323071 Time: 0.024416 | |
Tactic: 8650751 Time: 0.02464 | |
Tactic: 9895935 Time: 0.022624 | |
Tactic: 10551295 Time: 0.02608 | |
Tactic: 10944511 Time: 0.02256 | |
Fastest Tactic: 10944511 Time: 0.02256 | |
--------------- Timing Runner: Conv_47 (CudnnConvolution) | |
Tactic: 0 Time: 0.016384 | |
Tactic: 1 Time: 0.01648 | |
Tactic: 2 Time: 0.018176 | |
Tactic: 4 Time: 0.284576 | |
Tactic: 5 Time: 0.048064 | |
Tactic: 56 Time: 0.016384 | |
Tactic: 57 Time: 0.016384 | |
Tactic: 58 Time: 0.017376 | |
Tactic: 60 Time: 0.282624 | |
Tactic: 61 Time: 0.047104 | |
Fastest Tactic: 0 Time: 0.016384 | |
--------------- Timing Runner: Conv_47 (CublasConvolution) | |
CublasConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_47 (CaskConvolution) | |
Conv_47 Set Tactic Name: volta_scudnn_128x128_relu_interior_nn_v1 Tactic: 1754569683116234317 | |
Tactic: 1754569683116234317 Time: 0.044352 | |
Conv_47 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384 | |
Tactic: 1825138533642645384 Time: 0.04448 | |
Conv_47 Set Tactic Name: volta_scudnn_128x32_relu_interior_nn_v1 Tactic: 2733356012094739613 | |
Tactic: 2733356012094739613 Time: 0.01552 | |
Conv_47 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238 | |
Tactic: 3915320020053085238 Time: 0.044864 | |
Conv_47 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604 | |
Tactic: 6808617066150061604 Time: 0.02352 | |
Conv_47 Set Tactic Name: volta_scudnn_128x64_relu_interior_nn_v1 Tactic: 9091006216302412844 | |
Tactic: 9091006216302412844 Time: 0.024416 | |
Conv_47 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864 | |
Tactic: -8060443123034038864 Time: 0.024576 | |
Conv_47 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522 | |
Tactic: -4420849921117327522 Time: 0.016384 | |
Conv_47 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337 | |
Tactic: -3946921629105938337 Time: 0.015488 | |
Fastest Tactic: -3946921629105938337 Time: 0.015488 | |
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -3946921629105938337 | |
*************** Autotuning format combination: Float(345600,1,1920,8) -> Float(129600,1,720,3) *************** | |
--------------- Timing Runner: Conv_47 (CudnnConvolution) | |
CudnnConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_47 (CublasConvolution) | |
CublasConvolution has no valid tactics for this config, skipping | |
--------------- Timing Runner: Conv_47 (CaskConvolution) | |
CaskConvolution has no valid tactics for this config, skipping | |
*************** Autotuning Reformat:Float(129600,1,720,3) -> Float(129600,43200,240,1) *************** | |
--------------- Timing Runner: Optimizer Reformat (Reformat) | |
Tactic: 1002 Time: 0.035008 | |
Tactic: 0 Time: 0.00624 | |
Fastest Tactic: 0 Time: 0.00624 | |
*************** Autotuning format combination: Float(129600,43200,240,1) -> Float(43200,43200,240,1), Int32(43200,43200,240,1) *************** | |
--------------- Timing Runner: (Unnamed Layer* 48) [TopK] (TopK) | |
Tactic: 0 Time: 0.503712 | |
Tactic: 1 Time: 2.4536 | |
Tactic: 3 Time: 0.011872 | |
Tactic: 2 Time: 12.5644 | |
Fastest Tactic: 3 Time: 0.011872 | |
>>>>>>>>>>>>>>> Chose Runner Type: TopK Tactic: 3 | |
Formats and tactics selection completed in 2.53455 seconds. | |
After reformat layers: 26 layers | |
Block size 268435456 | |
Block size 2764800 | |
Block size 1382400 | |
Block size 1036800 | |
Block size 691200 | |
Block size 345600 | |
Block size 76800 | |
Total Activation Memory: 274733056 | |
Detected 1 inputs and 2 output network tensors. | |
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337 | |
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522 | |
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867 | |
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867 | |
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867 | |
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867 | |
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867 | |
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867 | |
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867 | |
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337 | |
Conv_47 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337 | |
Layer: Conv_0 + Relu_1 HostPersistent: 1664 DevicePersistent: 260608 | |
Layer: Conv_2 + Relu_3 HostPersistent: 2192 DevicePersistent: 0 | |
Layer: MaxPool_4 HostPersistent: 0 DevicePersistent: 0 | |
Layer: Conv_5 + Relu_6 HostPersistent: 2176 DevicePersistent: 69632 | |
Layer: Conv_7 + Relu_8 HostPersistent: 512 DevicePersistent: 42496 | |
Layer: MaxPool_9 HostPersistent: 0 DevicePersistent: 0 | |
Layer: Conv_10 + Relu_11 HostPersistent: 512 DevicePersistent: 51712 | |
Layer: Conv_12 + Relu_13 HostPersistent: 512 DevicePersistent: 102912 | |
Layer: MaxPool_14 HostPersistent: 0 DevicePersistent: 0 | |
Layer: Conv_15 + Relu_16 HostPersistent: 2192 DevicePersistent: 0 | |
Layer: Conv_17 + Relu_18 HostPersistent: 2192 DevicePersistent: 0 | |
Layer: Conv_19 + Relu_20 HostPersistent: 2192 DevicePersistent: 0 | |
Layer: ConvTranspose_21 + BatchNormalization_22 + Relu_23 HostPersistent: 0 DevicePersistent: 0 | |
Layer: 166 copy HostPersistent: 0 DevicePersistent: 0 | |
Layer: Conv_25 + Relu_26 HostPersistent: 512 DevicePersistent: 307712 | |
Layer: Conv_27 + Relu_28 HostPersistent: 512 DevicePersistent: 102912 | |
Layer: ConvTranspose_29 + Pad_30 + BatchNormalization_31 + Relu_32 HostPersistent: 0 DevicePersistent: 0 | |
Layer: 177 copy HostPersistent: 0 DevicePersistent: 0 | |
Layer: Conv_34 + Relu_35 HostPersistent: 512 DevicePersistent: 84480 | |
Layer: Conv_36 + Relu_37 HostPersistent: 512 DevicePersistent: 42496 | |
Layer: ConvTranspose_38 + Pad_39 + BatchNormalization_40 + Relu_41 HostPersistent: 0 DevicePersistent: 0 | |
Layer: 188 copy HostPersistent: 0 DevicePersistent: 0 | |
Layer: Conv_43 + Relu_44 HostPersistent: 2192 DevicePersistent: 0 | |
Layer: Conv_45 + Relu_46 HostPersistent: 1664 DevicePersistent: 262144 | |
Layer: Conv_47 HostPersistent: 1664 DevicePersistent: 259584 | |
Layer: (Unnamed Layer* 48) [TopK] HostPersistent: 0 DevicePersistent: 0 | |
Total Host Persistent Memory: 21712 | |
Total Device Persistent Memory: 1586688 | |
Total Scratch Memory: 4147200 | |
[MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 0 MiB, GPU 4 MiB | |
Using cublasLt a tactic source | |
[MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 1704, GPU 979 (MiB) | |
Using cuDNN as a tactic source | |
[MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 1704, GPU 987 (MiB) | |
[MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1704, GPU 971 (MiB) | |
Engine generation completed in 3.18454 seconds. | |
Deleting timing cache: 66 entries, 18 hits | |
[MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1704, GPU 953 (MiB) | |
Engine Layer Information: | |
Layer(CaskConvolution): Conv_0 + Relu_1, Tactic: -3946921629105938337, input[Float(1,1,180,240)] -> 136[Float(1,8,180,240)] | |
Layer(FusedConvActConvolution): Conv_2 + Relu_3, Tactic: 1572863, 136[Float(1,8,180,240)] -> 189[Float(1,8,180,240)] | |
Layer(TiledPooling): MaxPool_4, Tactic: 6226177, 189[Float(1,8,180,240)] -> 140[Float(1,8,90,120)] | |
Layer(CaskConvolution): Conv_5 + Relu_6, Tactic: -4420849921117327522, 140[Float(1,8,90,120)] -> 143[Float(1,16,90,120)] | |
Layer(CaskConvolution): Conv_7 + Relu_8, Tactic: 2775507031594384867, 143[Float(1,16,90,120)] -> 178[Float(1,16,90,120)] | |
Layer(TiledPooling): MaxPool_9, Tactic: 5636353, 178[Float(1,16,90,120)] -> 147[Float(1,16,45,60)] | |
Layer(CaskConvolution): Conv_10 + Relu_11, Tactic: 2775507031594384867, 147[Float(1,16,45,60)] -> 150[Float(1,32,45,60)] | |
Layer(CaskConvolution): Conv_12 + Relu_13, Tactic: 2775507031594384867, 150[Float(1,32,45,60)] -> 167[Float(1,32,45,60)] | |
Layer(TiledPooling): MaxPool_14, Tactic: 7537922, 167[Float(1,32,45,60)] -> 154[Float(1,32,15,20)] | |
Layer(FusedConvActConvolution): Conv_15 + Relu_16, Tactic: 10485759, 154[Float(1,32,15,20)] -> 157[Float(1,64,15,20)] | |
Layer(FusedConvActConvolution): Conv_17 + Relu_18, Tactic: 10682367, 157[Float(1,64,15,20)] -> 160[Float(1,64,15,20)] | |
Layer(FusedConvActConvolution): Conv_19 + Relu_20, Tactic: 10682367, 160[Float(1,64,15,20)] -> 163[Float(1,64,15,20)] | |
Layer(GemmDeconvolution): ConvTranspose_21 + BatchNormalization_22 + Relu_23, Tactic: 0, 163[Float(1,64,15,20)] -> 166[Float(1,64,45,60)] | |
Layer(Reformat): 166 copy, Tactic: 0, 166[Float(1,64,45,60)] -> 167[Float(1,64,45,60)] | |
Layer(CaskConvolution): Conv_25 + Relu_26, Tactic: 2775507031594384867, 167[Float(1,96,45,60)] -> 170[Float(1,32,45,60)] | |
Layer(CaskConvolution): Conv_27 + Relu_28, Tactic: 2775507031594384867, 170[Float(1,32,45,60)] -> 173[Float(1,32,45,60)] | |
Layer(GemmDeconvolution): ConvTranspose_29 + Pad_30 + BatchNormalization_31 + Relu_32, Tactic: 0, 173[Float(1,32,45,60)] -> 177[Float(1,16,90,120)] | |
Layer(Reformat): 177 copy, Tactic: 0, 177[Float(1,16,90,120)] -> 178[Float(1,16,90,120)] | |
Layer(CaskConvolution): Conv_34 + Relu_35, Tactic: 2775507031594384867, 178[Float(1,32,90,120)] -> 181[Float(1,16,90,120)] | |
Layer(CaskConvolution): Conv_36 + Relu_37, Tactic: 2775507031594384867, 181[Float(1,16,90,120)] -> 184[Float(1,16,90,120)] | |
Layer(GemmDeconvolution): ConvTranspose_38 + Pad_39 + BatchNormalization_40 + Relu_41, Tactic: 0, 184[Float(1,16,90,120)] -> 188[Float(1,8,180,240)] | |
Layer(Reformat): 188 copy, Tactic: 0, 188[Float(1,8,180,240)] -> 189[Float(1,8,180,240)] | |
Layer(FusedConvActConvolution): Conv_43 + Relu_44, Tactic: 10485759, 189[Float(1,16,180,240)] -> 192[Float(1,8,180,240)] | |
Layer(CaskConvolution): Conv_45 + Relu_46, Tactic: -3946921629105938337, 192[Float(1,8,180,240)] -> 195[Float(1,8,180,240)] | |
Layer(CaskConvolution): Conv_47, Tactic: -3946921629105938337, 195[Float(1,8,180,240)] -> raw_conv_out[Float(1,3,180,240)] | |
Layer(TopK): (Unnamed Layer* 48) [TopK], Tactic: 3, raw_conv_out[Float(1,3,180,240)] -> (Unnamed Layer* 48) [TopK]_output_1[Float(1,1,180,240)], dynamic_final_out[Int32(1,1,180,240)] | |
[MemUsageSnapshot] Builder end: CPU 1704 MiB, GPU 953 MiB | |
[MemUsageSnapshot] ExecutionContext creation begin: CPU 1703 MiB, GPU 953 MiB | |
Using cublasLt a tactic source | |
[MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +10, now: CPU 1703, GPU 963 (MiB) | |
Using cuDNN as a tactic source | |
[MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 1703, GPU 971 (MiB) | |
Total per-runner device memory is 1586688 | |
Total per-runner host memory is 21712 | |
Allocated activation device memory of size 9830400 | |
[MemUsageSnapshot] ExecutionContext creation end: CPU 1703 MiB, GPU 981 MiB | |
[MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1702, GPU 963 (MiB) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment