Skip to content

Instantly share code, notes, and snippets.

@jlucier
Created Sep 13, 2021
Embed
What would you like to do?
TRT 8 ONNX model output
[MemUsageChange] Init CUDA: CPU +331, GPU +0, now: CPU 353, GPU 397 (MiB)
----------------------------------------------------------------
Input filename: /tmp/tmpz078yzbz.onnx
ONNX IR version: 0.0.6
Opset version: 8
Producer name: pytorch
Producer version: 1.8
Domain:
Model version: 0
Doc string:
----------------------------------------------------------------
Registered plugin creator - ::GridAnchor_TRT version 1
Registered plugin creator - ::GridAnchorRect_TRT version 1
Registered plugin creator - ::NMS_TRT version 1
Registered plugin creator - ::Reorg_TRT version 1
Registered plugin creator - ::Region_TRT version 1
Registered plugin creator - ::Clip_TRT version 1
Registered plugin creator - ::LReLU_TRT version 1
Registered plugin creator - ::PriorBox_TRT version 1
Registered plugin creator - ::Normalize_TRT version 1
Registered plugin creator - ::ScatterND version 1
Registered plugin creator - ::RPROI_TRT version 1
Registered plugin creator - ::BatchedNMS_TRT version 1
Registered plugin creator - ::BatchedNMSDynamic_TRT version 1
Registered plugin creator - ::FlattenConcat_TRT version 1
Registered plugin creator - ::CropAndResize version 1
Registered plugin creator - ::DetectionLayer_TRT version 1
Registered plugin creator - ::EfficientNMS_ONNX_TRT version 1
Registered plugin creator - ::EfficientNMS_TRT version 1
Registered plugin creator - ::Proposal version 1
Registered plugin creator - ::ProposalLayer_TRT version 1
Registered plugin creator - ::PyramidROIAlign_TRT version 1
Registered plugin creator - ::ResizeNearest_TRT version 1
Registered plugin creator - ::Split version 1
Registered plugin creator - ::SpecialSlice_TRT version 1
Registered plugin creator - ::InstanceNormalization_TRT version 1
Adding network input: input with dtype: float32, dimensions: (1, 1, 180, 240)
Registering tensor: input for ONNX tensor: input
Importing initializer: _layers.deconv1.0.weight
Importing initializer: _layers.deconv1.0.bias
Importing initializer: _layers.deconv1.1.weight
Importing initializer: _layers.deconv1.1.bias
Importing initializer: _layers.deconv1.1.running_mean
Importing initializer: _layers.deconv1.1.running_var
Importing initializer: _layers.deconv2.0.weight
Importing initializer: _layers.deconv2.0.bias
Importing initializer: _layers.deconv2.2.weight
Importing initializer: _layers.deconv2.2.bias
Importing initializer: _layers.deconv2.2.running_mean
Importing initializer: _layers.deconv2.2.running_var
Importing initializer: _layers.deconv3.0.weight
Importing initializer: _layers.deconv3.0.bias
Importing initializer: _layers.deconv3.2.weight
Importing initializer: _layers.deconv3.2.bias
Importing initializer: _layers.deconv3.2.running_mean
Importing initializer: _layers.deconv3.2.running_var
Importing initializer: 199
Importing initializer: 200
Importing initializer: 202
Importing initializer: 203
Importing initializer: 205
Importing initializer: 206
Importing initializer: 208
Importing initializer: 209
Importing initializer: 211
Importing initializer: 212
Importing initializer: 214
Importing initializer: 215
Importing initializer: 217
Importing initializer: 218
Importing initializer: 220
Importing initializer: 221
Importing initializer: 223
Importing initializer: 224
Importing initializer: 226
Importing initializer: 227
Importing initializer: 229
Importing initializer: 230
Importing initializer: 232
Importing initializer: 233
Importing initializer: 235
Importing initializer: 236
Importing initializer: 238
Importing initializer: 239
Importing initializer: 241
Importing initializer: 242
Importing initializer: 244
Importing initializer: 245
Parsing node: Conv_0 [Conv]
Searching for input: input
Searching for input: 199
Searching for input: 200
Conv_0 [Conv] inputs: [input -> (1, 1, 180, 240)[FLOAT]], [199 -> (8, 1, 5, 5)[FLOAT]], [200 -> (8)[FLOAT]],
Convolution input dimensions: (1, 1, 180, 240)
Registering layer: Conv_0 for ONNX node: Conv_0
Using kernel: (5, 5), strides: (1, 1), prepadding: (2, 2), postpadding: (2, 2), dilations: (1, 1), numOutputs: 8
Convolution output dimensions: (1, 8, 180, 240)
Registering tensor: 198 for ONNX tensor: 198
Conv_0 [Conv] outputs: [198 -> (1, 8, 180, 240)[FLOAT]],
Parsing node: Relu_1 [Relu]
Searching for input: 198
Relu_1 [Relu] inputs: [198 -> (1, 8, 180, 240)[FLOAT]],
Registering layer: Relu_1 for ONNX node: Relu_1
Registering tensor: 136 for ONNX tensor: 136
Relu_1 [Relu] outputs: [136 -> (1, 8, 180, 240)[FLOAT]],
Parsing node: Conv_2 [Conv]
Searching for input: 136
Searching for input: 202
Searching for input: 203
Conv_2 [Conv] inputs: [136 -> (1, 8, 180, 240)[FLOAT]], [202 -> (8, 8, 5, 5)[FLOAT]], [203 -> (8)[FLOAT]],
Convolution input dimensions: (1, 8, 180, 240)
Registering layer: Conv_2 for ONNX node: Conv_2
Using kernel: (5, 5), strides: (1, 1), prepadding: (2, 2), postpadding: (2, 2), dilations: (1, 1), numOutputs: 8
Convolution output dimensions: (1, 8, 180, 240)
Registering tensor: 201 for ONNX tensor: 201
Conv_2 [Conv] outputs: [201 -> (1, 8, 180, 240)[FLOAT]],
Parsing node: Relu_3 [Relu]
Searching for input: 201
Relu_3 [Relu] inputs: [201 -> (1, 8, 180, 240)[FLOAT]],
Registering layer: Relu_3 for ONNX node: Relu_3
Registering tensor: 139 for ONNX tensor: 139
Relu_3 [Relu] outputs: [139 -> (1, 8, 180, 240)[FLOAT]],
Parsing node: MaxPool_4 [MaxPool]
Searching for input: 139
MaxPool_4 [MaxPool] inputs: [139 -> (1, 8, 180, 240)[FLOAT]],
Registering layer: MaxPool_4 for ONNX node: MaxPool_4
Registering tensor: 140 for ONNX tensor: 140
MaxPool_4 [MaxPool] outputs: [140 -> (1, 8, 90, 120)[FLOAT]],
Parsing node: Conv_5 [Conv]
Searching for input: 140
Searching for input: 205
Searching for input: 206
Conv_5 [Conv] inputs: [140 -> (1, 8, 90, 120)[FLOAT]], [205 -> (16, 8, 3, 3)[FLOAT]], [206 -> (16)[FLOAT]],
Convolution input dimensions: (1, 8, 90, 120)
Registering layer: Conv_5 for ONNX node: Conv_5
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 16
Convolution output dimensions: (1, 16, 90, 120)
Registering tensor: 204 for ONNX tensor: 204
Conv_5 [Conv] outputs: [204 -> (1, 16, 90, 120)[FLOAT]],
Parsing node: Relu_6 [Relu]
Searching for input: 204
Relu_6 [Relu] inputs: [204 -> (1, 16, 90, 120)[FLOAT]],
Registering layer: Relu_6 for ONNX node: Relu_6
Registering tensor: 143 for ONNX tensor: 143
Relu_6 [Relu] outputs: [143 -> (1, 16, 90, 120)[FLOAT]],
Parsing node: Conv_7 [Conv]
Searching for input: 143
Searching for input: 208
Searching for input: 209
Conv_7 [Conv] inputs: [143 -> (1, 16, 90, 120)[FLOAT]], [208 -> (16, 16, 3, 3)[FLOAT]], [209 -> (16)[FLOAT]],
Convolution input dimensions: (1, 16, 90, 120)
Registering layer: Conv_7 for ONNX node: Conv_7
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 16
Convolution output dimensions: (1, 16, 90, 120)
Registering tensor: 207 for ONNX tensor: 207
Conv_7 [Conv] outputs: [207 -> (1, 16, 90, 120)[FLOAT]],
Parsing node: Relu_8 [Relu]
Searching for input: 207
Relu_8 [Relu] inputs: [207 -> (1, 16, 90, 120)[FLOAT]],
Registering layer: Relu_8 for ONNX node: Relu_8
Registering tensor: 146 for ONNX tensor: 146
Relu_8 [Relu] outputs: [146 -> (1, 16, 90, 120)[FLOAT]],
Parsing node: MaxPool_9 [MaxPool]
Searching for input: 146
MaxPool_9 [MaxPool] inputs: [146 -> (1, 16, 90, 120)[FLOAT]],
Registering layer: MaxPool_9 for ONNX node: MaxPool_9
Registering tensor: 147 for ONNX tensor: 147
MaxPool_9 [MaxPool] outputs: [147 -> (1, 16, 45, 60)[FLOAT]],
Parsing node: Conv_10 [Conv]
Searching for input: 147
Searching for input: 211
Searching for input: 212
Conv_10 [Conv] inputs: [147 -> (1, 16, 45, 60)[FLOAT]], [211 -> (32, 16, 3, 3)[FLOAT]], [212 -> (32)[FLOAT]],
Convolution input dimensions: (1, 16, 45, 60)
Registering layer: Conv_10 for ONNX node: Conv_10
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 32
Convolution output dimensions: (1, 32, 45, 60)
Registering tensor: 210 for ONNX tensor: 210
Conv_10 [Conv] outputs: [210 -> (1, 32, 45, 60)[FLOAT]],
Parsing node: Relu_11 [Relu]
Searching for input: 210
Relu_11 [Relu] inputs: [210 -> (1, 32, 45, 60)[FLOAT]],
Registering layer: Relu_11 for ONNX node: Relu_11
Registering tensor: 150 for ONNX tensor: 150
Relu_11 [Relu] outputs: [150 -> (1, 32, 45, 60)[FLOAT]],
Parsing node: Conv_12 [Conv]
Searching for input: 150
Searching for input: 214
Searching for input: 215
Conv_12 [Conv] inputs: [150 -> (1, 32, 45, 60)[FLOAT]], [214 -> (32, 32, 3, 3)[FLOAT]], [215 -> (32)[FLOAT]],
Convolution input dimensions: (1, 32, 45, 60)
Registering layer: Conv_12 for ONNX node: Conv_12
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 32
Convolution output dimensions: (1, 32, 45, 60)
Registering tensor: 213 for ONNX tensor: 213
Conv_12 [Conv] outputs: [213 -> (1, 32, 45, 60)[FLOAT]],
Parsing node: Relu_13 [Relu]
Searching for input: 213
Relu_13 [Relu] inputs: [213 -> (1, 32, 45, 60)[FLOAT]],
Registering layer: Relu_13 for ONNX node: Relu_13
Registering tensor: 153 for ONNX tensor: 153
Relu_13 [Relu] outputs: [153 -> (1, 32, 45, 60)[FLOAT]],
Parsing node: MaxPool_14 [MaxPool]
Searching for input: 153
MaxPool_14 [MaxPool] inputs: [153 -> (1, 32, 45, 60)[FLOAT]],
Registering layer: MaxPool_14 for ONNX node: MaxPool_14
Registering tensor: 154 for ONNX tensor: 154
MaxPool_14 [MaxPool] outputs: [154 -> (1, 32, 15, 20)[FLOAT]],
Parsing node: Conv_15 [Conv]
Searching for input: 154
Searching for input: 217
Searching for input: 218
Conv_15 [Conv] inputs: [154 -> (1, 32, 15, 20)[FLOAT]], [217 -> (64, 32, 3, 3)[FLOAT]], [218 -> (64)[FLOAT]],
Convolution input dimensions: (1, 32, 15, 20)
Registering layer: Conv_15 for ONNX node: Conv_15
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 64
Convolution output dimensions: (1, 64, 15, 20)
Registering tensor: 216 for ONNX tensor: 216
Conv_15 [Conv] outputs: [216 -> (1, 64, 15, 20)[FLOAT]],
Parsing node: Relu_16 [Relu]
Searching for input: 216
Relu_16 [Relu] inputs: [216 -> (1, 64, 15, 20)[FLOAT]],
Registering layer: Relu_16 for ONNX node: Relu_16
Registering tensor: 157 for ONNX tensor: 157
Relu_16 [Relu] outputs: [157 -> (1, 64, 15, 20)[FLOAT]],
Parsing node: Conv_17 [Conv]
Searching for input: 157
Searching for input: 220
Searching for input: 221
Conv_17 [Conv] inputs: [157 -> (1, 64, 15, 20)[FLOAT]], [220 -> (64, 64, 3, 3)[FLOAT]], [221 -> (64)[FLOAT]],
Convolution input dimensions: (1, 64, 15, 20)
Registering layer: Conv_17 for ONNX node: Conv_17
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 64
Convolution output dimensions: (1, 64, 15, 20)
Registering tensor: 219 for ONNX tensor: 219
Conv_17 [Conv] outputs: [219 -> (1, 64, 15, 20)[FLOAT]],
Parsing node: Relu_18 [Relu]
Searching for input: 219
Relu_18 [Relu] inputs: [219 -> (1, 64, 15, 20)[FLOAT]],
Registering layer: Relu_18 for ONNX node: Relu_18
Registering tensor: 160 for ONNX tensor: 160
Relu_18 [Relu] outputs: [160 -> (1, 64, 15, 20)[FLOAT]],
Parsing node: Conv_19 [Conv]
Searching for input: 160
Searching for input: 223
Searching for input: 224
Conv_19 [Conv] inputs: [160 -> (1, 64, 15, 20)[FLOAT]], [223 -> (64, 64, 3, 3)[FLOAT]], [224 -> (64)[FLOAT]],
Convolution input dimensions: (1, 64, 15, 20)
Registering layer: Conv_19 for ONNX node: Conv_19
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 64
Convolution output dimensions: (1, 64, 15, 20)
Registering tensor: 222 for ONNX tensor: 222
Conv_19 [Conv] outputs: [222 -> (1, 64, 15, 20)[FLOAT]],
Parsing node: Relu_20 [Relu]
Searching for input: 222
Relu_20 [Relu] inputs: [222 -> (1, 64, 15, 20)[FLOAT]],
Registering layer: Relu_20 for ONNX node: Relu_20
Registering tensor: 163 for ONNX tensor: 163
Relu_20 [Relu] outputs: [163 -> (1, 64, 15, 20)[FLOAT]],
Parsing node: ConvTranspose_21 [ConvTranspose]
Searching for input: 163
Searching for input: _layers.deconv1.0.weight
Searching for input: _layers.deconv1.0.bias
ConvTranspose_21 [ConvTranspose] inputs: [163 -> (1, 64, 15, 20)[FLOAT]], [_layers.deconv1.0.weight -> (64, 64, 3, 3)[FLOAT]], [_layers.deconv1.0.bias -> (64)[FLOAT]],
Running deconvolution with:
Padding mode: NOTSET
Pre-padding: (0, 0)
Post-padding: (0, 0)
Registering layer: ConvTranspose_21 for ONNX node: ConvTranspose_21
Registering tensor: 164 for ONNX tensor: 164
ConvTranspose_21 [ConvTranspose] outputs: [164 -> (1, 64, 45, 60)[FLOAT]],
Parsing node: BatchNormalization_22 [BatchNormalization]
Searching for input: 164
Searching for input: _layers.deconv1.1.weight
Searching for input: _layers.deconv1.1.bias
Searching for input: _layers.deconv1.1.running_mean
Searching for input: _layers.deconv1.1.running_var
BatchNormalization_22 [BatchNormalization] inputs: [164 -> (1, 64, 45, 60)[FLOAT]], [_layers.deconv1.1.weight -> (64)[FLOAT]], [_layers.deconv1.1.bias -> (64)[FLOAT]], [_layers.deconv1.1.running_mean -> (64)[FLOAT]], [_layers.deconv1.1.running_var -> (64)[FLOAT]],
Registering layer: BatchNormalization_22 for ONNX node: BatchNormalization_22
Registering tensor: 165 for ONNX tensor: 165
BatchNormalization_22 [BatchNormalization] outputs: [165 -> (1, 64, 45, 60)[FLOAT]],
Parsing node: Relu_23 [Relu]
Searching for input: 165
Relu_23 [Relu] inputs: [165 -> (1, 64, 45, 60)[FLOAT]],
Registering layer: Relu_23 for ONNX node: Relu_23
Registering tensor: 166 for ONNX tensor: 166
Relu_23 [Relu] outputs: [166 -> (1, 64, 45, 60)[FLOAT]],
Parsing node: Concat_24 [Concat]
Searching for input: 166
Searching for input: 153
Concat_24 [Concat] inputs: [166 -> (1, 64, 45, 60)[FLOAT]], [153 -> (1, 32, 45, 60)[FLOAT]],
Registering layer: Concat_24 for ONNX node: Concat_24
Registering tensor: 167 for ONNX tensor: 167
Concat_24 [Concat] outputs: [167 -> (1, 96, 45, 60)[FLOAT]],
Parsing node: Conv_25 [Conv]
Searching for input: 167
Searching for input: 226
Searching for input: 227
Conv_25 [Conv] inputs: [167 -> (1, 96, 45, 60)[FLOAT]], [226 -> (32, 96, 3, 3)[FLOAT]], [227 -> (32)[FLOAT]],
Convolution input dimensions: (1, 96, 45, 60)
Registering layer: Conv_25 for ONNX node: Conv_25
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 32
Convolution output dimensions: (1, 32, 45, 60)
Registering tensor: 225 for ONNX tensor: 225
Conv_25 [Conv] outputs: [225 -> (1, 32, 45, 60)[FLOAT]],
Parsing node: Relu_26 [Relu]
Searching for input: 225
Relu_26 [Relu] inputs: [225 -> (1, 32, 45, 60)[FLOAT]],
Registering layer: Relu_26 for ONNX node: Relu_26
Registering tensor: 170 for ONNX tensor: 170
Relu_26 [Relu] outputs: [170 -> (1, 32, 45, 60)[FLOAT]],
Parsing node: Conv_27 [Conv]
Searching for input: 170
Searching for input: 229
Searching for input: 230
Conv_27 [Conv] inputs: [170 -> (1, 32, 45, 60)[FLOAT]], [229 -> (32, 32, 3, 3)[FLOAT]], [230 -> (32)[FLOAT]],
Convolution input dimensions: (1, 32, 45, 60)
Registering layer: Conv_27 for ONNX node: Conv_27
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 32
Convolution output dimensions: (1, 32, 45, 60)
Registering tensor: 228 for ONNX tensor: 228
Conv_27 [Conv] outputs: [228 -> (1, 32, 45, 60)[FLOAT]],
Parsing node: Relu_28 [Relu]
Searching for input: 228
Relu_28 [Relu] inputs: [228 -> (1, 32, 45, 60)[FLOAT]],
Registering layer: Relu_28 for ONNX node: Relu_28
Registering tensor: 173 for ONNX tensor: 173
Relu_28 [Relu] outputs: [173 -> (1, 32, 45, 60)[FLOAT]],
Parsing node: ConvTranspose_29 [ConvTranspose]
Searching for input: 173
Searching for input: _layers.deconv2.0.weight
Searching for input: _layers.deconv2.0.bias
ConvTranspose_29 [ConvTranspose] inputs: [173 -> (1, 32, 45, 60)[FLOAT]], [_layers.deconv2.0.weight -> (32, 16, 3, 3)[FLOAT]], [_layers.deconv2.0.bias -> (16)[FLOAT]],
Running deconvolution with:
Padding mode: NOTSET
Pre-padding: (0, 0)
Post-padding: (0, 0)
Registering layer: ConvTranspose_29 for ONNX node: ConvTranspose_29
Registering tensor: 174 for ONNX tensor: 174
ConvTranspose_29 [ConvTranspose] outputs: [174 -> (1, 16, 91, 121)[FLOAT]],
Parsing node: Pad_30 [Pad]
Searching for input: 174
Pad_30 [Pad] inputs: [174 -> (1, 16, 91, 121)[FLOAT]],
Registering layer: Pad_30 for ONNX node: Pad_30
Registering tensor: 175 for ONNX tensor: 175
Pad_30 [Pad] outputs: [175 -> (1, 16, 90, 120)[FLOAT]],
Parsing node: BatchNormalization_31 [BatchNormalization]
Searching for input: 175
Searching for input: _layers.deconv2.2.weight
Searching for input: _layers.deconv2.2.bias
Searching for input: _layers.deconv2.2.running_mean
Searching for input: _layers.deconv2.2.running_var
BatchNormalization_31 [BatchNormalization] inputs: [175 -> (1, 16, 90, 120)[FLOAT]], [_layers.deconv2.2.weight -> (16)[FLOAT]], [_layers.deconv2.2.bias -> (16)[FLOAT]], [_layers.deconv2.2.running_mean -> (16)[FLOAT]], [_layers.deconv2.2.running_var -> (16)[FLOAT]],
Registering layer: BatchNormalization_31 for ONNX node: BatchNormalization_31
Registering tensor: 176 for ONNX tensor: 176
BatchNormalization_31 [BatchNormalization] outputs: [176 -> (1, 16, 90, 120)[FLOAT]],
Parsing node: Relu_32 [Relu]
Searching for input: 176
Relu_32 [Relu] inputs: [176 -> (1, 16, 90, 120)[FLOAT]],
Registering layer: Relu_32 for ONNX node: Relu_32
Registering tensor: 177 for ONNX tensor: 177
Relu_32 [Relu] outputs: [177 -> (1, 16, 90, 120)[FLOAT]],
Parsing node: Concat_33 [Concat]
Searching for input: 177
Searching for input: 146
Concat_33 [Concat] inputs: [177 -> (1, 16, 90, 120)[FLOAT]], [146 -> (1, 16, 90, 120)[FLOAT]],
Registering layer: Concat_33 for ONNX node: Concat_33
Registering tensor: 178 for ONNX tensor: 178
Concat_33 [Concat] outputs: [178 -> (1, 32, 90, 120)[FLOAT]],
Parsing node: Conv_34 [Conv]
Searching for input: 178
Searching for input: 232
Searching for input: 233
Conv_34 [Conv] inputs: [178 -> (1, 32, 90, 120)[FLOAT]], [232 -> (16, 32, 3, 3)[FLOAT]], [233 -> (16)[FLOAT]],
Convolution input dimensions: (1, 32, 90, 120)
Registering layer: Conv_34 for ONNX node: Conv_34
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 16
Convolution output dimensions: (1, 16, 90, 120)
Registering tensor: 231 for ONNX tensor: 231
Conv_34 [Conv] outputs: [231 -> (1, 16, 90, 120)[FLOAT]],
Parsing node: Relu_35 [Relu]
Searching for input: 231
Relu_35 [Relu] inputs: [231 -> (1, 16, 90, 120)[FLOAT]],
Registering layer: Relu_35 for ONNX node: Relu_35
Registering tensor: 181 for ONNX tensor: 181
Relu_35 [Relu] outputs: [181 -> (1, 16, 90, 120)[FLOAT]],
Parsing node: Conv_36 [Conv]
Searching for input: 181
Searching for input: 235
Searching for input: 236
Conv_36 [Conv] inputs: [181 -> (1, 16, 90, 120)[FLOAT]], [235 -> (16, 16, 3, 3)[FLOAT]], [236 -> (16)[FLOAT]],
Convolution input dimensions: (1, 16, 90, 120)
Registering layer: Conv_36 for ONNX node: Conv_36
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 16
Convolution output dimensions: (1, 16, 90, 120)
Registering tensor: 234 for ONNX tensor: 234
Conv_36 [Conv] outputs: [234 -> (1, 16, 90, 120)[FLOAT]],
Parsing node: Relu_37 [Relu]
Searching for input: 234
Relu_37 [Relu] inputs: [234 -> (1, 16, 90, 120)[FLOAT]],
Registering layer: Relu_37 for ONNX node: Relu_37
Registering tensor: 184 for ONNX tensor: 184
Relu_37 [Relu] outputs: [184 -> (1, 16, 90, 120)[FLOAT]],
Parsing node: ConvTranspose_38 [ConvTranspose]
Searching for input: 184
Searching for input: _layers.deconv3.0.weight
Searching for input: _layers.deconv3.0.bias
ConvTranspose_38 [ConvTranspose] inputs: [184 -> (1, 16, 90, 120)[FLOAT]], [_layers.deconv3.0.weight -> (16, 8, 3, 3)[FLOAT]], [_layers.deconv3.0.bias -> (8)[FLOAT]],
Running deconvolution with:
Padding mode: NOTSET
Pre-padding: (0, 0)
Post-padding: (0, 0)
Registering layer: ConvTranspose_38 for ONNX node: ConvTranspose_38
Registering tensor: 185 for ONNX tensor: 185
ConvTranspose_38 [ConvTranspose] outputs: [185 -> (1, 8, 181, 241)[FLOAT]],
Parsing node: Pad_39 [Pad]
Searching for input: 185
Pad_39 [Pad] inputs: [185 -> (1, 8, 181, 241)[FLOAT]],
Registering layer: Pad_39 for ONNX node: Pad_39
Registering tensor: 186 for ONNX tensor: 186
Pad_39 [Pad] outputs: [186 -> (1, 8, 180, 240)[FLOAT]],
Parsing node: BatchNormalization_40 [BatchNormalization]
Searching for input: 186
Searching for input: _layers.deconv3.2.weight
Searching for input: _layers.deconv3.2.bias
Searching for input: _layers.deconv3.2.running_mean
Searching for input: _layers.deconv3.2.running_var
BatchNormalization_40 [BatchNormalization] inputs: [186 -> (1, 8, 180, 240)[FLOAT]], [_layers.deconv3.2.weight -> (8)[FLOAT]], [_layers.deconv3.2.bias -> (8)[FLOAT]], [_layers.deconv3.2.running_mean -> (8)[FLOAT]], [_layers.deconv3.2.running_var -> (8)[FLOAT]],
Registering layer: BatchNormalization_40 for ONNX node: BatchNormalization_40
Registering tensor: 187 for ONNX tensor: 187
BatchNormalization_40 [BatchNormalization] outputs: [187 -> (1, 8, 180, 240)[FLOAT]],
Parsing node: Relu_41 [Relu]
Searching for input: 187
Relu_41 [Relu] inputs: [187 -> (1, 8, 180, 240)[FLOAT]],
Registering layer: Relu_41 for ONNX node: Relu_41
Registering tensor: 188 for ONNX tensor: 188
Relu_41 [Relu] outputs: [188 -> (1, 8, 180, 240)[FLOAT]],
Parsing node: Concat_42 [Concat]
Searching for input: 188
Searching for input: 139
Concat_42 [Concat] inputs: [188 -> (1, 8, 180, 240)[FLOAT]], [139 -> (1, 8, 180, 240)[FLOAT]],
Registering layer: Concat_42 for ONNX node: Concat_42
Registering tensor: 189 for ONNX tensor: 189
Concat_42 [Concat] outputs: [189 -> (1, 16, 180, 240)[FLOAT]],
Parsing node: Conv_43 [Conv]
Searching for input: 189
Searching for input: 238
Searching for input: 239
Conv_43 [Conv] inputs: [189 -> (1, 16, 180, 240)[FLOAT]], [238 -> (8, 16, 3, 3)[FLOAT]], [239 -> (8)[FLOAT]],
Convolution input dimensions: (1, 16, 180, 240)
Registering layer: Conv_43 for ONNX node: Conv_43
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 8
Convolution output dimensions: (1, 8, 180, 240)
Registering tensor: 237 for ONNX tensor: 237
Conv_43 [Conv] outputs: [237 -> (1, 8, 180, 240)[FLOAT]],
Parsing node: Relu_44 [Relu]
Searching for input: 237
Relu_44 [Relu] inputs: [237 -> (1, 8, 180, 240)[FLOAT]],
Registering layer: Relu_44 for ONNX node: Relu_44
Registering tensor: 192 for ONNX tensor: 192
Relu_44 [Relu] outputs: [192 -> (1, 8, 180, 240)[FLOAT]],
Parsing node: Conv_45 [Conv]
Searching for input: 192
Searching for input: 241
Searching for input: 242
Conv_45 [Conv] inputs: [192 -> (1, 8, 180, 240)[FLOAT]], [241 -> (8, 8, 3, 3)[FLOAT]], [242 -> (8)[FLOAT]],
Convolution input dimensions: (1, 8, 180, 240)
Registering layer: Conv_45 for ONNX node: Conv_45
Using kernel: (3, 3), strides: (1, 1), prepadding: (1, 1), postpadding: (1, 1), dilations: (1, 1), numOutputs: 8
Convolution output dimensions: (1, 8, 180, 240)
Registering tensor: 240 for ONNX tensor: 240
Conv_45 [Conv] outputs: [240 -> (1, 8, 180, 240)[FLOAT]],
Parsing node: Relu_46 [Relu]
Searching for input: 240
Relu_46 [Relu] inputs: [240 -> (1, 8, 180, 240)[FLOAT]],
Registering layer: Relu_46 for ONNX node: Relu_46
Registering tensor: 195 for ONNX tensor: 195
Relu_46 [Relu] outputs: [195 -> (1, 8, 180, 240)[FLOAT]],
Parsing node: Conv_47 [Conv]
Searching for input: 195
Searching for input: 244
Searching for input: 245
Conv_47 [Conv] inputs: [195 -> (1, 8, 180, 240)[FLOAT]], [244 -> (3, 8, 1, 1)[FLOAT]], [245 -> (3)[FLOAT]],
Convolution input dimensions: (1, 8, 180, 240)
Registering layer: Conv_47 for ONNX node: Conv_47
Using kernel: (1, 1), strides: (1, 1), prepadding: (0, 0), postpadding: (0, 0), dilations: (1, 1), numOutputs: 3
Convolution output dimensions: (1, 3, 180, 240)
Registering tensor: raw_conv_out_50 for ONNX tensor: raw_conv_out
Conv_47 [Conv] outputs: [raw_conv_out -> (1, 3, 180, 240)[FLOAT]],
Marking raw_conv_out_50 as output: raw_conv_out
Tensor DataType is determined at build time for tensors not marked as input or output.
[MemUsageSnapshot] Builder begin: CPU 352 MiB, GPU 397 MiB
Applying generic optimizations to the graph for inference.
Original: 49 layers
After dead-layer removal: 49 layers
After Myelin optimization: 49 layers
After scale fusion: 49 layers
ConvReluFusion: Fusing Conv_0 with Relu_1
ConvReluFusion: Fusing Conv_2 with Relu_3
ConvReluFusion: Fusing Conv_5 with Relu_6
ConvReluFusion: Fusing Conv_7 with Relu_8
ConvReluFusion: Fusing Conv_10 with Relu_11
ConvReluFusion: Fusing Conv_12 with Relu_13
ConvReluFusion: Fusing Conv_15 with Relu_16
ConvReluFusion: Fusing Conv_17 with Relu_18
ConvReluFusion: Fusing Conv_19 with Relu_20
DeconvScaleFusion: Fusing ConvTranspose_21 with BatchNormalization_22
DeconvReluFusion: Fusing ConvTranspose_21 + BatchNormalization_22 with Relu_23
ConvReluFusion: Fusing Conv_25 with Relu_26
ConvReluFusion: Fusing Conv_27 with Relu_28
DeconvolutionPaddingFusion: Fusing ConvTranspose_29 with Pad_30
DeconvScaleFusion: Fusing ConvTranspose_29 + Pad_30 with BatchNormalization_31
DeconvReluFusion: Fusing ConvTranspose_29 + Pad_30 + BatchNormalization_31 with Relu_32
ConvReluFusion: Fusing Conv_34 with Relu_35
ConvReluFusion: Fusing Conv_36 with Relu_37
DeconvolutionPaddingFusion: Fusing ConvTranspose_38 with Pad_39
DeconvScaleFusion: Fusing ConvTranspose_38 + Pad_39 with BatchNormalization_40
DeconvReluFusion: Fusing ConvTranspose_38 + Pad_39 + BatchNormalization_40 with Relu_41
ConvReluFusion: Fusing Conv_43 with Relu_44
ConvReluFusion: Fusing Conv_45 with Relu_46
After vertical fusions: 26 layers
After dupe layer removal: 26 layers
After final dead-layer removal: 26 layers
After tensor merging: 26 layers
Eliminating concatenation Concat_24
Generating copy for 166 to 167 because input does not support striding.
Retargeting 153 to 167
Eliminating concatenation Concat_33
Generating copy for 177 to 178 because input does not support striding.
Retargeting 146 to 178
Eliminating concatenation Concat_42
Generating copy for 188 to 189 because input does not support striding.
Retargeting 139 to 189
After concat removal: 26 layers
Graph construction and optimization completed in 0.0164805 seconds.
Using cublasLt a tactic source
[MemUsageChange] Init cuBLAS/cuBLASLt: CPU +483, GPU +206, now: CPU 836, GPU 603 (MiB)
Using cuDNN as a tactic source
[MemUsageChange] Init cuDNN: CPU +469, GPU +204, now: CPU 1305, GPU 807 (MiB)
Detected invalid timing cache, setup a local cache instead
Constructing optimization profile number 0 [1/1].
*************** Autotuning Reformat:Float(43200,43200,240,1) -> Float(43200,1,240,1) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.006048
Tactic: 0 Time: 0.00416
Fastest Tactic: 0 Time: 0.00416
*************** Autotuning format combination: Float(43200,43200,240,1) -> Float(345600,43200,240,1) ***************
--------------- Timing Runner: Conv_0 + Relu_1 (CudaDepthwiseConvolution)
CudaDepthwiseConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_0 + Relu_1 (FusedConvActConvolution)
Tactic: 393215 Time: 0.02896
Tactic: 1245183 Time: 0.030816
Tactic: 1572863 Time: 0.028672
Tactic: 4784127 Time: 0.045056
Tactic: 4849663 Time: 0.032768
Tactic: 5111807 Time: 0.032864
Tactic: 6553599 Time: 0.029856
Tactic: 6619135 Time: 0.030816
Tactic: 9306111 Time: 0.034496
Tactic: 9371647 Time: 0.034912
Tactic: 9633791 Time: 0.0328
Fastest Tactic: 1572863 Time: 0.028672
--------------- Timing Runner: Conv_0 + Relu_1 (CudnnConvolution)
Tactic: 0 Time: 0.036256
Tactic: 1 Time: 0.034976
Tactic: 2 Time: 0.044896
Tactic: 4 Time: 0.223616
Tactic: 5 Time: 0.559072
Tactic: 56 Time: 0.035072
Tactic: 57 Time: 0.035072
Tactic: 58 Time: 0.044032
Tactic: 60 Time: 0.2208
Tactic: 61 Time: 0.528384
Fastest Tactic: 1 Time: 0.034976
--------------- Timing Runner: Conv_0 + Relu_1 (CaskConvolution)
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384
Tactic: 1825138533642645384 Time: 0.068544
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458
Tactic: 2842488832350522458 Time: 0.041152
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238
Tactic: 3915320020053085238 Time: 0.068832
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203
Tactic: 6448355332020552203 Time: 0.093824
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604
Tactic: 6808617066150061604 Time: 0.037888
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864
Tactic: -8060443123034038864 Time: 0.038688
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522
Tactic: -4420849921117327522 Time: 0.024416
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337
Tactic: -3946921629105938337 Time: 0.022272
Fastest Tactic: -3946921629105938337 Time: 0.022272
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -3946921629105938337
*************** Autotuning format combination: Float(43200,1,240,1) -> Float(345600,1,1920,8) ***************
--------------- Timing Runner: Conv_0 + Relu_1 (CudnnConvolution)
CudnnConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_0 + Relu_1 (CaskConvolution)
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376
Tactic: 861694390046228376 Time: 0.269248
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316
Tactic: 5821621277990374316 Time: 0.269536
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465
Tactic: -3853827649136781465 Time: 0.270432
Fastest Tactic: 861694390046228376 Time: 0.269248
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 861694390046228376
*************** Autotuning Reformat:Float(345600,43200,240,1) -> Float(345600,1,1920,8) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.034912
Tactic: 0 Time: 0.009504
Fastest Tactic: 0 Time: 0.009504
*************** Autotuning Reformat:Float(345600,1,1920,8) -> Float(345600,43200,240,1) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.03584
Tactic: 0 Time: 0.012288
Fastest Tactic: 0 Time: 0.012288
*************** Autotuning format combination: Float(345600,43200,240,1) -> Float(691200,43200,240,1) ***************
--------------- Timing Runner: Conv_2 + Relu_3 (CudaDepthwiseConvolution)
CudaDepthwiseConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_2 + Relu_3 (FusedConvActConvolution)
Tactic: 393215 Time: 0.0792
Tactic: 917503 Time: 0.075776
Tactic: 1114111 Time: 0.130528
Tactic: 1245183 Time: 0.128928
Tactic: 1572863 Time: 0.073632
Tactic: 2490367 Time: 0.07728
Tactic: 2555903 Time: 0.075776
Tactic: 2949119 Time: 0.149472
Tactic: 3211263 Time: 0.267648
Tactic: 3801087 Time: 0.077728
Tactic: 3866623 Time: 0.0768
Tactic: 4128767 Time: 0.132192
Tactic: 4456447 Time: 0.074752
Tactic: 4718591 Time: 0.13312
Tactic: 4784127 Time: 0.253952
Tactic: 4849663 Time: 0.134304
Tactic: 5111807 Time: 0.13312
Tactic: 5308415 Time: 0.139456
Tactic: 5505023 Time: 0.262144
Tactic: 6094847 Time: 0.078848
Tactic: 6356991 Time: 0.085184
Tactic: 6553599 Time: 0.079168
Tactic: 6619135 Time: 0.099488
Tactic: 6684671 Time: 0.268384
Tactic: 7471103 Time: 0.074912
Tactic: 7667711 Time: 0.133184
Tactic: 7929855 Time: 0.13824
Tactic: 8060927 Time: 0.082048
Tactic: 8126463 Time: 0.142784
Tactic: 8388607 Time: 0.14544
Tactic: 8519679 Time: 0.094112
Tactic: 8781823 Time: 0.16736
Tactic: 8912895 Time: 0.151456
Tactic: 9240575 Time: 0.136768
Tactic: 9306111 Time: 0.132352
Tactic: 9371647 Time: 0.135264
Tactic: 9437183 Time: 0.1496
Tactic: 9633791 Time: 0.134528
Tactic: 9699327 Time: 0.076704
Tactic: 9764863 Time: 0.075264
Tactic: 10158079 Time: 0.077152
Tactic: 10420223 Time: 0.153696
Tactic: 10616831 Time: 0.077632
Tactic: 10878975 Time: 0.074688
Fastest Tactic: 1572863 Time: 0.073632
--------------- Timing Runner: Conv_2 + Relu_3 (CudnnConvolution)
Tactic: 0 Time: 0.141216
Tactic: 1 Time: 0.09792
Tactic: 2 Time: 0.481376
Tactic: 4 Time: 0.484928
Tactic: 5 Time: 0.593632
Tactic: 56 Time: 0.139584
Tactic: 57 Time: 0.096672
Tactic: 58 Time: 0.484736
Tactic: 60 Time: 0.483424
Tactic: 61 Time: 0.592768
Fastest Tactic: 57 Time: 0.096672
--------------- Timing Runner: Conv_2 + Relu_3 (CaskConvolution)
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384
Tactic: 1825138533642645384 Time: 0.274272
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458
Tactic: 2842488832350522458 Time: 0.145312
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238
Tactic: 3915320020053085238 Time: 0.274272
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203
Tactic: 6448355332020552203 Time: 0.305248
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604
Tactic: 6808617066150061604 Time: 0.14656
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864
Tactic: -8060443123034038864 Time: 0.147456
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522
Tactic: -4420849921117327522 Time: 0.086112
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337
Tactic: -3946921629105938337 Time: 0.082016
Fastest Tactic: -3946921629105938337 Time: 0.082016
>>>>>>>>>>>>>>> Chose Runner Type: FusedConvActConvolution Tactic: 1572863
*************** Autotuning format combination: Float(345600,1,1920,8) -> Float(691200,1,3840,16) ***************
--------------- Timing Runner: Conv_2 + Relu_3 (CudnnConvolution)
CudnnConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_2 + Relu_3 (CaskConvolution)
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376
Tactic: 861694390046228376 Time: 0.268352
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567
Tactic: 1017870653102653567 Time: 0.2664
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167
Tactic: 5258189349241541167 Time: 0.280224
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316
Tactic: 5821621277990374316 Time: 0.268384
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648
Tactic: 5863767799113001648 Time: 0.323296
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536
Tactic: -9147980667639709536 Time: 0.26624
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857
Tactic: -8850904373104590857 Time: 0.278688
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660
Tactic: -7751035352149795660 Time: 0.267424
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465
Tactic: -3853827649136781465 Time: 0.270336
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196
Tactic: -3263369460438823196 Time: 0.276384
Conv_2 + Relu_3 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819
Tactic: -423878181466897819 Time: 0.325728
Fastest Tactic: -9147980667639709536 Time: 0.26624
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -9147980667639709536
*************** Autotuning Reformat:Float(691200,43200,240,1) -> Float(691200,1,3840,16) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.034976
Tactic: 0 Time: 0.010016
Fastest Tactic: 0 Time: 0.010016
*************** Autotuning Reformat:Float(691200,1,3840,16) -> Float(691200,43200,240,1) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.036288
Tactic: 0 Time: 0.012352
Fastest Tactic: 0 Time: 0.012352
*************** Autotuning Reformat:Float(691200,1,3840,16) -> Float(691200,43200,240,1) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.03616
Tactic: 0 Time: 0.01216
Fastest Tactic: 0 Time: 0.01216
*************** Autotuning format combination: Float(691200,43200,240,1) -> Float(86400,10800,120,1) ***************
--------------- Timing Runner: MaxPool_4 (TiledPooling)
Tactic: 5505281 Time: 0.008288
Tactic: 5570817 Time: 0.00624
Tactic: 5636353 Time: 0.006208
Tactic: 5701889 Time: 0.006176
Tactic: 5767425 Time: 0.006048
Tactic: 5832961 Time: 0.00624
Tactic: 5898497 Time: 0.006144
Tactic: 5964033 Time: 0.006048
Tactic: 6029569 Time: 0.008288
Tactic: 6095105 Time: 0.006208
Tactic: 6160641 Time: 0.00624
Tactic: 6226177 Time: 0.005984
Tactic: 6291713 Time: 0.006016
Tactic: 6357249 Time: 0.006176
Tactic: 6422785 Time: 0.00624
Tactic: 6488321 Time: 0.005984
Fastest Tactic: 6226177 Time: 0.005984
--------------- Timing Runner: MaxPool_4 (CudnnPooling)
Tactic: -1 Time: 0.006048
Fastest Tactic: -1 Time: 0.006048
>>>>>>>>>>>>>>> Chose Runner Type: TiledPooling Tactic: 6226177
*************** Autotuning Reformat:Float(86400,10800,120,1) -> Float(86400,1,960,8) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.01424
Tactic: 0 Time: 0.00576
Fastest Tactic: 0 Time: 0.00576
*************** Autotuning Reformat:Float(86400,10800,120,1) -> Float(86400,1,960,8) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.013152
Tactic: 0 Time: 0.005824
Fastest Tactic: 0 Time: 0.005824
*************** Autotuning Reformat:Float(86400,1,960,8) -> Float(86400,10800,120,1) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.013984
Tactic: 0 Time: 0.00624
Fastest Tactic: 0 Time: 0.00624
*************** Autotuning format combination: Float(86400,10800,120,1) -> Float(172800,10800,120,1) ***************
--------------- Timing Runner: Conv_5 + Relu_6 (CudaDepthwiseConvolution)
CudaDepthwiseConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_5 + Relu_6 (FusedConvActConvolution)
Tactic: 524287 Time: 0.020352
Tactic: 720895 Time: 0.020576
Tactic: 983039 Time: 0.020544
Tactic: 1048575 Time: 0.018272
Tactic: 1703935 Time: 0.01648
Tactic: 1769471 Time: 0.032928
Tactic: 1966079 Time: 0.023936
Tactic: 2031615 Time: 0.024672
Tactic: 2228223 Time: 0.018304
Tactic: 2621439 Time: 0.018432
Tactic: 2752511 Time: 0.022528
Tactic: 2818047 Time: 0.034176
Tactic: 2883583 Time: 0.032864
Tactic: 3014655 Time: 0.01792
Tactic: 3145727 Time: 0.020576
Tactic: 3473407 Time: 0.03488
Tactic: 3604479 Time: 0.017568
Tactic: 3735551 Time: 0.030816
Tactic: 4390911 Time: 0.02464
Tactic: 5046271 Time: 0.01648
Tactic: 5963775 Time: 0.02672
Tactic: 6160383 Time: 0.020544
Tactic: 6488063 Time: 0.018336
Tactic: 6881279 Time: 0.022592
Tactic: 7274495 Time: 0.025824
Tactic: 7864319 Time: 0.018496
Tactic: 7995391 Time: 0.02048
Tactic: 8585215 Time: 0.01984
Tactic: 8847359 Time: 0.019424
Tactic: 8978431 Time: 0.026688
Tactic: 9043967 Time: 0.016448
Tactic: 9175039 Time: 0.017408
Tactic: 9502719 Time: 0.02464
Tactic: 9830399 Time: 0.032672
Tactic: 10027007 Time: 0.017664
Tactic: 10092543 Time: 0.024736
Tactic: 10289151 Time: 0.023936
Tactic: 10485759 Time: 0.016192
Tactic: 10682367 Time: 0.018336
Tactic: 10813439 Time: 0.020384
Fastest Tactic: 10485759 Time: 0.016192
--------------- Timing Runner: Conv_5 + Relu_6 (CudnnConvolution)
Tactic: 0 Time: 0.024672
Tactic: 1 Time: 0.02352
Tactic: 2 Time: 0.061504
Tactic: 4 Time: 0.254528
Tactic: 5 Time: 0.15216
Tactic: 6 Time: 0.02464
Tactic: 56 Time: 0.025888
Tactic: 57 Time: 0.02432
Tactic: 58 Time: 0.062592
Tactic: 60 Time: 0.251744
Tactic: 61 Time: 0.153216
Tactic: 62 Time: 0.025728
Fastest Tactic: 1 Time: 0.02352
--------------- Timing Runner: Conv_5 + Relu_6 (CaskConvolution)
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384
Tactic: 1825138533642645384 Time: 0.034912
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867
Tactic: 2775507031594384867 Time: 0.016128
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458
Tactic: 2842488832350522458 Time: 0.022624
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238
Tactic: 3915320020053085238 Time: 0.034848
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203
Tactic: 6448355332020552203 Time: 0.039904
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604
Tactic: 6808617066150061604 Time: 0.02256
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864
Tactic: -8060443123034038864 Time: 0.022752
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522
Tactic: -4420849921117327522 Time: 0.015904
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337
Tactic: -3946921629105938337 Time: 0.0224
Fastest Tactic: -4420849921117327522 Time: 0.015904
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -4420849921117327522
*************** Autotuning format combination: Float(86400,1,960,8) -> Float(172800,1,1920,16) ***************
--------------- Timing Runner: Conv_5 + Relu_6 (CudnnConvolution)
CudnnConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_5 + Relu_6 (CaskConvolution)
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376
Tactic: 861694390046228376 Time: 0.032832
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567
Tactic: 1017870653102653567 Time: 0.0328
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167
Tactic: 5258189349241541167 Time: 0.034272
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316
Tactic: 5821621277990374316 Time: 0.032832
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648
Tactic: 5863767799113001648 Time: 0.039008
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536
Tactic: -9147980667639709536 Time: 0.032768
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857
Tactic: -8850904373104590857 Time: 0.03424
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660
Tactic: -7751035352149795660 Time: 0.032832
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465
Tactic: -3853827649136781465 Time: 0.032832
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196
Tactic: -3263369460438823196 Time: 0.03408
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819
Tactic: -423878181466897819 Time: 0.038912
Fastest Tactic: -9147980667639709536 Time: 0.032768
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -9147980667639709536
*************** Autotuning Reformat:Float(172800,10800,120,1) -> Float(172800,1,1920,16) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.013984
Tactic: 0 Time: 0.007296
Fastest Tactic: 0 Time: 0.007296
*************** Autotuning Reformat:Float(172800,1,1920,16) -> Float(172800,10800,120,1) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.014336
Tactic: 0 Time: 0.008256
Fastest Tactic: 0 Time: 0.008256
*************** Autotuning format combination: Float(172800,10800,120,1) -> Float(345600,10800,120,1) ***************
--------------- Timing Runner: Conv_7 + Relu_8 (CudaDepthwiseConvolution)
CudaDepthwiseConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_7 + Relu_8 (FusedConvActConvolution)
Tactic: 524287 Time: 0.02816
Tactic: 720895 Time: 0.032768
Tactic: 983039 Time: 0.03232
Tactic: 1048575 Time: 0.026112
Tactic: 1703935 Time: 0.020576
Tactic: 1769471 Time: 0.044992
Tactic: 1966079 Time: 0.03664
Tactic: 2031615 Time: 0.040128
Tactic: 2228223 Time: 0.024128
Tactic: 2424831 Time: 0.030784
Tactic: 2621439 Time: 0.023968
Tactic: 2752511 Time: 0.035968
Tactic: 2818047 Time: 0.057408
Tactic: 2883583 Time: 0.055296
Tactic: 3014655 Time: 0.02256
Tactic: 3145727 Time: 0.032672
Tactic: 3473407 Time: 0.060416
Tactic: 3604479 Time: 0.022528
Tactic: 3735551 Time: 0.054304
Tactic: 4390911 Time: 0.038816
Tactic: 5046271 Time: 0.02256
Tactic: 5963775 Time: 0.043104
Tactic: 6160383 Time: 0.027648
Tactic: 6488063 Time: 0.024864
Tactic: 6881279 Time: 0.034912
Tactic: 7274495 Time: 0.03712
Tactic: 7864319 Time: 0.02352
Tactic: 7995391 Time: 0.032768
Tactic: 8585215 Time: 0.026176
Tactic: 8847359 Time: 0.024576
Tactic: 8978431 Time: 0.043072
Tactic: 9043967 Time: 0.02144
Tactic: 9175039 Time: 0.022528
Tactic: 9502719 Time: 0.038816
Tactic: 9830399 Time: 0.056416
Tactic: 9961471 Time: 0.032768
Tactic: 10027007 Time: 0.022624
Tactic: 10092543 Time: 0.038816
Tactic: 10289151 Time: 0.036
Tactic: 10485759 Time: 0.020544
Tactic: 10682367 Time: 0.022528
Tactic: 10813439 Time: 0.031744
Fastest Tactic: 10485759 Time: 0.020544
--------------- Timing Runner: Conv_7 + Relu_8 (CudnnConvolution)
Tactic: 0 Time: 0.038816
Tactic: 1 Time: 0.032576
Tactic: 2 Time: 0.114656
Tactic: 4 Time: 0.372416
Tactic: 5 Time: 0.171648
Tactic: 6 Time: 0.028768
Tactic: 56 Time: 0.038592
Tactic: 57 Time: 0.032512
Tactic: 58 Time: 0.114752
Tactic: 60 Time: 0.374464
Tactic: 61 Time: 0.17264
Tactic: 62 Time: 0.028704
Fastest Tactic: 62 Time: 0.028704
--------------- Timing Runner: Conv_7 + Relu_8 (CaskConvolution)
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384
Tactic: 1825138533642645384 Time: 0.058528
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867
Tactic: 2775507031594384867 Time: 0.020288
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458
Tactic: 2842488832350522458 Time: 0.034656
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238
Tactic: 3915320020053085238 Time: 0.057344
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203
Tactic: 6448355332020552203 Time: 0.063104
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604
Tactic: 6808617066150061604 Time: 0.034912
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864
Tactic: -8060443123034038864 Time: 0.036256
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522
Tactic: -4420849921117327522 Time: 0.024032
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337
Tactic: -3946921629105938337 Time: 0.034912
Fastest Tactic: 2775507031594384867 Time: 0.020288
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 2775507031594384867
*************** Autotuning format combination: Float(172800,1,1920,16) -> Float(345600,1,3840,32) ***************
--------------- Timing Runner: Conv_7 + Relu_8 (CudnnConvolution)
CudnnConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_7 + Relu_8 (CaskConvolution)
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376
Tactic: 861694390046228376 Time: 0.056416
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567
Tactic: 1017870653102653567 Time: 0.056992
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167
Tactic: 5258189349241541167 Time: 0.0336
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316
Tactic: 5821621277990374316 Time: 0.055392
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648
Tactic: 5863767799113001648 Time: 0.039008
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536
Tactic: -9147980667639709536 Time: 0.055392
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857
Tactic: -8850904373104590857 Time: 0.0344
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660
Tactic: -7751035352149795660 Time: 0.055392
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465
Tactic: -3853827649136781465 Time: 0.0568
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196
Tactic: -3263369460438823196 Time: 0.034368
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819
Tactic: -423878181466897819 Time: 0.039072
Fastest Tactic: 5258189349241541167 Time: 0.0336
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 5258189349241541167
*************** Autotuning Reformat:Float(345600,10800,120,1) -> Float(345600,1,3840,32) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.013952
Tactic: 0 Time: 0.00624
Fastest Tactic: 0 Time: 0.00624
*************** Autotuning Reformat:Float(345600,1,3840,32) -> Float(345600,10800,120,1) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.014432
Tactic: 0 Time: 0.008288
Fastest Tactic: 0 Time: 0.008288
*************** Autotuning Reformat:Float(345600,1,3840,32) -> Float(345600,10800,120,1) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.014432
Tactic: 0 Time: 0.009952
Fastest Tactic: 0 Time: 0.009952
*************** Autotuning format combination: Float(345600,10800,120,1) -> Float(43200,2700,60,1) ***************
--------------- Timing Runner: MaxPool_9 (TiledPooling)
Tactic: 5505281 Time: 0.00624
Tactic: 5570817 Time: 0.005312
Tactic: 5636353 Time: 0.005088
Tactic: 5701889 Time: 0.00512
Tactic: 5767425 Time: 0.00512
Tactic: 5832961 Time: 0.005344
Tactic: 5898497 Time: 0.005472
Tactic: 5964033 Time: 0.00512
Tactic: 6029569 Time: 0.00624
Tactic: 6095105 Time: 0.00512
Tactic: 6160641 Time: 0.005792
Tactic: 6226177 Time: 0.005344
Tactic: 6291713 Time: 0.005088
Tactic: 6357249 Time: 0.005344
Tactic: 6422785 Time: 0.005504
Tactic: 6488321 Time: 0.005312
Fastest Tactic: 5636353 Time: 0.005088
--------------- Timing Runner: MaxPool_9 (CudnnPooling)
Tactic: -1 Time: 0.005504
Fastest Tactic: -1 Time: 0.005504
>>>>>>>>>>>>>>> Chose Runner Type: TiledPooling Tactic: 5636353
*************** Autotuning Reformat:Float(43200,2700,60,1) -> Float(43200,1,960,16) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.008288
Tactic: 0 Time: 0.00416
Fastest Tactic: 0 Time: 0.00416
*************** Autotuning Reformat:Float(43200,2700,60,1) -> Float(43200,1,960,16) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.008256
Tactic: 0 Time: 0.004224
Fastest Tactic: 0 Time: 0.004224
*************** Autotuning Reformat:Float(43200,1,960,16) -> Float(43200,2700,60,1) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.008224
Tactic: 0 Time: 0.00512
Fastest Tactic: 0 Time: 0.00512
*************** Autotuning format combination: Float(43200,2700,60,1) -> Float(86400,2700,60,1) ***************
--------------- Timing Runner: Conv_10 + Relu_11 (CudaDepthwiseConvolution)
CudaDepthwiseConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_10 + Relu_11 (FusedConvActConvolution)
Tactic: 524287 Time: 0.018496
Tactic: 720895 Time: 0.020224
Tactic: 983039 Time: 0.015232
Tactic: 1048575 Time: 0.01824
Tactic: 1703935 Time: 0.013408
Tactic: 1769471 Time: 0.017376
Tactic: 1966079 Time: 0.02464
Tactic: 2031615 Time: 0.022528
Tactic: 2228223 Time: 0.018528
Tactic: 2424831 Time: 0.01808
Tactic: 2621439 Time: 0.0136
Tactic: 2752511 Time: 0.01664
Tactic: 2818047 Time: 0.021856
Tactic: 2883583 Time: 0.027744
Tactic: 3014655 Time: 0.014496
Tactic: 3145727 Time: 0.014432
Tactic: 3473407 Time: 0.023552
Tactic: 3604479 Time: 0.015616
Tactic: 3735551 Time: 0.022624
Tactic: 4390911 Time: 0.0256
Tactic: 5046271 Time: 0.016
Tactic: 5963775 Time: 0.023808
Tactic: 6160383 Time: 0.01808
Tactic: 6488063 Time: 0.02048
Tactic: 6881279 Time: 0.02048
Tactic: 7274495 Time: 0.0144
Tactic: 7864319 Time: 0.01408
Tactic: 7995391 Time: 0.020608
Tactic: 8585215 Time: 0.019776
Tactic: 8847359 Time: 0.014368
Tactic: 8978431 Time: 0.02448
Tactic: 9043967 Time: 0.014432
Tactic: 9175039 Time: 0.016288
Tactic: 9502719 Time: 0.026528
Tactic: 9830399 Time: 0.024192
Tactic: 9961471 Time: 0.01808
Tactic: 10027007 Time: 0.018336
Tactic: 10092543 Time: 0.026272
Tactic: 10289151 Time: 0.024608
Tactic: 10485759 Time: 0.013888
Tactic: 10682367 Time: 0.013408
Tactic: 10813439 Time: 0.01392
Fastest Tactic: 1703935 Time: 0.013408
--------------- Timing Runner: Conv_10 + Relu_11 (CudnnConvolution)
Tactic: 0 Time: 0.022528
Tactic: 1 Time: 0.022528
Tactic: 2 Time: 0.051296
Tactic: 4 Time: 0.137216
Tactic: 5 Time: 0.070432
Tactic: 6 Time: 0.017408
Tactic: 56 Time: 0.022528
Tactic: 57 Time: 0.022624
Tactic: 58 Time: 0.0512
Tactic: 60 Time: 0.136672
Tactic: 61 Time: 0.072928
Tactic: 62 Time: 0.018336
Fastest Tactic: 6 Time: 0.017408
--------------- Timing Runner: Conv_10 + Relu_11 (CaskConvolution)
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384
Tactic: 1825138533642645384 Time: 0.032832
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867
Tactic: 2775507031594384867 Time: 0.010304
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458
Tactic: 2842488832350522458 Time: 0.024416
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238
Tactic: 3915320020053085238 Time: 0.03264
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203
Tactic: 6448355332020552203 Time: 0.032832
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604
Tactic: 6808617066150061604 Time: 0.02256
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864
Tactic: -8060443123034038864 Time: 0.023456
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522
Tactic: -4420849921117327522 Time: 0.02048
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337
Tactic: -3946921629105938337 Time: 0.024672
Fastest Tactic: 2775507031594384867 Time: 0.010304
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 2775507031594384867
*************** Autotuning format combination: Float(43200,1,960,16) -> Float(86400,1,1920,32) ***************
--------------- Timing Runner: Conv_10 + Relu_11 (CudnnConvolution)
CudnnConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_10 + Relu_11 (CaskConvolution)
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376
Tactic: 861694390046228376 Time: 0.03072
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567
Tactic: 1017870653102653567 Time: 0.030816
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167
Tactic: 5258189349241541167 Time: 0.02032
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316
Tactic: 5821621277990374316 Time: 0.030816
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648
Tactic: 5863767799113001648 Time: 0.020576
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536
Tactic: -9147980667639709536 Time: 0.032128
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857
Tactic: -8850904373104590857 Time: 0.02032
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660
Tactic: -7751035352149795660 Time: 0.030816
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465
Tactic: -3853827649136781465 Time: 0.031872
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196
Tactic: -3263369460438823196 Time: 0.020384
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819
Tactic: -423878181466897819 Time: 0.02048
Fastest Tactic: 5258189349241541167 Time: 0.02032
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 5258189349241541167
*************** Autotuning Reformat:Float(86400,2700,60,1) -> Float(86400,1,1920,32) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.008288
Tactic: 0 Time: 0.005984
Fastest Tactic: 0 Time: 0.005984
*************** Autotuning Reformat:Float(86400,1,1920,32) -> Float(86400,2700,60,1) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.009056
Tactic: 0 Time: 0.00768
Fastest Tactic: 0 Time: 0.00768
*************** Autotuning format combination: Float(86400,2700,60,1) -> Float(259200,2700,60,1) ***************
--------------- Timing Runner: Conv_12 + Relu_13 (CudaDepthwiseConvolution)
CudaDepthwiseConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_12 + Relu_13 (FusedConvActConvolution)
Tactic: 524287 Time: 0.02672
Tactic: 720895 Time: 0.032352
Tactic: 983039 Time: 0.024064
Tactic: 1048575 Time: 0.026272
Tactic: 1703935 Time: 0.017664
Tactic: 1769471 Time: 0.02432
Tactic: 1966079 Time: 0.04
Tactic: 2031615 Time: 0.03472
Tactic: 2228223 Time: 0.024128
Tactic: 2424831 Time: 0.022592
Tactic: 2621439 Time: 0.018336
Tactic: 2752511 Time: 0.036864
Tactic: 2818047 Time: 0.034816
Tactic: 2883583 Time: 0.045056
Tactic: 3014655 Time: 0.020576
Tactic: 3145727 Time: 0.021984
Tactic: 3473407 Time: 0.03904
Tactic: 3604479 Time: 0.021888
Tactic: 3735551 Time: 0.039008
Tactic: 4390911 Time: 0.041024
Tactic: 5046271 Time: 0.022464
Tactic: 5963775 Time: 0.03696
Tactic: 6160383 Time: 0.024672
Tactic: 6488063 Time: 0.032096
Tactic: 6881279 Time: 0.03168
Tactic: 7274495 Time: 0.020544
Tactic: 7864319 Time: 0.018496
Tactic: 7995391 Time: 0.032832
Tactic: 8585215 Time: 0.028704
Tactic: 8847359 Time: 0.019648
Tactic: 8978431 Time: 0.03696
Tactic: 9043967 Time: 0.019808
Tactic: 9175039 Time: 0.02064
Tactic: 9502719 Time: 0.042816
Tactic: 9830399 Time: 0.040672
Tactic: 9961471 Time: 0.023936
Tactic: 10027007 Time: 0.025824
Tactic: 10092543 Time: 0.042208
Tactic: 10289151 Time: 0.040032
Tactic: 10485759 Time: 0.01744
Tactic: 10682367 Time: 0.017696
Tactic: 10813439 Time: 0.020384
Fastest Tactic: 10485759 Time: 0.01744
--------------- Timing Runner: Conv_12 + Relu_13 (CudnnConvolution)
Tactic: 0 Time: 0.034016
Tactic: 1 Time: 0.033952
Tactic: 2 Time: 0.085088
Tactic: 4 Time: 0.212992
Tactic: 5 Time: 0.114688
Tactic: 6 Time: 0.021568
Tactic: 56 Time: 0.0328
Tactic: 57 Time: 0.032864
Tactic: 58 Time: 0.083968
Tactic: 60 Time: 0.212992
Tactic: 61 Time: 0.114688
Tactic: 62 Time: 0.020672
Fastest Tactic: 62 Time: 0.020672
--------------- Timing Runner: Conv_12 + Relu_13 (CaskConvolution)
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384
Tactic: 1825138533642645384 Time: 0.056704
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867
Tactic: 2775507031594384867 Time: 0.013824
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458
Tactic: 2842488832350522458 Time: 0.037888
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238
Tactic: 3915320020053085238 Time: 0.05504
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203
Tactic: 6448355332020552203 Time: 0.05632
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604
Tactic: 6808617066150061604 Time: 0.034912
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864
Tactic: -8060443123034038864 Time: 0.038976
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522
Tactic: -4420849921117327522 Time: 0.034272
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337
Tactic: -3946921629105938337 Time: 0.041056
Fastest Tactic: 2775507031594384867 Time: 0.013824
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 2775507031594384867
*************** Autotuning format combination: Float(86400,1,1920,32) -> Float(259200,1,5760,96) ***************
--------------- Timing Runner: Conv_12 + Relu_13 (CudnnConvolution)
CudnnConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_12 + Relu_13 (CaskConvolution)
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376
Tactic: 861694390046228376 Time: 0.05504
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567
Tactic: 1017870653102653567 Time: 0.054528
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167
Tactic: 5258189349241541167 Time: 0.032064
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316
Tactic: 5821621277990374316 Time: 0.053248
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648
Tactic: 5863767799113001648 Time: 0.020576
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536
Tactic: -9147980667639709536 Time: 0.05456
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857
Tactic: -8850904373104590857 Time: 0.032608
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660
Tactic: -7751035352149795660 Time: 0.053344
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465
Tactic: -3853827649136781465 Time: 0.055072
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196
Tactic: -3263369460438823196 Time: 0.03216
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819
Tactic: -423878181466897819 Time: 0.020544
Fastest Tactic: -423878181466897819 Time: 0.020544
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -423878181466897819
*************** Autotuning Reformat:Float(259200,2700,60,1) -> Float(259200,1,5760,96) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.009408
Tactic: 0 Time: 0.005952
Fastest Tactic: 0 Time: 0.005952
*************** Autotuning Reformat:Float(259200,1,5760,96) -> Float(259200,2700,60,1) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.009408
Tactic: 0 Time: 0.007392
Fastest Tactic: 0 Time: 0.007392
*************** Autotuning Reformat:Float(259200,1,5760,96) -> Float(259200,2700,60,1) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.00944
Tactic: 0 Time: 0.007296
Fastest Tactic: 0 Time: 0.007296
*************** Autotuning format combination: Float(259200,2700,60,1) -> Float(9600,300,20,1) ***************
--------------- Timing Runner: MaxPool_14 (TiledPooling)
Tactic: 7536897 Time: 0.016448
Tactic: 7536898 Time: 0.01024
Tactic: 7536900 Time: 0.00624
Tactic: 7536903 Time: 0.006112
Tactic: 7536904 Time: 0.006144
Tactic: 7536911 Time: 0.00624
Tactic: 7536912 Time: 0.006144
Tactic: 7536916 Time: 0.005632
Tactic: 7537153 Time: 0.010304
Tactic: 7537154 Time: 0.007488
Tactic: 7537156 Time: 0.007296
Tactic: 7537159 Time: 0.005856
Tactic: 7537160 Time: 0.00624
Tactic: 7537167 Time: 0.006432
Tactic: 7537168 Time: 0.006176
Tactic: 7537172 Time: 0.00624
Tactic: 7537409 Time: 0.008288
Tactic: 7537410 Time: 0.00624
Tactic: 7537412 Time: 0.006048
Tactic: 7537415 Time: 0.00496
Tactic: 7537416 Time: 0.00624
Tactic: 7537423 Time: 0.007392
Tactic: 7537424 Time: 0.007264
Tactic: 7537428 Time: 0.007232
Tactic: 7537665 Time: 0.007712
Tactic: 7537666 Time: 0.00624
Tactic: 7537668 Time: 0.006048
Tactic: 7537671 Time: 0.005824
Tactic: 7537672 Time: 0.00624
Tactic: 7537679 Time: 0.008224
Tactic: 7537680 Time: 0.008224
Tactic: 7537684 Time: 0.006144
Tactic: 7537921 Time: 0.00624
Tactic: 7537922 Time: 0.004928
Tactic: 7537924 Time: 0.005824
Tactic: 7537927 Time: 0.005024
Tactic: 7537928 Time: 0.00624
Tactic: 7537935 Time: 0.007744
Tactic: 7537936 Time: 0.007712
Tactic: 7537940 Time: 0.005888
Tactic: 7538177 Time: 0.006208
Tactic: 7538178 Time: 0.006208
Tactic: 7538180 Time: 0.006144
Tactic: 7538183 Time: 0.00624
Tactic: 7538184 Time: 0.00624
Tactic: 7538191 Time: 0.008
Tactic: 7538192 Time: 0.007168
Tactic: 7538433 Time: 0.006208
Tactic: 7538434 Time: 0.006144
Tactic: 7538436 Time: 0.00624
Tactic: 7538439 Time: 0.00624
Tactic: 7538440 Time: 0.00624
Tactic: 7538447 Time: 0.00736
Tactic: 7538448 Time: 0.00816
Tactic: 7538689 Time: 0.00624
Tactic: 7538690 Time: 0.006048
Tactic: 7538692 Time: 0.00624
Tactic: 7538695 Time: 0.006048
Tactic: 7538696 Time: 0.00624
Tactic: 7538945 Time: 0.00624
Tactic: 7538946 Time: 0.006144
Tactic: 7538948 Time: 0.00624
Tactic: 7538951 Time: 0.007392
Tactic: 7538952 Time: 0.00752
Tactic: 7539201 Time: 0.00624
Tactic: 7539202 Time: 0.00624
Tactic: 7539204 Time: 0.006208
Tactic: 7539207 Time: 0.00752
Tactic: 7539208 Time: 0.007872
Tactic: 7539457 Time: 0.00624
Tactic: 7539458 Time: 0.006208
Tactic: 7539460 Time: 0.00624
Tactic: 7539463 Time: 0.007712
Tactic: 7539464 Time: 0.007712
Tactic: 7539713 Time: 0.006208
Tactic: 7539714 Time: 0.006208
Tactic: 7539716 Time: 0.006208
Tactic: 7539719 Time: 0.008096
Tactic: 7539720 Time: 0.007872
Fastest Tactic: 7537922 Time: 0.004928
--------------- Timing Runner: MaxPool_14 (CudnnPooling)
Tactic: -1 Time: 0.005472
Fastest Tactic: -1 Time: 0.005472
>>>>>>>>>>>>>>> Chose Runner Type: TiledPooling Tactic: 7537922
*************** Autotuning Reformat:Float(9600,300,20,1) -> Float(9600,1,640,32) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.008256
Tactic: 0 Time: 0.004096
Fastest Tactic: 0 Time: 0.004096
*************** Autotuning Reformat:Float(9600,300,20,1) -> Float(9600,1,640,32) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.008192
Tactic: 0 Time: 0.004192
Fastest Tactic: 0 Time: 0.004192
*************** Autotuning Reformat:Float(9600,1,640,32) -> Float(9600,300,20,1) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.008192
Tactic: 0 Time: 0.004192
Fastest Tactic: 0 Time: 0.004192
*************** Autotuning format combination: Float(9600,300,20,1) -> Float(19200,300,20,1) ***************
--------------- Timing Runner: Conv_15 + Relu_16 (CudaDepthwiseConvolution)
CudaDepthwiseConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_15 + Relu_16 (FusedConvActConvolution)
Tactic: 524287 Time: 0.024416
Tactic: 720895 Time: 0.020608
Tactic: 983039 Time: 0.01408
Tactic: 1048575 Time: 0.017408
Tactic: 1703935 Time: 0.012096
Tactic: 1769471 Time: 0.015424
Tactic: 1966079 Time: 0.03824
Tactic: 2031615 Time: 0.032096
Tactic: 2228223 Time: 0.019936
Tactic: 2424831 Time: 0.012384
Tactic: 2621439 Time: 0.0112
Tactic: 2752511 Time: 0.023808
Tactic: 2818047 Time: 0.022624
Tactic: 2883583 Time: 0.042656
Tactic: 3014655 Time: 0.014048
Tactic: 3145727 Time: 0.01536
Tactic: 3473407 Time: 0.024608
Tactic: 3604479 Time: 0.014112
Tactic: 3735551 Time: 0.018496
Tactic: 4390911 Time: 0.040384
Tactic: 5046271 Time: 0.015936
Tactic: 5963775 Time: 0.03472
Tactic: 6160383 Time: 0.02224
Tactic: 6488063 Time: 0.020384
Tactic: 6881279 Time: 0.028736
Tactic: 7274495 Time: 0.011616
Tactic: 7864319 Time: 0.012256
Tactic: 7995391 Time: 0.022432
Tactic: 8585215 Time: 0.02672
Tactic: 8847359 Time: 0.012384
Tactic: 8978431 Time: 0.03472
Tactic: 9043967 Time: 0.012384
Tactic: 9175039 Time: 0.01424
Tactic: 9502719 Time: 0.040448
Tactic: 9830399 Time: 0.019648
Tactic: 9961471 Time: 0.013312
Tactic: 10027007 Time: 0.01648
Tactic: 10092543 Time: 0.040544
Tactic: 10289151 Time: 0.038304
Tactic: 10485759 Time: 0.010336
Tactic: 10682367 Time: 0.011616
Tactic: 10813439 Time: 0.014464
Fastest Tactic: 10485759 Time: 0.010336
--------------- Timing Runner: Conv_15 + Relu_16 (CudnnConvolution)
Tactic: 0 Time: 0.023648
Tactic: 1 Time: 0.022624
Tactic: 2 Time: 0.06768
Tactic: 4 Time: 0.082816
Tactic: 5 Time: 0.082688
Tactic: 6 Time: 0.019712
Tactic: 56 Time: 0.022624
Tactic: 57 Time: 0.02272
Tactic: 58 Time: 0.06768
Tactic: 60 Time: 0.082752
Tactic: 61 Time: 0.082016
Tactic: 62 Time: 0.019552
Fastest Tactic: 62 Time: 0.019552
--------------- Timing Runner: Conv_15 + Relu_16 (CaskConvolution)
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384
Tactic: 1825138533642645384 Time: 0.05632
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867
Tactic: 2775507031594384867 Time: 0.013536
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458
Tactic: 2842488832350522458 Time: 0.038816
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238
Tactic: 3915320020053085238 Time: 0.05504
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203
Tactic: 6448355332020552203 Time: 0.057152
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604
Tactic: 6808617066150061604 Time: 0.034848
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864
Tactic: -8060443123034038864 Time: 0.04064
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522
Tactic: -4420849921117327522 Time: 0.03488
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337
Tactic: -3946921629105938337 Time: 0.04096
Fastest Tactic: 2775507031594384867 Time: 0.013536
>>>>>>>>>>>>>>> Chose Runner Type: FusedConvActConvolution Tactic: 10485759
*************** Autotuning format combination: Float(9600,1,640,32) -> Float(19200,1,1280,64) ***************
--------------- Timing Runner: Conv_15 + Relu_16 (CudnnConvolution)
CudnnConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_15 + Relu_16 (CaskConvolution)
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376
Tactic: 861694390046228376 Time: 0.053344
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567
Tactic: 1017870653102653567 Time: 0.055104
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167
Tactic: 5258189349241541167 Time: 0.032
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316
Tactic: 5821621277990374316 Time: 0.053344
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648
Tactic: 5863767799113001648 Time: 0.020576
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536
Tactic: -9147980667639709536 Time: 0.053312
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857
Tactic: -8850904373104590857 Time: 0.032416
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660
Tactic: -7751035352149795660 Time: 0.053344
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465
Tactic: -3853827649136781465 Time: 0.055104
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196
Tactic: -3263369460438823196 Time: 0.032352
Conv_15 + Relu_16 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819
Tactic: -423878181466897819 Time: 0.02048
Fastest Tactic: -423878181466897819 Time: 0.02048
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -423878181466897819
*************** Autotuning Reformat:Float(19200,300,20,1) -> Float(19200,1,1280,64) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.008096
Tactic: 0 Time: 0.005664
Fastest Tactic: 0 Time: 0.005664
*************** Autotuning Reformat:Float(19200,1,1280,64) -> Float(19200,300,20,1) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.008288
Tactic: 0 Time: 0.004192
Fastest Tactic: 0 Time: 0.004192
*************** Autotuning format combination: Float(19200,300,20,1) -> Float(19200,300,20,1) ***************
--------------- Timing Runner: Conv_17 + Relu_18 (CudaDepthwiseConvolution)
CudaDepthwiseConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_17 + Relu_18 (FusedConvActConvolution)
Tactic: 524287 Time: 0.041056
Tactic: 720895 Time: 0.036288
Tactic: 983039 Time: 0.020576
Tactic: 1048575 Time: 0.028768
Tactic: 1703935 Time: 0.017664
Tactic: 1769471 Time: 0.022432
Tactic: 1966079 Time: 0.06768
Tactic: 2031615 Time: 0.056768
Tactic: 2228223 Time: 0.03216
Tactic: 2424831 Time: 0.01648
Tactic: 2621439 Time: 0.015456
Tactic: 2752511 Time: 0.039008
Tactic: 2818047 Time: 0.036864
Tactic: 2883583 Time: 0.077824
Tactic: 3014655 Time: 0.020512
Tactic: 3145727 Time: 0.024
Tactic: 3473407 Time: 0.042656
Tactic: 3604479 Time: 0.020576
Tactic: 3735551 Time: 0.030816
Tactic: 4390911 Time: 0.071776
Tactic: 5046271 Time: 0.026464
Tactic: 5963775 Time: 0.061536
Tactic: 6160383 Time: 0.0368
Tactic: 6488063 Time: 0.03248
Tactic: 6881279 Time: 0.051008
Tactic: 7274495 Time: 0.016096
Tactic: 7864319 Time: 0.01648
Tactic: 7995391 Time: 0.03696
Tactic: 8585215 Time: 0.0472
Tactic: 8847359 Time: 0.016544
Tactic: 8978431 Time: 0.061536
Tactic: 9043967 Time: 0.01856
Tactic: 9175039 Time: 0.020576
Tactic: 9502719 Time: 0.071776
Tactic: 9830399 Time: 0.032864
Tactic: 9961471 Time: 0.018528
Tactic: 10027007 Time: 0.02672
Tactic: 10092543 Time: 0.071776
Tactic: 10289151 Time: 0.067648
Tactic: 10485759 Time: 0.016288
Tactic: 10682367 Time: 0.014432
Tactic: 10813439 Time: 0.022592
Fastest Tactic: 10682367 Time: 0.014432
--------------- Timing Runner: Conv_17 + Relu_18 (CudnnConvolution)
Tactic: 0 Time: 0.036768
Tactic: 1 Time: 0.03696
Tactic: 2 Time: 0.097504
Tactic: 4 Time: 0.134496
Tactic: 5 Time: 0.137856
Tactic: 6 Time: 0.02608
Tactic: 56 Time: 0.036768
Tactic: 57 Time: 0.036832
Tactic: 58 Time: 0.096352
Tactic: 60 Time: 0.134464
Tactic: 61 Time: 0.137504
Tactic: 62 Time: 0.026336
Fastest Tactic: 6 Time: 0.02608
--------------- Timing Runner: Conv_17 + Relu_18 (CaskConvolution)
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384
Tactic: 1825138533642645384 Time: 0.102496
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867
Tactic: 2775507031594384867 Time: 0.019648
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458
Tactic: 2842488832350522458 Time: 0.067328
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238
Tactic: 3915320020053085238 Time: 0.100032
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203
Tactic: 6448355332020552203 Time: 0.103808
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604
Tactic: 6808617066150061604 Time: 0.061536
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864
Tactic: -8060443123034038864 Time: 0.071552
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522
Tactic: -4420849921117327522 Time: 0.063264
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337
Tactic: -3946921629105938337 Time: 0.075328
Fastest Tactic: 2775507031594384867 Time: 0.019648
>>>>>>>>>>>>>>> Chose Runner Type: FusedConvActConvolution Tactic: 10682367
*************** Autotuning format combination: Float(19200,1,1280,64) -> Float(19200,1,1280,64) ***************
--------------- Timing Runner: Conv_17 + Relu_18 (CudnnConvolution)
CudnnConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_17 + Relu_18 (CaskConvolution)
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376
Tactic: 861694390046228376 Time: 0.099904
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567
Tactic: 1017870653102653567 Time: 0.099648
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167
Tactic: 5258189349241541167 Time: 0.055296
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316
Tactic: 5821621277990374316 Time: 0.099008
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648
Tactic: 5863767799113001648 Time: 0.032864
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536
Tactic: -9147980667639709536 Time: 0.099552
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857
Tactic: -8850904373104590857 Time: 0.056416
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660
Tactic: -7751035352149795660 Time: 0.098368
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465
Tactic: -3853827649136781465 Time: 0.100448
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196
Tactic: -3263369460438823196 Time: 0.055392
Conv_17 + Relu_18 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819
Tactic: -423878181466897819 Time: 0.032864
Fastest Tactic: 5863767799113001648 Time: 0.032864
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 5863767799113001648
*************** Autotuning Reformat:Float(19200,300,20,1) -> Float(19200,1,1280,64) ***************
*************** Autotuning Reformat:Float(19200,1,1280,64) -> Float(19200,300,20,1) ***************
*************** Autotuning format combination: Float(19200,300,20,1) -> Float(19200,300,20,1) ***************
*************** Autotuning format combination: Float(19200,1,1280,64) -> Float(19200,1,1280,64) ***************
*************** Autotuning Reformat:Float(19200,300,20,1) -> Float(19200,1,1280,64) ***************
*************** Autotuning Reformat:Float(19200,1,1280,64) -> Float(19200,300,20,1) ***************
*************** Autotuning format combination: Float(19200,300,20,1) -> Float(172800,2700,60,1) ***************
--------------- Timing Runner: ConvTranspose_21 + BatchNormalization_22 + Relu_23 (CudnnDeconvolution)
Tactic: 0 Time: 0.030656
Tactic: 1 Time: 0.251808
Fastest Tactic: 0 Time: 0.030656
--------------- Timing Runner: ConvTranspose_21 + BatchNormalization_22 + Relu_23 (GemmDeconvolution)
Tactic: 0 Time: 0.018464
Fastest Tactic: 0 Time: 0.018464
--------------- Timing Runner: ConvTranspose_21 + BatchNormalization_22 + Relu_23 (CaskDeconvolution)
CaskDeconvolution has no valid tactics for this config, skipping
>>>>>>>>>>>>>>> Chose Runner Type: GemmDeconvolution Tactic: 0
*************** Autotuning format combination: Float(19200,1,1280,64) -> Float(172800,1,3840,64) ***************
--------------- Timing Runner: ConvTranspose_21 + BatchNormalization_22 + Relu_23 (CudnnDeconvolution)
CudnnDeconvolution has no valid tactics for this config, skipping
--------------- Timing Runner: ConvTranspose_21 + BatchNormalization_22 + Relu_23 (GemmDeconvolution)
GemmDeconvolution has no valid tactics for this config, skipping
--------------- Timing Runner: ConvTranspose_21 + BatchNormalization_22 + Relu_23 (CaskDeconvolution)
CaskDeconvolution has no valid tactics for this config, skipping
*************** Autotuning Reformat:Float(172800,2700,60,1) -> Float(259200,2700,60,1) ***************
--------------- Timing Runner: 166 copy (Reformat)
Tactic: 1002 Time: 0.008192
Tactic: 0 Time: 0.005344
Fastest Tactic: 0 Time: 0.005344
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0
*************** Autotuning Reformat:Float(172800,2700,60,1) -> Float(259200,1,5760,96) ***************
--------------- Timing Runner: 166 copy (Reformat)
Tactic: 1002 Time: 0.008224
Tactic: 0 Time: 0.008096
Fastest Tactic: 0 Time: 0.008096
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0
*************** Autotuning Reformat:Float(172800,1,3840,64) -> Float(259200,2700,60,1) ***************
--------------- Timing Runner: 166 copy (Reformat)
Tactic: 1002 Time: 0.010112
Tactic: 0 Time: 0.010144
Fastest Tactic: 1002 Time: 0.010112
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 1002
*************** Autotuning Reformat:Float(172800,1,3840,64) -> Float(259200,1,5760,96) ***************
--------------- Timing Runner: 166 copy (Reformat)
Tactic: 1002 Time: 0.008288
Tactic: 0 Time: 0.00624
Fastest Tactic: 0 Time: 0.00624
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0
*************** Autotuning Reformat:Float(259200,2700,60,1) -> Float(259200,1,5760,96) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.009856
Tactic: 0 Time: 0.010368
Fastest Tactic: 1002 Time: 0.009856
*************** Autotuning Reformat:Float(259200,1,5760,96) -> Float(259200,2700,60,1) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.012288
Tactic: 0 Time: 0.012288
Fastest Tactic: 1002 Time: 0.012288
*************** Autotuning format combination: Float(259200,2700,60,1) -> Float(86400,2700,60,1) ***************
--------------- Timing Runner: Conv_25 + Relu_26 (CudaDepthwiseConvolution)
CudaDepthwiseConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_25 + Relu_26 (FusedConvActConvolution)
Tactic: 524287 Time: 0.06096
Tactic: 720895 Time: 0.081024
Tactic: 983039 Time: 0.057216
Tactic: 1048575 Time: 0.059296
Tactic: 1703935 Time: 0.034144
Tactic: 1769471 Time: 0.051104
Tactic: 1966079 Time: 0.099904
Tactic: 2031615 Time: 0.084064
Tactic: 2228223 Time: 0.049088
Tactic: 2424831 Time: 0.04064
Tactic: 2621439 Time: 0.032832
Tactic: 2752511 Time: 0.092256
Tactic: 2818047 Time: 0.08912
Tactic: 2883583 Time: 0.116544
Tactic: 3014655 Time: 0.043104
Tactic: 3145727 Time: 0.050304
Tactic: 3473407 Time: 0.103712
Tactic: 3604479 Time: 0.043104
Tactic: 3735551 Time: 0.105728
Tactic: 4390911 Time: 0.104544
Tactic: 5046271 Time: 0.049216
Tactic: 5963775 Time: 0.090272
Tactic: 6160383 Time: 0.054784
Tactic: 6488063 Time: 0.071776
Tactic: 6881279 Time: 0.07568
Tactic: 7274495 Time: 0.04512
Tactic: 7864319 Time: 0.034912
Tactic: 7995391 Time: 0.08528
Tactic: 8585215 Time: 0.06768
Tactic: 8847359 Time: 0.03696
Tactic: 8978431 Time: 0.091744
Tactic: 9043967 Time: 0.040672
Tactic: 9175039 Time: 0.043104
Tactic: 9502719 Time: 0.106464
Tactic: 9830399 Time: 0.108064
Tactic: 9961471 Time: 0.043264
Tactic: 10027007 Time: 0.0552
Tactic: 10092543 Time: 0.105632
Tactic: 10289151 Time: 0.09968
Tactic: 10485759 Time: 0.032672
Tactic: 10682367 Time: 0.032864
Tactic: 10813439 Time: 0.045888
Fastest Tactic: 10485759 Time: 0.032672
--------------- Timing Runner: Conv_25 + Relu_26 (CudnnConvolution)
Tactic: 0 Time: 0.13088
Tactic: 1 Time: 0.061536
Tactic: 2 Time: 0.171936
Tactic: 4 Time: 0.570912
Tactic: 5 Time: 0.2448
Tactic: 6 Time: 0.033056
Tactic: 56 Time: 0.130912
Tactic: 57 Time: 0.062432
Tactic: 58 Time: 0.171872
Tactic: 60 Time: 0.56704
Tactic: 61 Time: 0.241792
Tactic: 62 Time: 0.033952
Fastest Tactic: 6 Time: 0.033056
--------------- Timing Runner: Conv_25 + Relu_26 (CaskConvolution)
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384
Tactic: 1825138533642645384 Time: 0.147552
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867
Tactic: 2775507031594384867 Time: 0.026368
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458
Tactic: 2842488832350522458 Time: 0.095488
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238
Tactic: 3915320020053085238 Time: 0.145184
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203
Tactic: 6448355332020552203 Time: 0.148864
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604
Tactic: 6808617066150061604 Time: 0.08816
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864
Tactic: -8060443123034038864 Time: 0.0984
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522
Tactic: -4420849921117327522 Time: 0.088128
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337
Tactic: -3946921629105938337 Time: 0.106592
Fastest Tactic: 2775507031594384867 Time: 0.026368
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 2775507031594384867
*************** Autotuning format combination: Float(259200,1,5760,96) -> Float(86400,1,1920,32) ***************
--------------- Timing Runner: Conv_25 + Relu_26 (CudnnConvolution)
CudnnConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_25 + Relu_26 (CaskConvolution)
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376
Tactic: 861694390046228376 Time: 0.145248
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567
Tactic: 1017870653102653567 Time: 0.14544
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167
Tactic: 5258189349241541167 Time: 0.079776
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316
Tactic: 5821621277990374316 Time: 0.14416
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648
Tactic: 5863767799113001648 Time: 0.045152
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536
Tactic: -9147980667639709536 Time: 0.145536
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857
Tactic: -8850904373104590857 Time: 0.079968
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660
Tactic: -7751035352149795660 Time: 0.144672
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465
Tactic: -3853827649136781465 Time: 0.147392
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196
Tactic: -3263369460438823196 Time: 0.079744
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819
Tactic: -423878181466897819 Time: 0.046272
Fastest Tactic: 5863767799113001648 Time: 0.045152
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 5863767799113001648
*************** Autotuning Reformat:Float(86400,2700,60,1) -> Float(86400,1,1920,32) ***************
*************** Autotuning Reformat:Float(86400,1,1920,32) -> Float(86400,2700,60,1) ***************
*************** Autotuning format combination: Float(86400,2700,60,1) -> Float(86400,2700,60,1) ***************
--------------- Timing Runner: Conv_27 + Relu_28 (CudaDepthwiseConvolution)
CudaDepthwiseConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_27 + Relu_28 (FusedConvActConvolution)
Tactic: 524287 Time: 0.02672
Tactic: 720895 Time: 0.031584
Tactic: 983039 Time: 0.023552
Tactic: 1048575 Time: 0.026464
Tactic: 1703935 Time: 0.017664
Tactic: 1769471 Time: 0.024128
Tactic: 1966079 Time: 0.04016
Tactic: 2031615 Time: 0.034912
Tactic: 2228223 Time: 0.024672
Tactic: 2424831 Time: 0.022624
Tactic: 2621439 Time: 0.018016
Tactic: 2752511 Time: 0.03696
Tactic: 2818047 Time: 0.034816
Tactic: 2883583 Time: 0.04512
Tactic: 3014655 Time: 0.020576
Tactic: 3145727 Time: 0.022144
Tactic: 3473407 Time: 0.038912
Tactic: 3604479 Time: 0.02048
Tactic: 3735551 Time: 0.039008
Tactic: 4390911 Time: 0.04192
Tactic: 5046271 Time: 0.022592
Tactic: 5963775 Time: 0.036896
Tactic: 6160383 Time: 0.024672
Tactic: 6488063 Time: 0.030816
Tactic: 6881279 Time: 0.03088
Tactic: 7274495 Time: 0.020576
Tactic: 7864319 Time: 0.018432
Tactic: 7995391 Time: 0.0328
Tactic: 8585215 Time: 0.030304
Tactic: 8847359 Time: 0.019328
Tactic: 8978431 Time: 0.036928
Tactic: 9043967 Time: 0.019968
Tactic: 9175039 Time: 0.02048
Tactic: 9502719 Time: 0.042656
Tactic: 9830399 Time: 0.039904
Tactic: 9961471 Time: 0.023904
Tactic: 10027007 Time: 0.02608
Tactic: 10092543 Time: 0.041056
Tactic: 10289151 Time: 0.040224
Tactic: 10485759 Time: 0.01648
Tactic: 10682367 Time: 0.017984
Tactic: 10813439 Time: 0.02048
Fastest Tactic: 10485759 Time: 0.01648
--------------- Timing Runner: Conv_27 + Relu_28 (CudnnConvolution)
Tactic: 0 Time: 0.033056
Tactic: 1 Time: 0.032864
Tactic: 2 Time: 0.083936
Tactic: 4 Time: 0.212864
Tactic: 5 Time: 0.113216
Tactic: 6 Time: 0.020576
Tactic: 56 Time: 0.033024
Tactic: 57 Time: 0.032864
Tactic: 58 Time: 0.08368
Tactic: 60 Time: 0.214688
Tactic: 61 Time: 0.11376
Tactic: 62 Time: 0.020576
Fastest Tactic: 6 Time: 0.020576
--------------- Timing Runner: Conv_27 + Relu_28 (CaskConvolution)
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384
Tactic: 1825138533642645384 Time: 0.056576
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867
Tactic: 2775507031594384867 Time: 0.014016
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458
Tactic: 2842488832350522458 Time: 0.0384
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238
Tactic: 3915320020053085238 Time: 0.055168
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203
Tactic: 6448355332020552203 Time: 0.055392
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604
Tactic: 6808617066150061604 Time: 0.034912
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864
Tactic: -8060443123034038864 Time: 0.039008
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522
Tactic: -4420849921117327522 Time: 0.034592
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337
Tactic: -3946921629105938337 Time: 0.04112
Fastest Tactic: 2775507031594384867 Time: 0.014016
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 2775507031594384867
*************** Autotuning format combination: Float(86400,1,1920,32) -> Float(86400,1,1920,32) ***************
--------------- Timing Runner: Conv_27 + Relu_28 (CudnnConvolution)
CudnnConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_27 + Relu_28 (CaskConvolution)
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376
Tactic: 861694390046228376 Time: 0.053344
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567
Tactic: 1017870653102653567 Time: 0.054496
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167
Tactic: 5258189349241541167 Time: 0.032128
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316
Tactic: 5821621277990374316 Time: 0.053344
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648
Tactic: 5863767799113001648 Time: 0.020576
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536
Tactic: -9147980667639709536 Time: 0.05328
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857
Tactic: -8850904373104590857 Time: 0.032544
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660
Tactic: -7751035352149795660 Time: 0.05328
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465
Tactic: -3853827649136781465 Time: 0.055168
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196
Tactic: -3263369460438823196 Time: 0.032192
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819
Tactic: -423878181466897819 Time: 0.020576
Fastest Tactic: 5863767799113001648 Time: 0.020576
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 5863767799113001648
*************** Autotuning Reformat:Float(86400,2700,60,1) -> Float(86400,1,1920,32) ***************
*************** Autotuning Reformat:Float(86400,1,1920,32) -> Float(86400,2700,60,1) ***************
*************** Autotuning format combination: Float(86400,2700,60,1) -> Float(172800,10800,120,1) ***************
--------------- Timing Runner: ConvTranspose_29 + Pad_30 + BatchNormalization_31 + Relu_32 (CudnnDeconvolution)
CudnnDeconvolution has no valid tactics for this config, skipping
--------------- Timing Runner: ConvTranspose_29 + Pad_30 + BatchNormalization_31 + Relu_32 (GemmDeconvolution)
Tactic: 0 Time: 0.026464
Fastest Tactic: 0 Time: 0.026464
--------------- Timing Runner: ConvTranspose_29 + Pad_30 + BatchNormalization_31 + Relu_32 (CaskDeconvolution)
CaskDeconvolution has no valid tactics for this config, skipping
>>>>>>>>>>>>>>> Chose Runner Type: GemmDeconvolution Tactic: 0
*************** Autotuning format combination: Float(86400,1,1920,32) -> Float(172800,1,1920,16) ***************
--------------- Timing Runner: ConvTranspose_29 + Pad_30 + BatchNormalization_31 + Relu_32 (CudnnDeconvolution)
CudnnDeconvolution has no valid tactics for this config, skipping
--------------- Timing Runner: ConvTranspose_29 + Pad_30 + BatchNormalization_31 + Relu_32 (GemmDeconvolution)
GemmDeconvolution has no valid tactics for this config, skipping
--------------- Timing Runner: ConvTranspose_29 + Pad_30 + BatchNormalization_31 + Relu_32 (CaskDeconvolution)
CaskDeconvolution has no valid tactics for this config, skipping
*************** Autotuning Reformat:Float(172800,10800,120,1) -> Float(345600,10800,120,1) ***************
--------------- Timing Runner: 177 copy (Reformat)
Tactic: 1002 Time: 0.008096
Tactic: 0 Time: 0.005344
Fastest Tactic: 0 Time: 0.005344
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0
*************** Autotuning Reformat:Float(172800,10800,120,1) -> Float(345600,1,3840,32) ***************
--------------- Timing Runner: 177 copy (Reformat)
Tactic: 1002 Time: 0.013984
Tactic: 0 Time: 0.00624
Fastest Tactic: 0 Time: 0.00624
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0
*************** Autotuning Reformat:Float(172800,1,1920,16) -> Float(345600,10800,120,1) ***************
--------------- Timing Runner: 177 copy (Reformat)
Tactic: 1002 Time: 0.0144
Tactic: 0 Time: 0.008224
Fastest Tactic: 0 Time: 0.008224
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0
*************** Autotuning Reformat:Float(172800,1,1920,16) -> Float(345600,1,3840,32) ***************
--------------- Timing Runner: 177 copy (Reformat)
Tactic: 1002 Time: 0.012288
Tactic: 0 Time: 0.00624
Fastest Tactic: 0 Time: 0.00624
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0
*************** Autotuning Reformat:Float(345600,10800,120,1) -> Float(345600,1,3840,32) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.014432
Tactic: 0 Time: 0.011264
Fastest Tactic: 0 Time: 0.011264
*************** Autotuning Reformat:Float(345600,1,3840,32) -> Float(345600,10800,120,1) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.016288
Tactic: 0 Time: 0.014432
Fastest Tactic: 0 Time: 0.014432
*************** Autotuning format combination: Float(345600,10800,120,1) -> Float(172800,10800,120,1) ***************
--------------- Timing Runner: Conv_34 + Relu_35 (CudaDepthwiseConvolution)
CudaDepthwiseConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_34 + Relu_35 (FusedConvActConvolution)
Tactic: 524287 Time: 0.045152
Tactic: 720895 Time: 0.055392
Tactic: 983039 Time: 0.055392
Tactic: 1048575 Time: 0.041056
Tactic: 1703935 Time: 0.033152
Tactic: 1769471 Time: 0.06768
Tactic: 1966079 Time: 0.061472
Tactic: 2031615 Time: 0.069312
Tactic: 2228223 Time: 0.03872
Tactic: 2424831 Time: 0.043104
Tactic: 2621439 Time: 0.03488
Tactic: 2752511 Time: 0.063008
Tactic: 2818047 Time: 0.102304
Tactic: 2883583 Time: 0.101504
Tactic: 3014655 Time: 0.036864
Tactic: 3145727 Time: 0.055296
Tactic: 3473407 Time: 0.112352
Tactic: 3604479 Time: 0.036544
Tactic: 3735551 Time: 0.100256
Tactic: 4390911 Time: 0.06688
Tactic: 5046271 Time: 0.03696
Tactic: 5963775 Time: 0.076896
Tactic: 6160383 Time: 0.04496
Tactic: 6488063 Time: 0.038976
Tactic: 6881279 Time: 0.059488
Tactic: 7274495 Time: 0.059488
Tactic: 7864319 Time: 0.036704
Tactic: 7995391 Time: 0.05744
Tactic: 8585215 Time: 0.041024
Tactic: 8847359 Time: 0.035968
Tactic: 8978431 Time: 0.075872
Tactic: 9043967 Time: 0.033024
Tactic: 9175039 Time: 0.036704
Tactic: 9502719 Time: 0.066912
Tactic: 9830399 Time: 0.103904
Tactic: 9961471 Time: 0.047008
Tactic: 10027007 Time: 0.036928
Tactic: 10092543 Time: 0.066944
Tactic: 10289151 Time: 0.061536
Tactic: 10485759 Time: 0.0328
Tactic: 10682367 Time: 0.034912
Tactic: 10813439 Time: 0.055296
Fastest Tactic: 10485759 Time: 0.0328
--------------- Timing Runner: Conv_34 + Relu_35 (CudnnConvolution)
Tactic: 0 Time: 0.153504
Tactic: 1 Time: 0.048736
Tactic: 2 Time: 0.203904
Tactic: 4 Time: 0.64512
Tactic: 5 Time: 0.178976
Tactic: 6 Time: 0.03632
Tactic: 56 Time: 0.153696
Tactic: 57 Time: 0.049024
Tactic: 58 Time: 0.204384
Tactic: 60 Time: 0.650464
Tactic: 61 Time: 0.182496
Tactic: 62 Time: 0.03616
Fastest Tactic: 62 Time: 0.03616
--------------- Timing Runner: Conv_34 + Relu_35 (CaskConvolution)
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384
Tactic: 1825138533642645384 Time: 0.103904
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867
Tactic: 2775507031594384867 Time: 0.028064
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458
Tactic: 2842488832350522458 Time: 0.059296
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238
Tactic: 3915320020053085238 Time: 0.102432
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203
Tactic: 6448355332020552203 Time: 0.10864
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604
Tactic: 6808617066150061604 Time: 0.059488
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864
Tactic: -8060443123034038864 Time: 0.061536
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522
Tactic: -4420849921117327522 Time: 0.04048
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337
Tactic: -3946921629105938337 Time: 0.061536
Fastest Tactic: 2775507031594384867 Time: 0.028064
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 2775507031594384867
*************** Autotuning format combination: Float(345600,1,3840,32) -> Float(172800,1,1920,16) ***************
--------------- Timing Runner: Conv_34 + Relu_35 (CudnnConvolution)
CudnnConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_34 + Relu_35 (CaskConvolution)
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376
Tactic: 861694390046228376 Time: 0.102496
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567
Tactic: 1017870653102653567 Time: 0.102304
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167
Tactic: 5258189349241541167 Time: 0.05744
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316
Tactic: 5821621277990374316 Time: 0.102496
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648
Tactic: 5863767799113001648 Time: 0.039008
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536
Tactic: -9147980667639709536 Time: 0.102144
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857
Tactic: -8850904373104590857 Time: 0.058528
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660
Tactic: -7751035352149795660 Time: 0.101696
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465
Tactic: -3853827649136781465 Time: 0.102496
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196
Tactic: -3263369460438823196 Time: 0.05744
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819
Tactic: -423878181466897819 Time: 0.038912
Fastest Tactic: -423878181466897819 Time: 0.038912
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -423878181466897819
*************** Autotuning Reformat:Float(172800,10800,120,1) -> Float(172800,1,1920,16) ***************
*************** Autotuning Reformat:Float(172800,1,1920,16) -> Float(172800,10800,120,1) ***************
*************** Autotuning format combination: Float(172800,10800,120,1) -> Float(172800,10800,120,1) ***************
--------------- Timing Runner: Conv_36 + Relu_37 (CudaDepthwiseConvolution)
CudaDepthwiseConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_36 + Relu_37 (FusedConvActConvolution)
Tactic: 524287 Time: 0.02784
Tactic: 720895 Time: 0.032672
Tactic: 983039 Time: 0.032032
Tactic: 1048575 Time: 0.025856
Tactic: 1703935 Time: 0.021792
Tactic: 1769471 Time: 0.044992
Tactic: 1966079 Time: 0.036736
Tactic: 2031615 Time: 0.039104
Tactic: 2228223 Time: 0.024224
Tactic: 2424831 Time: 0.031936
Tactic: 2621439 Time: 0.02448
Tactic: 2752511 Time: 0.03584
Tactic: 2818047 Time: 0.058624
Tactic: 2883583 Time: 0.055392
Tactic: 3014655 Time: 0.022624
Tactic: 3145727 Time: 0.032672
Tactic: 3473407 Time: 0.061344
Tactic: 3604479 Time: 0.022624
Tactic: 3735551 Time: 0.053472
Tactic: 4390911 Time: 0.038816
Tactic: 5046271 Time: 0.022624
Tactic: 5963775 Time: 0.043104
Tactic: 6160383 Time: 0.027648
Tactic: 6488063 Time: 0.024576
Tactic: 6881279 Time: 0.034912
Tactic: 7274495 Time: 0.036928
Tactic: 7864319 Time: 0.023584
Tactic: 7995391 Time: 0.032864
Tactic: 8585215 Time: 0.0256
Tactic: 8847359 Time: 0.024576
Tactic: 8978431 Time: 0.043008
Tactic: 9043967 Time: 0.02144
Tactic: 9175039 Time: 0.022624
Tactic: 9502719 Time: 0.038912
Tactic: 9830399 Time: 0.056448
Tactic: 9961471 Time: 0.0328
Tactic: 10027007 Time: 0.022528
Tactic: 10092543 Time: 0.039008
Tactic: 10289151 Time: 0.03648
Tactic: 10485759 Time: 0.020576
Tactic: 10682367 Time: 0.022624
Tactic: 10813439 Time: 0.032224
Fastest Tactic: 10485759 Time: 0.020576
--------------- Timing Runner: Conv_36 + Relu_37 (CudnnConvolution)
Tactic: 0 Time: 0.038752
Tactic: 1 Time: 0.032416
Tactic: 2 Time: 0.113664
Tactic: 4 Time: 0.376224
Tactic: 5 Time: 0.172736
Tactic: 6 Time: 0.028704
Tactic: 56 Time: 0.038752
Tactic: 57 Time: 0.032416
Tactic: 58 Time: 0.114688
Tactic: 60 Time: 0.374304
Tactic: 61 Time: 0.170656
Tactic: 62 Time: 0.029824
Fastest Tactic: 6 Time: 0.028704
--------------- Timing Runner: Conv_36 + Relu_37 (CaskConvolution)
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384
Tactic: 1825138533642645384 Time: 0.05744
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867
Tactic: 2775507031594384867 Time: 0.020032
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458
Tactic: 2842488832350522458 Time: 0.034688
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238
Tactic: 3915320020053085238 Time: 0.05744
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203
Tactic: 6448355332020552203 Time: 0.063008
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604
Tactic: 6808617066150061604 Time: 0.034912
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864
Tactic: -8060443123034038864 Time: 0.03632
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522
Tactic: -4420849921117327522 Time: 0.024032
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337
Tactic: -3946921629105938337 Time: 0.034912
Fastest Tactic: 2775507031594384867 Time: 0.020032
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: 2775507031594384867
*************** Autotuning format combination: Float(172800,1,1920,16) -> Float(172800,1,1920,16) ***************
--------------- Timing Runner: Conv_36 + Relu_37 (CudnnConvolution)
CudnnConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_36 + Relu_37 (CaskConvolution)
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376
Tactic: 861694390046228376 Time: 0.055392
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567
Tactic: 1017870653102653567 Time: 0.055392
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167
Tactic: 5258189349241541167 Time: 0.034272
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316
Tactic: 5821621277990374316 Time: 0.05648
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648
Tactic: 5863767799113001648 Time: 0.039008
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536
Tactic: -9147980667639709536 Time: 0.056896
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857
Tactic: -8850904373104590857 Time: 0.03472
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660
Tactic: -7751035352149795660 Time: 0.05536
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465
Tactic: -3853827649136781465 Time: 0.056864
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196
Tactic: -3263369460438823196 Time: 0.034208
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819
Tactic: -423878181466897819 Time: 0.039008
Fastest Tactic: -3263369460438823196 Time: 0.034208
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -3263369460438823196
*************** Autotuning Reformat:Float(172800,10800,120,1) -> Float(172800,1,1920,16) ***************
*************** Autotuning Reformat:Float(172800,1,1920,16) -> Float(172800,10800,120,1) ***************
*************** Autotuning format combination: Float(172800,10800,120,1) -> Float(345600,43200,240,1) ***************
--------------- Timing Runner: ConvTranspose_38 + Pad_39 + BatchNormalization_40 + Relu_41 (CudnnDeconvolution)
CudnnDeconvolution has no valid tactics for this config, skipping
--------------- Timing Runner: ConvTranspose_38 + Pad_39 + BatchNormalization_40 + Relu_41 (GemmDeconvolution)
Tactic: 0 Time: 0.03664
Fastest Tactic: 0 Time: 0.03664
--------------- Timing Runner: ConvTranspose_38 + Pad_39 + BatchNormalization_40 + Relu_41 (CaskDeconvolution)
CaskDeconvolution has no valid tactics for this config, skipping
>>>>>>>>>>>>>>> Chose Runner Type: GemmDeconvolution Tactic: 0
*************** Autotuning format combination: Float(172800,1,1920,16) -> Float(345600,1,1920,8) ***************
--------------- Timing Runner: ConvTranspose_38 + Pad_39 + BatchNormalization_40 + Relu_41 (CudnnDeconvolution)
CudnnDeconvolution has no valid tactics for this config, skipping
--------------- Timing Runner: ConvTranspose_38 + Pad_39 + BatchNormalization_40 + Relu_41 (GemmDeconvolution)
GemmDeconvolution has no valid tactics for this config, skipping
--------------- Timing Runner: ConvTranspose_38 + Pad_39 + BatchNormalization_40 + Relu_41 (CaskDeconvolution)
CaskDeconvolution has no valid tactics for this config, skipping
*************** Autotuning Reformat:Float(345600,43200,240,1) -> Float(691200,43200,240,1) ***************
--------------- Timing Runner: 188 copy (Reformat)
Tactic: 1002 Time: 0.010304
Tactic: 0 Time: 0.007296
Fastest Tactic: 0 Time: 0.007296
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0
*************** Autotuning Reformat:Float(345600,43200,240,1) -> Float(691200,1,3840,16) ***************
--------------- Timing Runner: 188 copy (Reformat)
Tactic: 1002 Time: 0.034912
Tactic: 0 Time: 0.010144
Fastest Tactic: 0 Time: 0.010144
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0
*************** Autotuning Reformat:Float(345600,1,1920,8) -> Float(691200,43200,240,1) ***************
--------------- Timing Runner: 188 copy (Reformat)
Tactic: 1002 Time: 0.036416
Tactic: 0 Time: 0.012192
Fastest Tactic: 0 Time: 0.012192
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0
*************** Autotuning Reformat:Float(345600,1,1920,8) -> Float(691200,1,3840,16) ***************
--------------- Timing Runner: 188 copy (Reformat)
Tactic: 1002 Time: 0.032096
Tactic: 0 Time: 0.009664
Fastest Tactic: 0 Time: 0.009664
>>>>>>>>>>>>>>> Chose Runner Type: Reformat Tactic: 0
*************** Autotuning Reformat:Float(691200,43200,240,1) -> Float(691200,1,3840,16) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.036544
Tactic: 0 Time: 0.020224
Fastest Tactic: 0 Time: 0.020224
*************** Autotuning Reformat:Float(691200,1,3840,16) -> Float(691200,43200,240,1) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.03696
Tactic: 0 Time: 0.018464
Fastest Tactic: 0 Time: 0.018464
*************** Autotuning format combination: Float(691200,43200,240,1) -> Float(345600,43200,240,1) ***************
--------------- Timing Runner: Conv_43 + Relu_44 (CudaDepthwiseConvolution)
CudaDepthwiseConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_43 + Relu_44 (FusedConvActConvolution)
Tactic: 524287 Time: 0.061536
Tactic: 720895 Time: 0.102496
Tactic: 983039 Time: 0.099776
Tactic: 1048575 Time: 0.060672
Tactic: 1703935 Time: 0.059232
Tactic: 1769471 Time: 0.149504
Tactic: 1966079 Time: 0.106592
Tactic: 2031615 Time: 0.096352
Tactic: 2228223 Time: 0.061536
Tactic: 2424831 Time: 0.102496
Tactic: 2621439 Time: 0.067648
Tactic: 2752511 Time: 0.105632
Tactic: 2818047 Time: 0.199552
Tactic: 2883583 Time: 0.190272
Tactic: 3014655 Time: 0.061216
Tactic: 3145727 Time: 0.104544
Tactic: 3473407 Time: 0.185472
Tactic: 3604479 Time: 0.060384
Tactic: 3735551 Time: 0.186272
Tactic: 4390911 Time: 0.106304
Tactic: 5046271 Time: 0.05904
Tactic: 5963775 Time: 0.106592
Tactic: 6160383 Time: 0.0632
Tactic: 6488063 Time: 0.0608
Tactic: 6881279 Time: 0.099328
Tactic: 7274495 Time: 0.118368
Tactic: 7864319 Time: 0.068992
Tactic: 7995391 Time: 0.100352
Tactic: 8585215 Time: 0.06128
Tactic: 8847359 Time: 0.071584
Tactic: 8978431 Time: 0.106464
Tactic: 9043967 Time: 0.059104
Tactic: 9175039 Time: 0.06064
Tactic: 9502719 Time: 0.106176
Tactic: 9830399 Time: 0.19056
Tactic: 9961471 Time: 0.100512
Tactic: 10027007 Time: 0.059392
Tactic: 10092543 Time: 0.105792
Tactic: 10289151 Time: 0.107904
Tactic: 10485759 Time: 0.058848
Tactic: 10682367 Time: 0.06672
Tactic: 10813439 Time: 0.09808
Fastest Tactic: 10485759 Time: 0.058848
--------------- Timing Runner: Conv_43 + Relu_44 (CudnnConvolution)
Tactic: 0 Time: 0.109856
Tactic: 1 Time: 0.078688
Tactic: 2 Time: 0.202848
Tactic: 4 Time: 0.757632
Tactic: 5 Time: 0.561088
Tactic: 6 Time: 0.075424
Tactic: 56 Time: 0.108544
Tactic: 57 Time: 0.07792
Tactic: 58 Time: 0.202848
Tactic: 60 Time: 0.759904
Tactic: 61 Time: 0.565248
Tactic: 62 Time: 0.075584
Fastest Tactic: 6 Time: 0.075424
--------------- Timing Runner: Conv_43 + Relu_44 (CaskConvolution)
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384
Tactic: 1825138533642645384 Time: 0.204864
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867
Tactic: 2775507031594384867 Time: 0.063392
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458
Tactic: 2842488832350522458 Time: 0.10864
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238
Tactic: 3915320020053085238 Time: 0.204608
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203
Tactic: 6448355332020552203 Time: 0.235424
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604
Tactic: 6808617066150061604 Time: 0.109888
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864
Tactic: -8060443123034038864 Time: 0.110688
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522
Tactic: -4420849921117327522 Time: 0.06768
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337
Tactic: -3946921629105938337 Time: 0.063584
Fastest Tactic: 2775507031594384867 Time: 0.063392
>>>>>>>>>>>>>>> Chose Runner Type: FusedConvActConvolution Tactic: 10485759
*************** Autotuning format combination: Float(691200,1,3840,16) -> Float(345600,1,1920,8) ***************
--------------- Timing Runner: Conv_43 + Relu_44 (CudnnConvolution)
CudnnConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_43 + Relu_44 (CaskConvolution)
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376
Tactic: 861694390046228376 Time: 0.199808
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567
Tactic: 1017870653102653567 Time: 0.198336
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167
Tactic: 5258189349241541167 Time: 0.11264
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316
Tactic: 5821621277990374316 Time: 0.200608
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648
Tactic: 5863767799113001648 Time: 0.141344
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536
Tactic: -9147980667639709536 Time: 0.197952
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857
Tactic: -8850904373104590857 Time: 0.112736
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660
Tactic: -7751035352149795660 Time: 0.197728
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465
Tactic: -3853827649136781465 Time: 0.201728
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196
Tactic: -3263369460438823196 Time: 0.112416
Conv_43 + Relu_44 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819
Tactic: -423878181466897819 Time: 0.14288
Fastest Tactic: -3263369460438823196 Time: 0.112416
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -3263369460438823196
*************** Autotuning Reformat:Float(345600,43200,240,1) -> Float(345600,1,1920,8) ***************
*************** Autotuning Reformat:Float(345600,1,1920,8) -> Float(345600,43200,240,1) ***************
*************** Autotuning format combination: Float(345600,43200,240,1) -> Float(345600,43200,240,1) ***************
--------------- Timing Runner: Conv_45 + Relu_46 (CudaDepthwiseConvolution)
CudaDepthwiseConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_45 + Relu_46 (FusedConvActConvolution)
Tactic: 524287 Time: 0.041056
Tactic: 720895 Time: 0.058144
Tactic: 983039 Time: 0.057344
Tactic: 1048575 Time: 0.038944
Tactic: 1703935 Time: 0.041056
Tactic: 1769471 Time: 0.105824
Tactic: 1966079 Time: 0.059392
Tactic: 2031615 Time: 0.055392
Tactic: 2228223 Time: 0.041952
Tactic: 2621439 Time: 0.049184
Tactic: 2752511 Time: 0.060864
Tactic: 2818047 Time: 0.10944
Tactic: 2883583 Time: 0.102496
Tactic: 3014655 Time: 0.042144
Tactic: 3145727 Time: 0.061344
Tactic: 3473407 Time: 0.104512
Tactic: 3604479 Time: 0.041024
Tactic: 3735551 Time: 0.100416
Tactic: 4390911 Time: 0.060192
Tactic: 5046271 Time: 0.037056
Tactic: 5963775 Time: 0.060224
Tactic: 6160383 Time: 0.04096
Tactic: 6488063 Time: 0.038816
Tactic: 6881279 Time: 0.056832
Tactic: 7274495 Time: 0.075776
Tactic: 7864319 Time: 0.050368
Tactic: 7995391 Time: 0.05632
Tactic: 8585215 Time: 0.03984
Tactic: 8847359 Time: 0.053248
Tactic: 8978431 Time: 0.059424
Tactic: 9043967 Time: 0.040096
Tactic: 9175039 Time: 0.041056
Tactic: 9502719 Time: 0.060512
Tactic: 9830399 Time: 0.103776
Tactic: 10027007 Time: 0.036928
Tactic: 10092543 Time: 0.059488
Tactic: 10289151 Time: 0.060448
Tactic: 10485759 Time: 0.038912
Tactic: 10682367 Time: 0.049088
Tactic: 10813439 Time: 0.055296
Fastest Tactic: 10027007 Time: 0.036928
--------------- Timing Runner: Conv_45 + Relu_46 (CudnnConvolution)
Tactic: 0 Time: 0.061344
Tactic: 1 Time: 0.051072
Tactic: 2 Time: 0.108448
Tactic: 4 Time: 0.485376
Tactic: 5 Time: 0.459936
Tactic: 6 Time: 0.060992
Tactic: 56 Time: 0.060544
Tactic: 57 Time: 0.051008
Tactic: 58 Time: 0.106688
Tactic: 60 Time: 0.482208
Tactic: 61 Time: 0.457024
Tactic: 62 Time: 0.060704
Fastest Tactic: 57 Time: 0.051008
--------------- Timing Runner: Conv_45 + Relu_46 (CaskConvolution)
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384
Tactic: 1825138533642645384 Time: 0.114528
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867
Tactic: 2775507031594384867 Time: 0.048928
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x64_relu_xregs_large_nn_v1 Tactic: 2842488832350522458
Tactic: 2842488832350522458 Time: 0.065024
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238
Tactic: 3915320020053085238 Time: 0.114624
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x128_relu_xregs_large_nn_v1 Tactic: 6448355332020552203
Tactic: 6448355332020552203 Time: 0.145312
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604
Tactic: 6808617066150061604 Time: 0.062464
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864
Tactic: -8060443123034038864 Time: 0.063552
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522
Tactic: -4420849921117327522 Time: 0.040288
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337
Tactic: -3946921629105938337 Time: 0.036896
Fastest Tactic: -3946921629105938337 Time: 0.036896
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -3946921629105938337
*************** Autotuning format combination: Float(345600,1,1920,8) -> Float(345600,1,1920,8) ***************
--------------- Timing Runner: Conv_45 + Relu_46 (CudnnConvolution)
CudnnConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_45 + Relu_46 (CaskConvolution)
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x128_relu_exp_medium_nhwc_tn_v1 Tactic: 861694390046228376
Tactic: 861694390046228376 Time: 0.108544
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: 1017870653102653567
Tactic: 1017870653102653567 Time: 0.108224
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5258189349241541167
Tactic: 5258189349241541167 Time: 0.112704
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x128_relu_exp_small_nhwc_tn_v1 Tactic: 5821621277990374316
Tactic: 5821621277990374316 Time: 0.10864
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: 5863767799113001648
Tactic: 5863767799113001648 Time: 0.140832
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -9147980667639709536
Tactic: -9147980667639709536 Time: 0.107648
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -8850904373104590857
Tactic: -8850904373104590857 Time: 0.113888
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x128_ldg4_relu_exp_small_nhwc_tn_v1 Tactic: -7751035352149795660
Tactic: -7751035352149795660 Time: 0.108384
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x128_relu_exp_large_nhwc_tn_v1 Tactic: -3853827649136781465
Tactic: -3853827649136781465 Time: 0.109568
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x64_sliced1x2_ldg4_relu_exp_large_nhwc_tn_v1 Tactic: -3263369460438823196
Tactic: -3263369460438823196 Time: 0.112384
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x32_sliced1x4_ldg4_relu_exp_medium_nhwc_tn_v1 Tactic: -423878181466897819
Tactic: -423878181466897819 Time: 0.141408
Fastest Tactic: -9147980667639709536 Time: 0.107648
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -9147980667639709536
*************** Autotuning Reformat:Float(345600,43200,240,1) -> Float(345600,1,1920,8) ***************
*************** Autotuning Reformat:Float(345600,1,1920,8) -> Float(345600,43200,240,1) ***************
*************** Autotuning format combination: Float(345600,43200,240,1) -> Float(129600,43200,240,1) ***************
--------------- Timing Runner: Conv_47 (CudaDepthwiseConvolution)
CudaDepthwiseConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_47 (FusedConvActConvolution)
Tactic: 589823 Time: 0.026144
Tactic: 786431 Time: 0.023424
Tactic: 1310719 Time: 0.02432
Tactic: 1638399 Time: 0.034048
Tactic: 1835007 Time: 0.023872
Tactic: 4194303 Time: 0.022624
Tactic: 4325375 Time: 0.024672
Tactic: 4521983 Time: 0.023552
Tactic: 4980735 Time: 0.022624
Tactic: 5439487 Time: 0.028672
Tactic: 5767167 Time: 0.053344
Tactic: 6946815 Time: 0.028128
Tactic: 7143423 Time: 0.028768
Tactic: 7602175 Time: 0.022624
Tactic: 7798783 Time: 0.023744
Tactic: 8191999 Time: 0.0248
Tactic: 8323071 Time: 0.024416
Tactic: 8650751 Time: 0.02464
Tactic: 9895935 Time: 0.022624
Tactic: 10551295 Time: 0.02608
Tactic: 10944511 Time: 0.02256
Fastest Tactic: 10944511 Time: 0.02256
--------------- Timing Runner: Conv_47 (CudnnConvolution)
Tactic: 0 Time: 0.016384
Tactic: 1 Time: 0.01648
Tactic: 2 Time: 0.018176
Tactic: 4 Time: 0.284576
Tactic: 5 Time: 0.048064
Tactic: 56 Time: 0.016384
Tactic: 57 Time: 0.016384
Tactic: 58 Time: 0.017376
Tactic: 60 Time: 0.282624
Tactic: 61 Time: 0.047104
Fastest Tactic: 0 Time: 0.016384
--------------- Timing Runner: Conv_47 (CublasConvolution)
CublasConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_47 (CaskConvolution)
Conv_47 Set Tactic Name: volta_scudnn_128x128_relu_interior_nn_v1 Tactic: 1754569683116234317
Tactic: 1754569683116234317 Time: 0.044352
Conv_47 Set Tactic Name: volta_scudnn_128x128_relu_medium_nn_v1 Tactic: 1825138533642645384
Tactic: 1825138533642645384 Time: 0.04448
Conv_47 Set Tactic Name: volta_scudnn_128x32_relu_interior_nn_v1 Tactic: 2733356012094739613
Tactic: 2733356012094739613 Time: 0.01552
Conv_47 Set Tactic Name: volta_scudnn_128x128_relu_small_nn_v1 Tactic: 3915320020053085238
Tactic: 3915320020053085238 Time: 0.044864
Conv_47 Set Tactic Name: volta_scudnn_128x64_relu_small_nn_v1 Tactic: 6808617066150061604
Tactic: 6808617066150061604 Time: 0.02352
Conv_47 Set Tactic Name: volta_scudnn_128x64_relu_interior_nn_v1 Tactic: 9091006216302412844
Tactic: 9091006216302412844 Time: 0.024416
Conv_47 Set Tactic Name: volta_scudnn_128x64_relu_medium_nn_v1 Tactic: -8060443123034038864
Tactic: -8060443123034038864 Time: 0.024576
Conv_47 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522
Tactic: -4420849921117327522 Time: 0.016384
Conv_47 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337
Tactic: -3946921629105938337 Time: 0.015488
Fastest Tactic: -3946921629105938337 Time: 0.015488
>>>>>>>>>>>>>>> Chose Runner Type: CaskConvolution Tactic: -3946921629105938337
*************** Autotuning format combination: Float(345600,1,1920,8) -> Float(129600,1,720,3) ***************
--------------- Timing Runner: Conv_47 (CudnnConvolution)
CudnnConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_47 (CublasConvolution)
CublasConvolution has no valid tactics for this config, skipping
--------------- Timing Runner: Conv_47 (CaskConvolution)
CaskConvolution has no valid tactics for this config, skipping
*************** Autotuning Reformat:Float(129600,1,720,3) -> Float(129600,43200,240,1) ***************
--------------- Timing Runner: Optimizer Reformat (Reformat)
Tactic: 1002 Time: 0.035008
Tactic: 0 Time: 0.00624
Fastest Tactic: 0 Time: 0.00624
*************** Autotuning format combination: Float(129600,43200,240,1) -> Float(43200,43200,240,1), Int32(43200,43200,240,1) ***************
--------------- Timing Runner: (Unnamed Layer* 48) [TopK] (TopK)
Tactic: 0 Time: 0.503712
Tactic: 1 Time: 2.4536
Tactic: 3 Time: 0.011872
Tactic: 2 Time: 12.5644
Fastest Tactic: 3 Time: 0.011872
>>>>>>>>>>>>>>> Chose Runner Type: TopK Tactic: 3
Formats and tactics selection completed in 2.53455 seconds.
After reformat layers: 26 layers
Block size 268435456
Block size 2764800
Block size 1382400
Block size 1036800
Block size 691200
Block size 345600
Block size 76800
Total Activation Memory: 274733056
Detected 1 inputs and 2 output network tensors.
Conv_0 + Relu_1 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337
Conv_5 + Relu_6 Set Tactic Name: volta_scudnn_128x32_relu_medium_nn_v1 Tactic: -4420849921117327522
Conv_7 + Relu_8 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867
Conv_10 + Relu_11 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867
Conv_12 + Relu_13 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867
Conv_25 + Relu_26 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867
Conv_27 + Relu_28 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867
Conv_34 + Relu_35 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867
Conv_36 + Relu_37 Set Tactic Name: volta_scudnn_winograd_128x128_ldg1_ldg4_relu_tile148t_nt_v1 Tactic: 2775507031594384867
Conv_45 + Relu_46 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337
Conv_47 Set Tactic Name: volta_scudnn_128x32_relu_small_nn_v1 Tactic: -3946921629105938337
Layer: Conv_0 + Relu_1 HostPersistent: 1664 DevicePersistent: 260608
Layer: Conv_2 + Relu_3 HostPersistent: 2192 DevicePersistent: 0
Layer: MaxPool_4 HostPersistent: 0 DevicePersistent: 0
Layer: Conv_5 + Relu_6 HostPersistent: 2176 DevicePersistent: 69632
Layer: Conv_7 + Relu_8 HostPersistent: 512 DevicePersistent: 42496
Layer: MaxPool_9 HostPersistent: 0 DevicePersistent: 0
Layer: Conv_10 + Relu_11 HostPersistent: 512 DevicePersistent: 51712
Layer: Conv_12 + Relu_13 HostPersistent: 512 DevicePersistent: 102912
Layer: MaxPool_14 HostPersistent: 0 DevicePersistent: 0
Layer: Conv_15 + Relu_16 HostPersistent: 2192 DevicePersistent: 0
Layer: Conv_17 + Relu_18 HostPersistent: 2192 DevicePersistent: 0
Layer: Conv_19 + Relu_20 HostPersistent: 2192 DevicePersistent: 0
Layer: ConvTranspose_21 + BatchNormalization_22 + Relu_23 HostPersistent: 0 DevicePersistent: 0
Layer: 166 copy HostPersistent: 0 DevicePersistent: 0
Layer: Conv_25 + Relu_26 HostPersistent: 512 DevicePersistent: 307712
Layer: Conv_27 + Relu_28 HostPersistent: 512 DevicePersistent: 102912
Layer: ConvTranspose_29 + Pad_30 + BatchNormalization_31 + Relu_32 HostPersistent: 0 DevicePersistent: 0
Layer: 177 copy HostPersistent: 0 DevicePersistent: 0
Layer: Conv_34 + Relu_35 HostPersistent: 512 DevicePersistent: 84480
Layer: Conv_36 + Relu_37 HostPersistent: 512 DevicePersistent: 42496
Layer: ConvTranspose_38 + Pad_39 + BatchNormalization_40 + Relu_41 HostPersistent: 0 DevicePersistent: 0
Layer: 188 copy HostPersistent: 0 DevicePersistent: 0
Layer: Conv_43 + Relu_44 HostPersistent: 2192 DevicePersistent: 0
Layer: Conv_45 + Relu_46 HostPersistent: 1664 DevicePersistent: 262144
Layer: Conv_47 HostPersistent: 1664 DevicePersistent: 259584
Layer: (Unnamed Layer* 48) [TopK] HostPersistent: 0 DevicePersistent: 0
Total Host Persistent Memory: 21712
Total Device Persistent Memory: 1586688
Total Scratch Memory: 4147200
[MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 0 MiB, GPU 4 MiB
Using cublasLt a tactic source
[MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 1704, GPU 979 (MiB)
Using cuDNN as a tactic source
[MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 1704, GPU 987 (MiB)
[MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1704, GPU 971 (MiB)
Engine generation completed in 3.18454 seconds.
Deleting timing cache: 66 entries, 18 hits
[MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1704, GPU 953 (MiB)
Engine Layer Information:
Layer(CaskConvolution): Conv_0 + Relu_1, Tactic: -3946921629105938337, input[Float(1,1,180,240)] -> 136[Float(1,8,180,240)]
Layer(FusedConvActConvolution): Conv_2 + Relu_3, Tactic: 1572863, 136[Float(1,8,180,240)] -> 189[Float(1,8,180,240)]
Layer(TiledPooling): MaxPool_4, Tactic: 6226177, 189[Float(1,8,180,240)] -> 140[Float(1,8,90,120)]
Layer(CaskConvolution): Conv_5 + Relu_6, Tactic: -4420849921117327522, 140[Float(1,8,90,120)] -> 143[Float(1,16,90,120)]
Layer(CaskConvolution): Conv_7 + Relu_8, Tactic: 2775507031594384867, 143[Float(1,16,90,120)] -> 178[Float(1,16,90,120)]
Layer(TiledPooling): MaxPool_9, Tactic: 5636353, 178[Float(1,16,90,120)] -> 147[Float(1,16,45,60)]
Layer(CaskConvolution): Conv_10 + Relu_11, Tactic: 2775507031594384867, 147[Float(1,16,45,60)] -> 150[Float(1,32,45,60)]
Layer(CaskConvolution): Conv_12 + Relu_13, Tactic: 2775507031594384867, 150[Float(1,32,45,60)] -> 167[Float(1,32,45,60)]
Layer(TiledPooling): MaxPool_14, Tactic: 7537922, 167[Float(1,32,45,60)] -> 154[Float(1,32,15,20)]
Layer(FusedConvActConvolution): Conv_15 + Relu_16, Tactic: 10485759, 154[Float(1,32,15,20)] -> 157[Float(1,64,15,20)]
Layer(FusedConvActConvolution): Conv_17 + Relu_18, Tactic: 10682367, 157[Float(1,64,15,20)] -> 160[Float(1,64,15,20)]
Layer(FusedConvActConvolution): Conv_19 + Relu_20, Tactic: 10682367, 160[Float(1,64,15,20)] -> 163[Float(1,64,15,20)]
Layer(GemmDeconvolution): ConvTranspose_21 + BatchNormalization_22 + Relu_23, Tactic: 0, 163[Float(1,64,15,20)] -> 166[Float(1,64,45,60)]
Layer(Reformat): 166 copy, Tactic: 0, 166[Float(1,64,45,60)] -> 167[Float(1,64,45,60)]
Layer(CaskConvolution): Conv_25 + Relu_26, Tactic: 2775507031594384867, 167[Float(1,96,45,60)] -> 170[Float(1,32,45,60)]
Layer(CaskConvolution): Conv_27 + Relu_28, Tactic: 2775507031594384867, 170[Float(1,32,45,60)] -> 173[Float(1,32,45,60)]
Layer(GemmDeconvolution): ConvTranspose_29 + Pad_30 + BatchNormalization_31 + Relu_32, Tactic: 0, 173[Float(1,32,45,60)] -> 177[Float(1,16,90,120)]
Layer(Reformat): 177 copy, Tactic: 0, 177[Float(1,16,90,120)] -> 178[Float(1,16,90,120)]
Layer(CaskConvolution): Conv_34 + Relu_35, Tactic: 2775507031594384867, 178[Float(1,32,90,120)] -> 181[Float(1,16,90,120)]
Layer(CaskConvolution): Conv_36 + Relu_37, Tactic: 2775507031594384867, 181[Float(1,16,90,120)] -> 184[Float(1,16,90,120)]
Layer(GemmDeconvolution): ConvTranspose_38 + Pad_39 + BatchNormalization_40 + Relu_41, Tactic: 0, 184[Float(1,16,90,120)] -> 188[Float(1,8,180,240)]
Layer(Reformat): 188 copy, Tactic: 0, 188[Float(1,8,180,240)] -> 189[Float(1,8,180,240)]
Layer(FusedConvActConvolution): Conv_43 + Relu_44, Tactic: 10485759, 189[Float(1,16,180,240)] -> 192[Float(1,8,180,240)]
Layer(CaskConvolution): Conv_45 + Relu_46, Tactic: -3946921629105938337, 192[Float(1,8,180,240)] -> 195[Float(1,8,180,240)]
Layer(CaskConvolution): Conv_47, Tactic: -3946921629105938337, 195[Float(1,8,180,240)] -> raw_conv_out[Float(1,3,180,240)]
Layer(TopK): (Unnamed Layer* 48) [TopK], Tactic: 3, raw_conv_out[Float(1,3,180,240)] -> (Unnamed Layer* 48) [TopK]_output_1[Float(1,1,180,240)], dynamic_final_out[Int32(1,1,180,240)]
[MemUsageSnapshot] Builder end: CPU 1704 MiB, GPU 953 MiB
[MemUsageSnapshot] ExecutionContext creation begin: CPU 1703 MiB, GPU 953 MiB
Using cublasLt a tactic source
[MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +10, now: CPU 1703, GPU 963 (MiB)
Using cuDNN as a tactic source
[MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 1703, GPU 971 (MiB)
Total per-runner device memory is 1586688
Total per-runner host memory is 21712
Allocated activation device memory of size 9830400
[MemUsageSnapshot] ExecutionContext creation end: CPU 1703 MiB, GPU 981 MiB
[MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1702, GPU 963 (MiB)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment