Skip to content

Instantly share code, notes, and snippets.

@nicolasvasilache
Created December 2, 2021 10:30
Show Gist options
  • Save nicolasvasilache/570868fddf59c141a8647ec28de99aee to your computer and use it in GitHub Desktop.
Save nicolasvasilache/570868fddf59c141a8647ec28de99aee to your computer and use it in GitHub Desktop.
Depthwise conv 1d
> export MLIR_RUNNER_UTILS_LIB=${IREE_LLVM_SANDBOX_BUILD_DIR}/lib/libmlir_runner_utils.so; cd ${IREE_LLVM_SANDBOX_SOURCE_DIR}; python -m python.examples.depthwise_conv.depthwise_conv_1d_bench
###############################################################
Compile-time problem size {'N': 8, 'W': 16, 'C': 32, 'KW': 3, 'strides': [1], 'dilations': [1]}
Runtime problem size {'N': 8, 'W': 16, 'C': 32, 'KW': 3, 'strides': [1], 'dilations': [1]}
Problem types [<class 'numpy.float32'>, <class 'numpy.float32'>, <class 'numpy.float32'>]
Compilation expert <python.examples.core.transform.TransformationList object at 0x7f4fccc7c1f0>
compilation in 0.09872s
xxxxxxxxxx : 1000 iters time on 1 threads
------------------------------------------------------------------------------------------------------------------------
slowest p1 p10 p25 p50 p75 p90 p99 fastest unit
------------------------------------------------------------------------------------------------------------------------
1.8e-05 4.7e-07 4.5e-07 4.5e-07 4.1e-07 4.1e-07 4.0e-07 4.0e-07 3.9e-07 seconds
1.38 52.74 54.49 55.10 59.36 60.24 60.83 62.06 62.38 GFlops/s
2.89 110.70 114.38 115.66 124.60 126.43 127.68 130.26 130.92 GBs/s
###############################################################
Compile-time problem size {'N': 8, 'W': 16, 'C': 32, 'KW': 3, 'strides': [1], 'dilations': [2]}
Runtime problem size {'N': 8, 'W': 16, 'C': 32, 'KW': 3, 'strides': [1], 'dilations': [2]}
Problem types [<class 'numpy.float32'>, <class 'numpy.float32'>, <class 'numpy.float32'>]
Compilation expert <python.examples.core.transform.TransformationList object at 0x7f4fccc7c1f0>
compilation in 0.1022s
xxxxxxxxxx : 1000 iters time on 1 threads
------------------------------------------------------------------------------------------------------------------------
slowest p1 p10 p25 p50 p75 p90 p99 fastest unit
------------------------------------------------------------------------------------------------------------------------
1.0e-05 3.6e-07 3.4e-07 3.4e-07 3.3e-07 3.3e-07 3.3e-07 3.2e-07 3.2e-07 seconds
2.46 68.08 72.28 73.14 74.02 74.70 75.62 76.56 77.28 GFlops/s
5.37 148.57 157.74 159.62 161.54 163.02 165.02 167.08 168.65 GBs/s
###############################################################
Compile-time problem size {'N': 8, 'W': 16, 'C': 32, 'KW': 3, 'strides': [2], 'dilations': [1]}
Runtime problem size {'N': 8, 'W': 16, 'C': 32, 'KW': 3, 'strides': [2], 'dilations': [1]}
Problem types [<class 'numpy.float32'>, <class 'numpy.float32'>, <class 'numpy.float32'>]
Compilation expert <python.examples.core.transform.TransformationList object at 0x7f4fccc7c1f0>
compilation in 0.1083s
xxxxxxxxxx : 1000 iters time on 1 threads
------------------------------------------------------------------------------------------------------------------------
slowest p1 p10 p25 p50 p75 p90 p99 fastest unit
------------------------------------------------------------------------------------------------------------------------
1.9e-05 7.9e-07 6.9e-07 6.8e-07 6.7e-07 6.6e-07 6.5e-07 6.4e-07 6.3e-07 seconds
1.29 31.07 35.57 36.19 36.74 37.29 37.69 38.34 38.89 GFlops/s
3.50 84.63 96.88 98.59 100.07 101.58 102.67 104.44 105.92 GBs/s
###############################################################
Compile-time problem size {'N': 8, 'W': 16, 'C': 32, 'KW': 3, 'strides': [2], 'dilations': [2]}
Runtime problem size {'N': 8, 'W': 16, 'C': 32, 'KW': 3, 'strides': [2], 'dilations': [2]}
Problem types [<class 'numpy.float32'>, <class 'numpy.float32'>, <class 'numpy.float32'>]
Compilation expert <python.examples.core.transform.TransformationList object at 0x7f4fccc7c1f0>
compilation in 0.1046s
xxxxxxxxxx : 1000 iters time on 1 threads
------------------------------------------------------------------------------------------------------------------------
slowest p1 p10 p25 p50 p75 p90 p99 fastest unit
------------------------------------------------------------------------------------------------------------------------
1.9e-06 5.7e-07 5.4e-07 5.3e-07 5.2e-07 5.1e-07 5.0e-07 4.9e-07 4.7e-07 seconds
12.87 43.27 45.09 46.20 47.26 48.38 49.15 50.46 52.07 GFlops/s
36.12 121.46 126.59 129.68 132.68 135.81 137.98 141.67 146.17 GBs/s
###############################################################
Compile-time problem size {'N': 8, 'W': 16, 'C': 32, 'KW': 3, 'strides': [2], 'dilations': [3]}
Runtime problem size {'N': 8, 'W': 16, 'C': 32, 'KW': 3, 'strides': [2], 'dilations': [3]}
Problem types [<class 'numpy.float32'>, <class 'numpy.float32'>, <class 'numpy.float32'>]
Compilation expert <python.examples.core.transform.TransformationList object at 0x7f4fccc7c1f0>
compilation in 0.1172s
xxxxxxxxxx : 1000 iters time on 1 threads
------------------------------------------------------------------------------------------------------------------------
slowest p1 p10 p25 p50 p75 p90 p99 fastest unit
------------------------------------------------------------------------------------------------------------------------
1.5e-05 1.1e-06 8.9e-07 8.8e-07 8.7e-07 8.0e-07 7.1e-07 7.0e-07 6.8e-07 seconds
1.69 21.77 27.64 27.96 28.22 30.76 34.61 35.31 36.25 GFlops/s
4.89 62.92 79.91 80.82 81.56 88.91 100.06 102.07 104.78 GBs/s
###############################################################
Compile-time problem size {'N': 8, 'W': 16, 'C': 32, 'KW': 3, 'strides': [3], 'dilations': [2]}
Runtime problem size {'N': 8, 'W': 16, 'C': 32, 'KW': 3, 'strides': [3], 'dilations': [2]}
Problem types [<class 'numpy.float32'>, <class 'numpy.float32'>, <class 'numpy.float32'>]
Compilation expert <python.examples.core.transform.TransformationList object at 0x7f4fccc7c1f0>
compilation in 0.1183s
xxxxxxxxxx : 1000 iters time on 1 threads
------------------------------------------------------------------------------------------------------------------------
slowest p1 p10 p25 p50 p75 p90 p99 fastest unit
------------------------------------------------------------------------------------------------------------------------
1.9e-05 1.2e-06 1.2e-06 1.2e-06 1.2e-06 1.2e-06 1.2e-06 9.9e-07 9.7e-07 seconds
1.31 20.28 20.60 20.76 20.92 21.08 21.30 24.95 25.23 GFlops/s
4.50 69.60 70.71 71.24 71.79 72.34 73.10 85.64 86.60 GBs/s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment