Skip to content

Instantly share code, notes, and snippets.

@connorgoggins
Created February 1, 2020 00:34
Show Gist options
  • Save connorgoggins/70be016d23c05f36bc2331cf89e4114a to your computer and use it in GitHub Desktop.
Save connorgoggins/70be016d23c05f36bc2331cf89e4114a to your computer and use it in GitHub Desktop.

Runtime Features

  1. BLAS_APPLE : ✖ BLAS_APPLE
  2. BLAS_ATLAS : ✖ BLAS_ATLAS
  3. BLAS_MKL : ✖ BLAS_MKL
  4. BLAS_OPEN : ✔ BLAS_OPEN
  5. CAFFE : ✖ CAFFE
  6. CPU_AVX : ✔ CPU_AVX
  7. CPU_AVX2 : ✖ CPU_AVX2
  8. CPU_SSE : ✔ CPU_SSE
  9. CPU_SSE2 : ✔ CPU_SSE2
  10. CPU_SSE3 : ✔ CPU_SSE3
  11. CPU_SSE4A : ✖ CPU_SSE4A
  12. CPU_SSE4_1 : ✔ CPU_SSE4_1
  13. CPU_SSE4_2 : ✔ CPU_SSE4_2
  14. CUDA : ✔ CUDA
  15. CUDA_RTC : ✖ CUDA_RTC
  16. CUDNN : ✔ CUDNN
  17. CXX14 : ✖ CXX14
  18. DEBUG : ✖ DEBUG
  19. DIST_KVSTORE : ✖ DIST_KVSTORE
  20. F16C : ✔ F16C
  21. INT64_TENSOR_SIZE : ✖ INT64_TENSOR_SIZE
  22. JEMALLOC : ✖ JEMALLOC
  23. LAPACK : ✔ LAPACK
  24. MKLDNN : ✔ MKLDNN
  25. NCCL : ✖ NCCL
  26. OPENCV : ✖ OPENCV
  27. OPENMP : ✔ OPENMP
  28. PROFILER : ✖ PROFILER
  29. SIGNAL_HANDLER : ✖ SIGNAL_HANDLER
  30. SSE : ✖ SSE
  31. TENSORRT : ✖ TENSORRT
  32. TVM_OP : ✖ TVM_OP

Benchmark Results

Operator Avg Forward Time (ms) Avg. Backward Time (ms) Max Mem Usage (Storage) (Bytes) Inputs
batch_dot 28.8235 --- 67108.8672 {'lhs': (32, 1024, 1024), 'rhs': (32, 1024, 1024)}
batch_dot 2.749 --- 64000.0 {'lhs': (32, 1000, 10), 'rhs': (32, 1000, 10), 'transpose_b': True}
batch_dot 0.4239 --- 6.4 {'lhs': (32, 1000, 1), 'rhs': (32, 100, 1000), 'transpose_a': True, 'transpose_b': True}
dot 1.331 3.1084 2097.1521 {'lhs': (1024, 1024), 'rhs': (1024, 1024)}
dot 0.2314 0.465 2000.0 {'lhs': (1000, 10), 'rhs': (1000, 10), 'transpose_b': True}
dot 0.1888 0.094 0.2 {'lhs': (1000, 1), 'rhs': (100, 1000), 'transpose_a': True, 'transpose_b': True}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment