Skip to content

Instantly share code, notes, and snippets.

Main inspiration comes from [here][1].

“”” Here is what a deep learning system stack would look like in nowdays.

  1. Build operator level graph description language: name whatever dl frameworks you care about, and [ONNX][2]
  2. Tensor primitive level graph description languages: [NNVM][3], [HLO/XLA][4], [NGraph][5]. It is close enough to the first one that you can also build graph optimization on first layer and bypass this layer.
  3. DSL for description and codegen: TVM, image processing languages like [halide][6], [darkroom][7].
  4. Hardcoded optimized kernel library: [nnpack][8], [cudnn][9], [libdnn][10]
  5. Device dependent library: [maxas][11](assembler for NVIDIA Maxwell architecture)
@snowolfhawk
snowolfhawk / tvm.cpp
Created August 28, 2019 12:33 — forked from log0div0/tvm.cpp
#include <iostream>
#include <chrono>
#include <stdexcept>
#include <memory>
#include <fstream>
#include <iterator>
#include <algorithm>
#include <tvm/runtime/module.h>
#include <tvm/runtime/registry.h>
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@snowolfhawk
snowolfhawk / Block-Sparse GEMM.ipynb
Created August 28, 2019 12:07 — forked from ajtulloch/Block-Sparse GEMM.ipynb
Block-Sparse GEMM.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@snowolfhawk
snowolfhawk / relay_autodiff_test.py
Created August 28, 2019 12:01 — forked from grwlf/relay_autodiff_test.py
relay_autodiff_test.py
import tvm
import topi
import numpy as np
from tvm.testing import check_numerical_grads, estimate_performance, PerformanceEstimate
import time
import inspect
import sys
from exercise.runners import run_tvm
# Whether to dump the generated code
import tvm
from tvm import relay
def test_fuse_simple():
"""Simple testcase."""
def before():
x = relay.var("x", shape=(10, 20))
y = relay.add(x, relay.const(1, "float32"))
z = relay.exp(y)
return relay.Function([x], z)
#include <random>
#include <iomanip>
#include <array>
#include <exception>
#define NDEBUG
#include <tvm/tvm.h>
#include <tvm/build_module.h>
#include <topi/broadcast.h>
#undef NDEBUG
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
"""
Quantization module for generating the calibration tables will be used by
quantized (INT8) models from FP32 models.with bucket split,[k, k, cin, cout]
cut into "cout" buckets.
This tool is based on Caffe Framework.
"""
from __future__ import division
from __future__ import print_function
import argparse
import numpy as np

外部张量函数

虽然TVM支持透明代码生成,但有时将手动编写的代码合并到管道中也很有帮助。例如,我们想去为部分卷积和使用cuDNN和定义其他阶段。

TVM原生支持黑盒函数调用。TVM支持兼容DLPack的所有张量函数。这意味着我们可以使用POD类型(pointer,int,float)或指向DLTensor的指针作为参数调用任何函数。

from __future__ import absolute_import, print_function

import tvm