snowolfhawk

## Deep Learning System Stack.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                snowolfhawk
                / Deep Learning System Stack.md
            
            
              Created
              August 28, 2019 12:47
                — forked from tonyyang-svail/Deep Learning System Stack.md
            
          
    Main inspiration comes from [here][1].
“””
Here is what a deep learning system stack would look like in nowdays.

Build operator level graph description language: name whatever dl frameworks you care about, and [ONNX][2]
Tensor primitive level graph description languages: [NNVM][3], [HLO/XLA][4], [NGraph][5]. It is close enough to the first one that you can also build graph optimization on first layer and bypass this layer.
DSL for description and codegen: TVM, image processing languages like [halide][6], [darkroom][7].
Hardcoded optimized kernel library: [nnpack][8], [cudnn][9], [libdnn][10]
Device dependent library: [maxas][11](assembler for NVIDIA Maxwell architecture)


## tvm.cpp

#include <iostream>
#include <chrono>
#include <stdexcept>
#include <memory>
#include <fstream>
#include <iterator>
#include <algorithm>
#include <tvm/runtime/module.h>
#include <tvm/runtime/registry.h>

## 00-grouped_relay_penalty.ipynb

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                snowolfhawk
                / 00-grouped_relay_penalty.ipynb
            
            
              Created
              August 28, 2019 12:32
                — forked from Wheest/00-grouped_relay_penalty.ipynb
            
          
      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## Block-Sparse GEMM.ipynb

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                snowolfhawk
                / Block-Sparse GEMM.ipynb
            
            
              Created
              August 28, 2019 12:07
                — forked from ajtulloch/Block-Sparse GEMM.ipynb
            
              
                Block-Sparse GEMM.ipynb
              
          
      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## relay_autodiff_test.py
import tvm
import topi
import numpy as np
from tvm.testing import check_numerical_grads, estimate_performance, PerformanceEstimate
import time
import inspect
import sys
from exercise.runners import run_tvm

# Whether to dump the generated code

## test_fuse.py
import tvm
from tvm import relay

def test_fuse_simple():
    """Simple testcase."""
    def before():
        x = relay.var("x", shape=(10, 20))
        y = relay.add(x, relay.const(1, "float32"))
        z = relay.exp(y)
        return relay.Function([x], z)

## tvm.cpp
#include <random>
#include <iomanip>
#include <array>
#include <exception>

#define NDEBUG
#include <tvm/tvm.h>
#include <tvm/build_module.h>
#include <topi/broadcast.h>
#undef NDEBUG

## grouped_onnx_tvm_opencl.ipynb

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                snowolfhawk
                / grouped_onnx_tvm_opencl.ipynb
            
            
              Created
              August 28, 2019 11:55
                — forked from Wheest/grouped_onnx_tvm_opencl.ipynb
            
          
      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## caffe-int8-convert-tool-dev-weight.py
"""
Quantization module for generating the calibration tables will be used by
quantized (INT8) models from FP32 models.with bucket split,[k, k, cin, cout]
cut into "cout" buckets.
This tool is based on Caffe Framework.
"""
from __future__ import division
from __future__ import print_function
import argparse
import numpy as np

## readme.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                snowolfhawk
                / readme.md
            
            
              Created
              August 28, 2019 11:26
                — forked from hewumars/readme.md
            
          
    外部张量函数

虽然TVM支持透明代码生成，但有时将手动编写的代码合并到管道中也很有帮助。例如，我们想去为部分卷积和使用cuDNN和定义其他阶段。
TVM原生支持黑盒函数调用。TVM支持兼容DLPack的所有张量函数。这意味着我们可以使用POD类型（pointer，int，float）或指向DLTensor的指针作为参数调用任何函数。
from __future__ import absolute_import, print_function

import tvm

	#include <iostream>
	#include <chrono>
	#include <stdexcept>
	#include <memory>
	#include <fstream>
	#include <iterator>
	#include <algorithm>
	#include <tvm/runtime/module.h>
	#include <tvm/runtime/registry.h>
	import tvm
	import topi
	import numpy as np
	from tvm.testing import check_numerical_grads, estimate_performance, PerformanceEstimate
	import time
	import inspect
	import sys
	from exercise.runners import run_tvm

	# Whether to dump the generated code
	import tvm
	from tvm import relay

	def test_fuse_simple():
	"""Simple testcase."""
	def before():
	x = relay.var("x", shape=(10, 20))
	y = relay.add(x, relay.const(1, "float32"))
	z = relay.exp(y)
	return relay.Function([x], z)
	#include <random>
	#include <iomanip>
	#include <array>
	#include <exception>

	#define NDEBUG
	#include <tvm/tvm.h>
	#include <tvm/build_module.h>
	#include <topi/broadcast.h>
	#undef NDEBUG
	"""
	Quantization module for generating the calibration tables will be used by
	quantized (INT8) models from FP32 models.with bucket split,[k, k, cin, cout]
	cut into "cout" buckets.
	This tool is based on Caffe Framework.
	"""
	from __future__ import division
	from __future__ import print_function
	import argparse
	import numpy as np