Skip to content

Instantly share code, notes, and snippets.

@ajtulloch
Last active April 30, 2019 02:49
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ajtulloch/7d3ff88981f0aab03ac4a8e0538e1844 to your computer and use it in GitHub Desktop.
Save ajtulloch/7d3ff88981f0aab03ac4a8e0538e1844 to your computer and use it in GitHub Desktop.
RelayTVMFusionE2E.ipynb
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "import tvm\nfrom tvm import relay\nimport logging\nlogging.basicConfig(level=logging.DEBUG)\n",
"execution_count": 1,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "X = tvm.placeholder((10,), name=\"X\")\nY = tvm.placeholder((10,), name=\"Y\")\n\nZ = tvm.compute(X.shape, lambda i: X[i] + Y[i], name=\"Z\")\nZ_relu = tvm.compute(Z.shape, lambda i: tvm.max(Z[i], 0), name=\"Z_relu\")",
"execution_count": 2,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Default Schedule\n\nFirst, we see that the default schedule does two separate passes, one to compute the sum, another to compute the ReLU."
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "s = tvm.create_schedule(Z_relu.op)\nprint(tvm.lower(s, [X, Y, Z_relu], simple_mode=True))",
"execution_count": 3,
"outputs": [
{
"output_type": "stream",
"text": "// attr [Z] storage_scope = \"global\"\nallocate Z[float32 * 10]\nproduce Z {\n for (i, 0, 10) {\n Z[i] = (X[i] + Y[i])\n }\n}\nproduce Z_relu {\n for (i, 0, 10) {\n Z_relu[i] = max(Z[i], 0.000000f)\n }\n}\n\n",
"name": "stdout"
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Fused Schedule\n\nNow, we compute the addition 'inline' - that is, we compute it at the point where it is used (in the ReLU). This allows us to compute the entire expression in a single pass over the input data."
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "s = tvm.create_schedule(Z_relu.op)\ns[Z].compute_inline()\nprint(tvm.lower(s, [X, Y, Z_relu], simple_mode=True))",
"execution_count": 4,
"outputs": [
{
"output_type": "stream",
"text": "produce Z_relu {\n for (i, 0, 10) {\n Z_relu[i] = max((X[i] + Y[i]), 0.000000f)\n }\n}\n\n",
"name": "stdout"
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Fusion at the Relay level\n\nNow, let's construct a simple graph of Relay IR. This is a simple Add -> Exp -> ReLU graph."
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "x = relay.var(\"x\", shape=(10, 32))\ny = relay.add(x, relay.const(1, \"float32\"))\nz = relay.exp(y)\nw = relay.maximum(z, relay.const(0, \"float32\"))\n\nf = relay.Function([x], w)\nf = relay.ir_pass.infer_type(f)",
"execution_count": 5,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "We see that the graph is a single function with three instructions (add, exp, maximum)."
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "print(f.astext(show_meta_data=False))",
"execution_count": 6,
"outputs": [
{
"output_type": "stream",
"text": "v0.0.1\n%3 = fn (%x: Tensor[(10, 32), float32]) -> Tensor[(10, 32), float32] {\n %0 = add(%x, 1f) // ty=Tensor[(10, 32), float32]\n %1 = exp(%0) // ty=Tensor[(10, 32), float32]\n %2 = maximum(%1, 0f) // ty=Tensor[(10, 32), float32]\n %2\n}\n%3\n",
"name": "stdout"
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Fusion Pass\n\nNow, we invoke operator fusion, we see that the graph is decomposed into a separate Relay function \n\n```\n %3 = fn (%p0: Tensor[(10, 20), float32], __dict__=meta[StrMap][0]) -> Tensor[(10, 20), float32] {\n %0 = add(%p0, 1f) // ty=Tensor[(10, 20), float32]\n %1 = exp(%0) // ty=Tensor[(10, 20), float32]\n %2 = maximum(%1, 0f) // ty=Tensor[(10, 20), float32]\n %2\n }\n```\n\nWe will then generate the fused HalideIR for this subgraph directly."
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "ff = relay.ir_pass.fuse_ops(f, opt_level=2)\nff = relay.ir_pass.infer_type(ff)\n\nprint(ff.astext(show_meta_data=False))",
"execution_count": 7,
"outputs": [
{
"output_type": "stream",
"text": "v0.0.1\n%5 = fn (%x: Tensor[(10, 32), float32]) -> Tensor[(10, 32), float32] {\n %3 = fn (%p0: Tensor[(10, 32), float32], __dict__=meta[StrMap][0]) -> Tensor[(10, 32), float32] {\n %0 = add(%p0, 1f) // ty=Tensor[(10, 32), float32]\n %1 = exp(%0) // ty=Tensor[(10, 32), float32]\n %2 = maximum(%1, 0f) // ty=Tensor[(10, 32), float32]\n %2\n }\n %4 = %3(%x) // ty=Tensor[(10, 32), float32]\n %4\n}\n%5\n// meta data omitted. you can use show_meta_data=True to include meta data\n",
"name": "stdout"
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Generated HalideIR for fused blocks\n\nWe can now invoke the compilation flow and see the exact HalideIR we generate for our fused block. We produce a function called `fused_add_exp_maximum`, where the HalideIR is what we'd expect:\n\n```\nproduce tensor {\n parallel (ax0, 0, 10) {\n for (ax1.outer, 0, 2) {\n tensor[ramp((((ax0*2) + ax1.outer)*16), 1, 16)] = max(exp((placeholder[ramp((((ax0*2) + ax1.outer)*16), 1, 16)] + x16(1.000000f))), x16(0.000000f))\n }\n }\n}\n```"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "\n_ = relay.build(ff, target=\"llvm -mcpu=core-avx2\")",
"execution_count": 8,
"outputs": [
{
"output_type": "stream",
"text": "DEBUG:autotvm:Finish loading 35 records\nDEBUG:root:lower function fused_add_exp_maximum\nDEBUG:root:produce tensor {\n parallel (ax0, 0, 10) {\n for (ax1.outer, 0, 2) {\n tensor[ramp((((ax0*2) + ax1.outer)*16), 1, 16)] = max(exp((placeholder[ramp((((ax0*2) + ax1.outer)*16), 1, 16)] + x16(1.000000f))), x16(0.000000f))\n }\n }\n}\n\n",
"name": "stderr"
}
]
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "\nx = relay.var(\"x\", shape=(1, 16, 32, 32))\nk = relay.var(\"k\", shape=(32, 16, 3, 3))\n\ny = relay.nn.max_pool2d(x, pool_size=[2, 2])\nz = relay.exp(y)\n\nz_conv = relay.nn.conv2d(z, k)\nz_conv_relu = relay.nn.relu(z_conv)\n\nf = relay.Function([x, k], z_conv_relu)\nf = relay.ir_pass.infer_type(f)",
"execution_count": 9,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "print(f.astext(show_meta_data=False))",
"execution_count": 10,
"outputs": [
{
"output_type": "stream",
"text": "v0.0.1\n%4 = fn (%x: Tensor[(1, 16, 32, 32), float32], %k: Tensor[(32, 16, 3, 3), float32]) -> Tensor[(1, 32, 29, 29), float32] {\n %0 = nn.max_pool2d(%x, pool_size=[2, 2]) // ty=Tensor[(1, 16, 31, 31), float32]\n %1 = exp(%0) // ty=Tensor[(1, 16, 31, 31), float32]\n %2 = nn.conv2d(%1, %k) // ty=Tensor[(1, 32, 29, 29), float32]\n %3 = nn.relu(%2) // ty=Tensor[(1, 32, 29, 29), float32]\n %3\n}\n%4\n",
"name": "stdout"
}
]
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "ff = relay.ir_pass.fuse_ops(f, opt_level=2)\nff = relay.ir_pass.infer_type(ff)\n\nprint(ff.astext(show_meta_data=False))",
"execution_count": 11,
"outputs": [
{
"output_type": "stream",
"text": "v0.0.1\n%8 = fn (%x: Tensor[(1, 16, 32, 32), float32], %k: Tensor[(32, 16, 3, 3), float32]) -> Tensor[(1, 32, 29, 29), float32] {\n %2 = fn (%p0: Tensor[(1, 16, 32, 32), float32], __dict__=meta[StrMap][0]) -> Tensor[(1, 16, 31, 31), float32] {\n %0 = nn.max_pool2d(%p0, pool_size=[2, 2]) // ty=Tensor[(1, 16, 31, 31), float32]\n %1 = exp(%0) // ty=Tensor[(1, 16, 31, 31), float32]\n %1\n }\n %3 = %2(%x) // ty=Tensor[(1, 16, 31, 31), float32]\n %6 = fn (%p01: Tensor[(1, 16, 31, 31), float32], %p1: Tensor[(32, 16, 3, 3), float32], __dict__=meta[StrMap][1]) -> Tensor[(1, 32, 29, 29), float32] {\n %4 = nn.conv2d(%p01, %p1) // ty=Tensor[(1, 32, 29, 29), float32]\n %5 = nn.relu(%4) // ty=Tensor[(1, 32, 29, 29), float32]\n %5\n }\n %7 = %6(%3, %k) // ty=Tensor[(1, 32, 29, 29), float32]\n %7\n}\n%8\n// meta data omitted. you can use show_meta_data=True to include meta data\n",
"name": "stdout"
}
]
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "",
"execution_count": null,
"outputs": []
}
],
"metadata": {
"kernelspec": {
"name": "python2",
"display_name": "Python 2",
"language": "python"
},
"_draft": {
"nbviewer_url": "https://gist.github.com/7d3ff88981f0aab03ac4a8e0538e1844"
},
"language_info": {
"mimetype": "text/x-python",
"nbconvert_exporter": "python",
"name": "python",
"pygments_lexer": "ipython2",
"version": "2.7.15",
"file_extension": ".py",
"codemirror_mode": {
"version": 2,
"name": "ipython"
}
},
"gist": {
"id": "7d3ff88981f0aab03ac4a8e0538e1844",
"data": {
"description": "RelayTVMFusionE2E.ipynb",
"public": false
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment