Skip to content

Instantly share code, notes, and snippets.

@driazati
Created July 13, 2022 23:49
Show Gist options
  • Save driazati/a089759843364ba5eff64369ef554b7c to your computer and use it in GitHub Desktop.
Save driazati/a089759843364ba5eff64369ef554b7c to your computer and use it in GitHub Desktop.

Introduction

The TVM community has worked since the v0.8 release to deliver many exciting features and improvements. This is the first release with the new quarterly release schedule and includes highlights such as:

  • TBD

RFCs

These RFCs have been merged in apache/tvm-rfcs since the last release.

What's Changed

Note that this list is not comprehensive of all PRs and discussions since v0.8. Please visit the full listing of commits for a complete view: https://github.com/apache/tvm/compare/v0.8.0...v0.9.0.rc0.

Misc

  • #11465 - Add cooldown interval logic for the profiling functional
  • #11888 - [LLVM] Include LLVM headers in files that use them, not in llvm_common.h
  • #11646 - [Arith] Simplification of ceil, log2, and left_shift
  • #11464 - [MLF] Add support for multiple modules in Model Library Format
  • #11632 - [AutoTVM][Autoscheduler] Default build funcs inherit PassContext
  • #11543 - [OpenCL] Implement conv2d_winograd algorithm for Adreno
  • #11287 - [Arith] Merge surjective/non-surjective iter mapping detections
  • #11393 - Add utility to replace direct call to pytest.main
  • #11252 - [ROOFLINE] Roofline analysis over RPC
  • #10794 - bump PyTorch version to 1.11
  • #10821 - [REFACTOR] Remove legacy nnvm folder
  • #10567 - [Refactor] Reduced repetition in CodeGenLLVM's buffer access
  • #7401 - RFC: initial stab at TorchScript fallback
  • #9808 - [Rust] Update Rust bindings
  • #9611 - [CMAKE] Automatically detect newly added source files
  • Profiler - #11530, #11066
  • Docs - #10921, #11403, #10774, #10912, #9633, #9906, #9534, #9654, #9580
  • Android - #11241
  • ETHOSN - #11261, #10486, #10018, #9596
  • TVMC - #11012, #10962, #10722, #9817, #9529, #9229

USMP

  • #11015 - U3 use case
  • #10189 - Adding support for U1 usecase for constant pools
  • #10785 - Adding support for U4 usecase
  • #10193 - adding support for U2 and U3 usecases
  • #10005 - Add performance characteristics to PoolInfo
  • #9565 - [TIR]Integrating USMP to AoT Executor
  • #9704 - Hill Climb allocator
  • #9418 - [TIR]adding the pass to convert to pool offsets
  • #9649 - [TIR]Augmenting the algo interface with memory pressure
  • #9214 - [TIR]Greedy memory planning algorithm
  • #8468 - [TIR]Added buffer info extraction pass

BYOC

  • #11474 - Two helper passes for external codegen using RelayToTIR custom pass machinery
  • #11144 - Remove support for run-time linked-params from codegen
  • #10590 - Add order to functions in C Codegen
  • #11638 - [DNNL][CBLAS]Unifles all MKLDNN/DNNL to DNNL
  • #11619 - RelayToTIR custom codegen passes can still depend on dynamic shape functions
  • DNNL - #11902, #11642, #11513, #11571, #11560, #11345, #11111, #10837, #10421, #9995, #9797
  • TensorRT - #11923, #11203, #10759, #10772, #10388
  • CMSIS-NN - #11732, #11625, #10939, #11013, #10817, #10563, #10224, #10148, #10100, #9338, #9531, #9409, #9331
  • OpenCLML - #10243
  • CUTLASS - #11631, #10185, #10177, #10110, #10036, #9899, #9820, #9800, #9795, #9746, #9737, #9698, #9595, #9571
  • CUDNN - #10997, #9986, #9948
  • ACL - #10801
  • PTX - #10855, #10339, #9909
  • CUBLAS - #10826, #10820

Hexagon

  • #11549 - Initial clip operator for Hexagon
  • #11834 - Add op resize2d for hexagon
  • #11559 - Softmax slice op initial version
  • #11529 - Slice ops added - add, subtract, multiply
  • #11720 - [testing] add max_pool2d benchmark
  • #11417 - Implement avg_pool2d slice op
  • #11653 - Add HexagonThreadManager
  • #11547 - Run single RPC server on Android in each testing session
  • #11490 - [testing] add TVMScript elemwise-add
  • #11400 - [testing] refactor benchmark-table code
  • #11277 - moves conftest.py to tvm.contrib.hexagon so outside repos can access the testing fixtures
  • #11319 - Add unit tests for Hexagon Device API
  • #11279 - Add USMP tests
  • #11283 - Update Readme
  • #11239 - capture gtest output and return over FFI
  • #11175 - Add schedule and test for conv2d_transpose_nchw
  • #11018 - [Runtime] Add QuRT thread pool backend
  • #11145 - Add support for on-device unit testing using gtest
  • #11138 - Add test for depthwise conv2d schedule
  • #11016 - Add test for registered schedules
  • #11104 - Add mobilenet test
  • #11090 - Delete offload runtime, move files to right places
  • #11065 - AoT with LLVM Codegen on Hexagon
  • #11025 - Deprecate USE_HEXAGON_DEVICE, introduce USE_HEXAGON
  • #10604 - HVX scheduling and bench-marking of TE element-wise add
  • #10905 - [LLVM] Enable/test tensorized Hexagon DMA on 2d transformed layout
  • #10907 - Move aot/graph_executor interactions into launcher
  • #10919 - Register basic strategies and schedules for common operators
  • #10904 - Add unit tests executing 2-d VTCM usage
  • #10910 - Refactor to keep HexagonBuffer private to the device api
  • #10908 - [LLVM][CodeGen] Make CodeGenHexagon a subclass of CodeGenCPU
  • #10878 - Generalized HexagonBuffer::CopyTo/CopyFrom
  • #10846 - Support both 1-d and 2-d VTCM allocations
  • #10581 - Improved ergonomics of HexagonLauncher in unit tests.
  • #10616 - Refactor tvm.contrib.hexagon, NFC
  • #10612 - Deprecate SDK 3.x, rewrite HexagonSDK.cmake
  • #10586 - Codegen for 2d Load/Store
  • #10558 - Generalize builtin for Nd memory alloc with storage scope and add lowering for VTCM / Hexagon
  • #10543 - [Runtime][PipelineExecutor] Add the pipeline internal forwarding logic.
  • #10507 - Add doc on TVM - Hexagon RPC flow
  • #10520 - Resolve breakage in test_hexagon/test_cache_read_write
  • #10311 - [runtime]AOTExecutor implementation for C Codegen
  • #10454 - Allow execution on target or simulator from HexagonLauncher
  • #10365 - Lower cache_read and cache_write to Hexagon DMA via tensorize
  • #10361 - RPC server/client for simulator
  • #10302 - [CI]Add Hexagon Tests to pipeline
  • #10263 - [Docker]Add docker file and scripts
  • #10227 - Refactor Hexagon.cmake
  • #10217 - Adding support for Hexagon User DMA Engine
  • #10068 - Update hexagon API build instruction and cleanup hexagon_proxy_rpc
  • #9970 - Do not auto-build apps when building TVM
  • #9736 - Add unit tests for HexagonBuffer
  • #9525 - Add Hexagon VTCM and discontiguous allocation support
  • #9631 - Add RPC Mechanism for Hexagon
  • #9473 - cleanup Hexagon conv2d tests

AOT

  • #11208 - Calculate used memory at the callsite of primitive functions
  • #11365 - Fix function number datatype from char to uint16_t
  • #11091 - Enable A-Normal Form in the AOT executor
  • #10753 - Support LLVM backend with C++ runtime
  • #10518 - Use python temporary directory for AOT tests
  • #10337 - BugFix of workspace calculation
  • #10282 - [runtime] Add Metadata classes for AOTExecutor
  • #9501 - [3/3][DeviceAPI] Wire up cpacked Device API context
  • #9500 - [2/3][DeviceAPI] Add Hooks for Activate/Deactivate/Open/Close
  • #9395 - [1/3][DeviceAPI] Connecting devices structure to relevant operators

MetaSchedule

  • #11884 - Postproc: Rewrite-Layout
  • #11848 - [OpStrategy] Support MetaSchedule Layout
  • #11845 - [Relay][Pass] Meta-Schedule-Layout-Rewrite
  • #11758 - [Runtime] Enhance Runner RandomFill
  • #11683 - Distributed Measurement
  • #11751 - [Minor] Organize Testing Scripts
  • #11735 - Modify Profiler Timers
  • #11727 - Developer Ergonomics Enhancement II
  • #11692 - Apply-History-Best Task Filtering
  • #11486 - Add Profiler Support For Tuning Efficiency Optimization
  • #11680 - JSONDatabase Utilities
  • #11641 - Generate MetaSchedule Dataset
  • #11622 - Developer Ergonomics Enhancement
  • #11604 - Resolve dependencies between header files
  • #11587 - Add Testing Script with ONNX Support
  • #11590 - Evo Independence from TaskScheduler
  • #11534 - No explicit unrolling for spatial PrimFunc
  • #11512 - Enable Task Filtering
  • #11177 - AutoBind rule and MutateThreadBinding
  • #11157 - Logging Interface Unification
  • #11088 - Auto tensorization for CPU / GPU dot product
  • #10986 - [Refactor] Introduce TuneConfig
  • #11020 - [Metaschedule, Refactor] Move MultiLevelTilingNode decl to a header
  • #10927 - [Refactor] Clarify Integration Logic
  • #10876 - Add utility API to ease using manual schedules
  • #10885 - [BugFix] Fix skipped tests
  • #10366 - Add Gradient Based Task Scheduler
  • #10823 - Fine-Grained Rewrite Unbound Block
  • #10793 - Add demonstration of selectively tuning relay ops with TIR schedules
  • #10811 - Support grouping in the cost model
  • #10810 - Extract task weights during task extraction
  • #10782 - [TIR]Estimate TIR FLOPs
  • #10776 - Misc updates for tuning end-to-end workloads
  • #10689 - Upstream the leftover changes
  • #10648 - [Meta Schedule] Refactor meta schedule testing utils
  • #10578 - New relay backend for meta schedule task extraction
  • #10534 - Bug Fix for Relay Integration
  • #10501 - Update scripts for subgraph tuning
  • #10497 - Refactor testing workloads
  • #10461 - Enable AutoTVM-style template-based search space
  • #10368 - Fix Cyclic Dependency in PyClass Family
  • #10403 - Arithmetic analysis
  • #10367 - Update Tuning Interfaces.
  • #10079 - [M4a] User-API: Tune-TE/TIR/Relay
  • #10081 - [M4a] Rewrite-Cooperative-Fetch
  • #10055 - [M4b] Testcases for TensorRT builder/runner
  • #10092 - [M4a] Mutator: Mutate-Tile-Size
  • #10096 - [M4a] Mutator: Mutate Parallel
  • #10071 - [M4a] PostProcessor: Rewrite-Parallel-Vectorize-Unroll
  • #10043 - [M4a] Schedule Rule: Multi-Level-Tiling
  • #10045 - Mutator: Mutate-Unroll
  • #10033 - [M4a] Schedule Rule: Parallelize-Vectorize-Unroll
  • #10027 - [M4a] PostProcessor: Rewrite-Unbound-Block
  • #10028 - Mutator: Mutate-Compute-Location
  • #9997 - [M4a] PostProcessor: Disallow-Dynamic-Loop
  • #9994 - [M4a] Schedule Rule: Cross-Thread-Reduction
  • #10013 - [M4a] PostProcessor: Rewrite Reduction Block
  • #9975 - [M4a] Schedule Rule: Add-RFactor
  • #9945 - [M4a] PostProcessor: Verify-GPU-Code
  • #9940 - [M4a] Schedule Rule: Random-Compute-Location
  • #9943 - [M4a] Schedule Rule: Auto-Inline
  • #9860 - [M3c] Add Per-Store-Feature
  • #9859 - [M3c] XGB-based Cost Model
  • #9836 - [M4a] Add EvolutionarySearch Search Strategy
  • #9799 - [M4a] Add ReplayFunc Search Strategy
  • #9789 - [M3c] Update TuneContext, TaskScheduler & Search Strategy Design
  • #9780 - [M3c] Add More Measure Callbacks
  • #9761 - [M4a] Add ScheduleRule class & PostOrderApply space generator
  • #9760 - [M3c] Random Feature Extractor

TIR

  • #11592 - HoistExpression, generalization of HoistIfThenElse
  • #11870 - [Pass] Remove-Weight-Layout-Rewrite-Block
  • #11740 - [TIR, analysis] Add GetAutoTensorizeMappingInfo to generate transforms for auto tensorization
  • #11585 - Add preserve-unit-iters
  • #11677 - Register CUDA WMMA tensor intrinsics
  • #11658 - [TIR, CUDA] Add pass to replace global to shared memory copy with cp.async
  • #11624 - [Schedule] Allow named block and buffer arguments in Schedule
  • #11628 - [PASS] Refactor a couple of TIR passes - BindTarget, AnnotateEntryFunc, Filter, LowerInitBlock
  • #11574 - CSE pass : Restrict the equivalence to be decided by a normal form - avoids comparison of terms
  • #11575 - Schedule Primitive: Add-Unit-Loop
  • #11515 - Add schedule primitive ReIndex
  • #11524 - [Arith] Additional Simplifications Inside Conditionals
  • #11485 - Add schedule primitive TransformBlockLayout
  • #11495 - [Software pipeline] Fix hardcoded index in access_ptr rewriting, add a GPU test with depth 4
  • #11269 - [Schedule] Transform layout quality of life
  • #11355 - Support tensorization using ldmatrix + MMA
  • #11289 - [Schedule] Allowed typing.Tuple in tir.schedule._type_checker
  • #11317 - Support affine expressions as indices in reverse compute inline
  • #11235 - [Arith] Implemented padded inverses in IndexMap
  • #11238 - [ROOFLINE] Calculate roofline from existing TIR PrimFunc
  • #11225 - Add schedule primitive SetAxisSeparator
  • #11110 - Get read/write access precisely for opaque access.
  • #11106 - Enhance software pipeline validation and fix predicate of epilogue
  • #10843 - StmtFunctor RenewDefs
  • #11075 - Add function to tile a block according to a given tensor intrinsic
  • #11050 - Utility function to decide loop mapping for auto tensorization
  • #10925 - VNNI and ARM dot product intrinsic for tensorization
  • #10887 - [Schedule] Relax reorder primitive's affine binding check
  • #10732 - [Analysis] Add SuggestIndexMap for layout rewriting
  • #10538 - [Schedule] Transform layout
  • #10638 - Change the behavior of read/write region analysis for reduction blocks.
  • #10671 - Tuple Reduction Support in CreatePrimFunc
  • #9727 - [TE]Implement layout transformations, non-flat memory buffers
  • #10405 - [TensorIR] Update VerifyGPU
  • #10401 - [TensorIR] Renormalize split pattern
  • #10112 - [TIR, Relay] improve bfloat16 support
  • #8509 - Tir constants integration into compilation pipeline
  • #9996 - add support for multi-blocking layout and their transformation
  • #10066 - Add software pipelining
  • #9482 - Implementation of Common Subexpression Elimination for TIR
  • #9527 - Allow compute_at create block predicate for non-trivial bounds and support floordiv pattern
  • #10158 - [Schedule] Update compact_dataflow constraint
  • #9871 - [Schedule] Blockize and Tensorize
  • #10016 - [BugFix]Fix cross-thread reduction when single reduction loop with predicate
  • #9880 - Encode conditional accesses info into block read/write regions
  • #9699 - Affine utility support iter lowerbound and diagnostics
  • #9742 - [Schedule] Add Annotate/Unannotate primitive
  • #9738 - [TensorIR] Primitive "SetScope"
  • #9743 - [Schedule] Analysis functions to check if compute_inline and com…
  • #9689 - Allow memory (aka storage) scopes to be retrieved/applied to PrimFuncs
  • #9559 - [TensorIR][UX] Type annotation-based runtime type checking
  • #9360 - [TensorIR] Cross-Thread Reduction

Relay

  • #11825 - [realy][pass]add split infer shape with convert op layout pass
  • #11674 - Finish implementations of WithFields
  • #11481 - IndexedGraph improvements in preparation for Collage
  • #11432 - Plumb external codegen target via Target.current()
  • #11494 - [Pass] Add MaxPool, AvgPool to FoldExplicitPadding
  • #11183 - Add unidirectional sequence lstm
  • #11442 - Add 'static_library' runtime::Module
  • #11413 - [Topi]Support for FP16 ERF on CPU.
  • #11382 - Finish support for list-of-targets
  • #11386 - [Tests] Replace the Relay interpreter with the VM in the op tests
  • #11224 - Support i16, f16 scalars in Relay text
  • #11337 - Fix eltwise alter op layout for broadcast axis
  • #11199 - Flexible shape dispatch transformation
  • #11173 - Support 'external codegen targets'.
  • #10996 - Add FlattenAtrousConv transformation
  • #10871 - [CUDNN] Add cuDNN as a Relay partitioning target (BYOC)
  • #10787 - [Pass][Bugfix] Disable re-use of non-flat buffers in StorageRewrite.
  • #10378 - [FQ2I] Add leaky relu to FQ21
  • #10400 - RelayViz graphviz renderer
  • #10352 - [VIRTUALDEVICE] Change syntax for device planning and store parameter virtual devices in virtual_device_ field
  • #10156 - Fix broadcast InferCorrectLayout
  • #10026 - [VM] Relay VM memory liveness/lifetime analysis
  • #10089 - [Pass] Add a relay pass to extract fake quantized ops
  • #9690 - Change function constructors to WithFields
  • #10069 - [DefuseOps pass] bug fix: To support function body types other…
  • #9954 - Add conv2d_backward_weight op (without topi)
  • #9723 - [Frontend] Add Span filling for frontends to Relay
  • #9749 - Fix invalid shape function for "copy" operator
  • #9759 - s/SEScope/VirtualDevice/g
  • #9734 - Support large constants saved/loaded outside of VM executable
  • #9613 - Re-run PlanDevices after LowerTE to flow new memory scope constraints.
  • #9693 - PlanDevices supports 'free' on_device annotations
  • #9641 - [AST] Add virtual_device as a first class field in Relay
  • #9483 - Switch the VM to use the LowerTE pass instead of TECompiler::{Lower,LowerShapeFunc}.
  • #9569 - WithFields method for Call, Function, Var, TupleGetItem, If, Let, RefCreate, RefRead, RefWrite, Match, and Clause
  • #9533 - WithFields for Tuples
  • #9550 - Prepare for switching VM to LowerTEPass.
  • #9542 - Prepare DeadCodeElimination for running post LowerTEPass/ManifestAlloc.
  • #9352 - [TVMC]Introduce executor and runtime parameters
  • #9326 - Switch PlanDevices pass to be w.r.t. SEScopes instead of DLDeviceTypes.
  • QNN - #11228, #10718, #10086, #10053, #9982

CI

  • #11313 - Refactor of tvm.testing.requires_* annotations
  • #11666 - Enable pylint for tests/python/ci
  • #11657 - Apply linting rules to AOT tests
  • #11380 - Restructure Jenkinsfile
  • Automation - #11813, #11775, #11480, #11437, #10833, #10056, #9973, #9934
  • User experience improvements - #11470, #11329, #11553, #11497, #11051, #10933, #10960, #10525, #10425, #10322, #10121, #9971, #9554, #9752, #9556
  • Reduce CI runtime - #11402, #11349, #11258, #11132, #10946, #10743, #10359
  • Code cleanups - #10968, #10740

MicroTVM

  • #11741 - Refactor RVM scripts and fix DNS network issue
  • #11472 - [ARM]Add tests for arm schedules
  • #11634 - Update pyproject to python3.7
  • Zephyr support - #11650
  • RPC - #11227, #10967

TE

  • #11589 - Support schedulable TIR compute definitions in TOPI
  • #11531 - [TOPI] TE implementation of LSTM using scan
  • #11341 - Optimized version of concatenation layer

microNPU

  • #11468 - Optimize separate padding operation for conv2d
  • #11453 - Add transform matrices and part matcher to identity op
  • #11410 - add E2E tests with cascader wo striping
  • #11288 - Expose compute cycle annotations to TIR lowering
  • #10959 - Add a pass to reorder copy and compute nodes
  • #10509 - Add various options to the cascader
  • #11263 - Adding a option to enable striping
  • #10251 - Add support for conv2d running on two cores on U65
  • #10862 - Integrate the cascader
  • #10344 - Integrate rolling buffers in Arm(R) Ethos(TM)-U
  • #10824 - Some housekeeping in the test_ethosu folder
  • #10763 - Tweak a layout transform matrix
  • #10725 - Add a pass to move allocate nodes to the outer scope
  • #10695 - Determine block configs using the cascader
  • #10599 - Refactor Relay to TIR hook
  • #10508 - Improve cascader memory transfer estimates
  • #10345 - Add support for TFLite FULLY_CONNECTED
  • #10254 - Introduce a pass to remove redundant identity operations
  • #10062 - [5] Convert Proposals to te.Schedules
  • #9959 - [4] Add the cascader Proposal generator
  • #10022 - enable USMP
  • #10127 - Add support for LeakyReLU
  • #10004 - Add FreeRTOS variant of NPU demo
  • #10060 - Refactor type inference data type checks
  • #9960 - Add support for pack and unpack
  • #10143 - Fix layout assignment in layout optimizer pass
  • #9890 - [3] Plan generation for the cascader
  • #9855 - Add support for transpose convolution
  • #9841 - Add support for nearest neighbor and bilinear upsampling
  • #9951 - Removing constant args from PrimFunc
  • #9929 - Refactor base address determination to codegen
  • #9910 - Add support for requantize
  • #9831 - Move optimization passes to be a module pass and ensure they are running
  • #9785 - [2d] Add more Part matchers to cascader
  • #9778 - [2c] Add performance modelling to cascader
  • #9471 - [2b] Create CascaderGraphs from TE graphs
  • #9469 - [2a] Add CascaderGraph for cascading analysis
  • #9621 - Add support for SPLIT and SPLIT_V
  • #9508 - Update Conv2D Tests to Use TF API to Gen Test Cases
  • #9627 - Add support for SIGMOID
  • #9589 - Add support for TFLite concatenate
  • #9623 - Refactor codegen tests
  • #9561 - Add NHWC -> NHCWB16 layout transformation pass
  • #9576 - Mean legalization support
  • #9597 - Move the compilation to use Target Hooks.
  • #9458 - [1] Add affine analysis structures for the cascader
  • #9547 - Add the infrastructure for lookup table and TANH
  • #9521 - Support binary elementwise with non-4D inputs
  • #9560 - Fix incorrectly calculated stride when converting NHWC to NHCWB16
  • #9530 - Add unary elementwise operator infrastructure with ABS
  • #9514 - Adding rounding mode attribute to operators
  • #9515 - Allow constants to be given as input to an operator

Frontends

  • PaddlePaddle - #11537, #9724, #9564
  • TFLite - #10915, #10566
  • Oneflow - #11321, #11036, #8790
  • PyTorch - #11190, #10504, #10184, #10091
  • ONNX - #10949, #9438, #9186, #9493, #9475
  • Keras - #7006

microTVM

  • #11250 - [ARM] Add Relay tests for conv2d registered schedules
  • #11232 - [rpc] Implemented rpc logging
  • #11044 - Add support for host-driven AoT Executor
  • #11043 - Better version handling for Arduino
  • #10555 - Enable micro tvmc tutorial testing in CI
  • #10194 - [RVM] Add scripts for automated build and testing
  • #10144 - TVMCon 2021 Zephyr Demo with CMSIS-NN
  • #10024 - [tvmc] Add TVMC Micro tutorial for Zephyr
  • #9684 - Fix zephye/test_zephyr_armv7m test
  • #9584 - [TVMC] Add TVMC test for Arduino and Zephyr
  • #9526 - Add minimal forwarding RPC server for host driven python execution on Hexagon
  • Zephyr support - #11362, #10138

Runtime

  • #11334 - [PipelineExecutor] Add graph manually splitting logic into the unit test.
  • #11133 - [PipelineExecutor] Refactor PipelineExecutor.py and Add cross compile support for pipeline executor.
  • #10990 - [PipelineExecutor]Add forwarding queue logic for set input.
  • #10953 - [Vulkan] Add RGP support to TVM for vulkan device
  • #10723 - [PipelineExecutor] Getting the asynchronous output
  • #10283 - AOTExecutor implementation and c target code-generator
  • #9802 - [ThreadPool]Refactor affinity function and support CPU affinity list setting.
  • #10234 - [Pipeline Executor] multiple threads management and the data forwarding notification mechanism.
  • #10326 - Improved log information with function signature
  • #10032 - [PackedFunc] Bring PackedFunc into TVM Object System
  • #9751 - [Pipeline Executor] Add the map logic of global input and subgraph input.

TVMScript

  • #11308 - Represent ramp as index slice
  • #10099 - Support T.buffer_decl using data pointer from Let/Allocate
  • #9680 - Improve printer for TIR syntax sugar
  • #9492 - Add syntax sugar for T.handle and T.match_buffer
  • #9620 - Add for loop syntax sugar
  • #9543 - Misc error message improvements
  • #9505 - [Fix] Add type hints for more uncovered cases
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment