With torch dynamo, we can dispatch a pytorch model to other awesome deep learning framework/compilers for acceleration. Hidet is one of such deep learning compilers that accelerates your model with a bunch of optimizations (e.g., subgraph fusion, rewriting and kernel tuning). To use hidet, please first install it via
$ pip install hidet
Then you can enable it via torch.compile(model, backend='hidet')
as shown in the code snippet below:
import torch
import hidet
# Define pytorch model
model = torch.hub.load('pytorch/vision:v0.6.0', 'resnet18', pretrained=True).cuda().eval()
x = torch.rand(1, 3, 224, 224).cuda()
# Compile the model through Hidet
hidet.torch.dynamo_config.search_space(2) # tune the kernel performance
model_opt = torch.compile(model, backend='hidet')
# Run the optimized model
y = model_opt(x)
Here are some benchmarks
(Batch Size = 1, NVIDIA RTX 3090, Bert sequence length=128, with float32 data type)
Learn more about hidet and its optimization options in the tutorial and GitHub repository. Hidet originates our research work that tries to simplify writing tensor program with our proposed task-mapping programming paradigm. Please checkout our paper for more details.
My point is that you can't just rely on every MLSys folk to read every paper in ASPLOS to know about Hidet (maybe they will eventually get to it in a year, but by then it's already too late for the purpose of building a community); you need a media like PyTorch's release blog to get as wider as coverage as possible.