Skip to content

Instantly share code, notes, and snippets.

View jgong5's full-sized avatar

Jiong Gong jgong5

  • Intel Corporation
  • Shanghai PRC
View GitHub Profile

gRPC编程模型

  • Unary(一元):传统请求-响应模式。
  • 服务端流式:客户端发送一个请求,服务端返回流式响应。
  • 客户端流式:客户端发送流式请求,服务端返回单个响应。
  • 双向流式:双方通过独立流同时发送数据。

关键点

  1. Unary的编程模式和普通函数调用类似
  2. 服务端流式(返回值流式)在客户端需要显式调用reader的Finish以标记通讯结束
@jgong5
jgong5 / TorchInductor_Update_4.md
Last active November 25, 2022 07:28
TorchInductor Update 4: CPU backend started to show promising performance boost

It’s Jiong Gong (@jgong5) from the Intel team working on PyTorch optimization for CPU. In this post, I’d like to give an update on the recent progress of CPU backend of TorchInductor, the new DL compiler of PyTorch. Designed to support multiple device backends, TorchInductor provides backends for both CPU and NVIDIA GPU. There has been great progress on GPU backend optimization for training workloads (see this for details). On CPU side, since a significant portion of DL workloads running on CPU are DL inference, we started off optimizing CPU inference as our first step. We started the efforts in early October with a low performance baseline (see table 1-3 below [^1]) at that point of time, and we are pleased to bring the improvements to the table.

T
@jgong5
jgong5 / IPEX_1_13.md
Created November 10, 2022 11:04
IPEX 1.13 release note

We are pleased to announce the release of Intel® Extension for PyTorch* 1.13.0-cpu which accompanies PyTorch 1.13. This release is highlighted with quite a few usability features which help users to get good performance and accuracy on CPU with less effort. We also added a couple of performance features as always. Check out the feature summary below.

  • Usability Features
  1. Automatic channels last format conversion: Channels last conversion is now applied automatically to PyTorch modules with ipex.optimize by default. Users don't have to explicitly convert input and weight for CV models.
  2. Code-free optimization (experimental): ipex.optimize is automatically applied to PyTorch modules without the need of code changes when the PyTorch program is started with the IPEX launcher via the new --auto-ipex option.
  3. Graph capture mode of ipex.optimize (experimental): A new boolean flag graph_mode (default off) was added to ipex.optimize, when turned on, converting the eager-mode PyTorch modu
@jgong5
jgong5 / output_code_bad.py
Last active October 14, 2022 09:06
Bad Triton Code
from ctypes import c_void_p, c_long
import torch
import random
from torch import empty_strided, as_strided, device
from torchinductor.codecache import AsyncCompile
aten = torch.ops.aten
async_compile = AsyncCompile()
@jgong5
jgong5 / output_code_good.py
Created October 14, 2022 08:52
Good Triton Code
from ctypes import c_void_p, c_long
import torch
import random
from torch import empty_strided, as_strided, device
from torchinductor.codecache import AsyncCompile
aten = torch.ops.aten
async_compile = AsyncCompile()
@jgong5
jgong5 / IPEX_INT8_Calibration.py
Last active May 6, 2022 11:30
IPEX INT8 Calibration
import os
import torch
model = Model()
model.eval()
data = torch.rand(<shape>)
# Applying torch.fx.experimental.optimization.fuse against model performs conv-batchnorm folding for better performance.
import torch.fx.experimental.optimization as optimization
model = optimization.fuse(model, inplace=True)
#################### code changes ####################
import intel_extension_for_pytorch as ipex
@jgong5
jgong5 / IPEX_BF16_Inference.py
Last active May 6, 2022 11:28
IPEX BF16 Inference
...
import torch
...
model = Model()
model = model.to(memory_format=torch.channels_last)
model.eval()
#################### code changes ####################
import intel_extension_for_pytorch as ipex
model = ipex.optimize(model, dtype=torch.bfloat16)
######################################################
@jgong5
jgong5 / IPEX_BF16_Training.py
Last active May 6, 2022 11:29
IPEX BF16 Training
...
import torch
...
model = Model()
model = model.to(memory_format=torch.channels_last)
criterion = ...
optimizer = ...
model.train()
#################### code changes ####################
import intel_extension_for_pytorch as ipex
@jgong5
jgong5 / BF16trainingwithIPEXBF16automixedprecision.py
Created March 29, 2021 08:26 — forked from pytorchsam/BF16trainingwithIPEXBF16automixedprecision.py
BF16 training with IPEX BF16 auto-mixed precision
import torch
# Step 1: Register IPEX optimizations
import intel_pytorch_extension as ipex
from my_models import SomeModel
# Step 2: Enable BF16 auto-mixed-precision
ipex.enable_auto_mixed_precision(mixed_dtype = torch.bfloat16)
data_loader = …
# Step 3: Enable IPEX optimizations
model = SomeModel().to(ipex.DEVICE)
opt = torch.optim.SGD(model.parameters(), ...)
@jgong5
jgong5 / IPEXBF16automixedprecisiongraphfusion.py
Created March 29, 2021 08:26 — forked from pytorchsam/IPEXBF16automixedprecisiongraphfusion.py
IPEX BF16 auto-mixed precision and graph fusion
import torch
# Step 1: Register IPEX optimizations
import intel_pytorch_extension as ipex
from my_models import SomeModel
# Step 2: Enable BF16 auto-mixed-precision
ipex.enable_auto_mixed_precision(mixed_dtype = torch.bfloat16)
# Step 3: Enable IPEX optimizations
model = SomeModel().to(ipex.DEVICE).eval()
model = torch.jit.script(model)