Jiong Gong jgong5

## gRPC_programming_model.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                jgong5
                / gRPC_programming_model.md
            
            
              Created
              March 22, 2025 07:22
            
          
    gRPC编程模型


Unary（一元）：传统请求-响应模式。
服务端流式：客户端发送一个请求，服务端返回流式响应。
客户端流式：客户端发送流式请求，服务端返回单个响应。
双向流式：双方通过独立流同时发送数据。

关键点


Unary的编程模式和普通函数调用类似
服务端流式（返回值流式）在客户端需要显式调用reader的Finish以标记通讯结束


## TorchInductor_Update_4.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                jgong5
                / TorchInductor_Update_4.md
            
            
              Last active
              November 25, 2022 07:28
            
              
                TorchInductor Update 4: CPU backend started to show promising performance boost
              
          
    It’s Jiong Gong (@jgong5) from the Intel team working on PyTorch optimization for CPU. In this post, I’d like to give an update on the recent progress of CPU backend of TorchInductor, the new DL compiler of PyTorch. Designed to support multiple device backends, TorchInductor provides backends for both CPU and NVIDIA GPU. There has been great progress on GPU backend optimization for training workloads (see this for details). On CPU side, since a significant portion of DL workloads running on CPU are DL inference, we started off optimizing CPU inference as our first step. We started the efforts in early October with a low performance baseline (see table 1-3 below [^1]) at that point of time, and we are pleased to bring the improvements to the table.
T


## IPEX_1_13.md

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                jgong5
                / IPEX_1_13.md
            
            
              Created
              November 10, 2022 11:04
            
              
                IPEX 1.13 release note
              
          
    We are pleased to announce the release of Intel® Extension for PyTorch* 1.13.0-cpu which accompanies PyTorch 1.13. This release is highlighted with quite a few usability features which help users to get good performance and accuracy on CPU with less effort. We also added a couple of performance features as always. Check out the feature summary below.

Usability Features


Automatic channels last format conversion: Channels last conversion is now applied automatically to PyTorch modules with ipex.optimize by default. Users don't have to explicitly convert input and weight for CV models.
Code-free optimization (experimental): ipex.optimize is automatically applied to PyTorch modules without the need of code changes when the PyTorch program is started with the IPEX launcher via the new --auto-ipex option.
Graph capture mode of ipex.optimize (experimental): A new boolean flag graph_mode (default off) was added to ipex.optimize, when turned on, converting the eager-mode PyTorch modu


## output_code_bad.py

from ctypes import c_void_p, c_long
import torch
import random
from torch import empty_strided, as_strided, device
from torchinductor.codecache import AsyncCompile

aten = torch.ops.aten
async_compile = AsyncCompile()

## output_code_good.py

from ctypes import c_void_p, c_long
import torch
import random
from torch import empty_strided, as_strided, device
from torchinductor.codecache import AsyncCompile

aten = torch.ops.aten
async_compile = AsyncCompile()

## IPEX_INT8_Calibration.py
import os
import torch
model = Model()
model.eval()
data = torch.rand(<shape>)
# Applying torch.fx.experimental.optimization.fuse against model performs conv-batchnorm folding for better performance.
import torch.fx.experimental.optimization as optimization
model = optimization.fuse(model, inplace=True)
#################### code changes ####################
import intel_extension_for_pytorch as ipex

## IPEX_BF16_Inference.py
...
import torch
...
model = Model()
model = model.to(memory_format=torch.channels_last)
model.eval()
#################### code changes ####################
import intel_extension_for_pytorch as ipex
model = ipex.optimize(model, dtype=torch.bfloat16)
######################################################

## IPEX_BF16_Training.py
...
import torch
...
model = Model()
model = model.to(memory_format=torch.channels_last)
criterion = ...
optimizer = ...
model.train()
#################### code changes ####################
import intel_extension_for_pytorch as ipex

## BF16trainingwithIPEXBF16automixedprecision.py
import torch
# Step 1: Register IPEX optimizations
import intel_pytorch_extension as ipex
from my_models import SomeModel
# Step 2: Enable BF16 auto-mixed-precision
ipex.enable_auto_mixed_precision(mixed_dtype = torch.bfloat16)
data_loader = …
# Step 3: Enable IPEX optimizations
model = SomeModel().to(ipex.DEVICE)
opt = torch.optim.SGD(model.parameters(), ...)

## IPEXBF16automixedprecisiongraphfusion.py
import torch
# Step 1: Register IPEX optimizations
import intel_pytorch_extension as ipex
from my_models import SomeModel

# Step 2: Enable BF16 auto-mixed-precision
ipex.enable_auto_mixed_precision(mixed_dtype = torch.bfloat16)
# Step 3: Enable IPEX optimizations
model = SomeModel().to(ipex.DEVICE).eval()
model = torch.jit.script(model)

	from ctypes import c_void_p, c_long
	import torch
	import random
	from torch import empty_strided, as_strided, device
	from torchinductor.codecache import AsyncCompile

	aten = torch.ops.aten
	async_compile = AsyncCompile()
	import os
	import torch
	model = Model()
	model.eval()
	data = torch.rand(<shape>)
	# Applying torch.fx.experimental.optimization.fuse against model performs conv-batchnorm folding for better performance.
	import torch.fx.experimental.optimization as optimization
	model = optimization.fuse(model, inplace=True)
	#################### code changes ####################
	import intel_extension_for_pytorch as ipex
	...
	import torch
	...
	model = Model()
	model = model.to(memory_format=torch.channels_last)
	model.eval()
	#################### code changes ####################
	import intel_extension_for_pytorch as ipex
	model = ipex.optimize(model, dtype=torch.bfloat16)
	######################################################
	import torch
	# Step 1: Register IPEX optimizations
	import intel_pytorch_extension as ipex
	from my_models import SomeModel
	# Step 2: Enable BF16 auto-mixed-precision
	ipex.enable_auto_mixed_precision(mixed_dtype = torch.bfloat16)
	data_loader = …
	# Step 3: Enable IPEX optimizations
	model = SomeModel().to(ipex.DEVICE)
	opt = torch.optim.SGD(model.parameters(), ...)