Skip to content

Instantly share code, notes, and snippets.

@seaside2mm
Last active November 2, 2022 07:41
Show Gist options
  • Save seaside2mm/8b49360fa223422f5855dc891bf58630 to your computer and use it in GitHub Desktop.
Save seaside2mm/8b49360fa223422f5855dc891bf58630 to your computer and use it in GitHub Desktop.
[硬件加速] #dl

NVIDIA Jetson AGX Xavier

Deep Learning Accelerator (DLA)

not as fast as the GPU but is extremely power efficient

  • Convolution Core accelerates various convolutional layers. (equivalent to nn.Conv2d in PyTorch)
  • Single Data Point Processor accelerates activation functions. It supports all commonly used activations like ReLU, PReLU, sigmoid, tanh and can even perform batch normalization and bias addition. (equivalent to nn.ReLU in PyTorch)
  • Planar Data Processor accelerates max-pooling and average pooling layers. (equivalent to nn.MaxPool2d in PyTorch)
  • Cross-channel Data Processor accelerates cross-channel operations like layer normalization. (equivalent to nn.LayerNorm in PyTorch)
  • Data Reshape Engine performs operations like splitting and splicing tensors, merging, reshaping, and transposing. (equivalent to torch.reshape in PyTorch)
  • Bridge DMA: Usually, in deep learning inference, data movement takes far longer and is more energy-intensive than the actual computation. The DMA (Direct Memory Access) bridge provides a way for data to move from system RAM to DLA’s internal memory. The DLA does not need to wait as the computations are performed in the DLA memory asynchronously.

Programmable Vision Accelerators (PVA)

ASIC for running traditional (i.e., not Deep Learning based) computer vision algorithms

Vision Image Compositor (VIC)

for image resizing and color space conversion.

Vision Programming Interface (VPI)

Drive

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment