Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save wolfram77/a42aeb021a2fec7b7e85ceb71b398772 to your computer and use it in GitHub Desktop.
Save wolfram77/a42aeb021a2fec7b7e85ceb71b398772 to your computer and use it in GitHub Desktop.
NVIDIA Tesla V100 GPU Architecture Whitepaper : NOTES

Highlighted notes on:
NVIDIA Tesla V100 GPU Architecture Whitepaper
While doing research work with Prof. Dip Sankar Banerjee, Prof. Kishore Kothapalli.

Here is a my short summary of NVIDIA Tesla GV100 (Volta) architecture from the whitepaper:

  • 84 SMs, each with 64 independent FP, INT cores.
  • Shared mem. size config. up to 96KB / SM.
  • 4 512-bit mem. controllers (total 4096-bit).
  • Upto 6 Bidirectional NVLink, 25 GB/s per direction (w/ IBM Power 9 CPUs).
  • 4 dies / HBM stack, 4 stacks. 16 GB w/ 900 GB/s HBM2 (Samsung).
  • 1 err. correcting, 2 err. detecting native/sideband ECC (HBM, REG, L1, L2) (1 bit / byte).

A few additional points:

  • Each SM has 4 processing blocks (each handles 1 warp of 32 threads).
  • L1 data cache is combined w/ shared mem. = 128 KB / SM (explicit caching not as imp.).
  • Volta also supports write-caching (not just load, as prev. arch.).
  • NVLink supports coherency allowing data reads from GPU mem. to be stored in CPU cache.
  • Addr. Translation Serv. (ATS) allows GPU to access CPU page tables directly (malloc ptr).
  • Copy engine dont need pinned memory (that's why i saw ~no speedup w/ pinned mem. in PR).
  • Volta per-thread PC, call-stack, allows interleaved exec. of warp threads, ok fine-grained sync. (__syncwarp()).
  • Cooperative groups enable sync. between warps, grid-wide, multi-GPUs, cross-warp, sub-warp.
Display the source blob
Display the rendered blob
Raw
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment