In general, check the crt/host_config.h
file to find out which versions are supported.
Sometimes it is possible to hack the requirements there to get some newer versions working, too :)
Thrust version can be found in $CUDA_ROOT/include/thrust/version.h
.
Download Archives: https://developer.nvidia.com/cuda-toolkit-archive
Release notes for CUDA Toolkit (CTK):
- 11.2: https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html
- 11.1: https://docs.nvidia.com/cuda/archive/11.1.1/index.html
- 11.0: https://docs.nvidia.com/cuda/archive/11.0/cuda-toolkit-release-notes/index.html
- 10.2: https://developer.download.nvidia.com/compute/cuda/10.2/Prod/docs/sidebar/CUDA_Toolkit_Release_Notes.pdf
- 10.1: https://developer.download.nvidia.com/compute/cuda/10.1/Prod/docs/sidebar/CUDA_Toolkit_Release_Notes.pdf
- 10.0: https://developer.download.nvidia.com/compute/cuda/10.0/Prod/docs/sidebar/CUDA_Toolkit_Release_Notes.pdf
- 9.2: https://developer.download.nvidia.com/compute/cuda/9.2/Prod/docs/sidebar/CUDA_Toolkit_Release_Notes.pdf
- 9.1: https://developer.download.nvidia.com/compute/cuda/9.1/Prod/docs/sidebar/CUDA_Toolkit_Release_Notes.pdf
- 9.0: https://developer.download.nvidia.com/compute/cuda/9.0/Prod/docs/sidebar/CUDA_Toolkit_Release_Notes.pdf
- 8.0: https://developer.nvidia.com/compute/cuda/8.0/Prod2/docs/sidebar/CUDA_Toolkit_Release_Notes-pdf
- 7.5: http://developer.download.nvidia.com/compute/cuda/7.5/Prod/docs/sidebar/CUDA_Toolkit_Release_Notes.pdf
- 7.0: http://developer.download.nvidia.com/compute/cuda/7_0/Prod/doc/CUDA_Toolkit_Release_Notes.pdf
- 6.5: http://developer.download.nvidia.com/compute/cuda/6_5/rel/docs/CUDA_Toolkit_Release_Notes.pdf
- 6.0: http://developer.download.nvidia.com/compute/cuda/6_0/rel/docs/CUDA_Toolkit_Release_Notes.pdf
- 5.5: http://developer.download.nvidia.com/compute/cuda/5_5/rel/docs/CUDA_Toolkit_Release_Notes.pdf
Version notes Nvidia HPC SDK:
nvcc
Latest, officical Compiler requirements: http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
CUDA version | SM Arch | g++ | icpc | pgc++ | xlC | MSVC | clang++ | Linux driver | thrust | note |
---|---|---|---|---|---|---|---|---|---|---|
1.0 | 1.0-1.1 | ? | ? | ? | ||||||
1.1 | 1.0-1.1 | ? | ? | ? | ||||||
2.0 | 1.0-1.1 | ? | ? | ? | ||||||
2.1 | 1.0-1.3 | ? | ? | ? | ||||||
2.3.1 | 1.0-1.3 | ? | ? | ? | ||||||
3.0 | 1.0-2.0 | ? | ? | ? | ||||||
3.1 | 1.0-2.0 | ? | ? | ? | ||||||
3.2 | 1.0-2.1 | ? | 11.1 | ? | ||||||
4.0 | 1.0-2.1 | <=4.4 | 11.1 | ? | ||||||
4.1 | 1.0-2.1 | <=4.5 | 11.1 | ? | ||||||
4.2 | 1.0-2.1 | <=4.6 | 11.1 | ? | ||||||
5.0 | 1.0-3.? | <=4.6 | 11.1 | ? | ? | 1.5.3 | ||||
5.5 | 1.0-3.? | <=4.8 | 12.1 | ? | ? | 1.7.0 | C++11 on host side supported; ICC fixed to build 20110811 |
|||
6.0 | 1.0-5.0 | <=4.8 | 13.1 | ? | 331.62 | 1.7.1 | ||||
6.5 | 1.1-5.X | <=4.8 | 14.0 | ? | ? | ? | 1.7.2 | experimenal device side C++11 support; including this version, <thrust/sort.h> skrews up __CUDA_ARCH__ (must be undefined on host); deprecation of SM 11-13 (10 removed) |
||
7.0.17 (RC) | s. below | <=4.9 | 15.0 | >=14.9 | 13.1.1 | ? | 346.29 | 1.8.0 | first official PGI support, first xlc string found; powerpc64 w. little endian supported | |
7.0.27 | 2.0-5.X | <=4.9 | 15.0 | >=14.9 | 13.1.1 | 2010-13 | 346.46 | 1.8.1 | official C++11 support on device side | |
7.5 | <=4.9 | 15.0 | 15.4 | 13.1 | 2010-13 | 3.5-3.6 | 352.41? | 1.8.2 | clang (host) on linux supported, __CUDACC_VER__ macros added |
|
7.5.18 | 2.0-5.X | <=4.9 | 15.0 | 15.4 | 13.1 | 2010-13 | 352.39 | 1.8.2 | ||
8.0.44 | 2.0-6.X | <=5.3 | 15.0(.4)-16.0 | 16(.3)+ | 13.1(.2) | 2012-15 | 3.8-3.9 | 367.48 | 1.8.3-patch2 | sm_60 (pascal) support added |
8.0.61 | 2.0-6.X | <=5.3 | 15.0(.4)-17.0 | 16(.3)+ | 13.1(.2) | 2012-15 | 3.8-3.9 | 375.26 | 1.8.3-patch2 | nvcc 8 is incompatible with std::tuple in gcc 5.4+ |
9.0.69 (RC) | 3.0-7.0 | <=5.5 (<=6) | 15.0(.4)-17.0 | 17 | 13.1(.2) | 2012-17 | 3.8-3.9 | ???.?? | 1.9.0-patch4 | device-side C++14; __CUDACC_VER__ deprecated for __CUDACC_VER_MAJOR/MINOR/BUILD__ |
9.0.103 (RC) | 3.0-7.0 | <=5.5 (<=6) | 15.0(.4)-17.0 | 17 | 13.1(.2) | 2012-17 | 3.8-3.9 | 384.59 | 1.9.0-patch4 | same as above, __CUDACC_VER__ defined as #error rendering it fully broken |
9.0.176 | 3.0-7.0 | <=5.5 (<=6) | (15.0-)17.0 | 17.1 | 13.1(.5) | 2012-17 | (3.8-)3.9 | 384.81 | 1.9.0-patch5 | same as above |
9.1.85 | 3.0-7.2 | <=5.5 (<=6) | (15.0-)17.0 | 17.X | 13.1(.6) | 2012-17 | (3.8-)4.0 | 390.46 | 1.9.1-patch2 | math_functions.hpp moved to crt/ |
9.1.85.1 | cuBLAS 9.1.128: Volta GEMM kernels optimized | |||||||||
9.1.85.2 | ptxas: fix address calculations using large immediate operands | |||||||||
9.1.85.3 | cuBLAS: fixes to GEMM optimizations for convolutional sequence to sequence (seq2seq) models. | |||||||||
9.0-9.1 | nvcc 9.0-9.1 is incompatible with std::tuple in gcc 6+ |
|||||||||
9.2.88 | 3.0-7.2 | <=7.3.0 (<=7) | (15.0-)17.0 | 17-18.X | 13.1(.6),16.1 | 2012-17 | (3.8-)5.0 | 396.26 | 1.9.2 | CUTLASS 1.0 added; std::tuple fixed (prior GCC 6 issues) |
9.2.148 | 396.37 | 1.9.2 | ||||||||
10.0.130 | 3.0-7.5 | <=7 | (15.0-)18.0 | 17-18.X | 13.1, 16.1 | 2013-17 | (3.8-)6.0 | 410.48 | 1.9.3 | |
10.1.105 | 3.0-7.5 | <=8 | (15.0-)19.0 | 17-19.X | 2013-19 | (3.8-)7.0 | 418.39 | 1.9.4 | ||
10.1.168 | (3.8-)8.0 | 418.67 | 10.1 "Update 1" | |||||||
10.1.243 | 418.87 | 10.1 "Update 2" | ||||||||
10.2.89 | 3.0-7.5 | <=8 | (15.0-)19.0 | 18-19.X | 13.1, 16.1 | 2015-19 | (3.3-)8.* | 440.33.01 | 1.9.7 | sm_30,35,37,50 deprecated; nvcc : -allow-unsupported-compiler |
11.0.1 (RC) NVCC:11.0.167 | 3.5-8.0 | (5-)9.* | (15.0-)19.1 | 18-20.1 | 13.1, 16.1 | 2015-19 | 3.2-9.0.0 | 450.36.06 | 1.9.9 | macOS dropped; libs drop pre-C++11, deprecate pre-C++14 (GCC < 5, Clang < 6, and MSVC < 2017); Arm C/C++ 19.2 support |
11.0.2-1 NVCC:11.0.194 | (3.3/)6-9.0.0 | 450.51.05 | nvcc : --Wext-lambda-captures-this |
|||||||
11.0.3 NVCC:11.0.221 | ? | ? | ? | ? | ? | ? | ? | 450.51.06 | ? | 11.0 "Update 1"; nvcc : --forward-unknown-to-host-compiler , --forward-unknown-to-host-linker flags |
11.1.0 NVCC:11.1.74 | 3.5-8.6 | 3.5-10.0 | (15.0-)19.1 | 18-20.1 | 13.1, 16.1 | 2017-19 | (3.3/)6-10.0.0 | 455.23.05 | 1.9.10-1 | Ubuntu@ppc64le deprecated |
11.1.1 NVCC:11.1.? | ? | ? | ? | |||||||
11.2.0 NVCC:11.2.67 | 460.27.04 | 1.10.0 | ||||||||
CUDA version | SM Arch | g++ | icpc | pgc++ | xlC | MSVC | clang++ | Linux driver | thrust | note |
Note: empty cells generally mean "same as above" for readability.
macOS: As of 7.0, clang seems to be the only supported compiler on OSX (but no version check found). CUDA 10.1.243 adds support for Xcode 10.2 . CUDA 11.0 dropped macOS support.
Compilers such as pgC, icc, xlC are only supported on x86 linux and little endian.
Dynamic parallelism was added with sm_35
and CUDA 5.0
.
Newer CUDA releases have a per-release support matrix for compilers, which also lists supported kernel and glibc versions: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#system-requirements
clang++ -x cuda
clang++ can compile CUDA C++ to ptx as well. Give it a whirl!
clang++ | supported CUDA release | supported SMs |
---|---|---|
3.9-5.0 | 7.0-8.0 | 2.0-(5.0)6.0 |
6.0 | 7.0-9.0 | (2.0)3.0-7.0 |
7.0 | 7.0-9.2 | (2.0)3.0-7.2 |
8.0 | 7.0-10.0 | (2.0)3.0-7.5 |
9.0 | 7.0-10.1 | (2.0)3.0-7.5 |
10.0 | 7.0-10.1 | (2.0)3.0-7.5 |
11.0 | 7.0-11.0 | (2.0)3.0-8.0 |
trunk | 7.0-11.0 | (2.0)3.0-8.0 |
https://llvm.org/docs/CompileCudaWithLLVM.html
Device-Side C++ Standard Support
C++ core language features:
supported C++ standard | notes | |
---|---|---|
nvcc -6.0 | c++03 | |
nvcc 6.5 | c++03, exp. c++11 | undocumented |
nvcc 7.0-8.0 | c++03,11 | only c++11 switch |
nvcc 9.0-10.2 | c++03,11,14 | 10.2 adds libcu++ (atomics); open repository: https://github.com/NVIDIA/libcudacxx/releases |
nvcc 11.0.167+ | c++03,11,14,17 | C++11 host compiler needed for math libs; ships C++11-compatible backport of the C++20 synchronization library; device LTO added; starting with CUDA Toolkit 11.0.1, nvcc and CUDA Toolkit versions are not equivalent anymore |
clang 5+ | c++03,11,14,17 | |
clang 6+ | c++03,11,14,17,2a | |
clang 10+ | c++03,11,14,17,20 | |
clang trunk | c++03,11,14,17,20 | status |
CUDA-enabled C++ standard library libcu++
, based on LLVM's libc++
(docs):
introduced components | notes | |
---|---|---|
CUDA 10.2 | <atomic> (SM6.0+), <type_traits> |
introduction of libcu++ |
CUDA 11.0 | atomic<T>::wait/notify , <barrier> , <latch> , <counting_semaphore> (SM7.0+), <chrono> , <ratio> , <functional> w/o function |
anticipated with GTC 2020 slides |
CUDA 11.2 | cuda::std::tuple ,pair |
notes |
CUDA next | cuda::std::complex , backports: chrono , type_traits |
notes |
newer | see the release notes and api docs | all open source now |
Incremental libcu++
release goals (GTC 2020):
- Version 1 (CUDA 10.2):
<atomic>
(SM6.0+),<type_traits>
. - Version 2 (CUDA next):
atomic<T>::wait/notify
,<barrier>
,<latch>
,<counting_semaphore>
(SM7.0+),<chrono>
,<ratio>
,<functional>
minus function. - Future priorities:
atomic_ref<T>
,<complex>
,<tuple>
,<array>
,<utility>
,<cmath>
, string processing, ...
NVC++
NVC++ is a unified C++ compiler and GPU-accelerated STL for the CUDA platform. It also seems to support OpenACC. NVC++ does currently not support the CUDA C++ language.
supported C++ standard | notes | |
---|---|---|
nvc++ 11.0 | ...,c++17 | initial release, ships C++11-compatible backport of the C++20 synchronization library |
All GPU compilers are cheese.
This comment has been minimized.
It appears that release notes are in a different location now:
https://developer.download.nvidia.com/compute/cuda/X.Y/Prod/docs/sidebar/CUDA_Toolkit_Release_Notes.pdf