Skip to content

Instantly share code, notes, and snippets.

@rharriso
Created October 14, 2018 23:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rharriso/0e0af5a01dcfd7460c4a86188c02df3b to your computer and use it in GitHub Desktop.
Save rharriso/0e0af5a01dcfd7460c4a86188c02df3b to your computer and use it in GitHub Desktop.
Looking at Thust: Cuda Prof
==21393== Profiling application: ./main-cuda.cuda
==21393== Profiling result:
Type Time(%) Time Calls Avg Min Max Name
GPU activities: 73.42% 227.76ms 2 113.88ms 113.37ms 114.40ms void initRandom<float>(float*, float, float)
26.58% 82.465ms 10 8.2465ms 4.8140ms 39.102ms void add<float>(float*, float*, float*)
API calls: 61.88% 310.24ms 11 28.204ms 4.8177ms 227.74ms cudaDeviceSynchronize
32.78% 164.31ms 3 54.771ms 49.680us 164.11ms cudaMallocManaged
4.87% 24.428ms 3 8.1428ms 8.1220ms 8.1685ms cudaFree
0.32% 1.6288ms 12 135.74us 19.770us 559.03us cudaLaunch
0.11% 528.26us 94 5.6190us 240ns 229.88us cuDeviceGetAttribute
0.02% 115.80us 1 115.80us 115.80us 115.80us cuDeviceTotalMem
0.01% 53.541us 1 53.541us 53.541us 53.541us cuDeviceGetName
0.00% 8.6900us 36 241ns 130ns 1.3400us cudaSetupArgument
0.00% 6.8300us 12 569ns 250ns 2.1200us cudaConfigureCall
0.00% 1.9900us 3 663ns 240ns 1.4500us cuDeviceGetCount
0.00% 1.1000us 2 550ns 250ns 850ns cuDeviceGet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment