Skip to content

Instantly share code, notes, and snippets.

@ptrblck
Created November 25, 2019 08:14
Show Gist options
  • Save ptrblck/331d0e5087b3aef199020c32cba2f3c4 to your computer and use it in GitHub Desktop.
Save ptrblck/331d0e5087b3aef199020c32cba2f3c4 to your computer and use it in GitHub Desktop.
pytorch_cuda_pow_test
import torch
import torch.nn as nn
import time
torch.backends.cudnn.benachmark = True
a = torch.randn(1024, 1024, 10).cuda()
b = torch.randn(1024, 1024, 10).cuda()
nb_iters = 1000
for _ in range(10):
c = (a**2 + b**2) **0.5
torch.cuda.synchronize()
t0 = time.time()
for _ in range(nb_iters):
c = (a**2 + b**2)** 0.5
torch.cuda.synchronize()
t1 = time.time()
print('slow took {}s'.format((t1 - t0)/nb_iters))
for _ in range(10):
c = (a*a + b*b) **0.5
torch.cuda.synchronize()
t0 = time.time()
for _ in range(nb_iters):
c = (a*a + b*b)**0.5
torch.cuda.synchronize()
t1 = time.time()
print('fast took {}s'.format((t1 - t0)/nb_iters))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment