Skip to content

Instantly share code, notes, and snippets.

@dirkgr
Created December 14, 2021 19:23
Show Gist options
  • Save dirkgr/25aac9f8dc24c8f3d548ec20fd967002 to your computer and use it in GitHub Desktop.
Save dirkgr/25aac9f8dc24c8f3d548ec20fd967002 to your computer and use it in GitHub Desktop.
Multiplying a 60000x60000 matrix with itself in torch
float32: 3.71 s ± 2.95 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
float16: 2.29 s ± 8.15 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
bfloat16: 2.29 s ± 9.57 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
torch.backends.cuda.matmul.allow_tf32 = False
float32: 24.4 s ± 41.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment