Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import collections | |
import timeit | |
import numpy as np | |
import torch | |
def loop_expand(values, repeats): | |
output = [] | |
for v, r in zip(values, repeats): |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Percent of index_select baseline. Lower is better. | |
[0] use gather (SmallVector) | |
[1] use gather (std::vector) | |
[2] sharded loop (shard_size = 2048) | |
Quadratic spacing. (Sparse) | |
////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// | |
Size (output) | Baseline (us) [0] [1] [2] | |
841 | 8.8 61.1% 60.7% 30.4% | |
2430 | 18.2 35.0% 34.2% 24.3% |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Percent of index_select baseline. Lower is better. | |
[0] use gather (SmallVector) | |
[1] use gather (std::vector) | |
[2] sharded loop (shard_size = 2048) | |
Quadratic spacing. (Sparse) | |
////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////// | |
Size (output) | Baseline (us) [0] [1] [2] | |
841 | 4.8 115.8% 113.9% 46.7% | |
2430 | 5.7 119.4% 117.3% 88.9% |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import collections | |
import json | |
import sys | |
import time | |
import timeit | |
import numpy as np | |
import torch | |
torch.set_num_threads(1) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
set -e | |
source ~/miniconda3/etc/profile.d/conda.sh | |
RESULTS="/tmp/${USER}/results.txt" | |
> ${RESULTS} | |
measure () { | |
local conda_env=$1 | |
conda activate ${conda_env} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Improved (>10%): 188 (17%) | |
Regressed (>10%): 344 (32%) | |
Within 10%: 553 (51%) | |
Improvement Absolute | dtype numel mask_reuse mask_true_pct x_layout mask_layout | |
================================================================================================================== | |
-98% 1.6 us | float64 33 33 67% contiguous contiguous | |
-97% 1.6 us | float32 33 33 67% contiguous contiguous | |
-97% 1.4 us | int8 10 1 92% contiguous contiguous | |
-91% 1.5 us | float32 6 6 36% contiguous contiguous |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
gpu: 128 / 128 | |
27 samples were culled, 1893 remain | |
======================================== | |
== GPU ================================= | |
======================================== | |
Improved (>5%): 462 ( 24%) | |
Regressed (>5%): 266 ( 14%) | |
Within 5%: 1165 ( 62%) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
gpu: 128 / 128 cpu: 128 / 128 | |
814 samples were culled, 1746 remain | |
======================================== | |
== CPU ================================= | |
======================================== | |
Improved (>5%): 36 ( 5%) | |
Regressed (>5%): 626 ( 93%) | |
Within 5%: 8 ( 1%) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
gpu: 128 / 128 cpu: 128 / 128 | |
1066 samples were culled, 2774 remain | |
======================================== | |
== CPU ================================= | |
======================================== | |
Improved (>5%): 385 ( 40%) | |
Regressed (>5%): 294 ( 31%) | |
Within 5%: 272 ( 29%) |
OlderNewer