Created
January 7, 2024 21:45
-
-
Save zamazan4ik/5440096a8a9b3265b402d9481eab3e10 to your computer and use it in GitHub Desktop.
Tokenizers: Instrumented compared to Release
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Running `/home/zamazan4ik/open_source/tokenizers/tokenizers/target/x86_64-unknown-linux-gnu/release/deps/bert_benchmark-265329d2ad95192b --bench` | |
Benchmarking WordPiece BERT encode | |
Benchmarking WordPiece BERT encode: Warming up for 3.0000 s | |
Benchmarking WordPiece BERT encode: Collecting 20 samples in estimated 5.0017 s (208k iterations) | |
Benchmarking WordPiece BERT encode: Analyzing | |
WordPiece BERT encode time: [24.207 µs 24.210 µs 24.213 µs] | |
change: [+61.027% +61.214% +61.393%] (p = 0.00 < 0.05) | |
Performance has regressed. | |
Found 2 outliers among 20 measurements (10.00%) | |
1 (5.00%) low mild | |
1 (5.00%) high severe | |
Benchmarking WordPiece BERT encode batch | |
Benchmarking WordPiece BERT encode batch: Warming up for 3.0000 s | |
Benchmarking WordPiece BERT encode batch: Collecting 20 samples in estimated 5.7485 s (100 iterations) | |
Benchmarking WordPiece BERT encode batch: Analyzing | |
WordPiece BERT encode batch | |
time: [57.032 ms 57.226 ms 57.431 ms] | |
change: [+1575.8% +1585.2% +1593.9%] (p = 0.00 < 0.05) | |
Performance has regressed. | |
Benchmarking WordPiece Train vocabulary (small) | |
Benchmarking WordPiece Train vocabulary (small): Warming up for 3.0000 s | |
Benchmarking WordPiece Train vocabulary (small): Collecting 10 samples in estimated 6.8133 s (110 iterations) | |
Benchmarking WordPiece Train vocabulary (small): Analyzing | |
WordPiece Train vocabulary (small) | |
time: [60.248 ms 60.786 ms 61.230 ms] | |
change: [+106.76% +109.96% +112.70%] (p = 0.00 < 0.05) | |
Performance has regressed. | |
Benchmarking WordPiece Train vocabulary (big) | |
Benchmarking WordPiece Train vocabulary (big): Warming up for 3.0000 s | |
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 38.3s. | |
Benchmarking WordPiece Train vocabulary (big): Collecting 10 samples in estimated 38.301 s (10 iterations) | |
Benchmarking WordPiece Train vocabulary (big): Analyzing | |
WordPiece Train vocabulary (big) | |
time: [3.7710 s 3.7951 s 3.8188 s] | |
change: [+328.18% +334.44% +340.41%] (p = 0.00 < 0.05) | |
Performance has regressed. | |
Found 1 outliers among 10 measurements (10.00%) | |
1 (10.00%) low mild | |
Running `/home/zamazan4ik/open_source/tokenizers/tokenizers/target/x86_64-unknown-linux-gnu/release/deps/bpe_benchmark-6e02138d48001b9d --bench` | |
Benchmarking BPE GPT2 encode | |
Benchmarking BPE GPT2 encode: Warming up for 3.0000 s | |
Benchmarking BPE GPT2 encode: Collecting 20 samples in estimated 5.0007 s (346k iterations) | |
Benchmarking BPE GPT2 encode: Analyzing | |
BPE GPT2 encode time: [14.469 µs 14.509 µs 14.558 µs] | |
change: [+36.503% +37.071% +37.816%] (p = 0.00 < 0.05) | |
Performance has regressed. | |
Found 7 outliers among 20 measurements (35.00%) | |
1 (5.00%) low severe | |
2 (10.00%) low mild | |
4 (20.00%) high severe | |
Benchmarking BPE GPT2 encode batch | |
Benchmarking BPE GPT2 encode batch: Warming up for 3.0000 s | |
Warning: Unable to complete 20 samples in 5.0s. You may wish to increase target time to 5.2s, enable flat sampling, or reduce sample count to 10. | |
Benchmarking BPE GPT2 encode batch: Collecting 20 samples in estimated 5.1965 s (210 iterations) | |
Benchmarking BPE GPT2 encode batch: Analyzing | |
BPE GPT2 encode batch time: [24.331 ms 24.474 ms 24.650 ms] | |
change: [+483.43% +487.49% +491.52%] (p = 0.00 < 0.05) | |
Performance has regressed. | |
Found 2 outliers among 20 measurements (10.00%) | |
2 (10.00%) high mild | |
Benchmarking BPE GPT2 encode, no cache | |
Benchmarking BPE GPT2 encode, no cache: Warming up for 3.0000 s | |
Benchmarking BPE GPT2 encode, no cache: Collecting 20 samples in estimated 5.0020 s (185k iterations) | |
Benchmarking BPE GPT2 encode, no cache: Analyzing | |
BPE GPT2 encode, no cache | |
time: [26.906 µs 26.917 µs 26.927 µs] | |
change: [+36.510% +36.733% +36.942%] (p = 0.00 < 0.05) | |
Performance has regressed. | |
Found 2 outliers among 20 measurements (10.00%) | |
1 (5.00%) low mild | |
1 (5.00%) high mild | |
Benchmarking BPE GPT2 encode batch, no cache | |
Benchmarking BPE GPT2 encode batch, no cache: Warming up for 3.0000 s | |
Warning: Unable to complete 20 samples in 5.0s. You may wish to increase target time to 9.5s, enable flat sampling, or reduce sample count to 10. | |
Benchmarking BPE GPT2 encode batch, no cache: Collecting 20 samples in estimated 9.5234 s (210 iterations) | |
Benchmarking BPE GPT2 encode batch, no cache: Analyzing | |
BPE GPT2 encode batch, no cache | |
time: [45.166 ms 45.350 ms 45.531 ms] | |
change: [+1041.3% +1046.3% +1050.9%] (p = 0.00 < 0.05) | |
Performance has regressed. | |
Benchmarking BPE Train vocabulary (small) | |
Benchmarking BPE Train vocabulary (small): Warming up for 3.0000 s | |
Benchmarking BPE Train vocabulary (small): Collecting 10 samples in estimated 6.6338 s (110 iterations) | |
Benchmarking BPE Train vocabulary (small): Analyzing | |
BPE Train vocabulary (small) | |
time: [59.898 ms 60.111 ms 60.491 ms] | |
change: [+125.30% +128.63% +132.18%] (p = 0.00 < 0.05) | |
Performance has regressed. | |
Found 1 outliers among 10 measurements (10.00%) | |
1 (10.00%) high mild | |
Benchmarking BPE Train vocabulary (big) | |
Benchmarking BPE Train vocabulary (big): Warming up for 3.0000 s | |
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 38.8s. | |
Benchmarking BPE Train vocabulary (big): Collecting 10 samples in estimated 38.804 s (10 iterations) | |
Benchmarking BPE Train vocabulary (big): Analyzing | |
BPE Train vocabulary (big) | |
time: [3.8338 s 3.8530 s 3.8725 s] | |
change: [+330.01% +335.95% +341.73%] (p = 0.00 < 0.05) | |
Performance has regressed. | |
Running `/home/zamazan4ik/open_source/tokenizers/tokenizers/target/x86_64-unknown-linux-gnu/release/deps/layout_benchmark-1b6429727c41181c --bench` | |
Benchmarking TemplateProcessing single encode | |
Benchmarking TemplateProcessing single encode: Warming up for 3.0000 s | |
Benchmarking TemplateProcessing single encode: Collecting 20 samples in estimated 5.0000 s (2.5M iterations) | |
Benchmarking TemplateProcessing single encode: Analyzing | |
TemplateProcessing single encode | |
time: [1.2197 µs 1.2248 µs 1.2322 µs] | |
change: [+28.641% +34.412% +40.339%] (p = 0.00 < 0.05) | |
Performance has regressed. | |
Found 2 outliers among 20 measurements (10.00%) | |
1 (5.00%) high mild | |
1 (5.00%) high severe | |
Benchmarking TemplateProcessing pair encode | |
Benchmarking TemplateProcessing pair encode: Warming up for 3.0000 s | |
Benchmarking TemplateProcessing pair encode: Collecting 20 samples in estimated 5.0006 s (1.3M iterations) | |
Benchmarking TemplateProcessing pair encode: Analyzing | |
TemplateProcessing pair encode | |
time: [2.8160 µs 2.8354 µs 2.8682 µs] | |
change: [+23.616% +31.051% +38.425%] (p = 0.00 < 0.05) | |
Performance has regressed. | |
Found 3 outliers among 20 measurements (15.00%) | |
1 (5.00%) high mild | |
2 (10.00%) high severe | |
Running `/home/zamazan4ik/open_source/tokenizers/tokenizers/target/x86_64-unknown-linux-gnu/release/deps/unigram_benchmark-60e4f6f5f848a010 --bench` | |
Benchmarking Unigram Train vocabulary (small) | |
Benchmarking Unigram Train vocabulary (small): Warming up for 3.0000 s | |
Benchmarking Unigram Train vocabulary (small): Collecting 10 samples in estimated 5.6442 s (330 iterations) | |
Benchmarking Unigram Train vocabulary (small): Analyzing | |
Unigram Train vocabulary (small) | |
time: [16.982 ms 17.052 ms 17.117 ms] | |
change: [+112.62% +115.04% +117.39%] (p = 0.00 < 0.05) | |
Performance has regressed. | |
Benchmarking Unigram Train vocabulary (medium) | |
Benchmarking Unigram Train vocabulary (medium): Warming up for 3.0000 s | |
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 35.8s. | |
Benchmarking Unigram Train vocabulary (medium): Collecting 10 samples in estimated 35.817 s (10 iterations) | |
Benchmarking Unigram Train vocabulary (medium): Analyzing | |
Unigram Train vocabulary (medium) | |
time: [3.5309 s 3.5400 s 3.5500 s] | |
change: [+453.13% +459.35% +465.02%] (p = 0.00 < 0.05) | |
Performance has regressed. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment