Skip to content

Instantly share code, notes, and snippets.

@Schwad
Last active February 23, 2023 20:20
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save Schwad/16edf3d7cc5316af4baf23497f3c6a8f to your computer and use it in GitHub Desktop.
Save Schwad/16edf3d7cc5316af4baf23497f3c6a8f to your computer and use it in GitHub Desktop.
RUNS = 100
results = Hash.new { |h, k| h[k] = [] }
RUNS.times do |i|
puts i
run = `benchmark-driver so_k_nucleotide.yml --chruby '2.7.5;3.0.5;3.1.3;3.2.0' -o simple`
run.scan(/\d\.\d\.\d/).each_with_index do |version, index|
results[version] << run.scan(/\d\.\d\d\d/)[index]
end
end
require 'csv'
columns = results.keys
outdata = CSV.generate do |csv|
csv << columns
RUNS.times do |i|
csv << columns.map { |c| results[c][i] }
end
end
File.write("output.csv", outdata)
2.7.5 3.0.5 3.1.3 3.2.0
1.741 1.544 1.504 1.454
1.746 1.582 1.491 1.416
1.710 1.537 1.453 1.425
1.697 1.463 1.514 1.412
1.731 1.585 1.490 1.435
1.747 1.593 1.506 1.368
1.669 1.560 1.511 1.457
1.639 1.513 1.416 1.365
1.694 1.512 1.427 1.402
1.714 1.480 1.415 1.399
1.659 1.542 1.500 1.361
1.572 1.489 1.436 1.371
1.546 1.485 1.440 1.431
1.672 1.504 1.386 1.431
1.749 1.560 1.499 1.404
1.717 1.565 1.508 1.455
1.715 1.586 1.512 1.448
1.729 1.576 1.514 1.440
1.704 1.585 1.501 1.436
1.735 1.583 1.508 1.430
1.735 1.542 1.512 1.451
1.722 1.564 1.460 1.419
1.749 1.552 1.504 1.441
1.740 1.590 1.504 1.449
1.722 1.582 1.495 1.448
1.720 1.563 1.497 1.445
1.720 1.585 1.500 1.447
1.732 1.564 1.513 1.449
1.737 1.568 1.516 1.453
1.730 1.579 1.507 1.431
1.694 1.579 1.519 1.454
1.712 1.592 1.513 1.450
1.650 1.567 1.508 1.418
1.747 1.589 1.507 1.385
1.687 1.405 1.435 1.372
1.629 1.563 1.506 1.425
1.710 1.569 1.466 1.418
1.730 1.568 1.493 1.433
1.725 1.567 1.506 1.427
1.734 1.565 1.481 1.456
1.721 1.571 1.506 1.421
1.725 1.586 1.519 1.447
1.732 1.576 1.518 1.457
1.743 1.579 1.503 1.409
1.738 1.580 1.492 1.459
1.713 1.549 1.505 1.412
1.589 1.522 1.455 1.411
1.718 1.573 1.495 1.394
1.736 1.585 1.508 1.436
1.751 1.577 1.505 1.433
1.722 1.580 1.502 1.417
1.731 1.587 1.506 1.437
1.729 1.573 1.425 1.408
1.670 1.572 1.481 1.439
1.701 1.591 1.510 1.457
1.725 1.570 1.506 1.448
1.715 1.581 1.491 1.420
1.632 1.555 1.479 1.429
1.736 1.581 1.503 1.443
1.742 1.570 1.499 1.443
1.731 1.567 1.508 1.444
1.692 1.584 1.507 1.450
1.738 1.583 1.515 1.452
1.719 1.573 1.483 1.432
1.729 1.569 1.470 1.458
1.735 1.558 1.507 1.431
1.707 1.585 1.481 1.446
1.715 1.567 1.477 1.435
1.702 1.575 1.508 1.419
1.739 1.569 1.491 1.433
1.730 1.562 1.503 1.458
1.728 1.572 1.505 1.451
1.729 1.583 1.510 1.448
1.718 1.568 1.496 1.447
1.711 1.573 1.499 1.447
1.706 1.591 1.503 1.435
1.734 1.535 1.497 1.433
1.731 1.567 1.481 1.448
1.718 1.570 1.497 1.427
1.741 1.558 1.495 1.427
1.715 1.567 1.504 1.422
1.730 1.367 1.417 1.311
1.565 1.422 1.423 1.403
1.734 1.554 1.507 1.458
1.686 1.536 1.490 1.447
1.681 1.560 1.500 1.442
1.731 1.565 1.462 1.448
1.731 1.559 1.504 1.432
1.734 1.564 1.511 1.430
1.740 1.572 1.490 1.443
1.712 1.557 1.505 1.445
1.702 1.587 1.506 1.446
1.733 1.568 1.490 1.365
1.605 1.485 1.457 1.360
1.672 1.558 1.466 1.366
1.642 1.551 1.480 1.422
1.702 1.573 1.476 1.426
1.720 1.568 1.489 1.432
1.730 1.535 1.506 1.429
1.714 1.569 1.509 1.427

Introduction

Recently I had been going through some of the old benchmarks in the Ruby Great Implementation Shootout from around 2010.

As an experiment, one night I ran the benchmarks against Ruby 3.2.0, Ruby 3.2.0 --yjit, TruffleRuby, TruffleRuby +GraalVM, and Ruby 2.6.10.

Most results were as expected. However there was a benchmark that Ruby 2.6.10 was consistently outperforming all new Rubies on.

Method

After pairing with @eightbitraptor, we discovered that this old benchmark was remarkably similar to an existing benchmark in the /benchmark directory, so_k_nucleotide.yml. For brevity I have not included the full 150 lines of the benchmark here.

I tested this out with 100 runs using benchmark-driver against Ruby 2.7,3.0,3.1,3.2. (I had discovered that 2.7 was even faster than 2.6.).

It appears that about half of the regression occured from 2.7 -> 3.0; the other half from 3.0 -> 3.2.

Code

This is my benchmark running code and harnass. The full code and data can be found here

RUNS = 100

results = Hash.new { |h, k| h[k] = [] }
RUNS.times do |i|
  puts i
  run = `benchmark-driver so_k_nucleotide.yml --chruby '2.7.5;3.0.5;3.1.3;3.2.0' -o simple`
  run.scan(/\d\.\d\.\d/).each_with_index do |version, index|
    results[version] << run.scan(/\d\.\d\d\d/)[index]
  end
end

require 'csv'

columns = results.keys
outdata = CSV.generate do |csv|
  csv << columns
  RUNS.times do |i|
    csv << columns.map { |c| results[c][i] }
  end
end

File.write("output.csv", outdata)

Data

Ruby 2.7.5 was consistently ~18-20% faster than Ruby 3.2.0 in this Benchmark

Screenshot 2023-02-15 at 13 16 10

Next Steps

I am happy to help investigate or learn more about this regression if anyone has any ideas.

prelude: |
bm_so_fasta = <<'EOS'
# The Computer Language Shootout
# http://shootout.alioth.debian.org/
# Contributed by Sokolov Yura
$last = 42.0
def gen_random(max, im=139968, ia=3877, ic=29573)
(max * ($last = ($last * ia + ic) % im)) / im
end
alu =
"GGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGG"+
"GAGGCCGAGGCGGGCGGATCACCTGAGGTCAGGAGTTCGAGA"+
"CCAGCCTGGCCAACATGGTGAAACCCCGTCTCTACTAAAAAT"+
"ACAAAAATTAGCCGGGCGTGGTGGCGCGCGCCTGTAATCCCA"+
"GCTACTCGGGAGGCTGAGGCAGGAGAATCGCTTGAACCCGGG"+
"AGGCGGAGGTTGCAGTGAGCCGAGATCGCGCCACTGCACTCC"+
"AGCCTGGGCGACAGAGCGAGACTCCGTCTCAAAAA"
iub = [
["a", 0.27],
["c", 0.12],
["g", 0.12],
["t", 0.27],
["B", 0.02],
["D", 0.02],
["H", 0.02],
["K", 0.02],
["M", 0.02],
["N", 0.02],
["R", 0.02],
["S", 0.02],
["V", 0.02],
["W", 0.02],
["Y", 0.02],
]
homosapiens = [
["a", 0.3029549426680],
["c", 0.1979883004921],
["g", 0.1975473066391],
["t", 0.3015094502008],
]
def make_repeat_fasta(id, desc, src, n)
puts ">#{id} #{desc}"
v = nil
width = 60
l = src.length
s = src * ((n / l) + 1)
s.slice!(n, l)
puts(s.scan(/.{1,#{width}}/).join("\n"))
end
def make_random_fasta(id, desc, table, n)
puts ">#{id} #{desc}"
rand, v = nil,nil
width = 60
chunk = 1 * width
prob = 0.0
table.each{|v| v[1]= (prob += v[1])}
for i in 1..(n/width)
puts((1..width).collect{
rand = gen_random(1.0)
table.find{|v| v[1]>rand}[0]
}.join)
end
if n%width != 0
puts((1..(n%width)).collect{
rand = gen_random(1.0)
table.find{|v| v[1]>rand}[0]
}.join)
end
end
n = (ARGV[0] or 250_000).to_i
make_repeat_fasta('ONE', 'Homo sapiens alu', alu, n*2)
make_random_fasta('TWO', 'IUB ambiguity codes', iub, n*3)
make_random_fasta('THREE', 'Homo sapiens frequency', homosapiens, n*5)
EOS
benchmark:
- name: so_k_nucleotide
prelude: |
script = File.join(File.dirname($0), 'bm_so_fasta.rb')
File.write(script, bm_so_fasta)
def prepare_fasta_output n
filebase = File.join(File.dirname($0), 'fasta.output')
script = File.join(File.dirname($0), 'bm_so_fasta.rb')
file = "#{filebase}.#{n}"
unless FileTest.exist?(file)
STDERR.puts "preparing #{file}"
open(file, 'w'){|f|
ARGV[0] = n
$stdout = f
load script
$stdout = STDOUT
}
end
end
prepare_fasta_output(100_000)
script: |
# The Computer Language Shootout
# http://shootout.alioth.debian.org
#
# contributed by jose fco. gonzalez
# modified by Sokolov Yura
seq = String.new
def frecuency( seq,length )
n, table = seq.length - length + 1, Hash.new(0)
f, i = nil, nil
(0 ... length).each do |f|
(f ... n).step(length) do |i|
table[seq[i,length]] += 1
end
end
[n,table]
end
def sort_by_freq( seq,length )
n,table = frecuency( seq,length )
a, b, v = nil, nil, nil
table.sort{|a,b| b[1] <=> a[1]}.each do |v|
puts "%s %.3f" % [v[0].upcase,((v[1]*100).to_f/n)]
end
puts
end
def find_seq( seq,s )
n,table = frecuency( seq,s.length )
puts "#{table[s].to_s}\t#{s.upcase}"
end
input = open(File.join(File.dirname($0), 'fasta.output.100000'), 'rb')
line = input.gets while line !~ /^>THREE/
line = input.gets
while (line !~ /^>/) & line do
seq << line.chomp
line = input.gets
end
[1,2].each {|i| sort_by_freq( seq,i ) }
%w(ggt ggta ggtatt ggtattttaatt ggtattttaatttatagt).each{|s| find_seq( seq,s) }
loop_count: 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment