Skip to content

Instantly share code, notes, and snippets.

@mintuhouse
Last active November 10, 2021 20:03
Show Gist options
  • Save mintuhouse/185b8454396b91e5e8c5002a1fff83f6 to your computer and use it in GitHub Desktop.
Save mintuhouse/185b8454396b91e5e8c5002a1fff83f6 to your computer and use it in GitHub Desktop.
Benchmark to see if ruby IO follows amdahl's law with s=num_of_threads

While going through https://www.speedshop.co/2020/05/11/the-ruby-gvl-and-scaling.html, I read

Amdahl's Law is simply 1 / (1 - p + p/s), where p is the percentage of the task that could be done in parallel, and s is the speedup factor from the part of the task that gained improved resources (the parallel part).

So, in our example, let's say that half of SatelliteDataProcessorJob is GVL-bound and half is IO-bound. In this case, p is 0.5 and s is 10, because we can wait for IO in parallel and there are 10 threads. In this case, Amdahl's Law shows that a Sidekiq process would go through our jobs up to 1.81x faster than a single-threaded Resque or DelayedJob process.

Question: In this case, shouldn’t speed up be 2x as IO-bound portion takes practically zero CPU time rather than p/s i.e., s = ♾️. Wondering what I am missing here? 🤔

Why do I think speedup will be 2x? Since an IO bound task would not take any CPU time, I am assuming that speedup s would be unlimited after 1/(1-p) number of threads

e.g., for 2 threads & 50% IO
for first half of second
  we run first job’s CPU bound portion and second job’s IO bound portion
in second half of second
  we run first job’s IO bound portion and second job’s CPU bound portion

If I have to formulate, I would say speedup when using n threads on jobs which have p % IO would be [n, 1/(1-p)].min

require 'benchmark'
JOB_COUNT = 100
ITERATION_COUNT = 10
MAX_INT_WHICH_TAKES_50ms = 750000 # Varies slightly based on ruby version
def cpu_task
for i in 1..MAX_INT_WHICH_TAKES_50ms; a = "1"; end
end
def ms100_job_with_50_percent_io()
cpu_task
sleep(0.05)
end
def job(thread_count)
jobs_per_thread = JOB_COUNT/thread_count
(1..thread_count).map{Thread.new { jobs_per_thread.times { ms100_job_with_50_percent_io() } }}.each{|t| t.join}
end
Benchmark.bm(10) do |x|
x.report("1 thread:") { ITERATION_COUNT.times {job(1)} }
x.report("2 threads:") { ITERATION_COUNT.times {job(2)} }
x.report("4 threads:") { ITERATION_COUNT.times {job(4)} }
x.report("10 threads:") { ITERATION_COUNT.times {job(10)} }
x.report("20 threads:") { ITERATION_COUNT.times {job(20)} }
end

Ran on heroku one-off performance-L dyno

Speedup as we increase the number of threads for 50% IO job speedup

./ruby-2.3.8/bin/ruby amdahl.rb

                 user     system      total        real
1 thread:   47.920000   0.020000  47.940000 ( 97.990803)
2 threads:  48.520000   0.050000  48.570000 ( 61.070696)
4 threads:  49.410000   0.060000  49.470000 ( 51.510326)
10 threads: 50.810000   0.050000  50.860000 ( 51.523674)
20 threads: 50.110000   0.060000  50.170000 ( 50.679642)

./ruby-2.4.10/bin/ruby amdahl.rb

                 user     system      total        real
1 thread:   47.430000   0.030000  47.460000 ( 97.510844)
2 threads:  48.110000   0.030000  48.140000 ( 60.229533)
4 threads:  49.070000   0.060000  49.130000 ( 54.895647)
10 threads: 51.210000   0.060000  51.270000 ( 51.790397)
20 threads: 50.710000   0.020000  50.730000 ( 51.283382)

./ruby-2.5.9/bin/ruby amdahl.rb

                 user     system      total        real
1 thread:   45.868000   0.020000  45.888000 ( 95.940620)
2 threads:  46.620000   0.056000  46.676000 ( 58.659435)
4 threads:  48.188000   0.056000  48.244000 ( 51.806961)
10 threads: 51.912000   0.064000  51.976000 ( 52.656030)
20 threads: 51.832000   0.068000  51.900000 ( 52.404683)

./ruby-2.6.8/bin/ruby amdahl.rb

                 user     system      total        real
1 thread:   47.316000   0.024000  47.340000 ( 97.395988)
2 threads:  48.180000   0.020000  48.200000 ( 49.746869)
4 threads:  49.536000   0.056000  49.592000 ( 55.840695)
10 threads: 53.656000   0.092000  53.748000 ( 54.384962)
20 threads: 53.436000   0.068000  53.504000 ( 54.031722)

./ruby-2.7.4/bin/ruby amdahl.rb

                 user     system      total        real
1 thread:   47.168000   0.008000  47.176000 ( 97.233165)
2 threads:  47.864000   0.012000  47.876000 ( 49.559120)
4 threads:  48.900000   0.040000  48.940000 ( 54.893521)
10 threads: 52.612000   0.044000  52.656000 ( 53.192105)
20 threads: 54.072000   0.048000  54.120000 ( 54.671252)

./ruby-3.0.2/bin/ruby amdahl.rb

                 user     system      total        real
1 thread:   48.112000   0.028000  48.140000 ( 98.195615)
2 threads:  49.044000   0.000000  49.044000 ( 50.229227)
4 threads:  50.164000   0.056000  50.220000 ( 53.945347)
10 threads: 53.656000   0.068000  53.724000 ( 54.264653)
20 threads: 54.788000   0.064000  54.852000 ( 55.323151)
# Downloading custom versions of ruby on heroku
export RUBY_VERSION=3.0.2 # Change to whatever version of ruby you want to download
mkdir -p ruby-$RUBY_VERSION
curl "https://s3-external-1.amazonaws.com/heroku-buildpack-ruby/heroku-18/ruby-$RUBY_VERSION.tgz" | tar -xz -C ruby-$RUBY_VERSION
# Note: use heroku-16 for ruby 2.3.8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment