Skip to content

Instantly share code, notes, and snippets.

@tenderlove
Created April 9, 2019 00:47
Show Gist options
  • Save tenderlove/99112e9fcc85d9c6c7d9d0ea40063fc6 to your computer and use it in GitHub Desktop.
Save tenderlove/99112e9fcc85d9c6c7d9d0ea40063fc6 to your computer and use it in GitHub Desktop.

Benchmarks for GC Compactor

GC benchmarks for trunk vs gc-compact seem to be about the same:

$ make benchmark ITEM=gc
./revision.h unchanged
/Users/aaron/.rbenv/shims/ruby --disable=gems -rrubygems -I./benchmark/lib ./benchmark/benchmark-driver/exe/benchmark-driver \
	            --executables="compare-ruby::/Users/aaron/.rbenv/shims/ruby --disable=gems -I.ext/common --disable-gem" \
	            --executables="built-ruby::./miniruby -I./lib -I. -I.ext/common  -r./prelude --disable-gem" \
	            $(ls ./benchmark/*gc*.{yml,rb} 2>/dev/null) 
Calculating -------------------------------------
                               compare-ruby  built-ruby 
            vm1_gc_short_lived       6.654M      6.863M i/s -     30.000M times in 4.508284s 4.371307s
vm1_gc_short_with_complex_long       7.717M      7.952M i/s -     30.000M times in 3.887685s 3.772617s
        vm1_gc_short_with_long       5.953M      6.075M i/s -     30.000M times in 5.039375s 4.937911s
      vm1_gc_short_with_symbol       7.593M      7.298M i/s -     30.000M times in 3.950842s 4.110539s
                 vm1_gc_wb_ary      63.941M     74.356M i/s -     30.000M times in 0.469183s 0.403465s
        vm1_gc_wb_ary_promoted      64.257M     62.325M i/s -     30.000M times in 0.466874s 0.481350s
                 vm1_gc_wb_obj      72.445M     92.515M i/s -     30.000M times in 0.414109s 0.324271s
        vm1_gc_wb_obj_promoted      77.314M     74.770M i/s -     30.000M times in 0.388029s 0.401229s
                        vm3_gc        0.937       0.949 i/s -       1.000 times in 1.067345s 1.053194s
               vm3_gc_old_full        0.377       0.373 i/s -       1.000 times in 2.650514s 2.677493s
          vm3_gc_old_immediate        0.599       0.543 i/s -       1.000 times in 1.669253s 1.840736s
               vm3_gc_old_lazy        0.451       0.416 i/s -       1.000 times in 2.218040s 2.403648s

Comparison:
                         vm1_gc_short_lived
                    built-ruby:   6862936.0 i/s 
                  compare-ruby:   6654416.6 i/s - 1.03x  slower

             vm1_gc_short_with_complex_long
                    built-ruby:   7952039.7 i/s 
                  compare-ruby:   7716674.6 i/s - 1.03x  slower

                     vm1_gc_short_with_long
                    built-ruby:   6075443.6 i/s 
                  compare-ruby:   5953119.2 i/s - 1.02x  slower

                   vm1_gc_short_with_symbol
                  compare-ruby:   7593318.1 i/s 
                    built-ruby:   7298312.9 i/s - 1.04x  slower

                              vm1_gc_wb_ary
                    built-ruby:  74355892.1 i/s 
                  compare-ruby:  63940935.6 i/s - 1.16x  slower

                     vm1_gc_wb_ary_promoted
                  compare-ruby:  64257165.7 i/s 
                    built-ruby:  62324711.7 i/s - 1.03x  slower

                              vm1_gc_wb_obj
                    built-ruby:  92515211.0 i/s 
                  compare-ruby:  72444694.5 i/s - 1.28x  slower

                     vm1_gc_wb_obj_promoted
                  compare-ruby:  77313809.0 i/s 
                    built-ruby:  74770268.3 i/s - 1.03x  slower

                                     vm3_gc
                    built-ruby:         0.9 i/s 
                  compare-ruby:         0.9 i/s - 1.01x  slower

                            vm3_gc_old_full
                  compare-ruby:         0.4 i/s 
                    built-ruby:         0.4 i/s - 1.01x  slower

                       vm3_gc_old_immediate
                  compare-ruby:         0.6 i/s 
                    built-ruby:         0.5 i/s - 1.10x  slower

                            vm3_gc_old_lazy
                  compare-ruby:         0.5 i/s 
                    built-ruby:         0.4 i/s - 1.08x  slower
                    
[aaron@TC-275 ~/g/ruby (gc-compact)]$ /Users/aaron/.rbenv/shims/ruby -v
ruby 2.7.0dev (2019-04-08 trunk 67472) [x86_64-darwin18]
[aaron@TC-275 ~/g/ruby (gc-compact)]$ ./ruby -v
ruby 2.7.0dev (2019-04-08 gc-compact 67472) [x86_64-darwin18]
last_commit=fix compiler warning

Heap Impact

To test compaction impact, I recorded the heap just before processing the first request, compacted the heap after the first request finished, then recorded the heap after compaction.

This is the graph of the heap before compaction:

before_compact-0

This is a graph of the heap after compaction:

after_compact-0

Each column is a page, each square is a slot. Red slots are pinned (cannot move), black slots are filled but can move, white are empty.

Before compaction there were 595 pages. 99 / 595 were full (had no empty slots). 495 / 595 were fragmented (contained objects and free slots).

After compaction there were 519 pages. 451 / 595 were full (had no empty slots). 68 / 595 were fragmented (contained objects and free slots).

Here is a graph of the pinned objects vs unpinned objects:

Untitled 2019-04-08 17-31-20

Most objects are unpinned so they can move around.

Compaction Performance

For actual compaction performance I used a benchmark like this:

require 'benchmark/ips'

GC.start
puts "Baseline #{GC.stat(:heap_live_slots)}"
Benchmark.ips do |x|
  x.report("compact") { GC.compact }
end

garbage = []

5.times do
  100000.times { garbage << Object.new }

  puts "Larger #{GC.stat(:heap_live_slots)}"
  Benchmark.ips do |x|
    x.report("compact") { GC.compact }
  end
end

Results of the benchmark are below:

Live Objects, Iterations per Second
21260, 274.572
121267, 98.238
221271, 59.184
321306, 43.179
421306, 33.998
521306, 27.901

As expected, the more live objects, the longer compaction takes. I think we can improve this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment