Skip to content

Instantly share code, notes, and snippets.

@acareaga
Forked from JoshCheek/performance_of_code.md
Created September 22, 2015 21:15
Show Gist options
  • Save acareaga/32e04c47999fe5bb4bd0 to your computer and use it in GitHub Desktop.
Save acareaga/32e04c47999fe5bb4bd0 to your computer and use it in GitHub Desktop.
Performance of code

Performance of code

Time.now

Check the time. Do a thing. Check the time. What's the difference in time?

a = (1..60).to_a
start_time = Time.now
10000.times { a.shuffle }
Time.now - start_time # => 0.018757

Exercise:

  • Put this into a convenient method
  • Record how long it takes one of your SalesEngine repos to not find a record, with 100 records, with 1000 records, with 10,000 records

Benchmarking

Does what we did above, but records several numbers, and compares them. Here I had to benchmark a big pull-request to RSpec before they would pull it in. What do the numbers mean? Here is an explanation

require 'benchmark'
arr = (1..100000).to_a
Benchmark.bmbm do |performance|
  performance.report('shuffling') { 10.times { arr.shuffle } }
  performance.report('join')      { 10.times { arr.join    } }
end

# >> Rehearsal ---------------------------------------------
# >> shuffling   0.030000   0.000000   0.030000 (  0.029996)
# >> join        0.260000   0.010000   0.270000 (  0.263311)
# >> ------------------------------------ total: 0.300000sec
# >>
# >>                 user     system      total        real
# >> shuffling   0.030000   0.000000   0.030000 (  0.028052)
# >> join        0.260000   0.000000   0.260000 (  0.256900)

Benchmark is in the standard library (meaning it ships with Ruby), but if it's not good enough for your use, you can use benchmark-ips, a gem to do this same thing ...but better :)

Here, we can see the differences in time for different search algorithms. One does a linear search through the records, taking linear y=x more work; one does a binary search, taking y=log x more work; and one stores them in a hash, which is takes the same amount of time no matter how many items are in the array y=1. Obviously, the hash wins :)

require '/Users/josh/deleteme/josh_sales_engine/lib/sales_engine'  # => true
require 'benchmark'

n            = 100_000
customers    = n.times.map { |i| {id: i} }
sales_engine = SimpleSalesEngine.new({customers: customers}).startup
customers    = sales_engine.customer_repository

Benchmark.bmbm do |performance|
  performance.report('.find') do
    customers.array_find_by_id n
  end
  performance.report('with a hash') do
    customers.hash_find_by_id n
  end
  performance.report('with a hash') do
    customers.bsearch_find_by_id n
  end
end

# >> Rehearsal -----------------------------------------------
# >> .find         0.040000   0.000000   0.040000 (  0.039789)
# >> with a hash   0.000000   0.000000   0.000000 (  0.000005)
# >> with a hash   0.000000   0.000000   0.000000 (  0.000021)
# >> -------------------------------------- total: 0.040000sec
# >>
# >>                   user     system      total        real
# >> .find         0.040000   0.000000   0.040000 (  0.039286)
# >> with a hash   0.000000   0.000000   0.000000 (  0.000005)
# >> with a hash   0.000000   0.000000   0.000000 (  0.000024)

Exercise:

  • Update your SalesEngine repo to use this.
  • Do 2 or 3 exercises from here

Profiling

Looks at time, memory, web request, anything you might want to understand. Periodically polls the runtime to identify which methods are called, from where, how long they took, etc.

We'll use ruby-prof, you may need to:

gem install ruby-prof

If we require it, we can then start / stop it around the code we want to profile. And then we can print the data in a handful of different output report formats:

# load the lib
require "ruby-prof"

# profile our code
RubyProf.start
puts fib(number)
result = RubyProf.stop

# write a handful of different reports to files
printer = RubyProf::MultiPrinter.new(result)
printer.print(:path => "./tmp", :profile => "profile")

Note that we need to make the tmp dir for it to save the file in:

$ mkdir tmp

Then we can run the code

$ ruby fib_ruby_prof.rb

This gives us tmp/profile.flat.txt, which we can look at in the shell:

$ cat tmp/profile.flat.txt
Measure Mode: wall_time
Thread ID: 70331259870160
Fiber ID: 70331278239480
Total: 1.704768
Sort by: self_time

 %self      total      self      wait     child     calls  name
 24.01      0.673     0.409     0.000     0.264        1   Range#each
 22.02      1.031     0.375     0.000     0.656        1   Integer#times
 21.65      0.656     0.369     0.000     0.287  1000000   Range#include?
 16.83      0.287     0.287     0.000     0.000  2000000   Fixnum#<=>
 15.48      0.264     0.264     0.000     0.000  1000000   String#[]
  0.00      0.000     0.000     0.000     0.000        2   IO#write
  0.00      1.705     0.000     0.000     1.705        1   Global#[No method]
  0.00      0.000     0.000     0.000     0.000        1   IO#puts
  0.00      0.000     0.000     0.000     0.000        1   Kernel#puts
  0.00      1.705     0.000     0.000     1.705        1   Object#fib
  0.00      0.673     0.000     0.000     0.673        1   Object#lol
  0.00      0.000     0.000     0.000     0.000        1   Range#to_s
  0.00      0.000     0.000     0.000     0.000        2   Fixnum#to_s

* indicates recursively called methods

It also gives us tmp/profile.graph.html which we can look at in our browser:

profile.graph.html.png

And tmp/profile.stack.html, also in the browser:

profile.stack.html.png

And if we add a printer for "dot" files:

printer = RubyProf::MultiPrinter.new(result)
printer.print(:path => "./tmp", :profile => "profile")

Then we can see a visual depiction of where our code spent its time. First, though, we need to turn tmp/profile.dot into an image using the dot command-line utility (if you don't have it, try brew install graphviz)

ruby fib_ruby_prof.rb
dot -T pdf -o ./tmp/profile.pdf ./tmp/profile.dot
open tmp/profile.pdf

profile.dot.png

You can even wrap this around your slow tests to figure out why it's taking so long!

# ...
  describe 'can find all customers by' do
    specify 'first_name' do
      RubyProf.start
      repo = repo_for [
        {id: 1, first_name: 'Jack'},
        {id: 2, first_name: 'Jill'},
        {id: 3, first_name: 'Jack'},
      ]
      expect(repo.find_all_by_first_name('Jack').map(&:id)).to eq [1, 3]
      expect(repo.find_all_by_first_name('Jill').map(&:id)).to eq [2]
      expect(repo.find_all_by_first_name('Jan').map(&:id)).to eq []
      result = RubyProf.stop
      printer = RubyProf::MultiPrinter.new(result)
      printer.print(:path => "./tmp", :profile => "profile")
    end
# ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment