Check the time. Do a thing. Check the time. What's the difference in time?
a = (1..60).to_a
start_time = Time.now
10000.times { a.shuffle }
Time.now - start_time # => 0.018757
- Put this into a convenient method
- Record how long it takes one of your SalesEngine repos to not find a record, with 100 records, with 1000 records, with 10,000 records
Does what we did above, but records several numbers, and compares them. Here I had to benchmark a big pull-request to RSpec before they would pull it in. What do the numbers mean? Here is an explanation
require 'benchmark'
arr = (1..100000).to_a
Benchmark.bmbm do |performance|
performance.report('shuffling') { 10.times { arr.shuffle } }
performance.report('join') { 10.times { arr.join } }
end
# >> Rehearsal ---------------------------------------------
# >> shuffling 0.030000 0.000000 0.030000 ( 0.029996)
# >> join 0.260000 0.010000 0.270000 ( 0.263311)
# >> ------------------------------------ total: 0.300000sec
# >>
# >> user system total real
# >> shuffling 0.030000 0.000000 0.030000 ( 0.028052)
# >> join 0.260000 0.000000 0.260000 ( 0.256900)
Benchmark is in the standard library (meaning it ships with Ruby), but if it's not good enough for your use, you can use benchmark-ips, a gem to do this same thing ...but better :)
Here, we can see the differences in time for different search algorithms. One does a linear search through the records, taking linear y=x
more work; one does a binary search, taking y=log x
more work; and one stores them in a hash, which is takes the same amount of time no matter how many items are in the array y=1
. Obviously, the hash wins :)
require '/Users/josh/deleteme/josh_sales_engine/lib/sales_engine' # => true
require 'benchmark'
n = 100_000
customers = n.times.map { |i| {id: i} }
sales_engine = SimpleSalesEngine.new({customers: customers}).startup
customers = sales_engine.customer_repository
Benchmark.bmbm do |performance|
performance.report('.find') do
customers.array_find_by_id n
end
performance.report('with a hash') do
customers.hash_find_by_id n
end
performance.report('with a hash') do
customers.bsearch_find_by_id n
end
end
# >> Rehearsal -----------------------------------------------
# >> .find 0.040000 0.000000 0.040000 ( 0.039789)
# >> with a hash 0.000000 0.000000 0.000000 ( 0.000005)
# >> with a hash 0.000000 0.000000 0.000000 ( 0.000021)
# >> -------------------------------------- total: 0.040000sec
# >>
# >> user system total real
# >> .find 0.040000 0.000000 0.040000 ( 0.039286)
# >> with a hash 0.000000 0.000000 0.000000 ( 0.000005)
# >> with a hash 0.000000 0.000000 0.000000 ( 0.000024)
- Update your SalesEngine repo to use this.
- Do 2 or 3 exercises from here
Looks at time, memory, web request, anything you might want to understand. Periodically polls the runtime to identify which methods are called, from where, how long they took, etc.
We'll use ruby-prof, you may need to:
gem install ruby-prof
If we require it, we can then start / stop it around the code we want to profile. And then we can print the data in a handful of different output report formats:
# load the lib
require "ruby-prof"
# profile our code
RubyProf.start
puts fib(number)
result = RubyProf.stop
# write a handful of different reports to files
printer = RubyProf::MultiPrinter.new(result)
printer.print(:path => "./tmp", :profile => "profile")
Note that we need to make the tmp dir for it to save the file in:
$ mkdir tmp
Then we can run the code
$ ruby fib_ruby_prof.rb
This gives us tmp/profile.flat.txt
, which we can look at in the shell:
$ cat tmp/profile.flat.txt
Measure Mode: wall_time
Thread ID: 70331259870160
Fiber ID: 70331278239480
Total: 1.704768
Sort by: self_time
%self total self wait child calls name
24.01 0.673 0.409 0.000 0.264 1 Range#each
22.02 1.031 0.375 0.000 0.656 1 Integer#times
21.65 0.656 0.369 0.000 0.287 1000000 Range#include?
16.83 0.287 0.287 0.000 0.000 2000000 Fixnum#<=>
15.48 0.264 0.264 0.000 0.000 1000000 String#[]
0.00 0.000 0.000 0.000 0.000 2 IO#write
0.00 1.705 0.000 0.000 1.705 1 Global#[No method]
0.00 0.000 0.000 0.000 0.000 1 IO#puts
0.00 0.000 0.000 0.000 0.000 1 Kernel#puts
0.00 1.705 0.000 0.000 1.705 1 Object#fib
0.00 0.673 0.000 0.000 0.673 1 Object#lol
0.00 0.000 0.000 0.000 0.000 1 Range#to_s
0.00 0.000 0.000 0.000 0.000 2 Fixnum#to_s
* indicates recursively called methods
It also gives us tmp/profile.graph.html
which we can look at in our browser:
And tmp/profile.stack.html
, also in the browser:
And if we add a printer for "dot" files:
printer = RubyProf::MultiPrinter.new(result)
printer.print(:path => "./tmp", :profile => "profile")
Then we can see a visual depiction of where our code spent its time. First, though, we need to turn tmp/profile.dot
into an image using the dot
command-line utility (if you don't have it, try brew install graphviz
)
ruby fib_ruby_prof.rb
dot -T pdf -o ./tmp/profile.pdf ./tmp/profile.dot
open tmp/profile.pdf
You can even wrap this around your slow tests to figure out why it's taking so long!
# ...
describe 'can find all customers by' do
specify 'first_name' do
RubyProf.start
repo = repo_for [
{id: 1, first_name: 'Jack'},
{id: 2, first_name: 'Jill'},
{id: 3, first_name: 'Jack'},
]
expect(repo.find_all_by_first_name('Jack').map(&:id)).to eq [1, 3]
expect(repo.find_all_by_first_name('Jill').map(&:id)).to eq [2]
expect(repo.find_all_by_first_name('Jan').map(&:id)).to eq []
result = RubyProf.stop
printer = RubyProf::MultiPrinter.new(result)
printer.print(:path => "./tmp", :profile => "profile")
end
# ...