Skip to content

Instantly share code, notes, and snippets.

@AlexWayfer
Created September 14, 2018 02:15
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save AlexWayfer/8affba5fa6b4f29b7cc1f22de9e85d9c to your computer and use it in GitHub Desktop.
Save AlexWayfer/8affba5fa6b4f29b7cc1f22de9e85d9c to your computer and use it in GitHub Desktop.
Ruby Transducers benchmark
> ruby test.rb
I, [2018-09-14T05:14:28.929558 #30046] INFO -- : true
Warming up --------------------------------------
native 1.000 i/100ms
baseline 2.000 i/100ms
lazy 1.000 i/100ms
via_transduce 1.000 i/100ms
Calculating -------------------------------------
native 11.293 (± 0.0%) i/s - 57.000 in 5.054247s
baseline 21.711 (± 4.6%) i/s - 110.000 in 5.077083s
lazy 5.798 (± 0.0%) i/s - 29.000 in 5.009277s
via_transduce 3.549 (± 0.0%) i/s - 18.000 in 5.078086s
Comparison:
baseline: 21.7 i/s
native: 11.3 i/s - 1.92x slower
lazy: 5.8 i/s - 3.74x slower
via_transduce: 3.5 i/s - 6.12x slower
Calculating -------------------------------------
native 169.602M memsize ( 0.000 retained)
40.000 objects ( 0.000 retained)
0.000 strings ( 0.000 retained)
baseline 10.216M memsize ( 0.000 retained)
10.000 objects ( 0.000 retained)
0.000 strings ( 0.000 retained)
lazy 52.649M memsize ( 0.000 retained)
1.060M objects ( 0.000 retained)
0.000 strings ( 0.000 retained)
via_transduce 10.236M memsize ( 0.000 retained)
340.000 objects ( 0.000 retained)
0.000 strings ( 0.000 retained)
Comparison:
baseline: 10215920 allocated
via_transduce: 10235920 allocated - 1.00x more
lazy: 52648720 allocated - 5.15x more
native: 169601920 allocated - 16.60x more
# frozen_string_literal: true
require 'benchmark/ips'
require 'benchmark/memory'
require 'logger'
ARRAY = (0..530000).to_a.freeze
def native
ARRAY
.map { |x| x + 10 }
.map { |x| x * 2 }
.select { |x| x.even? }
.select { |x| x % 5 == 0 }
end
def lazy
ARRAY
.lazy
.map { |x| x + 10 }
.map { |x| x * 2 }
.select { |x| x.even? }
.select { |x| x % 5 == 0 }
.to_a
end
def baseline
ARRAY.each_with_object([]) do |x, result|
x = (x + 10) * 2
result << x if x.even? && x % 5 == 0
end
end
require 'ramda'
def transduce(transformation, reducing_fn, initial, input)
input.reduce(initial, &transformation.call(reducing_fn))
end
PUSHES = -> list, item { list.push(item) }
def mapping(&fn)
-> reducing_fn {
-> result, input {
reducing_fn.call(result, fn.call(input))
}
}
end
def selecting(&predicate_fn)
-> reducing_fn {
-> result, input {
if predicate_fn.call(input)
reducing_fn.call(result, input)
else
result
end
}
}
end
def via_transduce
transformation = Ramda.compose(
mapping { |x| x + 10 },
mapping { |x| x * 2 },
selecting { |x| x.even? },
selecting { |x| x % 5 == 0 }
)
transduce(
transformation,
PUSHES,
[],
ARRAY
)
end
def logger
return @logger if defined? @logger
@logger = Logger.new($stdout)
@logger.level = Logger::INFO
@logger
end
def test
logger.debug native
logger.debug lazy
logger.debug baseline
logger.debug via_transduce
result = (native == lazy) && (lazy == baseline) && (baseline == via_transduce)
logger.info result
result
end
exit unless test
Benchmark.ips do |x|
x.report('native') { native }
x.report('baseline') { baseline }
x.report('lazy') { lazy }
x.report('via_transduce') { via_transduce }
x.compare!
end
Benchmark.memory do |x|
x.report('native') { 10.times { native } }
x.report('baseline') { 10.times { baseline } }
x.report('lazy') { 10.times { lazy } }
x.report('via_transduce') { 10.times { via_transduce } }
x.compare!
end
@konung
Copy link

konung commented Sep 14, 2018

Thanks for a reply and putting up a benchmark . My Point was that Lazy was slow and slower than 'baseline'. the other commenter, made it out like lazy was fast . which it is not.

Also there is a bit caveat about lazy - it's effeciency highly depends on the size of the array. ( By the way my quip about it being slow - comes from the horse's mouth - the author of lazy himself - https://railsware.com/blog/2012/03/13/ruby-2-0-enumerablelazy/

Here are my results for your benchmark running on MacBook 2015 with high sierra and ruby 2.6.0-previews

Warming up --------------------------------------
              native     1.000  i/100ms
            baseline     1.000  i/100ms
                lazy     1.000  i/100ms
       via_transduce     1.000  i/100ms
Calculating -------------------------------------
              native      7.622  (± 0.0%) i/s -     39.000  in   5.127945s
            baseline     16.002  (± 6.2%) i/s -     80.000  in   5.005739s
                lazy      4.784  (± 0.0%) i/s -     24.000  in   5.021066s
       via_transduce      4.746  (± 0.0%) i/s -     24.000  in   5.057447s

Comparison:
            baseline:       16.0 i/s
              native:        7.6 i/s - 2.10x  slower
                lazy:        4.8 i/s - 3.34x  slower
       via_transduce:        4.7 i/s - 3.37x  slower

Calculating -------------------------------------
              native   169.602M memsize (     0.000  retained)
                        40.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
            baseline    10.216M memsize (     0.000  retained)
                        10.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
                lazy    52.649M memsize (     0.000  retained)
                         1.060M objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
       via_transduce    10.236M memsize (     0.000  retained)
                       340.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)

Comparison:
            baseline:   10215920 allocated
       via_transduce:   10235920 allocated - 1.00x more
                lazy:   52648720 allocated - 5.15x more
              native:  169601920 allocated - 16.60x more

Memory consumption is is exactly the same, but ips is off.

HOWEVER
If you run it for a smaller array ( let's say 50 elements and 1000 elements) - numbers actually jump around quite a bit ( in my initial test I ran it for 50 elements - which is the size of sets I often have to deal with ( lots of small arrays)

For 50 elements

I, [2018-09-14T17:11:13.291584 #51737]  INFO -- : true
Warming up --------------------------------------
              native     6.675k i/100ms
            baseline    14.603k i/100ms
                lazy     2.875k i/100ms
       via_transduce     3.038k i/100ms
Calculating -------------------------------------
              native     70.851k (± 2.4%) i/s -    360.450k in   5.090445s
            baseline    152.145k (± 2.5%) i/s -    773.959k in   5.090214s
                lazy     29.004k (± 2.2%) i/s -    146.625k in   5.057872s
       via_transduce     30.447k (± 2.4%) i/s -    154.938k in   5.091914s

Comparison:
            baseline:   152145.1 i/s
              native:    70851.4 i/s - 2.15x  slower
       via_transduce:    30447.3 i/s - 5.00x  slower
                lazy:    29003.6 i/s - 5.25x  slower

Calculating -------------------------------------
              native    18.880k memsize (     0.000  retained)
                        40.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
            baseline     2.000k memsize (     0.000  retained)
                        10.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
                lazy    38.800k memsize (     0.000  retained)
                       550.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
       via_transduce    22.000k memsize (     0.000  retained)
                       340.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)

Comparison:
            baseline:       2000 allocated
              native:      18880 allocated - 9.44x more
       via_transduce:      22000 allocated - 11.00x more
                lazy:      38800 allocated - 19.40x more

For 1000 elements

Warming up --------------------------------------
              native   401.000  i/100ms
            baseline   869.000  i/100ms
                lazy   250.000  i/100ms
       via_transduce   245.000  i/100ms
Calculating -------------------------------------
              native      4.247k (± 3.4%) i/s -     21.253k in   5.010516s
            baseline      8.829k (± 1.8%) i/s -     44.319k in   5.021735s
                lazy      2.545k (± 1.7%) i/s -     12.750k in   5.011131s
       via_transduce      2.478k (± 1.9%) i/s -     12.495k in   5.044822s

Comparison:
            baseline:     8828.6 i/s
              native:     4246.9 i/s - 2.08x  slower
                lazy:     2545.1 i/s - 3.47x  slower
       via_transduce:     2477.7 i/s - 3.56x  slower

Calculating -------------------------------------
              native   321.920k memsize (     0.000  retained)
                        40.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
            baseline    23.600k memsize (     0.000  retained)
                        10.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
                lazy   136.400k memsize (     0.000  retained)
                         2.450k objects (     0.000  retained)
                         0.000  strings (     0.000  retained)
       via_transduce    43.600k memsize (     0.000  retained)
                       340.000  objects (     0.000  retained)
                         0.000  strings (     0.000  retained)

Comparison:
            baseline:      23600 allocated
       via_transduce:      43600 allocated - 1.85x more
                lazy:     136400 allocated - 5.78x more
              native:     321920 allocated - 13.64x more

@AlexWayfer
Copy link
Author

@konung, speed almost the same for all three cases. There is big difference in memory consuming, not in favor of lazy.

@baweaver
Copy link

The other thing to consider is that Enumerable methods are all optimized. If you were to throw this up before your test, it'd skew the result quite a bit:

class Array
  def map
    new_array = []
    each { |v| new_array << yield(v) }
    new_array
  end

  def select
    new_array = []
    each { |v| new_array << v if yield(v) }
    new_array
  end
end

Now if someone could figure out how to write both transduction and composition in a C extension, those numbers would get really interesting. Unfortunately I do not have that skillset currently, but may pursue it later for the sake of amusement.

This JS benchmark was something else: https://jlongster.com/Transducers.js-Round-2-with-Benchmarks

Thanks @konung for the benchmarks!

I'd meant to reply but I was still delving around my own benchmarks to see if there was a clean way to demonstrate a few corner cases, but that does quite nicely.

@AlexWayfer
Copy link
Author

If you were to throw this up before your test, it'd skew the result quite a bit:

It will not be lazy anymore, this code is not about laziness.

V8 doesn't have optimizations for basic JavaScript functions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment