Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
require 'benchmark'
require 'prime'
def primes_up_to(n)
s = [nil, nil] + (2..n).to_a
(2..(n ** 0.5).to_i).reject { |i| s[i].nil? }.each do |i|
(i ** 2).step(n, i) { |j| s[j] = nil }
end
s.compact
end
Benchmark.bm(12) do |x|
x.report('primes_up_to') { primes_up_to(2000000).inject(0) { |memo,obj| memo + obj } }
x.report('Prime.each') { Prime.each(2000000).inject(0) { |memo,obj| memo + obj } }
end
$ ruby -v
ruby 1.9.2p290 (2011-07-09 revision 32553) [x86_64-darwin10.8.0]
$ ruby lol.rb
user system total real
primes_up_to 1.470000 0.020000 1.490000 ( 1.491340)
Prime.each 7.820000 0.010000 7.830000 ( 7.820969)

I'd think Ruby's internal implementation would be an efficient implementation in C or something of the most efficient algorithm that exists. Or, I'd think that if I didn't know anything about Ruby.

Owner

mconigliaro commented Sep 28, 2011

I assumed the same thing. I can't see why a non-math guy like me should be able to come up with a more efficient algorithm simply by reading blogs and articles on Wikipedia. So here's my best guess (without actually having looked at the Prime source yet) as to what's going on here:

My primes_up_to method generates the whole list of primes up front and just returns the whole thing in one shot, but this means you have to tell it how many primes to generate. Ruby's Prime.each is implemented as an enumerator, which means you can use it to keep generating primes indefinitely. This suggests to me that it's only generating the next prime on demand. But in this test, I did tell it how many primes to generate, so I don't understand why they wouldn't just use a sieve to generate all the primes upfront (which has to be faster). Maybe they are doing that, and I'm just seeing the overhead of calling next() 2000000 times. If that's the case, it seems like if I know how many primes I want, there should be a way to get the whole list rather than an enumerable.

Owner

mconigliaro commented Sep 28, 2011

A co-worker suggested that maybe my prime number generator is flawed, but according to the test below, both methods generate the same list of primes:

require 'benchmark'
require 'pp'
require 'prime'

def primes_up_to(n)
  s = [nil, nil] + (2..n).to_a
  (2..(n ** 0.5).to_i).reject { |i| s[i].nil? }.each do |i|
    (i ** 2).step(n, i) { |j| s[j] = nil }
  end
  s.compact
end

my_primes = []
ruby_primes = []

Benchmark.bm(12) do |x|
  x.report('primes_up_to') { my_primes = primes_up_to(2000000) }
  x.report('Prime.each') { ruby_primes = Prime.each(2000000).to_a }
end

pp my_primes - ruby_primes
pp ruby_primes - my_primes
$ ruby test.rb 
                  user     system      total        real
primes_up_to  1.430000   0.020000   1.450000 (  1.439108)
Prime.each    8.260000   0.010000   8.270000 (  8.263235)
[]
[]

A co-worker and I reviewed the algorithm and think it is a correct algorithm. It's pretty basic and standard. You could probably optimize it a bit.

Owner

mconigliaro commented Sep 28, 2011

Haha, that's the best part. I remember reading about further optimizations that can be done, and I know for sure that there are faster algorithms out there (e.g. the sieve of Atkin), but performance was good enough for my purposes, so I left it alone. But you would think someone would have optimized the hell out of the version in the standard library! I guess this is why Python is more popular for scientific programming.

Owner

mconigliaro commented Sep 28, 2011

I submitted a bug report: http://redmine.ruby-lang.org/issues/5378

If I'm reading this correctly, 'generators' are the solution to this - they remember what the last prime you looked up is, and can use different methods of finding prime numbers depending on your needs. You can even specify a generator when calling each. See http://www.ruby-doc.org/core-1.9/classes/Prime.html

Owner

mconigliaro commented Sep 28, 2011

Yeah, I actually tried the other generators last night (Prime::EratosthenesGenerator is the default). Prime::TrialDivisionGenerator was so slow that I gave up letting it finish for the above test, and Prime::Generator23 doesn't really find primes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment