-
-
Save arlyon/0d90932adcb5a8a05b0bb6fa21518a04 to your computer and use it in GitHub Desktop.
from itertools import count | |
from collections import defaultdict | |
def eratosthenes(): | |
""" | |
Infinitely generates primes using the sieve of Eratosthenes. | |
""" | |
yield 2 | |
candidate_factorizations = defaultdict(set) | |
for candidate in count(3, 2): | |
if candidate not in candidate_factorizations: | |
yield candidate | |
candidate_factorizations[candidate ** 2].add(candidate) | |
else: | |
for prime in candidate_factorizations[candidate]: | |
candidate_factorizations[candidate + prime * 2].add(prime) | |
del candidate_factorizations[candidate] |
Benchmarks:
I compared this to a naive trial division implementation as well as the fastest "classic" non-generator implementation I could find: rwh_primes2_python3.
The generator is, with the additional overhead of "rolling forward" primes, quite a bit slower than the bounded version by about a factor of 20 but still looks linear from my calculations. Not bad considering the overhead required to generate infinitely and comparing the general legibility of the code!
Note that the number given to the
rwh_primes2_python3
function is larger because the others look for the nth prime andrwh_primes2_python3
generates all primes less than n. For that reason, we need to give it the nth prime so it will calculate up to it.
With 10,000 primes:
python -mtimeit -s"import eratosthenes" "from itertools import islice" "next(islice(eratosthenes.eratosthenes(), 10000, None))" && python -mtimeit -s "import eratosthenes" "eratosthenes.rwh_primes2_python3(104729)" && python -mtimeit -s "import eratosthenes" "from itertools import islice" "next(islice(eratosthenes.trial_division(), 10000, None))"
- infinite trial division:
10 loops, best of 3: 4.65 sec per loop
- infinite sieve:
10 loops, best of 3: 39.7 msec per loop
- bounded sieve:
1000 loops, best of 3: 1.95 msec per loop
Removing the trial division allows us to up the numbers a little and compare 1,000,000 primes:
python -mtimeit -s"import eratosthenes" "from itertools import islice" "next(islice(eratosthenes.eratosthenes(), 1000000, None))" && python -mtimeit -s "import eratosthenes" "eratosthenes.rwh_primes2_python3(15485863)"
- infinite sieve:
10 loops, best of 3: 7.3 sec per loop
- bounded sieve:
10 loops, best of 3: 285 msec per loop
Some explanation:
It skips all even numbers, doubling the speed. Filtering out additional multiples such as 3 and 5 speeds it up even more. However, in the interest of brevity, that was left out. Additionally, it uses the prime square optimization on line 14 to reduce workload further. The last oddity is due to skipping all multiples of two. It means that on line 17 since a candidate is always odd, a prime is always odd, and an odd plus an odd is even, we need to skip it by simply multiplying the prime by two, ensuring it is even and that the addition will return an odd number meaning it wont be left behind.