Skip to content

Instantly share code, notes, and snippets.

@arlyon
Last active August 21, 2020 12:14
Show Gist options
  • Save arlyon/0d90932adcb5a8a05b0bb6fa21518a04 to your computer and use it in GitHub Desktop.
Save arlyon/0d90932adcb5a8a05b0bb6fa21518a04 to your computer and use it in GitHub Desktop.
A 2-cycle fast and pythonic implementation of the sieve or eratosthenes.
from itertools import count
from collections import defaultdict
def eratosthenes():
"""
Infinitely generates primes using the sieve of Eratosthenes.
"""
yield 2
candidate_factorizations = defaultdict(set)
for candidate in count(3, 2):
if candidate not in candidate_factorizations:
yield candidate
candidate_factorizations[candidate ** 2].add(candidate)
else:
for prime in candidate_factorizations[candidate]:
candidate_factorizations[candidate + prime * 2].add(prime)
del candidate_factorizations[candidate]
@arlyon
Copy link
Author

arlyon commented Aug 21, 2018

Some explanation:

It skips all even numbers, doubling the speed. Filtering out additional multiples such as 3 and 5 speeds it up even more. However, in the interest of brevity, that was left out. Additionally, it uses the prime square optimization on line 14 to reduce workload further. The last oddity is due to skipping all multiples of two. It means that on line 17 since a candidate is always odd, a prime is always odd, and an odd plus an odd is even, we need to skip it by simply multiplying the prime by two, ensuring it is even and that the addition will return an odd number meaning it wont be left behind.

@arlyon
Copy link
Author

arlyon commented Aug 21, 2018

Benchmarks:

I compared this to a naive trial division implementation as well as the fastest "classic" non-generator implementation I could find: rwh_primes2_python3.

The generator is, with the additional overhead of "rolling forward" primes, quite a bit slower than the bounded version by about a factor of 20 but still looks linear from my calculations. Not bad considering the overhead required to generate infinitely and comparing the general legibility of the code!

Note that the number given to the rwh_primes2_python3 function is larger because the others look for the nth prime and rwh_primes2_python3 generates all primes less than n. For that reason, we need to give it the nth prime so it will calculate up to it.

With 10,000 primes:

python -mtimeit -s"import eratosthenes" "from itertools import islice"  "next(islice(eratosthenes.eratosthenes(), 10000, None))" && python -mtimeit -s "import eratosthenes" "eratosthenes.rwh_primes2_python3(104729)" && python -mtimeit -s "import eratosthenes" "from itertools import islice" "next(islice(eratosthenes.trial_division(), 10000, None))"
  • infinite trial division: 10 loops, best of 3: 4.65 sec per loop
  • infinite sieve: 10 loops, best of 3: 39.7 msec per loop
  • bounded sieve: 1000 loops, best of 3: 1.95 msec per loop

Removing the trial division allows us to up the numbers a little and compare 1,000,000 primes:

python -mtimeit -s"import eratosthenes" "from itertools import islice"  "next(islice(eratosthenes.eratosthenes(), 1000000, None))" && python -mtimeit -s "import eratosthenes" "eratosthenes.rwh_primes2_python3(15485863)"
  • infinite sieve: 10 loops, best of 3: 7.3 sec per loop
  • bounded sieve: 10 loops, best of 3: 285 msec per loop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment