Skip to content

Instantly share code, notes, and snippets.

@WillNess
Last active January 8, 2018 10:56
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save WillNess/dac5a45f46b0d936ea3ebc49b3ff6af2 to your computer and use it in GitHub Desktop.
Save WillNess/dac5a45f46b0d936ea3ebc49b3ff6af2 to your computer and use it in GitHub Desktop.

OK, so here's a short progression of implementations for comparison, for clarity:

 {-# LANGUAGE ViewPatterns #-}
 import Data.Array.Unboxed
 import Data.List (tails, inits)
 import qualified Data.List.Ordered as O     -- package 'data-ordlist'

 (\\) = O.minus

 tr = sv [2..]
      where
      sv (p:t) = [p] ++ sv [n | n <- t, rem n p > 0]

Turner's sieve is the classic, but it is trial division, and it is non-postponed (more below).

 er = sv [2..]
      where
      sv (p:t) = [p] ++ sv (t \\ [p, p+p..])      -- (t \\ map (p*) [p..])

This basic sieve of Eratosthenes (SoE) is not postponed either. Generating the multiples to be removed starting from p*p instead of just from p is a step in the right direction but not much of an improvement, because the removals are still being started way to early, so there'll be way too many of them, just as in the Turner's sieve.

 eu = sv [2..]
      where
      sv (p:t) = [p] ++ sv (t \\ map (p*) (p:t))

Euler's scheme is a detour, arranging to avoid SoE's duplicate composites production, like 5*9 == 3*15. It comes with a great price of making the postponement and thus segmentation impossible, because the multiples on each sieving step are not generated independently of each other, as it was before.

 pe = 2 : sv [[p*p, p*p+p..] | p <- pe] [3..]
      where
      sv ((q:cs):r) (span (< q) -> (h,_:t)) = h ++ sv r (t \\ cs)

At last, Postponed Eratosthenes starts the removals just at the right moment, so there's no more quadratic explosion in their numbers as it was in the non-postponed variants.

 br = 2 : [3..] \\ foldr (\p -> (p*p :) . O.union [p*p+p, p*p+2*p..]) [] br

Bird's sieve is also a classic.

    = 2 : [3..] \\ foldr (\(x:xs) -> (x:) . O.union xs) [] [[p*p, p*p+p..] | p <- br]
 
    = (2 :) . ([3..] \\) . foldr (\(x:xs) -> (x:) . O.union xs) [] 
                         . map (\p -> [p*p, p*p+p..]) $ br
 
    = fix $ (2 :) . ([3..] \\) . O.unionAll . map (\p -> [p*p, p*p+p..])
 
 tf = (2:) . _Y $ (3 :) . ([5,7..] \\) . O.unionAll . map (\p -> [p*p, p*p+2*p..])
 
 _Y g = g (_Y g)

Folding the multiples in a tree is much better, provides nice logarithmic advantage in complexity.

 se = 2 : [n | (r:q:_, px) <- (zip . tails . (2:) . map (^2)) se (inits se),
               (n,True)    <- assocs ( accumArray (const id) True (r+1,q-1)
                                [(m,False) | p <- px, s <- [ (r+p)`div`p * p ], 
                                             m <- [s,s+p..q-1]] :: UArray Int Bool )]

Segmentwise Eratosthenes is the same algorithm, only "transposed", making it much more of a natural fit to using arrays, for much much improved speed.

Wheel factorization is an orthogonal optimization.

Another is using _Y g = g (_Y g) instead of fix to express the data recursion (re: se = .... se ....) creating a (short) tower of independent primes producers self-arranged by the repeated squaring scheme feeding each into the one above it, to minimize the overall memory retention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment