Skip to content

Instantly share code, notes, and snippets.

@WillNess
Last active December 17, 2015 19:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save WillNess/5659214 to your computer and use it in GitHub Desktop.
Save WillNess/5659214 to your computer and use it in GitHub Desktop.
Actually, with just some minor and cosmetic edits, your code becomes the *very* nice:
> -- map (*i) [i ..] == map (*i) [i, i+1 ..] == [i*i, i*i+i ..]
> -- sieve 2 [] where sieve i [] = (i:) $ sieve (i + 1) [i*i, i*i+i ..]
> -- == 2 : sieve 3 [4, 6 ..]
> primesA :: [Integer]
> primesA = 2 : sieve 3 [4, 6 ..]
> --
> sieve p cs@(c : t)
> | p == c = sieve (p + 1) t
> | otherwise = p : sieve (p + 1) (unionI cs [p*p, p*p+p ..])
> --
> unionI a@(x:xs) b@(y:ys) | x < y = x : xs `unionI` b
> | x == y = x : xs `unionI` ys
> | otherwise = y : a `unionI` ys
It is *very* nice, because it uses "union" instead of the usual "minus", so it's already
one step closer towards the final goal. It's another step in the derivation chain, using
`xs - (a+b+c+...)` instead of the usual `(...(((xs-a)-b)-c) - ...)` . If we look closer,
we see that `a+b+c+...` is actually `(((a+b)+c) + ...)` which is less fortunate, but
that's just details at this step.
____
It can be further improved by pre-calculating one more initial step,
> primesA :: [Integer]
> primesA = 2 : 3 : sieve 5 [9, 15 ..]
> --
> sieve p cs@(c : t)
> | p == c = sieve (p + 2) t
> | otherwise = p : sieve (p + 2) (unionI cs [p*p, p*p+2*p ..])
or even
> primesA = 2 : 3 : 5 : sieve 7 ( scanl (+) 25 $ cycle [10,20] )
> -- [25,35..] \\ [45,75..]
> -- [25,55..] +/+ [35,65..]
> -- where (x:xs)+/+(y:ys)=x:y:(xs+/+ys)
> sieve p cs@(c : t)
> | p == c = sieve (p + 6) ......
> | ......
Now its problem is more readily apparent. It adds new composite streams much too early -
and will end up with way too many of them. To have an optimal - *minimal* - amount of
composites streams to deal with at any "point in time" i.e. for any candidate considered,
these streams must be added each at its proper time - only when a prime's square is reached
among the candidates (*9, 25, 49* ...), and not sooner.
......
This is achieved with (also creating the better `a+(b+(c+(...)))` calculation structure):
> primesA = [2,3,5] ++ sieve 7 (foldr (\p r-> p*p : unionI [p*p+2*p, p*p+4*p ..] r)
> [] $ tail primesA)
> sieve p cs@(c : t)
> | p == c = sieve (p + 2) t
> | otherwise = p : sieve (p + 2) cs
---------
Your code is actually very nice, with some minor and cosmetic edits (like using shorter
names, and changing `map (*i) [i..]` to `map (*i) [i, i+1..]` to `[i*i, i*i+i ..]`):
> primesA :: [Integer]
> primesA = sieve 2 []
> sieve i [] = i : sieve (i + 1) [i*i, i*i+i ..]
> sieve i cs@(h : t)
> | i == h = sieve (i + 1) t
> | otherwise = i : sieve (i + 1) (unionI cs [i*i, i*i+i ..])
> unionI [] b = b
> unionI a [] = a
> unionI a@(x:xs) b@(y:ys) | x < y = x : xs `unionI` b
> | x == y = x : xs `unionI` ys
> | otherwise = y : a `unionI` ys
Pre-calculating the first step gives us
> primesAB = 2 : sieve 3 [4, 6 ..]
> sieve p cs@(c : t)
> | p == c = sieve (p + 1) t
> | otherwise = p : sieve (p + 1) (unionI cs [p*p, p*p+p ..])
There are two problems with this code. There is the `(...(((a+b)+c)+d)+...)` calculation
structure, which puts the more frequently-producing streams lower in the structure than
the less frequently-producing.
But the more immediate concern is the premature handling of each prime's multiples. E.g.
when you reach 5, `[25, 30 ..]` is added into the composites. But that should only be
done when 25 is reached, which would radically reduce the total number of multiples
streams being processed (from `O(n)` to `O(sqrt(n))`, for *n*-th prime). This has very
significant impact on efficiency.
We can synchronize on squares of primes, taken from the primes sequence itself as it is
being produced (this just maintains a separate back-pointer `ps` into the sequence, which
is advanced at much slower pace than the production point):
> primesAC = [2,3] ++ sieve 4 (tail primesAC) 9 [4,6..]
> sieve k ps@(p:pt) q cs@(c:ct) -- "p" for prime, "k" for candidate
> | k == c = sieve (k + 1) ps q ct -- "c" for composite
> | k < q = k : sieve (k + 1) ps q cs
> | otherwise = sieve k pt (head pt^2) -- k == q == p*p
> (unionI cs [q, q+p ..])
or we can simplify this formulation (at cost of diminished efficiency) as
___________________________________________________________________________
> sieve k (p:pt) q cs | (h,t) <- span (< q) cs =
> diffIncr [k..q-1] h ++ sieve q pt (head pt^2) (unionI t [q, q+p ..])
---------------------------------------------------------------------------
This is just half a step away from Richard Bird's solution:
> primesAD = (2 :) . sieve 3 .
> foldr (\p r-> p*p : unionI [p*p+p, p*p+2*p ..] r) [] $ primesAD
> sieve p cs@(c : t) -- == diffIncr [p..] cs, if p<=c
> | p == c = sieve (p + 1) t
> | otherwise = p : sieve (p + 1) cs
____
Let's look into your attempted solution.
> primesB :: [Integer]
> primesB = [2..] `diffI` composites
> composites :: [Integer]
> composites = foldl f [4,6..] [3..]
> where
> f cs@(h:t) i | i == h = cs
> | otherwise = h : unionI t [i*i, i*i+i ..]
> diffI :: Ord a => [a] -> [a] -> [a]
> diffI [] b = []
> diffI a [] = a
> diffI (x:xs) (y:ys) | x < y = x : xs `diffI` (y:ys)
> | x == y = xs `diffI` ys
> | otherwise = (x:xs) `diffI` ys
> composites = foldl f [4,6..] [3..]
> = foldl f (f [4,6..] 3) [4..] -- no part of calculation of (f [4,6..] 3)
> = foldl f (4:unionI [6,8..] [9,12..]) [4..] -- is forced here, but if it were, ...
> = foldl f (4:unionI [6,8..] [9,12..]) [5..]
> = foldl f (4:union (unionI [6,8..] [9,12..]) [25,30..]) [6..]
> = foldl f (4:union (union (unionI [6,8..] [9,12..]) [25,30..]) [36,42..]) [7..]
> = ....
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment