Last active
December 17, 2015 19:19
-
-
Save WillNess/5659214 to your computer and use it in GitHub Desktop.
primesA w/ unionIncr and sieve==gaps http://stackoverflow.com/questions/16271592/why-wouldnt-my-sieve-terminate-when-i-rewrote-it-as-a-foldl
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Actually, with just some minor and cosmetic edits, your code becomes the *very* nice: | |
> -- map (*i) [i ..] == map (*i) [i, i+1 ..] == [i*i, i*i+i ..] | |
> -- sieve 2 [] where sieve i [] = (i:) $ sieve (i + 1) [i*i, i*i+i ..] | |
> -- == 2 : sieve 3 [4, 6 ..] | |
> primesA :: [Integer] | |
> primesA = 2 : sieve 3 [4, 6 ..] | |
> -- | |
> sieve p cs@(c : t) | |
> | p == c = sieve (p + 1) t | |
> | otherwise = p : sieve (p + 1) (unionI cs [p*p, p*p+p ..]) | |
> -- | |
> unionI a@(x:xs) b@(y:ys) | x < y = x : xs `unionI` b | |
> | x == y = x : xs `unionI` ys | |
> | otherwise = y : a `unionI` ys | |
It is *very* nice, because it uses "union" instead of the usual "minus", so it's already | |
one step closer towards the final goal. It's another step in the derivation chain, using | |
`xs - (a+b+c+...)` instead of the usual `(...(((xs-a)-b)-c) - ...)` . If we look closer, | |
we see that `a+b+c+...` is actually `(((a+b)+c) + ...)` which is less fortunate, but | |
that's just details at this step. | |
____ | |
It can be further improved by pre-calculating one more initial step, | |
> primesA :: [Integer] | |
> primesA = 2 : 3 : sieve 5 [9, 15 ..] | |
> -- | |
> sieve p cs@(c : t) | |
> | p == c = sieve (p + 2) t | |
> | otherwise = p : sieve (p + 2) (unionI cs [p*p, p*p+2*p ..]) | |
or even | |
> primesA = 2 : 3 : 5 : sieve 7 ( scanl (+) 25 $ cycle [10,20] ) | |
> -- [25,35..] \\ [45,75..] | |
> -- [25,55..] +/+ [35,65..] | |
> -- where (x:xs)+/+(y:ys)=x:y:(xs+/+ys) | |
> sieve p cs@(c : t) | |
> | p == c = sieve (p + 6) ...... | |
> | ...... | |
Now its problem is more readily apparent. It adds new composite streams much too early - | |
and will end up with way too many of them. To have an optimal - *minimal* - amount of | |
composites streams to deal with at any "point in time" i.e. for any candidate considered, | |
these streams must be added each at its proper time - only when a prime's square is reached | |
among the candidates (*9, 25, 49* ...), and not sooner. | |
...... | |
This is achieved with (also creating the better `a+(b+(c+(...)))` calculation structure): | |
> primesA = [2,3,5] ++ sieve 7 (foldr (\p r-> p*p : unionI [p*p+2*p, p*p+4*p ..] r) | |
> [] $ tail primesA) | |
> sieve p cs@(c : t) | |
> | p == c = sieve (p + 2) t | |
> | otherwise = p : sieve (p + 2) cs | |
--------- | |
Your code is actually very nice, with some minor and cosmetic edits (like using shorter | |
names, and changing `map (*i) [i..]` to `map (*i) [i, i+1..]` to `[i*i, i*i+i ..]`): | |
> primesA :: [Integer] | |
> primesA = sieve 2 [] | |
> sieve i [] = i : sieve (i + 1) [i*i, i*i+i ..] | |
> sieve i cs@(h : t) | |
> | i == h = sieve (i + 1) t | |
> | otherwise = i : sieve (i + 1) (unionI cs [i*i, i*i+i ..]) | |
> unionI [] b = b | |
> unionI a [] = a | |
> unionI a@(x:xs) b@(y:ys) | x < y = x : xs `unionI` b | |
> | x == y = x : xs `unionI` ys | |
> | otherwise = y : a `unionI` ys | |
Pre-calculating the first step gives us | |
> primesAB = 2 : sieve 3 [4, 6 ..] | |
> sieve p cs@(c : t) | |
> | p == c = sieve (p + 1) t | |
> | otherwise = p : sieve (p + 1) (unionI cs [p*p, p*p+p ..]) | |
There are two problems with this code. There is the `(...(((a+b)+c)+d)+...)` calculation | |
structure, which puts the more frequently-producing streams lower in the structure than | |
the less frequently-producing. | |
But the more immediate concern is the premature handling of each prime's multiples. E.g. | |
when you reach 5, `[25, 30 ..]` is added into the composites. But that should only be | |
done when 25 is reached, which would radically reduce the total number of multiples | |
streams being processed (from `O(n)` to `O(sqrt(n))`, for *n*-th prime). This has very | |
significant impact on efficiency. | |
We can synchronize on squares of primes, taken from the primes sequence itself as it is | |
being produced (this just maintains a separate back-pointer `ps` into the sequence, which | |
is advanced at much slower pace than the production point): | |
> primesAC = [2,3] ++ sieve 4 (tail primesAC) 9 [4,6..] | |
> sieve k ps@(p:pt) q cs@(c:ct) -- "p" for prime, "k" for candidate | |
> | k == c = sieve (k + 1) ps q ct -- "c" for composite | |
> | k < q = k : sieve (k + 1) ps q cs | |
> | otherwise = sieve k pt (head pt^2) -- k == q == p*p | |
> (unionI cs [q, q+p ..]) | |
or we can simplify this formulation (at cost of diminished efficiency) as | |
___________________________________________________________________________ | |
> sieve k (p:pt) q cs | (h,t) <- span (< q) cs = | |
> diffIncr [k..q-1] h ++ sieve q pt (head pt^2) (unionI t [q, q+p ..]) | |
--------------------------------------------------------------------------- | |
This is just half a step away from Richard Bird's solution: | |
> primesAD = (2 :) . sieve 3 . | |
> foldr (\p r-> p*p : unionI [p*p+p, p*p+2*p ..] r) [] $ primesAD | |
> sieve p cs@(c : t) -- == diffIncr [p..] cs, if p<=c | |
> | p == c = sieve (p + 1) t | |
> | otherwise = p : sieve (p + 1) cs | |
____ | |
Let's look into your attempted solution. | |
> primesB :: [Integer] | |
> primesB = [2..] `diffI` composites | |
> composites :: [Integer] | |
> composites = foldl f [4,6..] [3..] | |
> where | |
> f cs@(h:t) i | i == h = cs | |
> | otherwise = h : unionI t [i*i, i*i+i ..] | |
> diffI :: Ord a => [a] -> [a] -> [a] | |
> diffI [] b = [] | |
> diffI a [] = a | |
> diffI (x:xs) (y:ys) | x < y = x : xs `diffI` (y:ys) | |
> | x == y = xs `diffI` ys | |
> | otherwise = (x:xs) `diffI` ys | |
> composites = foldl f [4,6..] [3..] | |
> = foldl f (f [4,6..] 3) [4..] -- no part of calculation of (f [4,6..] 3) | |
> = foldl f (4:unionI [6,8..] [9,12..]) [4..] -- is forced here, but if it were, ... | |
> = foldl f (4:unionI [6,8..] [9,12..]) [5..] | |
> = foldl f (4:union (unionI [6,8..] [9,12..]) [25,30..]) [6..] | |
> = foldl f (4:union (union (unionI [6,8..] [9,12..]) [25,30..]) [36,42..]) [7..] | |
> = .... |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment