Skip to content

Instantly share code, notes, and snippets.

@Hermann-SW
Last active March 20, 2024 22:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Hermann-SW/558728c025a1f2d3dace008292bac897 to your computer and use it in GitHub Desktop.
Save Hermann-SW/558728c025a1f2d3dace008292bac897 to your computer and use it in GitHub Desktop.
determine count of Proth primes until environment n
\\ detemine count of Proth primes until environment n
\\ environment l (loops) allows to tune work distribution
\\
\\ based on Karim's isProth2():
\\ https://pari.math.u-bordeaux.fr/archives/pari-users-2403/msg00064.html
\\ and Bill's parallel implementation:
\\ https://pari.math.u-bordeaux.fr/archives/pari-users-2403/msg00067.html
\\
isProth2(p) = !(p >> (valuation(p-1,2)<<1));
export(isProth2);
nbt=default(nbthreads);
doit(n,l)={
my(c=0,B=n\nbt\if(l,l,1));
parfor(ii=0,
(n+B-1)\B,
my(cc=0,i=ii*B+1);
forprime(p=i,
min(i+B-1,n),
if(isProth2(p),cc++));cc,
C,
c+=C
);
c
};
n=eval(getenv("n"));
gettime();
t0=getwalltime();
r=doit(n,eval(getenv("l")));
wt=getwalltime()-t0;
tt=gettime();
print(nbt);
printf("%s × %.3f = %s",strtime(wt),tt/wt/1.0,strtime(tt));
print(r);
t0=getwalltime();
c=0;forprime(p=3,n,if(isProth2(p),c++));
wts=getwalltime()-t0;
printf("%s (%.3f×)",strtime(wts),wts/wt/1.0);
print(c);
@Hermann-SW
Copy link
Author

Bill's original work partitioning corresponds to l=1:

hermann@7950x:~$ n=4636016641 gp -q < proth.gp 
32
3,888 ms × 28.742 = 1min, 51,748 ms
10000
47,880 ms (12.315×)
10000
hermann@7950x:~$

better factor 30.404× of overall time divided by walltime for l=2:

hermann@7950x:~$ n=4636016641 l=2 gp -q < proth.gp 
32
3,700 ms × 30.404 = 1min, 52,495 ms
10000
47,095 ms (12.728×)
10000
hermann@7950x:~$

l=4 slightly better 31.190×:

hermann@7950x:~$ n=4636016641 l=4 gp -q < proth.gp 
32
3,662 ms × 31.190 = 1min, 54,217 ms
10000
47,929 ms (13.088×)
10000
hermann@7950x:~$

even better 31.422× for l=8:

hermann@7950x:~$ n=4636016641 l=8 gp -q < proth.gp 
32
3,642 ms × 31.422 = 1min, 54,439 ms
10000
47,621 ms (13.076×)
10000
hermann@7950x:~$

Factor of parallel walltime by sequential time gets better, from 12.315× up to 13.076×.
Amdahl's law is limiting speedup.

@Hermann-SW
Copy link
Author

For 2.5× n even closer to factor 32×, but the other factor 13.046× seems to be maxed out:

hermann@7950x:~$ n=10^11 l=32 gp -q < proth.gp 
32
1min, 10,058 ms × 31.784 = 37min, 6,695 ms
39170
15min, 13,952 ms (13.046×)
39170
hermann@7950x:~$ 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment