Skip to content

Instantly share code, notes, and snippets.

@robertcampion
Created January 19, 2016 06:25
Show Gist options
  • Save robertcampion/ed6287fc82adff9e8a1d to your computer and use it in GitHub Desktop.
Save robertcampion/ed6287fc82adff9e8a1d to your computer and use it in GitHub Desktop.
NOTE: Most of the tests in DIEHARD return a p-value, which
should be uniform on [0,1) if the input file contains truly
independent random bits. Those p-values are obtained by
p=F(X), where F is the assumed distribution of the sample
random variable X---often normal. But that assumed F is just
an asymptotic approximation, for which the fit will be worst
in the tails. Thus you should not be surprised with
occasional p-values near 0 or 1, such as .0012 or .9983.
When a bit stream really FAILS BIG, you will get p's of 0 or
1 to six or more places. By all means, do not, as a
Statistician might, think that a p < .025 or p> .975 means
that the RNG has "failed the test at the .05 level". Such
p's happen among the hundreds that DIEHARD produces, even
with good RNG's. So keep in mind that " p happens".
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: This is the BIRTHDAY SPACINGS TEST ::
:: Choose m birthdays in a year of n days. List the spacings ::
:: between the birthdays. If j is the number of values that ::
:: occur more than once in that list, then j is asymptotically ::
:: Poisson distributed with mean m^3/(4n). Experience shows n ::
:: must be quite large, say n>=2^18, for comparing the results ::
:: to the Poisson distribution with that mean. This test uses ::
:: n=2^24 and m=2^9, so that the underlying distribution for j ::
:: is taken to be Poisson with lambda=2^27/(2^26)=2. A sample ::
:: of 500 j's is taken, and a chi-square goodness of fit test ::
:: provides a p value. The first test uses bits 1-24 (counting ::
:: from the left) from integers in the specified file. ::
:: Then the file is closed and reopened. Next, bits 2-25 are ::
:: used to provide birthdays, then 3-26 and so on to bits 9-32. ::
:: Each set of bits provides a p-value, and the nine p-values ::
:: provide a sample for a KSTEST. ::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
BIRTHDAY SPACINGS TEST, M= 512 N=2**24 LAMBDA= 2.0000
Results for random.bin
For a sample of size 500: mean
random.bin using bits 1 to 24 2.066
duplicate number number
spacings observed expected
0 57. 67.668
1 127. 135.335
2 148. 135.335
3 96. 90.224
4 48. 45.112
5 16. 18.045
6 to INF 8. 8.282
Chisquare with 6 d.o.f. = 4.18 p-value= .347168
:::::::::::::::::::::::::::::::::::::::::
For a sample of size 500: mean
random.bin using bits 2 to 25 2.032
duplicate number number
spacings observed expected
0 64. 67.668
1 140. 135.335
2 127. 135.335
3 95. 90.224
4 41. 45.112
5 28. 18.045
6 to INF 5. 8.282
Chisquare with 6 d.o.f. = 8.29 p-value= .782614
:::::::::::::::::::::::::::::::::::::::::
For a sample of size 500: mean
random.bin using bits 3 to 26 1.972
duplicate number number
spacings observed expected
0 71. 67.668
1 142. 135.335
2 127. 135.335
3 84. 90.224
4 46. 45.112
5 27. 18.045
6 to INF 3. 8.282
Chisquare with 6 d.o.f. = 9.27 p-value= .840805
:::::::::::::::::::::::::::::::::::::::::
For a sample of size 500: mean
random.bin using bits 4 to 27 2.066
duplicate number number
spacings observed expected
0 66. 67.668
1 122. 135.335
2 144. 135.335
3 90. 90.224
4 46. 45.112
5 24. 18.045
6 to INF 8. 8.282
Chisquare with 6 d.o.f. = 3.90 p-value= .310185
:::::::::::::::::::::::::::::::::::::::::
For a sample of size 500: mean
random.bin using bits 5 to 28 2.050
duplicate number number
spacings observed expected
0 64. 67.668
1 140. 135.335
2 132. 135.335
3 88. 90.224
4 43. 45.112
5 19. 18.045
6 to INF 14. 8.282
Chisquare with 6 d.o.f. = 4.59 p-value= .403183
:::::::::::::::::::::::::::::::::::::::::
For a sample of size 500: mean
random.bin using bits 6 to 29 1.918
duplicate number number
spacings observed expected
0 77. 67.668
1 136. 135.335
2 131. 135.335
3 90. 90.224
4 46. 45.112
5 14. 18.045
6 to INF 6. 8.282
Chisquare with 6 d.o.f. = 2.98 p-value= .188967
:::::::::::::::::::::::::::::::::::::::::
For a sample of size 500: mean
random.bin using bits 7 to 30 2.006
duplicate number number
spacings observed expected
0 64. 67.668
1 142. 135.335
2 136. 135.335
3 80. 90.224
4 49. 45.112
5 22. 18.045
6 to INF 7. 8.282
Chisquare with 6 d.o.f. = 3.09 p-value= .202432
:::::::::::::::::::::::::::::::::::::::::
For a sample of size 500: mean
random.bin using bits 8 to 31 1.972
duplicate number number
spacings observed expected
0 71. 67.668
1 132. 135.335
2 145. 135.335
3 80. 90.224
4 46. 45.112
5 19. 18.045
6 to INF 7. 8.282
Chisquare with 6 d.o.f. = 2.36 p-value= .116355
:::::::::::::::::::::::::::::::::::::::::
For a sample of size 500: mean
random.bin using bits 9 to 32 1.950
duplicate number number
spacings observed expected
0 81. 67.668
1 133. 135.335
2 123. 135.335
3 92. 90.224
4 44. 45.112
5 20. 18.045
6 to INF 7. 8.282
Chisquare with 6 d.o.f. = 4.26 p-value= .359013
:::::::::::::::::::::::::::::::::::::::::
The 9 p-values were
.347168 .782614 .840805 .310185 .403183
.188967 .202432 .116355 .359013
A KSTEST for the 9 p-values yields .684805
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: THE OVERLAPPING 5-PERMUTATION TEST ::
:: This is the OPERM5 test. It looks at a sequence of one mill- ::
:: ion 32-bit random integers. Each set of five consecutive ::
:: integers can be in one of 120 states, for the 5! possible or- ::
:: derings of five numbers. Thus the 5th, 6th, 7th,...numbers ::
:: each provide a state. As many thousands of state transitions ::
:: are observed, cumulative counts are made of the number of ::
:: occurences of each state. Then the quadratic form in the ::
:: weak inverse of the 120x120 covariance matrix yields a test ::
:: equivalent to the likelihood ratio test that the 120 cell ::
:: counts came from the specified (asymptotically) normal dis- ::
:: tribution with the specified 120x120 covariance matrix (with ::
:: rank 99). This version uses 1,000,000 integers, twice. ::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
OPERM5 test for file random.bin
For a sample of 1,000,000 consecutive 5-tuples,
chisquare for 99 degrees of freedom=108.450; p-value= .757608
OPERM5 test for file random.bin
For a sample of 1,000,000 consecutive 5-tuples,
chisquare for 99 degrees of freedom= 96.791; p-value= .455899
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: This is the BINARY RANK TEST for 31x31 matrices. The leftmost ::
:: 31 bits of 31 random integers from the test sequence are used ::
:: to form a 31x31 binary matrix over the field {0,1}. The rank ::
:: is determined. That rank can be from 0 to 31, but ranks< 28 ::
:: are rare, and their counts are pooled with those for rank 28. ::
:: Ranks are found for 40,000 such random matrices and a chisqua-::
:: re test is performed on counts for ranks 31,30,29 and <=28. ::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Binary rank test for random.bin
Rank test for 31x31 binary matrices:
rows from leftmost 31 bits of each 32-bit integer
rank observed expected (o-e)^2/e sum
28 211 211.4 .000826 .001
29 5161 5134.0 .141886 .143
30 23170 23103.0 .194032 .337
31 11458 11551.5 .757200 1.094
chisquare= 1.094 for 3 d. of f.; p-value= .369264
--------------------------------------------------------------
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: This is the BINARY RANK TEST for 32x32 matrices. A random 32x ::
:: 32 binary matrix is formed, each row a 32-bit random integer. ::
:: The rank is determined. That rank can be from 0 to 32, ranks ::
:: less than 29 are rare, and their counts are pooled with those ::
:: for rank 29. Ranks are found for 40,000 such random matrices ::
:: and a chisquare test is performed on counts for ranks 32,31, ::
:: 30 and <=29. ::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Binary rank test for random.bin
Rank test for 32x32 binary matrices:
rows from leftmost 32 bits of each 32-bit integer
rank observed expected (o-e)^2/e sum
29 201 211.4 .513367 .513
30 5206 5134.0 1.009449 1.523
31 23160 23103.0 .140400 1.663
32 11433 11551.5 1.216120 2.879
chisquare= 2.879 for 3 d. of f.; p-value= .634256
--------------------------------------------------------------
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: This is the BINARY RANK TEST for 6x8 matrices. From each of ::
:: six random 32-bit integers from the generator under test, a ::
:: specified byte is chosen, and the resulting six bytes form a ::
:: 6x8 binary matrix whose rank is determined. That rank can be ::
:: from 0 to 6, but ranks 0,1,2,3 are rare; their counts are ::
:: pooled with those for rank 4. Ranks are found for 100,000 ::
:: random matrices, and a chi-square test is performed on ::
:: counts for ranks 6,5 and <=4. ::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Binary Rank Test for random.bin
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 1 to 8
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 992 944.3 2.409 2.409
r =5 21985 21743.9 2.673 5.083
r =6 77023 77311.8 1.079 6.162
p=1-exp(-SUM/2)= .95408
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 2 to 9
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 939 944.3 .030 .030
r =5 21927 21743.9 1.542 1.572
r =6 77134 77311.8 .409 1.981
p=1-exp(-SUM/2)= .62852
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 3 to 10
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 953 944.3 .080 .080
r =5 21824 21743.9 .295 .375
r =6 77223 77311.8 .102 .477
p=1-exp(-SUM/2)= .21227
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 4 to 11
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 937 944.3 .056 .056
r =5 21832 21743.9 .357 .413
r =6 77231 77311.8 .084 .498
p=1-exp(-SUM/2)= .22037
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 5 to 12
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 915 944.3 .909 .909
r =5 21621 21743.9 .695 1.604
r =6 77464 77311.8 .300 1.903
p=1-exp(-SUM/2)= .61393
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 6 to 13
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 964 944.3 .411 .411
r =5 21851 21743.9 .528 .938
r =6 77185 77311.8 .208 1.146
p=1-exp(-SUM/2)= .43629
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 7 to 14
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 959 944.3 .229 .229
r =5 21806 21743.9 .177 .406
r =6 77235 77311.8 .076 .482
p=1-exp(-SUM/2)= .21434
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 8 to 15
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 935 944.3 .092 .092
r =5 21549 21743.9 1.747 1.839
r =6 77516 77311.8 .539 2.378
p=1-exp(-SUM/2)= .69546
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 9 to 16
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 932 944.3 .160 .160
r =5 21437 21743.9 4.332 4.492
r =6 77631 77311.8 1.318 5.810
p=1-exp(-SUM/2)= .94525
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 10 to 17
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 993 944.3 2.511 2.511
r =5 21427 21743.9 4.619 7.130
r =6 77580 77311.8 .930 8.060
p=1-exp(-SUM/2)= .98223
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 11 to 18
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 945 944.3 .001 .001
r =5 21702 21743.9 .081 .081
r =6 77353 77311.8 .022 .103
p=1-exp(-SUM/2)= .05030
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 12 to 19
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 950 944.3 .034 .034
r =5 21555 21743.9 1.641 1.675
r =6 77495 77311.8 .434 2.110
p=1-exp(-SUM/2)= .65173
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 13 to 20
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 963 944.3 .370 .370
r =5 21573 21743.9 1.343 1.713
r =6 77464 77311.8 .300 2.013
p=1-exp(-SUM/2)= .63452
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 14 to 21
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 959 944.3 .229 .229
r =5 21674 21743.9 .225 .454
r =6 77367 77311.8 .039 .493
p=1-exp(-SUM/2)= .21843
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 15 to 22
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 889 944.3 3.239 3.239
r =5 21563 21743.9 1.505 4.744
r =6 77548 77311.8 .722 5.465
p=1-exp(-SUM/2)= .93495
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 16 to 23
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 878 944.3 4.655 4.655
r =5 21991 21743.9 2.808 7.463
r =6 77131 77311.8 .423 7.886
p=1-exp(-SUM/2)= .98061
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 17 to 24
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 987 944.3 1.931 1.931
r =5 21967 21743.9 2.289 4.220
r =6 77046 77311.8 .914 5.134
p=1-exp(-SUM/2)= .92322
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 18 to 25
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 963 944.3 .370 .370
r =5 21960 21743.9 2.148 2.518
r =6 77077 77311.8 .713 3.231
p=1-exp(-SUM/2)= .80122
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 19 to 26
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 986 944.3 1.841 1.841
r =5 21861 21743.9 .631 2.472
r =6 77153 77311.8 .326 2.798
p=1-exp(-SUM/2)= .75318
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 20 to 27
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 922 944.3 .527 .527
r =5 21645 21743.9 .450 .977
r =6 77433 77311.8 .190 1.167
p=1-exp(-SUM/2)= .44192
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 21 to 28
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 896 944.3 2.471 2.471
r =5 21755 21743.9 .006 2.476
r =6 77349 77311.8 .018 2.494
p=1-exp(-SUM/2)= .71266
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 22 to 29
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 929 944.3 .248 .248
r =5 21755 21743.9 .006 .254
r =6 77316 77311.8 .000 .254
p=1-exp(-SUM/2)= .11919
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 23 to 30
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 898 944.3 2.270 2.270
r =5 21678 21743.9 .200 2.470
r =6 77424 77311.8 .163 2.633
p=1-exp(-SUM/2)= .73190
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 24 to 31
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 962 944.3 .332 .332
r =5 21686 21743.9 .154 .486
r =6 77352 77311.8 .021 .507
p=1-exp(-SUM/2)= .22384
Rank of a 6x8 binary matrix,
rows formed from eight bits of the RNG random.bin
b-rank test for bits 25 to 32
OBSERVED EXPECTED (O-E)^2/E SUM
r<=4 928 944.3 .281 .281
r =5 21890 21743.9 .982 1.263
r =6 77182 77311.8 .218 1.481
p=1-exp(-SUM/2)= .52313
TEST SUMMARY, 25 tests on 100,000 random 6x8 matrices
These should be 25 uniform [0,1] random variables:
.954077 .628519 .212272 .220366 .613930
.436290 .214336 .695462 .945246 .982229
.050296 .651731 .634522 .218434 .934951
.980610 .923221 .801217 .753177 .441921
.712661 .119193 .731902 .223842 .523125
brank test summary for random.bin
The KS test for those 25 supposed UNI's yields
KS p-value= .838396
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: THE BITSTREAM TEST ::
:: The file under test is viewed as a stream of bits. Call them ::
:: b1,b2,... . Consider an alphabet with two "letters", 0 and 1 ::
:: and think of the stream of bits as a succession of 20-letter ::
:: "words", overlapping. Thus the first word is b1b2...b20, the ::
:: second is b2b3...b21, and so on. The bitstream test counts ::
:: the number of missing 20-letter (20-bit) words in a string of ::
:: 2^21 overlapping 20-letter words. There are 2^20 possible 20 ::
:: letter words. For a truly random string of 2^21+19 bits, the ::
:: number of missing words j should be (very close to) normally ::
:: distributed with mean 141,909 and sigma 428. Thus ::
:: (j-141909)/428 should be a standard normal variate (z score) ::
:: that leads to a uniform [0,1) p value. The test is repeated ::
:: twenty times. ::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
THE OVERLAPPING 20-tuples BITSTREAM TEST, 20 BITS PER WORD, N words
This test uses N=2^21 and samples the bitstream 20 times.
No. missing words should average 141909. with sigma=428.
---------------------------------------------------------
tst no 1: 141995 missing words, .20 sigmas from mean, p-value= .57933
tst no 2: 141834 missing words, -.18 sigmas from mean, p-value= .43015
tst no 3: 142571 missing words, 1.55 sigmas from mean, p-value= .93894
tst no 4: 142079 missing words, .40 sigmas from mean, p-value= .65411
tst no 5: 142241 missing words, .77 sigmas from mean, p-value= .78081
tst no 6: 141772 missing words, -.32 sigmas from mean, p-value= .37416
tst no 7: 141768 missing words, -.33 sigmas from mean, p-value= .37062
tst no 8: 141415 missing words, -1.15 sigmas from mean, p-value= .12405
tst no 9: 142323 missing words, .97 sigmas from mean, p-value= .83311
tst no 10: 142288 missing words, .88 sigmas from mean, p-value= .81185
tst no 11: 142330 missing words, .98 sigmas from mean, p-value= .83717
tst no 12: 142406 missing words, 1.16 sigmas from mean, p-value= .87707
tst no 13: 141693 missing words, -.51 sigmas from mean, p-value= .30663
tst no 14: 142104 missing words, .45 sigmas from mean, p-value= .67539
tst no 15: 142240 missing words, .77 sigmas from mean, p-value= .78012
tst no 16: 142493 missing words, 1.36 sigmas from mean, p-value= .91367
tst no 17: 141980 missing words, .17 sigmas from mean, p-value= .56558
tst no 18: 142054 missing words, .34 sigmas from mean, p-value= .63233
tst no 19: 141953 missing words, .10 sigmas from mean, p-value= .54064
tst no 20: 142645 missing words, 1.72 sigmas from mean, p-value= .95718
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: The tests OPSO, OQSO and DNA ::
:: OPSO means Overlapping-Pairs-Sparse-Occupancy ::
:: The OPSO test considers 2-letter words from an alphabet of ::
:: 1024 letters. Each letter is determined by a specified ten ::
:: bits from a 32-bit integer in the sequence to be tested. OPSO ::
:: generates 2^21 (overlapping) 2-letter words (from 2^21+1 ::
:: "keystrokes") and counts the number of missing words---that ::
:: is 2-letter words which do not appear in the entire sequence. ::
:: That count should be very close to normally distributed with ::
:: mean 141,909, sigma 290. Thus (missingwrds-141909)/290 should ::
:: be a standard normal variable. The OPSO test takes 32 bits at ::
:: a time from the test file and uses a designated set of ten ::
:: consecutive bits. It then restarts the file for the next de- ::
:: signated 10 bits, and so on. ::
:: ::
:: OQSO means Overlapping-Quadruples-Sparse-Occupancy ::
:: The test OQSO is similar, except that it considers 4-letter ::
:: words from an alphabet of 32 letters, each letter determined ::
:: by a designated string of 5 consecutive bits from the test ::
:: file, elements of which are assumed 32-bit random integers. ::
:: The mean number of missing words in a sequence of 2^21 four- ::
:: letter words, (2^21+3 "keystrokes"), is again 141909, with ::
:: sigma = 295. The mean is based on theory; sigma comes from ::
:: extensive simulation. ::
:: ::
:: The DNA test considers an alphabet of 4 letters:: C,G,A,T,::
:: determined by two designated bits in the sequence of random ::
:: integers being tested. It considers 10-letter words, so that ::
:: as in OPSO and OQSO, there are 2^20 possible words, and the ::
:: mean number of missing words from a string of 2^21 (over- ::
:: lapping) 10-letter words (2^21+9 "keystrokes") is 141909. ::
:: The standard deviation sigma=339 was determined as for OQSO ::
:: by simulation. (Sigma for OPSO, 290, is the true value (to ::
:: three places), not determined by simulation. ::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
OPSO test for generator random.bin
Output: No. missing words (mw), equiv normal variate (z), p-value (p)
mw z p
OPSO for random.bin using bits 23 to 32 142201 1.006 .8427
OPSO for random.bin using bits 22 to 31 142123 .737 .7694
OPSO for random.bin using bits 21 to 30 142244 1.154 .8758
OPSO for random.bin using bits 20 to 29 142086 .609 .7288
OPSO for random.bin using bits 19 to 28 141812 -.336 .3686
OPSO for random.bin using bits 18 to 27 141612 -1.025 .1526
OPSO for random.bin using bits 17 to 26 141383 -1.815 .0348
OPSO for random.bin using bits 16 to 25 141632 -.956 .1695
OPSO for random.bin using bits 15 to 24 142315 1.399 .9191
OPSO for random.bin using bits 14 to 23 142417 1.751 .9600
OPSO for random.bin using bits 13 to 22 141542 -1.267 .1026
OPSO for random.bin using bits 12 to 21 141941 .109 .5435
OPSO for random.bin using bits 11 to 20 142110 .692 .7555
OPSO for random.bin using bits 10 to 19 141738 -.591 .2773
OPSO for random.bin using bits 9 to 18 141905 -.015 .4940
OPSO for random.bin using bits 8 to 17 141667 -.836 .2017
OPSO for random.bin using bits 7 to 16 142092 .630 .7356
OPSO for random.bin using bits 6 to 15 141956 .161 .5639
OPSO for random.bin using bits 5 to 14 141855 -.187 .4257
OPSO for random.bin using bits 4 to 13 141735 -.601 .2739
OPSO for random.bin using bits 3 to 12 141840 -.239 .4055
OPSO for random.bin using bits 2 to 11 142051 .489 .6874
OPSO for random.bin using bits 1 to 10 141442 -1.611 .0535
OQSO test for generator random.bin
Output: No. missing words (mw), equiv normal variate (z), p-value (p)
mw z p
OQSO for random.bin using bits 28 to 32 141832 -.262 .3966
OQSO for random.bin using bits 27 to 31 142033 .419 .6625
OQSO for random.bin using bits 26 to 30 141688 -.750 .2265
OQSO for random.bin using bits 25 to 29 141687 -.754 .2255
OQSO for random.bin using bits 24 to 28 141827 -.279 .3901
OQSO for random.bin using bits 23 to 27 141717 -.652 .2572
OQSO for random.bin using bits 22 to 26 141682 -.771 .2205
OQSO for random.bin using bits 21 to 25 141735 -.591 .2773
OQSO for random.bin using bits 20 to 24 142287 1.280 .8998
OQSO for random.bin using bits 19 to 23 142117 .704 .7593
OQSO for random.bin using bits 18 to 22 141880 -.099 .4604
OQSO for random.bin using bits 17 to 21 141470 -1.489 .0682
OQSO for random.bin using bits 16 to 20 141703 -.699 .2421
OQSO for random.bin using bits 15 to 19 141866 -.147 .4416
OQSO for random.bin using bits 14 to 18 141690 -.743 .2286
OQSO for random.bin using bits 13 to 17 141980 .240 .5947
OQSO for random.bin using bits 12 to 16 141862 -.160 .4363
OQSO for random.bin using bits 11 to 15 141769 -.476 .3171
OQSO for random.bin using bits 10 to 14 141295 -2.082 .0186
OQSO for random.bin using bits 9 to 13 142181 .921 .8215
OQSO for random.bin using bits 8 to 12 141771 -.469 .3196
OQSO for random.bin using bits 7 to 11 141703 -.699 .2421
OQSO for random.bin using bits 6 to 10 141898 -.038 .4847
OQSO for random.bin using bits 5 to 9 141493 -1.411 .0791
OQSO for random.bin using bits 4 to 8 142413 1.707 .9561
OQSO for random.bin using bits 3 to 7 141883 -.089 .4644
OQSO for random.bin using bits 2 to 6 142038 .436 .6686
OQSO for random.bin using bits 1 to 5 141559 -1.188 .1175
DNA test for generator random.bin
Output: No. missing words (mw), equiv normal variate (z), p-value (p)
mw z p
DNA for random.bin using bits 31 to 32 142018 .321 .6257
DNA for random.bin using bits 30 to 31 141234 -1.992 .0232
DNA for random.bin using bits 29 to 30 141567 -1.010 .1563
DNA for random.bin using bits 28 to 29 141759 -.443 .3287
DNA for random.bin using bits 27 to 28 142071 .477 .6833
DNA for random.bin using bits 26 to 27 141978 .203 .5803
DNA for random.bin using bits 25 to 26 141910 .002 .5008
DNA for random.bin using bits 24 to 25 142161 .742 .7711
DNA for random.bin using bits 23 to 24 141999 .265 .6043
DNA for random.bin using bits 22 to 23 142021 .329 .6291
DNA for random.bin using bits 21 to 22 142059 .442 .6706
DNA for random.bin using bits 20 to 21 141822 -.258 .3984
DNA for random.bin using bits 19 to 20 141862 -.140 .4445
DNA for random.bin using bits 18 to 19 141934 .073 .5290
DNA for random.bin using bits 17 to 18 142627 2.117 .9829
DNA for random.bin using bits 16 to 17 141617 -.862 .1943
DNA for random.bin using bits 15 to 16 142001 .270 .6066
DNA for random.bin using bits 14 to 15 141527 -1.128 .1297
DNA for random.bin using bits 13 to 14 141582 -.966 .1671
DNA for random.bin using bits 12 to 13 142042 .391 .6522
DNA for random.bin using bits 11 to 12 141798 -.328 .3713
DNA for random.bin using bits 10 to 11 142672 2.250 .9878
DNA for random.bin using bits 9 to 10 141514 -1.166 .1218
DNA for random.bin using bits 8 to 9 142654 2.197 .9860
DNA for random.bin using bits 7 to 8 142509 1.769 .9615
DNA for random.bin using bits 6 to 7 141610 -.883 .1886
DNA for random.bin using bits 5 to 6 141894 -.045 .4820
DNA for random.bin using bits 4 to 5 142311 1.185 .8820
DNA for random.bin using bits 3 to 4 141909 -.001 .4996
DNA for random.bin using bits 2 to 3 141822 -.258 .3984
DNA for random.bin using bits 1 to 2 142160 .739 .7702
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: This is the COUNT-THE-1's TEST on a stream of bytes. ::
:: Consider the file under test as a stream of bytes (four per ::
:: 32 bit integer). Each byte can contain from 0 to 8 1's, ::
:: with probabilities 1,8,28,56,70,56,28,8,1 over 256. Now let ::
:: the stream of bytes provide a string of overlapping 5-letter ::
:: words, each "letter" taking values A,B,C,D,E. The letters are ::
:: determined by the number of 1's in a byte:: 0,1,or 2 yield A,::
:: 3 yields B, 4 yields C, 5 yields D and 6,7 or 8 yield E. Thus ::
:: we have a monkey at a typewriter hitting five keys with vari- ::
:: ous probabilities (37,56,70,56,37 over 256). There are 5^5 ::
:: possible 5-letter words, and from a string of 256,000 (over- ::
:: lapping) 5-letter words, counts are made on the frequencies ::
:: for each word. The quadratic form in the weak inverse of ::
:: the covariance matrix of the cell counts provides a chisquare ::
:: test:: Q5-Q4, the difference of the naive Pearson sums of ::
:: (OBS-EXP)^2/EXP on counts for 5- and 4-letter cell counts. ::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Test results for random.bin
Chi-square with 5^5-5^4=2500 d.of f. for sample size:2560000
chisquare equiv normal p-value
Results fo COUNT-THE-1's in successive bytes:
byte stream for random.bin 2527.13 .384 .649382
byte stream for random.bin 2567.96 .961 .831761
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: This is the COUNT-THE-1's TEST for specific bytes. ::
:: Consider the file under test as a stream of 32-bit integers. ::
:: From each integer, a specific byte is chosen , say the left- ::
:: most:: bits 1 to 8. Each byte can contain from 0 to 8 1's, ::
:: with probabilitie 1,8,28,56,70,56,28,8,1 over 256. Now let ::
:: the specified bytes from successive integers provide a string ::
:: of (overlapping) 5-letter words, each "letter" taking values ::
:: A,B,C,D,E. The letters are determined by the number of 1's, ::
:: in that byte:: 0,1,or 2 ---> A, 3 ---> B, 4 ---> C, 5 ---> D,::
:: and 6,7 or 8 ---> E. Thus we have a monkey at a typewriter ::
:: hitting five keys with with various probabilities:: 37,56,70,::
:: 56,37 over 256. There are 5^5 possible 5-letter words, and ::
:: from a string of 256,000 (overlapping) 5-letter words, counts ::
:: are made on the frequencies for each word. The quadratic form ::
:: in the weak inverse of the covariance matrix of the cell ::
:: counts provides a chisquare test:: Q5-Q4, the difference of ::
:: the naive Pearson sums of (OBS-EXP)^2/EXP on counts for 5- ::
:: and 4-letter cell counts. ::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Chi-square with 5^5-5^4=2500 d.of f. for sample size: 256000
chisquare equiv normal p value
Results for COUNT-THE-1's in specified bytes:
bits 1 to 8 2482.01 -.254 .399574
bits 2 to 9 2342.85 -2.222 .013127
bits 3 to 10 2413.19 -1.228 .109785
bits 4 to 11 2505.58 .079 .531458
bits 5 to 12 2584.23 1.191 .883199
bits 6 to 13 2506.29 .089 .535469
bits 7 to 14 2502.65 .037 .514937
bits 8 to 15 2483.86 -.228 .409749
bits 9 to 16 2458.99 -.580 .280954
bits 10 to 17 2555.30 .782 .782930
bits 11 to 18 2549.75 .704 .759151
bits 12 to 19 2589.73 1.269 .897773
bits 13 to 20 2614.83 1.624 .947809
bits 14 to 21 2517.38 .246 .597064
bits 15 to 22 2513.09 .185 .573439
bits 16 to 23 2508.87 .125 .549933
bits 17 to 24 2381.56 -1.675 .046972
bits 18 to 25 2498.22 -.025 .489966
bits 19 to 26 2495.24 -.067 .473146
bits 20 to 27 2521.61 .306 .620075
bits 21 to 28 2618.16 1.671 .952638
bits 22 to 29 2557.98 .820 .793892
bits 23 to 30 2449.60 -.713 .237995
bits 24 to 31 2477.11 -.324 .373098
bits 25 to 32 2573.74 1.043 .851478
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: THIS IS A PARKING LOT TEST ::
:: In a square of side 100, randomly "park" a car---a circle of ::
:: radius 1. Then try to park a 2nd, a 3rd, and so on, each ::
:: time parking "by ear". That is, if an attempt to park a car ::
:: causes a crash with one already parked, try again at a new ::
:: random location. (To avoid path problems, consider parking ::
:: helicopters rather than cars.) Each attempt leads to either ::
:: a crash or a success, the latter followed by an increment to ::
:: the list of cars already parked. If we plot n: the number of ::
:: attempts, versus k:: the number successfully parked, we get a::
:: curve that should be similar to those provided by a perfect ::
:: random number generator. Theory for the behavior of such a ::
:: random curve seems beyond reach, and as graphics displays are ::
:: not available for this battery of tests, a simple characteriz ::
:: ation of the random experiment is used: k, the number of cars ::
:: successfully parked after n=12,000 attempts. Simulation shows ::
:: that k should average 3523 with sigma 21.9 and is very close ::
:: to normally distributed. Thus (k-3523)/21.9 should be a st- ::
:: andard normal variable, which, converted to a uniform varia- ::
:: ble, provides input to a KSTEST based on a sample of 10. ::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CDPARK: result of ten tests on file random.bin
Of 12,000 tries, the average no. of successes
should be 3523 with sigma=21.9
Successes: 3530 z-score: .320 p-value: .625377
Successes: 3509 z-score: -.639 p-value: .261324
Successes: 3521 z-score: -.091 p-value: .463618
Successes: 3466 z-score: -2.603 p-value: .004624
Successes: 3522 z-score: -.046 p-value: .481790
Successes: 3517 z-score: -.274 p-value: .392053
Successes: 3548 z-score: 1.142 p-value: .873180
Successes: 3526 z-score: .137 p-value: .554479
Successes: 3551 z-score: 1.279 p-value: .899470
Successes: 3533 z-score: .457 p-value: .676028
square size avg. no. parked sample sigma
100. 3522.300 22.457
KSTEST for the above 10: p= .266079
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: THE MINIMUM DISTANCE TEST ::
:: It does this 100 times:: choose n=8000 random points in a ::
:: square of side 10000. Find d, the minimum distance between ::
:: the (n^2-n)/2 pairs of points. If the points are truly inde- ::
:: pendent uniform, then d^2, the square of the minimum distance ::
:: should be (very close to) exponentially distributed with mean ::
:: .995 . Thus 1-exp(-d^2/.995) should be uniform on [0,1) and ::
:: a KSTEST on the resulting 100 values serves as a test of uni- ::
:: formity for random points in the square. Test numbers=0 mod 5 ::
:: are printed but the KSTEST is based on the full set of 100 ::
:: random choices of 8000 points in the 10000x10000 square. ::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
This is the MINIMUM DISTANCE test
for random integers in the file random.bin
Sample no. d^2 avg equiv uni
5 1.9790 2.1369 .863166
10 .5505 1.4008 .424959
15 .3998 1.2303 .330862
20 1.5178 1.2251 .782471
25 2.7155 1.3062 .934722
30 2.0228 1.1854 .869057
35 .4750 1.1153 .379576
40 1.3005 1.1325 .729392
45 .4269 1.0846 .348848
50 1.5156 1.1285 .781986
55 .6288 1.1075 .468451
60 2.0025 1.1368 .866357
65 1.4711 1.1256 .772020
70 .7313 1.0652 .520493
75 .3194 1.0348 .274557
80 2.3196 1.0481 .902824
85 3.7112 1.0580 .976004
90 .2516 1.0562 .223399
95 .6092 1.0361 .457874
100 1.4253 1.0534 .761275
MINIMUM DISTANCE TEST for random.bin
Result of KS test on 20 transformed mindist^2's:
p-value= .493300
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: THE 3DSPHERES TEST ::
:: Choose 4000 random points in a cube of edge 1000. At each ::
:: point, center a sphere large enough to reach the next closest ::
:: point. Then the volume of the smallest such sphere is (very ::
:: close to) exponentially distributed with mean 120pi/3. Thus ::
:: the radius cubed is exponential with mean 30. (The mean is ::
:: obtained by extensive simulation). The 3DSPHERES test gener- ::
:: ates 4000 such spheres 20 times. Each min radius cubed leads ::
:: to a uniform variable by means of 1-exp(-r^3/30.), then a ::
:: KSTEST is done on the 20 p-values. ::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
The 3DSPHERES test for file random.bin
sample no: 1 r^3= 2.839 p-value= .09029
sample no: 2 r^3= 49.939 p-value= .81074
sample no: 3 r^3= 1.968 p-value= .06348
sample no: 4 r^3= 7.360 p-value= .21756
sample no: 5 r^3= 22.772 p-value= .53189
sample no: 6 r^3= 24.316 p-value= .55537
sample no: 7 r^3= 45.656 p-value= .78170
sample no: 8 r^3= 30.042 p-value= .63263
sample no: 9 r^3= 5.317 p-value= .16243
sample no: 10 r^3= 6.664 p-value= .19918
sample no: 11 r^3= 41.088 p-value= .74579
sample no: 12 r^3= 5.903 p-value= .17863
sample no: 13 r^3= 48.696 p-value= .80273
sample no: 14 r^3= 22.822 p-value= .53268
sample no: 15 r^3= 6.985 p-value= .20771
sample no: 16 r^3= 29.169 p-value= .62179
sample no: 17 r^3= 19.812 p-value= .48336
sample no: 18 r^3= 9.162 p-value= .26317
sample no: 19 r^3= 26.759 p-value= .59015
sample no: 20 r^3= 47.072 p-value= .79176
A KS test is applied to those 20 p-values.
---------------------------------------------------------
3DSPHERES test for file random.bin p-value= .492809
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: This is the SQEEZE test ::
:: Random integers are floated to get uniforms on [0,1). Start- ::
:: ing with k=2^31=2147483647, the test finds j, the number of ::
:: iterations necessary to reduce k to 1, using the reduction ::
:: k=ceiling(k*U), with U provided by floating integers from ::
:: the file being tested. Such j's are found 100,000 times, ::
:: then counts for the number of times j was <=6,7,...,47,>=48 ::
:: are used to provide a chi-square test for cell frequencies. ::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
RESULTS OF SQUEEZE TEST FOR random.bin
Table of standardized frequency counts
( (obs-exp)/sqrt(exp) )^2
for j taking values <=6,7,8,...,47,>=48:
-.8 .1 -.8 -.8 .5 .6
1.1 -1.0 .4 .1 -.4 -1.4
1.1 .6 1.2 .3 -1.6 .8
.8 .9 -1.0 -.9 -.3 -2.1
1.8 -1.6 .8 .7 -1.0 1.4
-2.1 2.3 -.3 -1.2 -.2 -.5
.0 .5 .1 -.1 1.6 -1.0
-.1
Chi-square with 42 degrees of freedom: 46.312
z-score= .470 p-value= .701241
______________________________________________________________
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: The OVERLAPPING SUMS test ::
:: Integers are floated to get a sequence U(1),U(2),... of uni- ::
:: form [0,1) variables. Then overlapping sums, ::
:: S(1)=U(1)+...+U(100), S2=U(2)+...+U(101),... are formed. ::
:: The S's are virtually normal with a certain covariance mat- ::
:: rix. A linear transformation of the S's converts them to a ::
:: sequence of independent standard normals, which are converted ::
:: to uniform variables for a KSTEST. The p-values from ten ::
:: KSTESTs are given still another KSTEST. ::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Test no. 1 p-value .652326
Test no. 2 p-value .783967
Test no. 3 p-value .067778
Test no. 4 p-value .649611
Test no. 5 p-value .692736
Test no. 6 p-value .598217
Test no. 7 p-value .268917
Test no. 8 p-value .863340
Test no. 9 p-value .084200
Test no. 10 p-value .532075
Results of the OSUM test for random.bin
KSTEST on the above 10 p-values: .274531
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: This is the RUNS test. It counts runs up, and runs down, ::
:: in a sequence of uniform [0,1) variables, obtained by float- ::
:: ing the 32-bit integers in the specified file. This example ::
:: shows how runs are counted: .123,.357,.789,.425,.224,.416,.95::
:: contains an up-run of length 3, a down-run of length 2 and an ::
:: up-run of (at least) 2, depending on the next values. The ::
:: covariance matrices for the runs-up and runs-down are well ::
:: known, leading to chisquare tests for quadratic forms in the ::
:: weak inverses of the covariance matrices. Runs are counted ::
:: for sequences of length 10,000. This is done ten times. Then ::
:: repeated. ::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
The RUNS test for file random.bin
Up and down runs in a sample of 10000
_________________________________________________
Run test for random.bin :
runs up; ks test for 10 p's: .074944
runs down; ks test for 10 p's: .396186
Run test for random.bin :
runs up; ks test for 10 p's: .825835
runs down; ks test for 10 p's: .742302
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: This is the CRAPS TEST. It plays 200,000 games of craps, finds::
:: the number of wins and the number of throws necessary to end ::
:: each game. The number of wins should be (very close to) a ::
:: normal with mean 200000p and variance 200000p(1-p), with ::
:: p=244/495. Throws necessary to complete the game can vary ::
:: from 1 to infinity, but counts for all>21 are lumped with 21. ::
:: A chi-square test is made on the no.-of-throws cell counts. ::
:: Each 32-bit integer from the test file provides the value for ::
:: the throw of a die, by floating to [0,1), multiplying by 6 ::
:: and taking 1 plus the integer part of the result. ::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Results of craps test for random.bin
No. of wins: Observed Expected
98531 98585.86
98531= No. of wins, z-score= -.245 pvalue= .40309
Analysis of Throws-per-Game:
Chisq= 15.90 for 20 degrees of freedom, p= .27739
Throws Observed Expected Chisq Sum
1 66488 66666.7 .479 .479
2 37588 37654.3 .117 .596
3 27256 26954.7 3.367 3.963
4 19320 19313.5 .002 3.965
5 13675 13851.4 2.247 6.212
6 9966 9943.5 .051 6.263
7 7095 7145.0 .350 6.613
8 5224 5139.1 1.404 8.017
9 3726 3699.9 .185 8.201
10 2666 2666.3 .000 8.201
11 1869 1923.3 1.535 9.736
12 1377 1388.7 .099 9.835
13 1026 1003.7 .495 10.330
14 767 726.1 2.299 12.629
15 550 525.8 1.110 13.740
16 382 381.2 .002 13.741
17 276 276.5 .001 13.742
18 210 200.8 .419 14.161
19 149 146.0 .062 14.223
20 94 106.2 1.405 15.628
21 296 287.1 .275 15.903
SUMMARY FOR random.bin
p-value for no. of wins: .403088
p-value for throws/game: .277389
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
Results of DIEHARD battery of tests sent to file random.log
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment