Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save drfloob/6924928 to your computer and use it in GitHub Desktop.
Save drfloob/6924928 to your computer and use it in GitHub Desktop.
# Dumb Guy's Statistical Analysis of the Diehard RNG Suite's Craps Test
## Test Description
Craps is a series of rolls of 2 dice, where you win on the first roll
when the sum of the dice is 7 or 11, or lose when it's 2, 3, or 12. If
you roll any other number on the first roll, that is the point you
have to make on subsequent rolls. If you hit that number again, you
win. Otherwise, if you roll 7 you lose.
The total number of wins should approximately resemble a normal
distribution, and the number of rolls it takes to end a game should
match their respective simple probabilities. These are both tested
statistically, and the resulting p values are given.
## Example Program Output
```
|-------------------------------------------------------------|
|This the CRAPS TEST. It plays 200,000 games of craps, counts|
|the number of wins and the number of throws necessary to end |
|each game. The number of wins should be (very close to) a |
|normal with mean 200000p and variance 200000p(1-p), and |
|p=244/495. Throws necessary to complete the game can vary |
|from 1 to infinity, but counts for all>21 are lumped with 21.|
|A chi-square test is made on the no.-of-throws cell counts. |
|Each 32-bit integer from the test file provides the value for|
|the throw of a die, by floating to [0,1), multiplying by 6 |
|and taking 1 plus the integer part of the result. |
|-------------------------------------------------------------|
RESULTS OF CRAPS TEST FOR bits.22
No. of wins: Observed Expected
98332 98585.858586
z-score=-1.135, pvalue=0.87190
Analysis of Throws-per-Game:
Throws Observed Expected Chisq Sum of (O-E)^2/E
1 66910 66666.7 0.888 0.888
2 37869 37654.3 1.224 2.112
3 26834 26954.7 0.541 2.653
4 19219 19313.5 0.462 3.115
5 13753 13851.4 0.699 3.814
6 9788 9943.5 2.433 6.247
7 7137 7145.0 0.009 6.256
8 5249 5139.1 2.351 8.608
9 3604 3699.9 2.484 11.092
10 2634 2666.3 0.391 11.483
11 1968 1923.3 1.038 12.520
12 1399 1388.7 0.076 12.596
13 1027 1003.7 0.540 13.136
14 712 726.1 0.275 13.412
15 515 525.8 0.223 13.635
16 335 381.2 5.588 19.223
17 269 276.5 0.206 19.429
18 216 200.8 1.146 20.575
19 152 146.0 0.248 20.822
20 106 106.2 0.000 20.823
21 304 287.1 0.993 21.816
Chisq= 21.82 for 20 degrees of freedom, p= 0.35058
SUMMARY of craptest on bits.22
p-value for no. of wins: 0.871897
p-value for throws/game: 0.350580
_____________________________________________________________
```
## Testing the Distribution of the Number of Wins
The Binomial Distribution models the total number of wins that should
occur in 200,000 games. The total probability of winning a game of
craps, based on simple probability and assuming true dice, is
`p=244/495` ([source](http://mathforum.org/library/drmath/view/56534.html)).
Since the probability of winning is not far from 0.5, the normal
distribution should approximate the true distribution of wins fairly well ([source](https://en.wikipedia.org/wiki/Binomial_distribution#Normal_approximation)).
With normality assumed, the expected (population) mean and standard
deviation are calculated.
The z-test measures how likely it is that the resulting number of wins
could be from a normal distribution, which it should be if the RNG is
unbiased (random). A p-value of 0.87 indicates a very strong fit
(using loose terms). In typical statistical analysis, you would
usually conclude that the outcome is not from a normal distribution if
the p value were less than 0.05 (called the significance level, where
p<0.05 is called "statistically significant"). `p=0.05` means you'd
expect to see an outcome this wild or wilder from a true normal
distribution about 5% of the time.
## Testing the Distribution of Individual Throws
The number of throws indicate how many rolls until the game ended,
*whether the game was won or lost*. Assuming true dice, the
probability of winning or losing at throw N can be calculated using
simple probability.
For example, the probability of ending a game on the first roll is
```
P(7 or 11 or 2 or 3 or 12) = 6/36 +2/36 +1/36 +2/36 +1/36 = 1/3
```
With 200,000 games, we expect `200000/3=66666.7` games to finish on
the first roll. Again, see ([here](http://mathforum.org/library/drmath/view/56534.html))
The Chi-Squared test measures the "goodness of fit" for the actual
results with the expected results based on the above calculated
probabilities ([source](https://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test)).
Each "Throw count" result has its chi-squared test statistic
calculated (e.g. 0.888 for Throw count=1), and those numbers are
summed to evaluate a chi-squared test for the entire set of 200,000
games as a whole ([source](https://en.wikipedia.org/wiki/Normal_distribution#Combination_of_two_or_more_independent_random_variables)).
If doing this by hand, you would take the chisq test statistic value
and degrees of freedom, look them up on a chisq table, and get a
p-value. The p-value is given here already.
Again, p<0.05 indicates a problem.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment