drfloob/Analysis of the Craps Test from the Diehard test suite

## Analysis of the Craps Test from the Diehard test suite
# Dumb Guy's Statistical Analysis of the Diehard RNG Suite's Craps Test

## Test Description

Craps is a series of rolls of 2 dice, where you win on the first roll
when the sum of the dice is 7 or 11, or lose when it's 2, 3, or 12. If
you roll any other number on the first roll, that is the point you
have to make on subsequent rolls. If you hit that number again, you
win. Otherwise, if you roll 7 you lose.

The total number of wins should approximately resemble a normal
distribution, and the number of rolls it takes to end a game should
match their respective simple probabilities. These are both tested
statistically, and the resulting p values are given.


## Example Program Output

```
	|-------------------------------------------------------------|
	|This the CRAPS TEST.  It plays 200,000 games of craps, counts|
	|the number of wins and the number of throws necessary to end |
	|each game.  The number of wins should be (very close to) a   |
	|normal with mean 200000p and variance 200000p(1-p), and      |
	|p=244/495.  Throws necessary to complete the game can vary   |
	|from 1 to infinity, but counts for all>21 are lumped with 21.|
	|A chi-square test is made on the no.-of-throws cell counts.  |
	|Each 32-bit integer from the test file provides the value for|
	|the throw of a die, by floating to [0,1), multiplying by 6   |
	|and taking 1 plus the integer part of the result.            |
	|-------------------------------------------------------------|

		RESULTS OF CRAPS TEST FOR bits.22
	No. of wins:  Observed	Expected
	                 98332        98585.858586
		z-score=-1.135, pvalue=0.87190

	Analysis of Throws-per-Game:

	Throws	Observed	Expected	Chisq	 Sum of (O-E)^2/E
	1	66910		66666.7		0.888		0.888
	2	37869		37654.3		1.224		2.112
	3	26834		26954.7		0.541		2.653
	4	19219		19313.5		0.462		3.115
	5	13753		13851.4		0.699		3.814
	6	9788		9943.5		2.433		6.247
	7	7137		7145.0		0.009		6.256
	8	5249		5139.1		2.351		8.608
	9	3604		3699.9		2.484		11.092
	10	2634		2666.3		0.391		11.483
	11	1968		1923.3		1.038		12.520
	12	1399		1388.7		0.076		12.596
	13	1027		1003.7		0.540		13.136
	14	712		726.1		0.275		13.412
	15	515		525.8		0.223		13.635
	16	335		381.2		5.588		19.223
	17	269		276.5		0.206		19.429
	18	216		200.8		1.146		20.575
	19	152		146.0		0.248		20.822
	20	106		106.2		0.000		20.823
	21	304		287.1		0.993		21.816

	Chisq=  21.82 for 20 degrees of freedom, p= 0.35058

		SUMMARY of craptest on bits.22
	 p-value for no. of wins: 0.871897
	 p-value for throws/game: 0.350580
	_____________________________________________________________

```

## Testing the Distribution of the Number of Wins

The Binomial Distribution models the total number of wins that should
occur in 200,000 games. The total probability of winning a game of
craps, based on simple probability and assuming true dice, is
`p=244/495` ([source](http://mathforum.org/library/drmath/view/56534.html)).
Since the probability of winning is not far from 0.5, the normal
distribution should approximate the true distribution of wins fairly well ([source](https://en.wikipedia.org/wiki/Binomial_distribution#Normal_approximation)).

With normality assumed, the expected (population) mean and standard
deviation are calculated.

The z-test measures how likely it is that the resulting number of wins
could be from a normal distribution, which it should be if the RNG is
unbiased (random). A p-value of 0.87 indicates a very strong fit
(using loose terms). In typical statistical analysis, you would
usually conclude that the outcome is not from a normal distribution if
the p value were less than 0.05 (called the significance level, where
p<0.05 is called "statistically significant"). `p=0.05` means you'd
expect to see an outcome this wild or wilder from a true normal
distribution about 5% of the time.

## Testing the Distribution of Individual Throws

The number of throws indicate how many rolls until the game ended,
*whether the game was won or lost*. Assuming true dice, the
probability of winning or losing at throw N can be calculated using
simple probability.

For example, the probability of ending a game on the first roll is

```
P(7 or 11 or 2 or 3 or 12) = 6/36 +2/36 +1/36 +2/36 +1/36 = 1/3
```

With 200,000 games, we expect `200000/3=66666.7` games to finish on
the first roll. Again, see ([here](http://mathforum.org/library/drmath/view/56534.html))

The Chi-Squared test measures the "goodness of fit" for the actual
results with the expected results based on the above calculated
probabilities ([source](https://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test)).

Each "Throw count" result has its chi-squared test statistic
calculated (e.g. 0.888 for Throw count=1), and those numbers are
summed to evaluate a chi-squared test for the entire set of 200,000
games as a whole ([source](https://en.wikipedia.org/wiki/Normal_distribution#Combination_of_two_or_more_independent_random_variables)).

If doing this by hand, you would take the chisq test statistic value
and degrees of freedom, look them up on a chisq table, and get a
p-value. The p-value is given here already.

Again, p<0.05 indicates a problem.
	# Dumb Guy's Statistical Analysis of the Diehard RNG Suite's Craps Test

	## Test Description

	Craps is a series of rolls of 2 dice, where you win on the first roll
	when the sum of the dice is 7 or 11, or lose when it's 2, 3, or 12. If
	you roll any other number on the first roll, that is the point you
	have to make on subsequent rolls. If you hit that number again, you
	win. Otherwise, if you roll 7 you lose.

	The total number of wins should approximately resemble a normal
	distribution, and the number of rolls it takes to end a game should
	match their respective simple probabilities. These are both tested
	statistically, and the resulting p values are given.


	## Example Program Output

	```
	\|-------------------------------------------------------------\|
	\|This the CRAPS TEST. It plays 200,000 games of craps, counts\|
	\|the number of wins and the number of throws necessary to end \|
	\|each game. The number of wins should be (very close to) a \|
	\|normal with mean 200000p and variance 200000p(1-p), and \|
	\|p=244/495. Throws necessary to complete the game can vary \|
	\|from 1 to infinity, but counts for all>21 are lumped with 21.\|
	\|A chi-square test is made on the no.-of-throws cell counts. \|
	\|Each 32-bit integer from the test file provides the value for\|
	\|the throw of a die, by floating to [0,1), multiplying by 6 \|
	\|and taking 1 plus the integer part of the result. \|
	\|-------------------------------------------------------------\|

	RESULTS OF CRAPS TEST FOR bits.22
	No. of wins: Observed Expected
	98332 98585.858586
	z-score=-1.135, pvalue=0.87190

	Analysis of Throws-per-Game:

	Throws Observed Expected Chisq Sum of (O-E)^2/E
	1 66910 66666.7 0.888 0.888
	2 37869 37654.3 1.224 2.112
	3 26834 26954.7 0.541 2.653
	4 19219 19313.5 0.462 3.115
	5 13753 13851.4 0.699 3.814
	6 9788 9943.5 2.433 6.247
	7 7137 7145.0 0.009 6.256
	8 5249 5139.1 2.351 8.608
	9 3604 3699.9 2.484 11.092
	10 2634 2666.3 0.391 11.483
	11 1968 1923.3 1.038 12.520
	12 1399 1388.7 0.076 12.596
	13 1027 1003.7 0.540 13.136
	14 712 726.1 0.275 13.412
	15 515 525.8 0.223 13.635
	16 335 381.2 5.588 19.223
	17 269 276.5 0.206 19.429
	18 216 200.8 1.146 20.575
	19 152 146.0 0.248 20.822
	20 106 106.2 0.000 20.823
	21 304 287.1 0.993 21.816

	Chisq= 21.82 for 20 degrees of freedom, p= 0.35058

	SUMMARY of craptest on bits.22
	p-value for no. of wins: 0.871897
	p-value for throws/game: 0.350580
	_____________________________________________________________

	```

	## Testing the Distribution of the Number of Wins

	The Binomial Distribution models the total number of wins that should
	occur in 200,000 games. The total probability of winning a game of
	craps, based on simple probability and assuming true dice, is
	`p=244/495` ([source](http://mathforum.org/library/drmath/view/56534.html)).
	Since the probability of winning is not far from 0.5, the normal
	distribution should approximate the true distribution of wins fairly well ([source](https://en.wikipedia.org/wiki/Binomial_distribution#Normal_approximation)).

	With normality assumed, the expected (population) mean and standard
	deviation are calculated.

	The z-test measures how likely it is that the resulting number of wins
	could be from a normal distribution, which it should be if the RNG is
	unbiased (random). A p-value of 0.87 indicates a very strong fit
	(using loose terms). In typical statistical analysis, you would
	usually conclude that the outcome is not from a normal distribution if
	the p value were less than 0.05 (called the significance level, where
	p<0.05 is called "statistically significant"). `p=0.05` means you'd
	expect to see an outcome this wild or wilder from a true normal
	distribution about 5% of the time.

	## Testing the Distribution of Individual Throws

	The number of throws indicate how many rolls until the game ended,
	whether the game was won or lost. Assuming true dice, the
	probability of winning or losing at throw N can be calculated using
	simple probability.

	For example, the probability of ending a game on the first roll is

	```
	P(7 or 11 or 2 or 3 or 12) = 6/36 +2/36 +1/36 +2/36 +1/36 = 1/3
	```

	With 200,000 games, we expect `200000/3=66666.7` games to finish on
	the first roll. Again, see ([here](http://mathforum.org/library/drmath/view/56534.html))

	The Chi-Squared test measures the "goodness of fit" for the actual
	results with the expected results based on the above calculated
	probabilities ([source](https://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test)).

	Each "Throw count" result has its chi-squared test statistic
	calculated (e.g. 0.888 for Throw count=1), and those numbers are
	summed to evaluate a chi-squared test for the entire set of 200,000
	games as a whole ([source](https://en.wikipedia.org/wiki/Normal_distribution#Combination_of_two_or_more_independent_random_variables)).

	If doing this by hand, you would take the chisq test statistic value
	and degrees of freedom, look them up on a chisq table, and get a
	p-value. The p-value is given here already.

	Again, p<0.05 indicates a problem.