Skip to content

Instantly share code, notes, and snippets.

@rab
Created October 10, 2011 13:48
Show Gist options
  • Save rab/1275380 to your computer and use it in GitHub Desktop.
Save rab/1275380 to your computer and use it in GitHub Desktop.
Testing a die for bias
ruby randomness.rb rob_d20.txt
500 rolls from 1 to 20
Expected per bucket: 25.0 (σ ≈ 5.00)
10: ( 1)
11:
12:
13:
14:
15: (20)
16: ( 7)
17:
18: (14)
19:
20:
21: ( 8)
22: (19)
23: ( 2)(11)
24: ( 3)(15)(17)
25: ( 4)
26: (13)
27:
28:
29:
30: ( 5)(16)
31: (18)
32: (10)
33:
34: ( 6)
35:
36: ( 9)(12)
Mean (μ) : 10.61
Variance (σ²): 29.08
Std.Dev (σ) : 5.3928
lines at: [0, 9, 11, 40, 42]
Histogram:
1: ********* *
2: ********* ** ************
3: ********* ** *************
4: ********* ** **************
5: ********* ** *******************
6: ********* ** ***********************
7: ********* ** *****
8: ********* ** **********
9: ********* ** *************************
10: ********* ** *********************
11: ********* ** ************
12: ********* ** *************************
13: ********* ** ***************
14: ********* ** *******
15: ********* ** *************
16: ********* ** *******************
17: ********* ** *************
18: ********* ** ********************
19: ********* ** ***********
20: ********* ** ****
For your d20, the χ² value is 38.0000
This exceeds 36.19 and is probably biased
For your d20, the Kolmogorov-Smirnov value is 0.80
This is less than 1.14 and is probably fair
By count:
1: **********
20: ***************
7: ****************
14: ******************
8: *********************
19: **********************
2: ***********************
11: ***********************
3: ************************
15: ************************
17: ************************
4: *************************
13: **************************
5: ******************************
16: ******************************
18: *******************************
10: ********************************
6: **********************************
9: ************************************
12: ************************************
1:10
7:16 19:22 13:26
15:24 17:24 3:24 9:36 11:23 5:30
12:36 10:32 16:30 6:34 4:25 18:31
8:21 14:18 2:23
20:15
# -*- coding: utf-8 -*-
begin
require 'ansi'
def go(color)
print ANSI.send(color)
end
rescue LoadError => e
$stderr.puts "to get colored output: gem install ansi"
def go(color); nil; end
end
# rolls = an array holding the individual results
# low = minimum element value in rolls; for a dN die, this should be 1
# high = maximum element value in rolls; for a dN die, this should be N
# counts = an array indexed by the element values from rolls with the observed count
# mean = actual average roll
# ideal_mean = (low + high) / 2
rolls = []
File.foreach(ARGV[0], 'r') do |line|
rolls.concat line.split.map{|_|_.to_i(10)}
end
low, high = rolls.minmax
puts "#{rolls.size} rolls from #{low} to #{high}"
faces = high - low + 1
counts = Array.new(high + 1)
sum = 0
rolls.each {|_|counts[_] = (counts[_] || 0) + 1; sum += _}
mean = sum.to_f / rolls.size
# mu = (((num_values + 1)*num_values)/2).to_f / num_values
variance = rolls.inject(0.0) {|sum,x| sum + (x.to_f - mean)**2 } / rolls.size
# assuming that all values will have been seen
buckets = counts.compact.size
# The actual number in the cell will be approximately normally distributed
# with mean equal to the expected number in each cell and standard deviation
# approximately the square root of the mean.
expected = rolls.size.to_f / buckets
exp_std_dev = Math.sqrt(expected)
puts "Expected per bucket: %3.1f (σ ≈ %4.2f)"%[expected, exp_std_dev]
low_bucket, high_bucket = counts.compact.minmax
(low_bucket..high_bucket).each do |count|
if count <= expected - exp_std_dev
go :red_on_black
elsif count <= expected
go :cyan_on_black
elsif count <= expected + exp_std_dev
go :yellow_on_black
else
go :red_on_black
end
print "%3d: "%[count]
counts.map.with_index.select{|(c,i)| c==count }.map{|(c,i)| i }.each do |pos|
print "(%2d)"%[pos]
end
go :white_on_black
puts
end
puts "Mean (μ) : %5.2f"%[mean]
puts "Variance (σ²): %5.2f"%[variance]
puts "Std.Dev (σ) : %7.4f"%[Math.sqrt(variance)]
unless buckets == faces
go :black_on_red
puts " bad rolls, not all values from #{low} to #{high} "
missing = (low..high).to_a - counts.map_with_index{|c,i|i if c}.compact
puts " missing: #{missing.inspect} "
go :white_on_black
go :red_on_black
puts "Observed #{buckets} bucket#{'s' unless buckets == 1}, but expected #{faces}"
go :white_on_black
end
# ------------------------------------------------------------------------------
# Histogram tests
Histogram = {
# d# 5% 1%
4 => [ 2.49, 3.02 ], # Due to being in the
6 => [ 2.63, 3.14 ], # extreme tails of the
8 => [ 2.73, 3.22 ], # distribution, combined
10 => [ 2.80, 3.29 ], # with slight asymmetry,
12 => [ 2.86, 3.34 ], # the ranges we get are
20 => [ 3.02, 3.48 ], # sometimes out a bit.
}
if Histogram.has_key?(faces)
hist_factor = Math.sqrt(expected * (faces - 1) / faces)
hist_lines = Histogram[faces].reverse.map{|_|-1*_} + Histogram[faces]
hist_lines.map! {|x|
(expected + x * hist_factor).ceil
}
hist_lines.unshift 0
puts "lines at: #{hist_lines.inspect}"
else
hist_lines = [0,0,0,expected.ceil,rolls.size]
go :magenta_on_black
puts "Don't know how to interpret these results for a d%d"%[faces]
end
go :white_on_black
hist_colors = [ :magenta_on_black, # outside 99%
:blue_on_black, # outside 95%
:green_on_black, # OK!
:yellow_on_black, # outside 95%
:red_on_black, # outside 99%
]
# ------------------------------------------------------------------------------
# The χ² test
ChiSquaredDistribution = {
# d# => 5% 1% df
4 => [ 7.81, 11.34, ], # 3
6 => [ 11.07, 15.09, ], # 5
8 => [ 14.07, 18.48, ], # 7
10 => [ 16.92, 21.67, ], # 9
12 => [ 19.68, 24.72, ], # 11
20 => [ 30.14, 36.19, ], # 19
}
chi_squared = 0.0
go :yellow_on_black
puts "Histogram:"
counts.each.with_index do |count, num|
next if count.nil?
chi_squared += ((count.to_f - expected)**2)/expected
go :white_on_black
print "%3d: "%num
go hist_colors[0]
count.times do |c|
if c < hist_lines[1]
go hist_colors[0]
elsif c < hist_lines[2]
go hist_colors[1]
elsif c < hist_lines[3]
go hist_colors[2]
elsif c < hist_lines[4]
go hist_colors[3]
else
go hist_colors[4]
end
print '*'
end
puts
end
go :white_on_black
if ChiSquaredDistribution.has_key?(faces)
puts "For your d%d, the χ² value is %7.4f"%[faces, chi_squared]
if chi_squared > ChiSquaredDistribution[faces].last
go :red_on_black
puts "This exceeds %5.2f and is probably biased"%[ChiSquaredDistribution[faces].last]
elsif chi_squared > ChiSquaredDistribution[faces].first
go :yellow_on_black
puts "This is between %5.2f and %5.2f and may be biased"%ChiSquaredDistribution[faces]
else
go :green_on_black
puts "This is less than %5.2f and is probably fair"%[ChiSquaredDistribution[faces].first]
end
else
go :magenta_on_black
puts "Don't know how to interpret these results for a d%d"%[faces]
end
go :white_on_black
# ------------------------------------------------------------------------------
# The Kolmogorov-Smirnov test:
sum_counts = 0
differences = Array.new(counts.size)
(low..high).each.with_index do |roll, i|
sum_counts += counts[roll]
differences[roll] = (sum_counts - (i+1)*expected).abs
end
raise "Kolmogorov-Smirnov bad math? #{differences.inspect}" unless differences[high].zero?
diff_max = differences.compact.max
d_stat = diff_max.to_f / Math.sqrt(rolls.size)
KolmogorovSmirnov = {
# d# => 5% 1%
4 => [ 1.08, 1.35 ], # These values apply pretty well
6 => [ 1.10, 1.37 ], # irrespective of the total number of
8 => [ 1.11, 1.38 ], # rolls, but I would use at least 10 rolls
10 => [ 1.12, 1.39 ], # per face. Note also that these values
12 => [ 1.12, 1.40 ], # come from simulation, and are hence not
20 => [ 1.14, 1.42 ], # exact. This doesn't really matter.
}
if KolmogorovSmirnov.has_key?(faces)
puts "For your d%d, the Kolmogorov-Smirnov value is %4.2f"%[faces, d_stat]
if d_stat > KolmogorovSmirnov[faces].last
go :red_on_black
puts "This exceeds %5.2f and is probably biased"%KolmogorovSmirnov[faces].last
elsif d_stat > KolmogorovSmirnov[faces].first
go :yellow_on_black
puts "This is between %5.2f and %5.2f and may be biased"%KolmogorovSmirnov[faces]
else
go :green_on_black
puts "This is less than %5.2f and is probably fair"%[KolmogorovSmirnov[faces].first]
end
else
go :magenta_on_black
puts "Don't know how to interpret these results for a d%d"%[faces]
end
go :white_on_black
# ------------------------------------------------------------------------------
# Histogram sorted by counts
go :cyan_on_black
puts "By count:"
counts.each.with_index.select{|count,_|count}.sort.each do |(count, num)|
puts "%3d: %s"%[num, '*'*count]
end
go :white_on_black
if faces == 20
rab_faces = [ [ 1, ],
[ 7, 19, 13, ],
[ 15,17, 3,9, 11,5, ],
[ 12,10, 16,6, 4,18, ],
[ 8, 14, 2, ],
[ 20, ],
]
rab_faces.each do |layer|
str = layer.map {|value| "#{value}:#{counts[value]}" } * ' '
puts str.center(36)
end
end
__END__
=begin
References:
[1] http://wiretap.area.com/Gopher/Library/Article/Gaming/fairdice.txt
Newsgroups: rec.games.frp.archives
From: barnett@agsm.unsw.oz.au (Glen Barnett)
Subject: PAPER: Testing Dice for bias
Message-ID: <al#23np@rpi.edu>
Organization: The Australian Graduate School of Management
Date: Wed, 2 Dec 1992 04:41:08 GMT
Approved: goldm@rpi.edu
Lines: 749
The following information is intended for distribution over Internet,
and outside of that may be copied for personal use only.
(c) Glen L. Barnett, 1992. All rights reserved.
-------------------------------------------------------------------
Conover, W.J. (1980): Practical nonparametric statistics,
2nd Ed., Wiley, New York.
Neave, H.R. and Worthington, P.L.B. (1988): Distribution-free tests,
Unwin Hyman, London.
=end
02 17 06 11 08 12 02 13 12 12 10 20 17 06 04 06 14 13 19 05 06 10 19 18 04
18 17 18 16 12 09 09 01 10 09 20 12 06 05 04 13 03 09 09 12 11 12 11 05 02
09 05 15 12 13 15 12 12 20 07 07 14 08 03 06 12 16 13 16 14 06 07 09 14 01
02 10 20 11 12 20 07 10 07 14 02 12 06 09 12 12 20 06 13 11 19 10 05 16 06
04 17 08 05 04 09 03 14 17 16 10 10 16 16 06 10 04 17 17 05 06 13 06 11 11
16 08 18 20 06 11 09 01 19 13 06 03 17 09 11 02 14 06 05 05 19 15 17 09 11
13 20 07 03 08 08 12 10 07 13 05 16 17 06 12 16 02 06 03 09 18 17 02 01 05
15 03 05 16 15 02 02 02 19 02 15 10 03 06 06 16 04 06 05 13 18 04 15 05 17
17 04 10 19 01 03 12 10 15 02 20 12 07 05 02 12 05 20 14 08 19 06 11 18 05
05 01 13 04 09 10 17 15 15 19 05 15 16 20 06 09 02 09 06 08 01 12 18 09 11
11 18 15 04 13 04 09 11 16 04 08 12 09 18 20 04 16 19 09 10 09 10 02 17 18
09 07 12 13 16 15 03 02 12 13 06 08 03 13 04 06 13 11 13 09 19 10 06 07 10
14 11 18 04 19 03 07 13 12 08 06 19 09 05 11 08 03 19 03 04 13 16 18 16 13
12 15 09 18 10 18 03 11 08 10 04 15 12 08 18 12 13 03 13 06 19 02 12 17 11
10 04 16 19 18 14 14 09 04 05 03 10 05 19 05 09 16 05 16 20 03 10 12 14 16
12 15 08 18 17 17 08 19 18 15 16 14 18 03 17 15 01 15 03 05 14 17 06 03 04
02 10 18 05 10 04 10 08 19 01 08 09 12 03 19 18 11 03 09 07 19 13 13 11 02
19 09 16 14 18 16 13 04 17 09 05 05 14 07 04 12 02 18 03 10 09 07 16 14 17
08 18 10 06 10 16 15 01 16 16 09 15 09 04 15 06 17 15 16 18 11 15 10 17 18
18 02 12 11 12 06 05 18 06 08 10 18 02 14 20 08 07 09 12 10 07 09 05 18 20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment