fhars/ModeratelyAccurateSkill.md

## ModeratelyAccurateSkill.md

      
    Raw
  

              ModeratelyAccurateSkill.md
            
          
    ModeratelyAccurateSkill™

ModeratelyAccurateSkill is a simplified variant of Microsoft's TrueSkill
algorithm that uses just Bayesian nets with tabular CPDs as
implemented in Programming Assignment 4 of the
Coursera PGM Course.
The basic data structures

Since we are using tables, we cannot use the continuous skill
variable and its variance, but must represent our knowledge of the
skill of a player as a distribution over a fixed set of skill
levels. So each player will be represented as a factor with one
variable, and whenever we see a new player we do not yet know anything
about, we give her a uniform skill distribution:
% Create a new factor for a new player we do not
% yet know anything about
function player = MakePlayer(variable, steps)
  if nargin < 2
    steps = 10;
  end
  player = struct('var', [variable],
                  'card', [steps],
                  'val', ones(1, steps)/steps);
The next important ingredient is the probability that a player wins
against another player, which we assume to depend only on the
difference of skill levels. A bit of repmat-magic (yeah, it's an
.m-function, so sue me) gives us a nice rectangular matrix of
differences:
diff = repmat((1:5)(:), 1, 5) - repmat(1:5, 5, 1)
            diff =
               0  -1  -2  -3  -4
               1   0  -1  -2  -3
               2   1   0  -1  -2
               3   2   1   0  -1
               4   3   2   1   0
which we can transform into a nice probability using the logistic
distribution, which we scale so that a difference of one skill level
results in an 80% chance of winning:
logistic_cdf(1.3864*(diff))
            ans =
               0.5000000   0.1999831   0.0588118   0.0153798   0.0038894
               0.8000169   0.5000000   0.1999831   0.0588118   0.0153798
               0.9411882   0.8000169   0.5000000   0.1999831   0.0588118
               0.9846202   0.9411882   0.8000169   0.5000000   0.1999831
               0.9961106   0.9846202   0.9411882   0.8000169   0.5000000
The function ExpectedOutcome produces a factor that encodes this
probability distribution for the expected outcome of a match between
two players. The outcome variable is one if the first player wins and
two if the second one wins (for sake of simplicity we assume that our
game never produces a draw):
function outcome = ExpectedOutcome(outcomeVariable, player1, player2)
  steps1 = player1.card(1);
  steps2 = player2.card(1);
  diffs = repmat((1:steps1)(:), 1, steps2) - repmat(1:steps2, steps1, 1);
  probs = logistic_cdf(1.3864*(diffs))(:);
  val = [ probs, (1 - probs)]'(:)';
  outcome = struct('var', [outcomeVariable, player1.var(1), player2.var(1)],
                   'card', [2, player1.card(1), player2.card(1)],
                   'val', val);
Now we can make our first skill estimate based on match between two
players where the first player wins:
p1 = MakePlayer(1);
p2 = MakePlayer(2);
ex = ExpectedOutcome(3, p1, p2);
m = ComputeExactMarginalsBP([p1, p2, ex], [0 0 1], 0);
[m(p1.var(1)).val; m(p2.var(1)).val]
                ans =

                 Columns 1 through 5:

                   0.015587   0.031588   0.050411   0.070102   0.090020
                   0.184413   0.168412   0.149589   0.129898   0.109980

                 Columns 6 through 10:

                   0.109980   0.129898   0.149589   0.168412   0.184413
                   0.090020   0.070102   0.050411   0.031588   0.015587
As we can see, our skill estimate for the first player has been moved
to higher values, while the one for the second to lower value. This
becomes even more visible if player one wins ten games in a row:
for i = 1:10
  m = ComputeExactMarginalsBP([p1, p2, ex], [0 0 1], 0);
  p1 = m(p1.var(1));
  p2 = m(p2.var(1));
end
[p1.val; p2.val]
            ans =

             Columns 1 through 4:

               7.4135e-09   6.1870e-06   3.8620e-04   5.0852e-03
               2.9838e-01   2.5766e-01   2.0248e-01   1.3723e-01

             Columns 5 through 8:

               2.6033e-02   7.2738e-02   1.3723e-01   2.0248e-01
               7.2738e-02   2.6033e-02   5.0852e-03   3.8620e-04

             Columns 9 and 10:

               2.5766e-01   2.9838e-01
               6.1870e-06   7.4135e-09
Now our estimate is quite certain that player one has a skill in the
upper half of the distribution:
sum(p1.val(6:10))
                ans =  0.96849
Adding some elements of chance

In TrueSkill, there is one more factor between the skill and the
outcome, the actual performance of the player, which is a random
variable that depends on the skill of the player and among other
things models if the game is more dominated by skill or by chance.
For a moderately chancy game, we can model this for example with a
binomial distribution where the total number of balls in the urn is
one more than the number of skill levels, and we draw one time less:
function strength = Performance (variable, player)
  steps = player.card(1);
  v = zeros(steps,steps);
  for i = 1:steps
  v(i, :) = binopdf(0:(steps -1), (steps - 1), i / (steps + 1));
  end
  strength = struct('var', [variable, player.var(1)],
                    'card', [steps, steps],
                    'val', v(:)');
With this choice, a player at the highest skill level has a
appreciable probability of actually performing at one of the two
highest performance levels:
p=Performance(2, MakePlayer(1));
ObserveEvidence(p,[1 10]);
d = FactorMarginalization(ObserveEvidence(p,[1 10]), 1).val;
d/sum(d)
            ans =

             Columns 1 through 4:

               6.3520e-10   3.2522e-07   1.2503e-05   1.6651e-04

             Columns 5 through 8:

               1.2406e-03   6.4014e-03   2.5633e-02   8.5255e-02

             Columns 9 and 10:

               2.4609e-01   6.3520e-01
Now a single match involves the two player factors, the two performance
factors and the expected outcome factor, which now depends on the
performance factors, and the resulting change from the initial
distribution is weaker than in the previous example:
p1 = MakePlayer(1);
p2 = MakePlayer(2);
pf1=Performance(3, p1);
pf2=Performance(4, p2);
ex=ExpectedOutcome(5, pf1, pf2);
m = ComputeExactMarginalsBP([p1, p2, pf1, pf2, ex], [0 0 0 0 1], 0);
[m(p1.var(1)).val; m(p2.var(1)).val]
            ans =

             Columns 1 through 5:

               0.016361   0.036312   0.056516   0.077556   0.099135
               0.117171   0.170364   0.163175   0.142543   0.120867

             Columns 6 through 10:

               0.120867   0.142543   0.163175   0.170364   0.117171
               0.099135   0.077556   0.056516   0.036312   0.016361
The same thing happens when you iterate ten times:
for i = 1:10
  m = ComputeExactMarginalsBP([p1, p2, pf1, pf2, ex], [0 0 0 0 1], 0);
  p1 = m(p1.var(1));
  p2 = m(p2.var(1));
end
[p1.val; p2.val]
            ans =

             Columns 1 through 4:

               8.2230e-10   2.2340e-06   1.5321e-04   2.5803e-03
               4.9292e-03   3.1480e-01   3.8291e-01   2.0059e-01

             Columns 5 through 8:

               1.8680e-02   7.5345e-02   2.0059e-01   3.8291e-01
               7.5345e-02   1.8680e-02   2.5803e-03   1.5321e-04

             Columns 9 and 10:

               3.1480e-01   4.9292e-03
               2.2340e-06   8.2230e-10
The algorithm is still quite sure that the first player is above average:
sum(p1.val(6:10))
                ans =  0.97858
but it is far more reluctant to assign any player to one of the
extreme values.
Tournaments

Now a tournament can be modelled as a connected series of matches where
every player has won against the following player in the resulting
list:
p1 = MakePlayer(1);
p2 = MakePlayer(2);
p3 = MakePlayer(3);
pf1=Performance(4, p1);
pf2=Performance(5, p2);
pf3=Performance(6, p3);
ex12=ExpectedOutcome(7, pf1, pf2);
ex23=ExpectedOutcome(8, pf2, pf3);
factors = [p1, p2, p3, pf1, pf2, pf3, ex12, ex23];
obs = zeros(1, 8);
obs(7) = 1;
obs(8) = 1;
m = ComputeExactMarginalsBP(factors, obs, 0);
[m(p1.var(1)).val; m(p2.var(1)).val; m(p3.var(1)).val]
                ans =

                 Columns 1 through 5:

                   0.0069047   0.0180071   0.0333445   0.0538342   0.0797502
                   0.0392870   0.0794268   0.1095116   0.1304529   0.1413217
                   0.1519890   0.2109338   0.1868481   0.1473617   0.1110269

                 Columns 6 through 10:

                   0.1110269   0.1473617   0.1868481   0.2109338   0.1519890
                   0.1413217   0.1304529   0.1095116   0.0794268   0.0392870
                   0.0797502   0.0538342   0.0333445   0.0180071   0.0069047

Teams

The final building block are teams, which TrueSkill models by just
adding the player performance. In ModeratelyAccurateSkill, this can be
modelled with a deterministic factor that encodes the addition,
although you will probably want to add an offset so that a team of
four lowest level players gets a combined performance at the lowest level,
and not at level four. Implementing this is left as an exercise for
the readers. (You should also check that I did not get row and column ordering
wrong in ExpectedOutcome if you do so).
Adding teams will also lead to rather huge factors: A team of four
players can have 37 different performance levels, and a match between two of
these teams requires an ExpectedOutcome factor that has 37 * 37 * 2
elements in its val array. On the bright side, inference in the
resulting net is still exact and we get around the iterated inner
schedule of TrueSkill which is required as the updates on the
continuous distributions used there are only approximate.
That's it

As far as I can see, this covers all the basic ideas of the TrueSkill
algorithm in this simplified, discrete skill level reformulation.
Whats left are some catchy simple numbers to use in leaderboards or
so.  First you have to calculate the expectation and deviation of a
player's skill:
function [skill, sigma] = Skill(player)
  ll = 1:(player.card(1));
  skill = ll * player.val(:);
  sigma = sqrt((ll .* ll) * player.val(:) - skill*skill);
So an unknown player has an expected skill level of 5.5:
 [sk, sd] = Skill(MakePlayer(1))
                sk =  5.5000
                sd =  2.8723
while the player who has won ten games in a row has a skill level of
7.9:
 [sk, sd] = Skill(p1)
                sk =  7.9062
                sd =  1.0203
The reported skill level used for player ranking is a certain number
of standard deviations below the skill estimate. TrueSkill uses a
factor of three and adjusts the prior for an unknown player so that a
new player enters the leaderboard from the bottom, a value of 1.75
gives us roughly the same effect for reasonably low numbers of skill
levels.
function mas = ModeratelyAccurateSkill(player)
  [sk, sd] = Skill(player);
  mas = sk - 1.75 * sd;
With this choice we have for an unknown player and the winner and loser
of our ten games:
ModeratelyAccurateSkill(MakePlayer(1))
                ans =  0.47351
ModeratelyAccurateSkill(p1)
                ans =  6.1207
ModeratelyAccurateSkill(p2)
                ans =  1.3082
and for the three participants after one round of our tournament:
ModeratelyAccurateSkill(p1)
                ans =  3.6928
ModeratelyAccurateSkill(p2)
                ans =  1.3722
ModeratelyAccurateSkill(p3)
                ans = -0.042808