Heril/statProbabilities.md Secret

## statProbabilities.md

      
    Raw
  

              statProbabilities.md
            
          
    The likelihood of getting or beating an array of stats in D&D

Spencer R Hall

12/15/2020

I got involved in another conversation about Dungeons and Dragons and probabilities again. This time I saw something that I've seen come up in a variety of circumstances. So, let's take this scenario.
A Dungeon Master is running a game where the players roll for their stats, and one player has suspiciously high results. The question of course is, did the player cheat with their stats, either by rolling a different way, or did they just pick the stats they wanted. Given D&D is a cooperative game, this can be especially frustrating to the other players at the table, as this one player has a character that is well above average for doing just about everything.
First, lets talk about the standard way for generating stats for a character. In D&D 5th edition, the default way is actually just to take a standard array of values. 8, 10, 12, 13, 14, 15. The nice thing about this is that it gets rid of inequality from good or bad luck of a player. The downside for many, is that it's just too predictable. They like the chance of having a character with some really high stats and low stats as well. The standard way of rolling for stats is called 4d6, drop lowest. What that means is for each of the six stats, you roll 4 six-sided dice, discard the single lowest die, and then sum the remaining 3 dice. This gives a range of numbers from 3 to 18 that is skewed in a way to make the players have results that are a bit higher than average in that range compared to just rolling 3 dice.
With all my previous posts I've done simulations, so lets start off with that. Lets take an unbelievably high set of stats: 15, 17, 16, 15, 17, 15 and then simulate creating a million characters and count what proportion meet or exceed this array of stats. In R this is pretty straight forward:
library(tidyverse)

rolledStats <- c(15, 17, 16, 15, 17, 15)
rolledStats <- sort(rolledStats)
die_sides <- 1:6
N <- 1000000
results <- vector(mode = "list", length = N)
count <- 0
for (j in 1:N) {
  results[[j]] <- vector(mode = "logical", length = 6)
  for (k in 1:6) {
    results[[j]][k] <- sum(sort(sample(die_sides, 4, replace = T), decreasing = TRUE)[1:3])
  }
  results[[j]] <- sort(results[[j]])
}
countExceeding <- as_tibble(matrix(unlist(results), ncol = 6, byrow = T)) %>%
  filter(V1 >= rolledStats[1] & V2 >= rolledStats[2] & V3 >= rolledStats[3] & V4 >= rolledStats[4] & V5 >= rolledStats[5] & V6 >= rolledStats[6]) %>%
  count()
paste(countExceeding[[1]]/N*100, "%", sep = "")
When running this I got 0.0097%, which I must say, is pretty unlikely.
But what if we could do better and calculate this from first principles. For each roll of 4d6, drop lowest there are only $6^4 = 1296$ outcomes, so counting the probability of each value from 3 to 18 should be easy enough. We can use these probabilities to calculate the odds of each 54,264 unique arrays of stats we can get by calculating the odds of one of the permutations of those arrays as well as the number of permutations of that arrays. We can then filter through the resulting table to find all the arrays that are as good or better than the array we are testing against. It may sound like a lot of computation, but by being smart about the approach I get it run in under 5 seconds from start to finish.
counter <- 0
statCombinations <- vector(mode = "list", length = 54264)
statPermutations <- vector(mode = "logical", length = 54264)
statOdds <- vector(mode = "logical", length = 54264)
maxPermutations <- factorial(6)

diceodds <- function(a) {
  statRoll <- expand.grid(rep(list(1:6), 4))
  statRoll <- apply(statRoll,1,sort)
  dicepdf <- summary(as.factor(colSums(statRoll[2:4,])))/(6^4)
  probability <- 1
  for (i in a) {
    probability <- probability * as.numeric(dicepdf[as.character(i)])
  }
  return(probability)
}

pct_sum <- function(..., na.rm = FALSE) {
  sum(..., na.rm = na.rm) * 100
}

for (i1 in 3:18) {
  pos1 <- i1
  for (i2 in i1:18) {
    pos2 <- i2
    for (i3 in i2:18) {
      pos3 <- i3
      for (i4 in i3:18) {
        pos4 <- i4
        for (i5 in i4:18) {
          pos5 <- i5
          for (i6 in i5:18) {
            pos6 <- i6
            counter <- counter + 1
            statCombinations[[counter]] <- c(pos1, pos2, pos3, pos4, pos5, pos6)
            statOdds[counter] <- diceodds(statCombinations[[counter]])
            statPermutations[counter] <- maxPermutations  / prod(factorial(summary(as.factor(statCombinations[[counter]]))))
          }
        }
      }
    }
  }
}
prob <- as_tibble(cbind(matrix(unlist(statCombinations), ncol = 6, byrow = T), statOdds, statPermutations)) %>%
  mutate(probability = statOdds * statPermutations) %>%
  filter(V1 >= rolledStats[1] & V2 >= rolledStats[2] & V3 >= rolledStats[3] & V4 >= rolledStats[4] & V5 >= rolledStats[5] & V6 >= rolledStats[6]) %>%
  summarize_at(vars(probability),
               list(total_percent = pct_sum))
paste(prob[[1]], "%", sep = "")
Calculating from first principles gives us a probability of 0.0066%, a little bit lower than the result of my simulation, but in neighborhood.
I made a Shiny App so you can put in any given array in to see it's probability. The default is the standard array, and when the odds of beating each stat in the standard array come out to be around 32%, it doesn't look so bad.
Of course, if you run across this situation in a game, there is still that uncertainty of did they really cheat. Given the social nature of D&D you could offend the player by accusing them of something they didn't actually do. Or, even though they cheated, it might have to do with their own insecurities in wanting to not fail or wanting to be an epic hero. No matter the case, what tends to be the most well received advice in the conversations I've seen about this is to without mentioning cheating, talk to the player about the effect on the enjoyment of their fellow players along with the suggestion of them roll their stats again.