Skip to content

Instantly share code, notes, and snippets.

@Xom
Last active September 24, 2023 03:00
Show Gist options
  • Save Xom/d836151a320d116936ea00203f70d46b to your computer and use it in GitHub Desktop.
Save Xom/d836151a320d116936ea00203f70d46b to your computer and use it in GitHub Desktop.
Beyond Duplicate Bridge: Generalizing Shuffle Duplication to Autochess, Deckbuilders, and other games

Beyond Duplicate Bridge: Generalizing Shuffle Duplication to Autochess, Deckbuilders, and other games

From Wikipedia:

Duplicate bridge [is a variant of bridge in which] the same bridge deal (the specific arrangement of the 52 cards into the four hands) is played at each table, and scoring is based on relative performance. In this way, every hand, whether strong or weak, is played in competition with others playing identical cards, and the element of skill is heightened while that of chance is reduced.

For bridge, shuffle duplication is straightforward because player choices don't affect which cards are dealt. In contrast, autochess is a genre of computer games in which players choose how many cards to draw each turn, and also choose cards to discard, causing those cards immediately to be shuffled back into the deck. A general shuffle duplication system must handle such complications.

The goal of shuffle duplication is to deal as similar a set of cards as possible while conforming to the probabilities of the underlying game. Where player choices have effects on probabilities in the underlying game, they must have the same effects in the duplicate variant. That is, the goal is to maximize the correlation across tables using the same shuffle, without breaking the constraint that, from a single table's perspective, the probabilities in the duplicate variant must be identical to the underlying game.

Consider the following toy example. Shuffle a deck of five cards, one labeled with a X, and the other four blank. The player draws a card. If the X was not drawn, the player chooses whether or not to have the X removed from the deck and set aside. Next, the player draws a second card. Then if the X was set aside, add it back to the deck and shuffle. Finally, the player draws a third card.

Suppose Alice at table A always removes the X if possible, whereas Bob at table B never removes it. If they play with no duplication system, the joint probability table of outcomes is as follows:

Fig. 1: Underlying Game X on 1st draw Remove X; X on 3rd draw Remove X; X not drawn Sum
X on 1st draw 0.04 0.0533 0.1066 0.2
Keep X; X on 2nd draw 0.04 0.0533 0.1066 0.2
Keep X; X on 3rd draw 0.04 0.0533 0.1066 0.2
Keep X; X not drawn 0.08 0.1066 0.2133 0.4
Sum 0.2 0.2666 0.5333 1

The column sums, in italics, are Alice's outcome probabilities, and the row sums are Bob's. Since Alice's are independent from Bob's, the joint probabilities are simply the products of each pair of individual probabilities.

Consider a duplication system in which the initial shuffle is predetermined and duplicated, as in duplicate bridge, while the removal of the X, if chosen, is simulated by replacing the second draw with a blank if the X is second in the shuffle. High correlation is achieved between Alice's and Bob's outcomes, but whereas Bob experiences the same probabilities as the underlying game, in the case of Alice, on the other hand, the constraint of correctness is violated in that she is less likely to draw the X than in the underlying game:

Fig. 2: Incorrect System X on 1st draw Remove X; X on 3rd draw Remove X; X not drawn Sum
X on 1st draw 0.2 0 0 0.2
Keep X; X on 2nd draw 0 0 0.2 0.2
Keep X; X on 3rd draw 0 0.2 0 0.2
Keep X; X not drawn 0 0 0.4 0.4
Sum 0.2 0.2 (correctness violation) 0.6 (correctness violation) 1

Consider another system in which the initial shuffle is predetermined and duplicated, and a second shuffle is predetermined for when the X is shuffled back in, if that event is reachable from the initial shuffle. Then there is some correlation, but only due to the first draw, which happens before the player choice:

Fig. 3: Two-Shuffle System X on 1st draw Remove X; X on 3rd draw Remove X; X not drawn Sum
X on 1st draw 0.2 0 0 0.2
Keep X; X on 2nd draw 0 0.0666 0.1333 0.2
Keep X; X on 3rd draw 0 0.0666 0.1333 0.2
Keep X; X not drawn 0 0.1333 0.2666 0.4
Sum 0.2 0.2666 0.5333 1

Consider an amended version of the incorrect system in which the X, if and only if replaced by a blank, is returned to the deck according to a predetermined second shuffle. Then correctness is restored, yet the correlation remains relatively high:

Fig. 4: Amended System X on 1st draw Remove X; X on 3rd draw Remove X; X not drawn Sum
X on 1st draw 0.2 0 0 0.2
Keep X; X on 2nd draw 0 0.0666 0.1333 0.2
Keep X; X on 3rd draw 0 0.2 0 0.2
Keep X; X not drawn 0 0 0.4 0.4
Sum 0.2 0.2666 0.5333 1

If you think 'Remove X; X on 3rd draw' is more similar to 'Keep X; X on 3rd draw' than to 'Keep X; X on 2nd draw', then this amended system has the highest correlation possible (without violating correctness). If you think it's more similar to the latter, it's not too hard to come up with a slightly different system in which their respective joint probabilities are the other way around. In a toy example, there's no context to justify a serious opinion either way, so let's stipulate "'Remove X; X on 3rd draw' is more similar to 'Keep X; X on 3rd draw' than to 'Keep X; X on 2nd draw'" as an additional part of the toy example's definition; for instance, imagine a prize for drawing the X exactly on the third draw.

The real issue with the amended system is that it's unclear how to extend it to a more complicated game such as Teamfight Tactics, which has not only a deck of many copies each of 58 different cards, but also shifting rarities, and effects like Loaded Dice that draw from a subset of the deck. The rest of this article shall describe a general system of shuffle duplication applicable to Teamfight Tactics and other games. As it so happens, keeping the new stipulation in mind, the general system doesn't perform as well on the toy example as the ad-hoc amended system already described, but it's better than the two-shuffle system:

Fig. 5: General System X on 1st draw Remove X; X on 3rd draw Remove X; X not drawn Sum
X on 1st draw 0.2 0 0 0.2
Keep X; X on 2nd draw 0 0.1333 0.0666 0.2
Keep X; X on 3rd draw 0 0.1333 0.0666 0.2
Keep X; X not drawn 0 0 0.4 0.4
Sum 0.2 0.2666 0.5333 1

The basic idea of the general system is to represent the cards in the deck as needles in a continuous haystack, rather than as sequential list elements. Then the occurrence of needles in the haystack can be modeled as a Poisson process, like the arrival of customers at a store. Distances in the haystack are not meaningful to begin with, since whenever we draw a card, we care which needle is closest to the top of the haystack, not how close it is; however, given that the needles are exponentially distributed from the top of the haystack, any change in the occurrence probability of a card in the deck can be represented by a linear transformation of the corresponding needle.

(Exponentially distributed needles in an infinitely deep haystack is isomorphic to uniformly distributed needles in a fixed-height haystack, which is what I came up with originally, since it's more analogous to a deck of cards. That might make one of the reasons for the haystack approach easier to visualize—it lets us generate a random position in the deck without counting how many cards are above or below. Transforming to exponential distribution earlier saves computation by replacing lots of later exponential transformations with linear transformations.)

We begin by representing each shuffle by a parent seed, which is combined with a unique identifier for each card to produce a sequence of random numbers for that card, exponentially distributed around a default mean. Numbers in this default distribution are termed 'normalized positions'. To draw a card, first we project each card's position by dividing its normalized position by its probability of being drawn, then select the card whose projected position is closest to zero (the top of the haystack). Each card's position is initially represented by the first random number in its sequence. When a card is drawn, its position is consumed and the next number is used to represent its next occurrence in the deck. When a card fails to be drawn, we must update its normalized position by subtracting from its projected position the projected position of the card drawn, then reversing the projection by multiplying by the same probability we previously divided by; the resulting re-normalized position once again belongs to our default distribution. (Without the update, it would be in a different distribution, for the same reason that, given three die rolls, each may be uniformly random among the faces, but after removing the lowest roll, the remaining rolls can no longer be considered uniformly random.)

Note that in the limit case of a card with zero probability to be drawn, such as the X during its removal from the deck, the division by zero can be intuitively interpreted as a projection to infinity, and if we distribute the multiplication by zero to reverse the division by zero of the original normalized position, while still multiplying by zero the projected position of the card drawn, then the net update to the normalized position is zero, which is correct. Indeed, a card with zero probability to be drawn can be skipped from the beginning of the procedure. In deckbuilding games where most of the cards don't start in the deck, even the calculation of their normalized positions from the parent seed can be delayed lazily.

Let's work through the toy example. Let x0 be the normalized position of the X, and y0 that of the first blank. (A uniformly distributed random variable between 0 and 1 can be transformed to an exponential distribution by the exponential distribution's quantile function f(p) = -ln(1-p).) To draw the first card, we project the position of the X by dividing x0 by 1, and that of the first blank by dividing y0 by 4, the number of blanks in the deck; note that these probability-based weights need not sum to 1, but need only be in proportion to each other. Then for each possible value of x0′ (that is, x0/1), the probability of that value is given by the probability density function f(x) = e^-x, and the probability of y0′ (that is, y0/4) being less than a given value x is given by y0′'s cumulative distribution function f(x) = 1-e^(-4x); therefore, the overall probability of y0′ being less than x0′ is given by the integral of (e^-x) * (1-e^(-4x)) from 0 to infinity, which equals 0.8, as expected.

If the X is drawn first, the rest of the deck is blanks, so the system can hardly go wrong thereafter. Otherwise, y0 is consumed, and we must re-normalize x0 so that it once again belongs to our default distribution. True, it was generated on that distribution, but now that we've decided not to consume it on the basis that it isn't low enough, we can no longer consider that to be its distribution. To re-normalize it, we take x0′, subtract y0′ from it, then multiply it by 1. (If the X had been drawn, we'd re-normalize y0 by taking y0′, subtracting x0′ from it, then multiplying it by 4, though it'd hardly matter unless there were still some other non-blank card(s) in the deck then or later.) For Bob, who keeps the X in the deck, the rest of the math is more of the same with y1 and so on. Now Alice removes the X from the deck for her second draw, so we don't even need to calculate x0′ (for the new value of x0); technically we could have special logic when all of the cards in the deck are the same to draw without calculating and consuming y1, but I think it's more elegant not to have this special case. Finally, for the third draw, the X is in the deck again, and we repeat the original procedure to select between x0 (re-normalized once) and y2 (given two remaining blanks, y2′ = y2/2). The calculations to produce Figure 5 are left as an exercise to the reader.

For multiplayer games such as autochess, we want a parent seed for each seat, and possibly a grandparent seed for the table to generate these parent seeds, along with any non-seat-specific events, such as the Lucky Lantern in TFT Set 4.5. The result is that the set of cards drawn at a given seat across tables using the same seed will be as similar as possible, diverging only to the extent necessitated by the effects of player actions on probabilities, like the choice to remove the X in the toy example. (The discussion of the interpretive stipulation shows that this system is inadequate to the extent that there are discontinuities in the significance of the timing of cards drawn. If there is some game where the effects of cards tend to vary wildly depending which turn they're drawn, some other system may be more appropriate, but I think the system in this article would be good for most games.)

In some games, a card can be taken from the deck by an ability that differs from drawing in that, perhaps, it makes the card unavailable to the player, instead of available, such as Millstone in Magic: the Gathering. If our goal is to generate similar draws for Alice who gets targeted by Millstone and Bob who doesn't, we shouldn't generate the milled cards from the same seed that generates draws; indeed, we could mill cards at random, without using any seed, and correctness would not be violated. Or we could have a separate parent seed, or "shuffle", for card-denying effects, if the game had enough of them to make it worth our while; then we could classify each effect that takes from the deck as a draw-like effect, which would include effects like Mind's Desire, or a card-denying effect, or even neither—one could argue that an effect like Impulse, which forces the player to keep just one of many cards, is a sufficiently different effect in the context of Magic from the usual experience of having to eke use out of all of the cards one is dealt, that it should take from neither of the two aforementioned shuffles—it might use no seed, or if there are enough such effects in the game, the category could have its own shuffle. (In my humble opinion, I would still consider Impulse a draw-like effect, but there's room for argument.)

Thus far, the system considers two cards similar only if they're identical. But drawing a Volcanic Hammer and drawing a Magma Jet are similar outcomes. There are two possible solutions. One is to classify all cards into a hierarchical taxonomy. For example, Magic cards by cost, then type, then color. In addition to the haystack representing cards as needles, maintain three additional haystacks, representing costs, types, and colors as needles, respectively. To draw a card, first calculate the probabilities in the haystack of costs, in order to select a cost, then calculate the probabilities for that cost in the haystack of types, and so on, until you draw a card of the selected cost, type, and color from the haystack of cards. The other solution is slightly more flexible, and if you're interested in it, I'm available for consulting.

A few last points of minutiae. In autochess, at the beginning of each round, every player draws cards simultaneously. Although it would not violate correctness to generate the draws starting with the same player each time, for the sake of symmetry I think it's more elegant to randomize the order, and as long as it's random, the process might as well have its own seed, derived from the grandparent seed. I should mention a small optimization for drawing consecutively at the same seat, which is that re-normalization can wait until done drawing and only be computed once, subtracting the projected position of the last draw, with the caveat that whenever a card is drawn of which there remain more in the deck, the projected position of its next instance in the haystack must have the projected position of the card drawn added to it, for example, after drawing the second of ten Xs, x2′ must be x2/8 + x1′ instead of just x2/8. Indeed, one can imagine more complicated schemes for delaying the computation of re-normalization in even more cases, but they usually would not be worth the additional complexity to maintain.

Q&A

Q: Is this technique relevant for world or level generation in single-player games?

A: For randomizing among discrete alternatives, if you implement it correctly, then there should be no change from the original probabilities, so the only cost to consider is that it's more complicated to code and maintain. Note that this technique only makes a difference if you ever re-use the same seed with different probabilities. I can think of four cases where that's true:

  1. Where the probabilities vary depending on prior calculations or events in the game or simulation. Technically, the use case for autochess and other competitive games that the article was about falls into this category.
  2. Where the probabilities are user-configurable.
  3. For playtesting, as probabilities are of course configurable from a developer's point of view. I'm skeptical of the implementation effort being worthwhile in this case, unless you have an AI to play through many pairs of same-seed-different-config worlds to generate statistics on dependent variables, such as damage taken per encounter. (The independent variable being the configuration.)
  4. For backward-compatibility of seeds, when probabilities change across versions. I think this is mildly interesting, so in this case it might depend on how strenuous you estimate the implementation to be.

You should use a different system if the alternatives among which to randomize have an obvious sort ordering, such as dangerousness. If that's the case, you should sort the alternatives into a weighted list, and interpret the seeded RNG output number as a position in that list. This behavior is preferable, and as it so happens, also easier to implement. The same considerations still apply, of the (smaller) implementation cost and four use cases.

Related reading

2023-05-27 Addendum: Duplicate Mahjong

In Mahjong, players draw tiles in turn order. If that were all, it would suffice to predetermine the shuffle at the start. Alas, occasionally, turns are skipped (when a tile is called) or a player draws an additional tile (when calling kan). Still, it nearly suffices to predetermine the tiles each player will draw, whatsoever the timing, by dividing the initial shuffle into four lists. Only at the end, it becomes possible that a player may run out of tiles to draw while others still have tiles left. In kobalab's implementation, a player in this situation draws from the bottom of the longest other list. All this is not only easier to implement than the general system, it also results in better duplication (just as the ad-hoc amended system for the toy example, too, is better than the general system).

It may be marginally better, though, to apply the general system to choose a tile for a player whose list is exhausted. Instead of taking the tile at the bottom of the longest other list, we can consider eligible all tiles that would eventually be drawn from the bottoms of lists in this way, assuming no more calls occur. Occasionally this worsens the duplication, relative to kobalab's implementation, but it should be better on average, a worthwhile tradeoff. (Another possibility is to only consider eligible all tiles that would eventually be drawn from the bottoms of lists in all scenarios where at most one more call occurs, falling back to the no-calls criterion if no tiles are eligible under this stricter criterion; this removes some of the cases where the tradeoff is made. However, my guess is that the tradeoff is still worthwhile in those marginal cases. But it's hard to say.) Edit: On second thought, I think that it's so unlikely to generate a different result that it's not worth complicating the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment