#PAWNED!
######CMPT317 - KIRK MCCULLOCH – 11146754 ######MARCH 4, 2015
##PROBLEM DESCRIPTION Pawned is a simple game based on chess, which is played on a 5 by 6 board. Each player has five pawns (black or white) placed along either the top or bottom rows. Similar to how a pawn moves in chess, in Pawned a pawn can only move forwards, or attack on the left and right diagonals. Additionally, pawns in Pawned cannot move two spaces on their first move, they are restricted to single space moves. The goal of the game is to get one of your pawns to the opponent's side of the board.
In any given turn, players are required to make a move if a valid move is available. If a player cannot move any of their pawns (e.g. all pawns are stuck with an opponent's pawn in the square directly in front), then the player must pass and the turn changes to the opponent. The game ends when one player reaches the opposite side or all pawns on the board are unable to move, in which case the player with the most pawns wins.
##SOLUTION DESCRIPTION The key to an AI solution for Pawned lies in adversarial search. We have two players, white and black, who are competing against each other to breach the other's defensive line. Like chess, Pawned is a zero-sum game with perfect information. That is, a game in which when one player wins, the other must lose. If we assign values to wins and losses, a zero-sum game always ends with a utility balance of zero. Either one player wins (+1) and one loses (-1) or they tie (+½, -½) and the utility of the game's outcome will always equal zero. Pawned also has perfect information in the sense that no information is hidden from either player. The entire board is visible and it is always possible to determine what moves can be made in any given board state. There is no randomness.
The solution to such a problem lies in a minimax search tree. A minimax search tree is one in which every other level seeks to maximize or minimize utility. Consider such a tree as a tree of max nodes and min nodes. Each max node can only have min nodes as children and each min node can only have max nodes as children. If a max node has two children with utilities of 2 and 5, the node will prefer the node with maximum utility, 5. If a min node has the same two child values, it will prefer the node with minimum utility, 2. The leaf nodes at the bottom are all terminal nodes, which each have inherent utility values and cannot be max or min.
We can think of Pawned as a minimax problem where the white player seeks to maximize utility and the black player seeks to minimize utility. The game can then be represented as a series of alternating moves from each player, with directly opposing goals. The white player's moves all take place on max nodes and the black player's moves all take place on min nodes, in the search tree.
One key drawback of the minimax search is that it has to search through the entire depth and width in the tree down to each leaf node before determining the best option. To make this style of search a viable option we can use Alpha-Beta pruning to eliminate portions of the search tree, which we know will not benefit us. We accomplish this by maintaining two values as we traverse the tree:
- Alpha (α) – The value of the best (highest) choice for max
- Beta (β) – The value of the best (lowest) choice for min
As we traverse the tree, two actions allow us to prune off sections of the tree. Max nodes will check whether the current node's value is greater than or equal (>=) to Beta, and if so we stop searching this branch because we know that the min nodes will ignore values larger than Beta. Similarly, min nodes will check whether the current node's value is less than or equal (<=) to Alpha, and if so we can stop searching this branch because max nodes will ignore values smaller than Alpha.
##IMPLEMENTATION DESCRIPTION For my implementation of Pawned written in C#, I created a base program script and two helper object classes, a Pawn class and BoardState class. The Pawn class is a simple representation of a pawn and allows for a way to associate x and y board coordinates with either the white or black player. The BoardState class is where most of the game takes place.
At the core, each BoardState stores a representation of the board as a 2D string array, as well as a list of white pawns, a list of black pawns, and which player has the next turn. This class performs most of the basic operations needed to make the game functional, such as initializing a fresh game board, printing the game board to the console, and making a deep copy of a BoardState (used to make board states immutable when necessary). The class can also use a given BoardState to calculate all legal moves that a given player can make, move a pawn on the board (this is where immutability is necessary) as defined by a transition string, report whether it is a terminal game state, and calculate a utility value for the state.
The base program script is where the AI search happens. Once I had my base game representation implemented, particularly the move calculation function, I began implementing a simple proprietary minimax algorithm. The minimax player move function takes in a BoardState and a search depth limit. In the case where it is the white player's turn, it will determine which of the available moves has maximum utility and return that transition string. In the case where it is the black player's turn, it will determine which of the available moves has the minimum utility and return that transition string. It will also never search beyond the set search depth. I found that at depths upwards of about 7 or 8 the unpruned minimax search became far too lengthy to expect a human player to wait on it. To improve the speed I added an alternative search with Alpha-Beta pruning. The pruned search produced immediately noticeable improvement and made a depth of 14 possible in less than 1 minute.
There are also a few additional ways I tried to improve the search. The first is in the StateUtility function, which calculates the approximate value for a state. Rather than giving terminal states defined values, I used this function to calculate a value for all states, so that when the search has to stop at the depth limit, it can still determine an intermediate value based on the pawns on the board. The general concept behind the function is that positive outcomes for white (more white pawns, white win) generate positive utility and positive outcomes for black (more black pawns, black win) generate negative utility. When a state really is a terminal state, the function reports a dramatically skewed utility value dependent upon the player who won and how they won. This way, the max (white) player will always prefer a terminal state with a utility of several thousand, and the min (black) player will always prefer a terminal state with negative several thousand utility.
An additional idea I explored for the utility function, but did not completely implement, was to employ a kind of passed-pawn logic to find states that may not be terminal states, but had a guaranteed terminal state within a few moves. The idea is that if a player has a pawn at the middle point and there exist no enemy pawns that can intercept it, then that pawn is guaranteed to win within three moves. If we consider a board, where white has a pawn in the center row and third column up, then the triangle with a height of 3 squares and base of 5 squares directly in front of the white pawn is the area where an enemy pawn could intercept the white pawn. If we check this area relative to all white pawns on the board in the third row or better, we can find cases where a player is certain to win in a minimum number of moves. We can then apply a boost to the utility (either positive or negative, depending on player) for all board states where this occurs.
I also tested out a method of logging board states to build an implicit tree as a way to try to improve performance and avoid recalculating the same board states with each turn. I did this by implementing a key/value list of successor states with each BoardState. During search, if a successor state does not already exist, it gets created and the move string and new state are added to the collection for future reference. However, what I found was that storing those precalculated states used an excessive amount of memory without giving significant speed gain. I will expand on this more in the results section.
Additionally, I had been doing something similar before implementing the key/value collection. For the sake of keeping my code concise, I had created an attribute in the BoardState class to store the calculated set of move strings and mapped it to a C# property, which would only calculate the move set if not already computed. The result is an effective way to log board states into an implicit tree, without storing numerous BoardState objects. By logging only the move transition strings, I was able to nearly double performance speed with very little additional memory usage.
##RESULTS To test my implementation I wanted to see how the AI would perform both against itself, playing both white and black, and against a person. To facilitate this I have two main testing functions: an automated game of Pawned that reports the total running time from beginning to end, and a single player game that reports only the time taken to compute AI moves.
To test the time performance I performed several comparison tests with the automated game, to ensure a consistent move pattern. To play these games myself would make the move pattern inconsistent, because I would not always play the same way. The AI will always stay consistent. First, a comparison of Minimax vs Alpha-Beta search at depth 8.
Figure 1 - Alpha-Beta on the left, Minimax on the right
Here, the Alpha-Beta search as no trouble at all while the Minimax lags behind, but still completes within a reasonably fast period. Next, I want to try a depth that would cause Alpha-Beta to have noticeable slowdown, depth 10.
Figure 2 - Alpha-Beta on the left, Minimax on the right
Here, Alpha-Beta took noticeably longer to complete, but still completed in less than 4 seconds. Minimax on the other hand took over 4 minutes to complete. Clearly, the Alpha-Beta pruning is cutting out a huge portion of the search tree, and they both arrived at the same terminal state. Next, I want to stress the limits of Alpha-Beta by running a game at depth 15.
Figure 3 - Alpha-Beta running at depth 15
It took almost 9 minutes to complete this game. Depth 15 seems to be just beyond the feasibly acceptable limit for real time games. All further tests were ran at depth 14 or less.
Next, I want to talk about the logging of board states. When I tried running, the search with logging on at depth 14, it used so much memory so quickly that it crashed before making a single move. I ran it again at depth 10 and it completed in about 5 seconds.
Figure 4 - Top crash at depth 14, bottom complete at depth 10, both with state logging
For comparison, I ran without logging at depth 14 and depth 10. Both used negligible memory (< 10MB).
Figure 5 - Left depth 14, right depth 10
Finally, I want to compare the logging of move transition strings against logging nothing at all. This test was run at depth 14. By logging the move strings the computation time is roughly cut in half.
Figure 6 - Left logging move strings, right logging nothing
In terms of performance against a person, the AI performs poorly at low depths. At a low depth such as 5 or 8, the search simply does not go deep enough in the tree to find any substantial utility changes. Regardless of what moves the player chooses the AI seems to follow a consistent pattern, rarely adjusting to counteract the player's moves. Small depth searches are fast, but dumb and very easily beaten by a human. When playing a game against a depth 14 search the difficulty of the game becomes more noticeable. The AI will take steps to counteract the player's moves earlier and I found I actually had to actively think about my next move to avoid losing. I lost several times, but depth 14 can still be beaten easily with a little care and attention. Depth 14 is also still quite fast. Over the course of a game, the AI usually only spent about 18 – 20 seconds searching.
##CONCLUSIONS Ultimately, my chief goal is to increase the depth as much as possible. Greater depth directly translates into increased "intelligence", but it also leads to greater processing time and memory usage if state logging is used. State logging is something that could potentially speed up the search at extremely deeper depths, but it requires so much memory to accomplish that the search cannot run effectively beyond shallow depths. Any performance increase that may come from state logging cannot be reliably measured at shallow depths. When the depth gets too deep, the search reaches a hard limit and the program crashes. Considering that tree logging creates a hard depth limit from how much memory is available, I judge no logging to be the better method. It allows me to increase the depth well beyond my memory limitation and the only limiting factor is how long I am willing to wait on the AI.
Since the AI without logging can run at deep depths with low memory impact and comparable speed, it could also run at extremely deeper depths where the AI with logging would have crashed without completion. By not using state logging, an Alpha-Beta search could theoretically be used to solve the game. It would take an extremely long time, but the program would be able to complete with a modest amount of memory.
The deeper I am able to make the depth, the smarter the AI effectively gets. I was able to beat a depth 14 AI with some practice, but deeper depths could stump me. Games ran at depth 15 took roughly 9 minutes to complete when entirely ran by the AI. In practice, the first move took most of that time (~7 minutes) and the rest of the game ran much faster. It could be feasible to play a game against a depth 15 opponent, with enough patience. Beyond depth 15, it is probably not realistic to expect a person to wait on the AI.