Saturday, March 10, 2012

Match equity table: post-Crawford

I'm starting to look at match equity now: the expected points won in a match, including the cube actions (my first foray into doubling cube strategy). A certain win counts as match equity +1, and a certain loss, -1. In general, match equity equals twice the probability of match win less one.

A "match equity table" shows the probability of winning a match when two matched players are playing (so 50% chance of win on a regular game), for different states during the match play. The states are named by how many games each player is from winning the match: for example "2-away, 3-away" means the player is two games from winning the match and the opponent is three games away. That is, a match to 7 where the player has 5 points and the opponent has 4 is no different from a match to 11 where the player has 9 points and the opponent has 8.

Match equities in a match equity table correspond to match equities before the first die is thrown in a game - so before the players even know who plays first.

Match equity tables assume the Crawford rule is in force. The Crawford rule is common to match play and states that the first time one of the players is one game away from a match win, the opponent cannot double for that game. After the "Crawford" game is over, the opponent is free to double again (and indeed will, at first opportunity).

Two references for match equity tables I've found which are quite illuminating: one that describes the algorithm for generating a match equity table (which I summarize below, with a few additions); and one that shows a modern match equity table based on GNUbg rollouts (though it shows match-winning probabilities instead of equities - an easy conversion).

The general approach to building the match equity table is backward induction: start with late-match known states and step backward in the match, weighting later match states by the probability of reaching them.

The first place to start is "post-Crawford" match equities. These correspond to late-match states after the Crawford game has been played: the player is 1-away, the opponent is n-away, and the opponent will double at first opportunity.

Here is my post-Crawford match equity table out to 12-away


1-away2-away3-away4-away5-away6-away 7-away8-away9-away10-away11-away12-away
1-away00.0310.3630.3910.6270.6480.775 0.7900.8660.8760.9200.926

Here's how I calculated it:

Call the post-Crawford match equity Mp(n), where n is the number of games the opponent is away from a win. We can backward-induct these as well.

Mp(1) is where both players are 1-away. In this case both players are equally likely to win the game (remember, the equity corresponds to right before the first die is thrown, and we assume both players are equally good). So the match equity in this state is zero: Mp(1) = 0.

Mp(2) is a bit trickier. The opponent will double at first opportunity, and the player will take (accept the doubling cube) if the odds are in his favor, but pass (turn it away and give up the game) if not, since a pass just means they start again at 1-away, 1-away with even odds. If the player takes the cube he cannot double again since he's already 1-away.

This means Mp(2) is close to zero, but not zero exactly, since the opponent can double only after a turn or two, and the first one or two rolls will change the odds. That is, if the player wins the first roll, he makes a move, then the opponent can double. If the opponent wins the first roll, he makes a move, then the player makes a move, and only then can the opponent offer the double.

We can calculate that using our cubeless player. If the player wins the first roll, there are fifteen possible states, with a probability of 1/30 of reaching each (15 rolls, and 50% chance that he plays first). For each of those fifteen rolls we use the player to calculate the best move, as well as the odds of winning a game after the move. We care only about the probability of any win, since a regular win, gammon, or backgammon all win the match for either player.

If the opponent wins the first roll we have to step two rolls deep into the game. We have a probability of either 1/30/18 or 1/30/36 for each state, depending on whether the first roll is mixed or a double. Again, we use the cubeless player to both make the optimal moves and tell us the game-winning probability after each move.

One caveat: the player's estimates of probabilities are only approximate, so when you use the states defined above and calculate the probability of a player win by weighting the game-winning probability in a state by the probability of reaching that state, you do not get exactly 50% (like you should). Generally it is close, but could be off by a few tenths of a percent. Mp(2) is going to be sensitive to small differences from 50% in probabilities, so this could be a significant bias.

I decided to normalize for this by adding a constant offset to all game-winning probabilities in the two-deep states (where the opponent wins the first roll) such that after the offset, the calculated probability of the player winning is exactly 50%.

Then I calculated Mp(2) as the expected match equity over all those states (15 when the player wins the first roll and 15*21=315 when the opponent wins the first roll, for a total of 330). In each state I calculate the match equity if the player takes the cube (2*probability of win-1); if that is greater than zero (the match equity if the player passes on the cube) the match equity in the state is the calculated match equity; otherwise it is zero. I calculate the weighted sum of that match equity to get Mp(2).

Using 2-ply Player 3.2 to estimate the moves & game-winning probabilities in each state, I get Mp(2) = 0.0310. This ties out well with the value in the second reference (rounded to 0.03).

Mp(3) is easier. When the player is doubled he'll always take the cube because the take match equity is always much higher than the pass match equity (which equals Mp(2), since a pass gives the opponent a game and takes him from 3-away to 2-away). The match equity after the game depends on who wins what. If the player wins any kind of game, he wins the match, for a match equity of +1. That happens 50% of the time since the players are symmetric.

When the player loses a single game, doubled, the match ends up at 1-away, 1-away, with a match equity of zero (Mp(1) above). If the player loses a gammon (or backgammon), the game is over and the match equity is -1.

So Mp(3) = +1 * 0.5 + 0 * ( 0.5 - Pg ) - 1 * Pg = 0.5 - Pg

where Pg is the probability that the player loses a gammon.

To calculate this we need an estimate of the probability of losing (or winning, as it's symmetric before the first die is thrown in a game) a gammon. The standard when generating match equity tables is just to specify this a priori; the first reference uses 20% and the second 26%. We can do a little better since we already have the bot estimates of all the network probability in the 330 possible states before a double; so we can calculate it. Using 2-ply Player 3.2 I get a gammon probability of 27.47% - this represents the probability that a game ends in a gammon. This is a cubeless estimate, but that's fine, since after the opponent (immediately) doubles and the player takes, the cube is dead.

Plugging in Pg = 0.2747/2 = 0.13735, Mp(3) = 0.363.

Mp(4) is calculated like Mp(2), except that the match equity in the "take" states is slightly different. If the player wins any kind of game the match equity is still +1; if he loses a single game the match equity is Mp(2); if he loses a gammon (or backgammon) the match equity is -1. As with Mp(2), the probabilities here (wins, gammon losses, etc) are specific to each state and estimated by the bot. At each point you compare that state-dependent "take" equity with the pass equity (Mp(3) in this case).

A similar algorithm is used for Mp(n) for n>4: for each state calculate the state's match equity as the maximum of the state-dependent "take" equity and the state-independent "pass" equity, and then average over all states.

No comments:

Post a Comment