Marlin’s Gottahack
– 1998 Christmas Bonus Page – PREV HOME NEXT
Analysis
of the game of Skunk
Rules of the game
The game of SKUNK is played with 2 people and
a pair of dice. The object is to be the first to accumulate 100 points. When
you have the dice you may roll as often as you like, accumulating the point
total on the dice UNLESS you roll a 1 on one of the dice. If you roll a 1, you
have been skunked and lose all the points you have been accumulating on this
turn and must pass the dice to your opponent. If you think you have accumulated
enough points on this turn and you have not yet rolled a 1, you may choose to
quit your turn and "lock in" the points you have accumulated on this
round. Players take turns with the dice trying to get to 100. Through out the
game a single 1 on the dice eliminates the points you have accumulated on this
round, but snake eyes, two 1s in a single dice throw, eliminates all the points
you have accumulated in the game so far.
The above is a quick paraphrase of the game
as I learned it as a kid. Below is an analysis of the optimal strategy for the
game. It tells you when you should try to go for more points and when you
should lock in your accumulated earnings. First I will tell you how to read the
chart then I will explain how the
chart was derived. The chart is large and gives you all the exact numbers. I also
have a picture that
summarized most but not quite all of the strategy in a single easy to grok
format.
How to read the chart
Each line starts out with two numbers in
parenthesis. These are your score followed by your opponents score. The next
number, the floating point one, is your probability of winning the game, thus
you can see from the first line, (0,0) 0.5195594 the advantage to the player
who goes first. Having the dice in your hands gives you a slight edge over 50-50.
As you can also see going down the list, if your opponent went first,
accumulated only 5 points and locked it in, those 5 points is enough of a lead
to give him the higher probability of winning (your chance is .4973309)
The next number (or numbers) after the dash
represents the strategic points in the game. When you have a single number, as
in the case of (0,0) - 21 this means that if you start your turn at (0,0) you
should keep rolling until you have collected 21 or more points. Note it also
means that if you have 21 or more points, you should stop rolling and lock in
those points.
The meaning of additional numbers is a
strategic toggle of state. You should roll if your accumulation for the turn is
less than the first number. You should not roll if it is less than the second
number but greater than or equal to the first. You should roll if it is less
that the third but greater than or equal to the second, etc.
For example, the strategy at (0,87) is 33 44
100. This means that if you have less than 33 points you should keep rolling.
If you have between 33 and 43 points inclusive, you should stop and lock in
your points. If by random chance you have accumulated exactly 32 points, you
should roll again because it is less than the first stop point, 33. If you roll
anything less than 12 you will stop rolling and lock in your score, but it you
roll 12, your accumulated points are 44 which is a GO numbers (think of the
strategy list as an alternation of STOP numbers and GO numbers) and you should
go for the whole 100 points.
Lastly, I will point out that not all the
"strategic" numbers are achievable. For example, the strategy for
(0,67) is 28 96 100. This says STOP if you have accumulated 28 points and GO if
you have accumulated 96. Numerically this is correct. It says that if you have
made it to within 4 points away from winning, even though all of it is at risk,
you should go for the win. However if you are playing the strategy, you would
have stopped rolling back at 28 (or there abouts) and never gotten to the state
of having 96 accumulated points.
I could have written the code to eliminate
these "unreachable strategic" sections but they are actually
revealing and help explain certain catastrophic events. For example, scan
through the (0,x) portion of the game. You will see that if you are nearly tied
with you opponent you go for 21 accumulated points. As your opponent pulls
ahead of you, you try to go a little higher, 22,23,24 but even when opponent is
50 points ahead you still only go as high as 25 before you lock in. However,
looking at the line (0,67) 28 96 100 and also (67,0) .858472 you can see that
by the time the opponent is 67 points ahead if you give him the dice he has an
85% chance of wining. The basic notion of locking in a score to live to fight another
day is no longer a viable strategy. i.e. if you have accumulated 96 points and
you lock it in, the opponent gets the dice and has a fair chance to beat you on
his next turn. Your better strategic choice is risk losing the entire 96 points
and try to win directly.
The "catastrophic" event occurs
right around (0,87)
(0,86) 33 53 100
(0,87) 33 44 100
(0,88) 100
At 86 the strategy is to lock in 33 points
and live to fight again. The 53 to 100 portion is unreachable, but that
unreachable portion is much wider than it was. Your opponent is now so close to
winning, if you could get to 53 points, you ought to go for it. At 87 the Gap
has narrowed even more to the point that the upper strategic range is
reachable, you can hop from 32 points (a GO number) to 44 points (a GO number).
By the time your opponent has 88 points the gap has narrowed so much that it
has vanished. Your strategy has suddenly gone from "try to lock in 33
points" to "Go for the whole wad". In Mathematics, this sudden
change from one fundamental strategy to a very different one is called a
catastrophe and is what catastrophe theory is all about.
So, now you know enough to browse through the
chart and observe the curious shape of the strategy manifold. All my reasoning
above to describe why the strategy changes is just my rationalization of the
numbers that I see in this chart. Observe the curious retrograde motion from
(0,75) up to (0,82). Weird stuff!
How the chart was derived
Finally a note on how I get to this table and
the final caveat on why this might not be the final nail in the coffin. The
real impetus for this was the realization that these crappy little desktop
machines are really not that crappy and little anymore. I can just haul off in
VB and allocate an array with a million elements and being able to hammer out a
million calculations a second you can actually work on problems of that
magnitude.
I build an array P(M,Y,R) with dimensions 100
x 100 x 100. It represents all the states in the game. M is My Score, Y is Your
Score and R is the amount that I have accumulated so far in my turn and is the
amount at Risk should I happen to get a 1.
The actual array holds the probability that I
will win if I am in that state. At this point we have an enormous million state
Markov process. If I am in state M,Y,R I must make a strategic decision either
to NOT ROLL (in which case I will lock in my earning and move to the state
Y,M+R,0 where my probability of winning is 1 - p(Y,M+R,0) cause the opponent
has the dice) or to ROLL (in which case I have a 1/36 probability of losing
everything; moving to state Y,0,0 or a 10/36 probability of losing the turn
accumulation; moving to state Y,M,0 or various probabilities of getting more
points and moving to states M,Y,R+n )
Strategically my decision in any state is
simply to do that thing that will probabilistically move me to a state with a
greater probability of winning. If I know what I will do in any given
situation, I can calculate the probability of winning from any state. If I know
the probability of winning in any state, I can decide what the strategically
correct thing is to do. How do you get around this chicken and egg problem? I
just load up the probability array with random probabilities and start
converging. The Correct probabilities with the Correct strategic decisions is a
single fixed point is a space of random numbers that transform according to the
rules outlined above. ie. p(M,Y,R) = the maximum of (1-p(Y,M+R,0), the weighted
sum of 1/36*(1-p(Y,0,0)) + 10/36*(1-p(Y,M,0) + 1/36*p(M,Y,R+4) + 2/36*p(M,Y,R+5)+...+
1/36*p(M,Y,R+12))
The leap of faith is that the sucker will
converge. I just pick a random state M,Y,R and calculate its
"correct" value using the formula above. Thus that one number in the
array bears the proper relationship to its neighbors, even though they be
random junk. Then I pick another random state and do it again. I keep track of
the convergence by looking at whether the number I just calculated is close to
the number that was previously in the array. What can I say? It appears to
work. I believe it must work but have no proof. The probabilities start random
and the strategy starts pretty random. It begins to settle, the strategy starts
to freeze, the probabilities start to freeze, it gets to the point that
everything is consistent out to 4 decimals. Do I have the correct result?
I see two possible sources of errors. 1)
since I cruise over the array at random, it is possible that some entries have
had more convergence than others. I it POSSIBLE (but not very likely) that one
of the numbers in the array has never been touched, is still random junk. But
if that were true, there would be downstream ripples from that junk and you
would see funny effects in the strategy as a result of misevaluation of the
probability for a given state. Is that
the reason for the funny retrograde motion? I
don't know yet. This seems an unlikely source of error. I have run for hours
taking about 100,000 convergence steps per second. The cumulative error in
those 100,000 steps is typically less than .01 giving me several digits of
accuracy.
2) the other possible error is that lacking a
proof that there is but a single fixed point, It is possible that I have merely
converged on a single consistent strategy, one of many. I don't believe that
this is possible. I think that like any kind of differential equation
relaxation solution there is a uniqueness theorem - if you find a solution, you
have found THE solution. The basic reason for this is that the edges are nailed
down. The state (100,x,0) is an absolute
winner. So is the state (50,96,50). You have
50 points in the bank and you have accumulated 50 more. There is no strategic
decision at this point, you have won the game. Probability is 1! I believe that
these points FORCE their immediate neighbors into fixed states and once those
neighbors are fixed they start forcing their neighbors. It reminds me of a heat
equation, where you hold the edge of some sheet at various fixed temperatures
and the thing must settle down into a single state that matches the boundary
conditions and matched the local rules of the heat equation.
So it is possible that the strategy array
that follows has a few local defects due to error #1 and it is possible by
error #2 that I have deluded myself and the entire thing is but a castle built
in the air. None the less, I feel that I am now ready to go to Vegas and make a
killing playing SKUNK with the pros.
Enjoy!