Experimental Analysis of
Various Solitaire Games

Jan Wolter
April 10, 2013

Introduction

While programming my Polymorphic Solitaire to play a multitude of different solitaire variations, I became curious about the measurements of the difficulty of various games, and the impact of different methods of card shuffling on the difficulty of the game.

Searching the web, I found estimates of what percentage of solitaire hands are winnable for different games, but those estimates were often wildly inconsistent and there was rarely much explanation of where they came from. Most are probably player's estimates of how many games can be won, which obviously varies widely with the skill of the player. Serious analysis appears only to have been done for a few popular games, notably FreeCell and Klondike, and some of their variations.

I'm not a particularly talented solitaire player. My interest in the subject is mainly that of a programmer. So data on how many games I personally can win would be both uninteresting and embarrassing. But I thought it might be interesting to write programs to solve a variety of solitaire games, and use them to explore what fraction of games it is possible to win.

This doesn't really provide an entirely satisfactory measure of the difficulty of a solitaire game. Some games where nearly all deals can be won are nevertheless very difficult to win. But it is an upper bound on the percentage of games that can be theoretically be won, and we'll be able to more firmly discount the claims of people who say they won 100% of 18,000 games of Canfield.

Natural Shuffling versus Full Randomization

I was also interested in studying the impact of card shuffling on various solitaire games. Computerized solitaire games typically try to completely randomize the order of the cards before each game. Though occasionally done badly, it is not really difficult to do this. (At a philosophical level, there are great difficulties with actually making deterministic machines like computers do anything truly random, but practically speaking, for the purposes of card shuffling, one can come close enough as to make no difference.)

But off the computer, complete randomization is probably the exception rather than the rule. A friend of mine, Mark Conger, is one of several mathematician who have studied card shuffling. It turns out that the classic riffle shuffle really is a good way to randomize a deck of cards, but only if you do a lot of shuffles. The mathematical result that many people have heard is that seven shuffles will do a pretty good job of randomizing a deck of 52 cards. This isn't always really true, many games require less, and some require more, but it's a good rule of thumb.

In any case, it's a good bet that most solitaire players don't shuffle seven times between games. I'd guess that most people only shuffle two to four times. Given that most solitaire games sort the deck a lot while being played, and many are very sensitive to the order of the cards, it seems likely that they may play very differently if the cards are not completely randomized between games.

It seemed to me that many older solitaire games may have been designed to be played this way. It's quite possible that putting them on a computer where the cards are relentlessly randomized before each game makes them play worse, not better. I thought there might be some circumstantial evidence for this in the fact that some of the most popular computer solitaire games, like FreeCell, are games that were little known before the computer age, while classic games like Canfield have become relatively less popular.

Bridge tournaments are often played with computer-shuffled hands now, and there is a lot of controversy over this. Many players think computer shuffling substantially alters the hands, increasing the odds of unusual distributions of cards. Others seem to think it is all in their imaginations. Bridge is a difficult game to analyze. Solitaire is much simpler.

So one of my objectives in these experiments was to try to investigate some of the impact of shuffling techniques on solitaire games. In the end, my overall impression is that people who shuffle three or four times between games aren't going to see spectacular differences between physical and computer solitaire, though the difference is going to be noticeable for some games. Only with really bad shuffling will you see really big differences.

Mark provided me with a nice algorithm for simulating manual shuffles on the computer which is used in all tests, and which can be used when playing games on Politaire.com. It's based on the Gilbert, Shannon and Reeds model of shuffling, often called the GSR model. There is a small amount of experimental evidence suggesting that it isn't a bad model of what people really do when they do manual riffle shuffles. (Though I personally suspect that inexperienced shufflers, or people with unevenly worn decks, may deviate substantially from the model.) I also did a few tests using a simulated overhand shuffle, which mostly just confirmed what everyone already knew - it's not a very good way to shuffle.

Deck Order, Deal Order, and Gather Order

As soon as we start talking about decks that aren't quite random, then the order it is in, and the order we do things in, starts to matter. I needed some terminology, and some standard rules for how to do things, which I'll define here.

A single deck is ordered or in increasing order if, when held face up, the sequence from the bottom card to the top card is:

A♠

2♠

3♠

4♠

5♠

6♠

7♠

8♠

9♠

10♠

J♠

Q♠

K♠

A♥

2♥

3♥

4♥

5♥

6♥

7♥

8♥

9♥

10♥

J♥

Q♥

K♥

A♣

2♣

3♣

4♣

5♣

6♣

7♣

8♣

9♣

10♣

J♣

Q♣

K♣

A♦

2♦

3♦

4♦

5♦

6♦

7♦

8♦

9♦

10♦

J♦

Q♦

K♦

The key thing to remember is that if we flip the deck face down and start dealing cards one-by-one from the top, then they come off in this sequence, first the A♠, followed by the 2♠, and so on.

If we have multiple decks, they are ordered if each deck was separately ordered, and they are just stacked on top of each other.

A reversed deck is in the opposite sequence, in decreasing order so the first card dealt from it would be the K♦.

When we deal out a solitaire games, our normal procedure is to deal the tableau column by column, starting from the left. Note that dealing this way reverses the order of the cards in the columns, so that if we dealt a column of three cards from an ordered deck then the top card would be 3♠, with 2♠ under it, and A♠ on the bottom. Any reserve columns are dealt the same way.

The stock, however, is not flipped over, so the cards are not reversed. If we were dealing an ordered deck out to play Ali Baba, for which we'd need 40 cards on the tableau, leaving 12 cards for the stock, then the first card turned up off the stock would be 2♦ followed by 3♦, and 4♦, and so on.

Sometimes we start games not with a fresh deck, but from the cards gathered up after a previous game. For these tests I used a standard procedure for gathering up the cards:

First we stack up the foundation piles. We do this in ♠♥♣♦ order. If we had two decks, we'd gather them up in ♠♥♣♦♠♥♣♦ order. Because of this, the result of gathering up after a winning one of the many solitaire games where we build the foundations up in suit is an ordered deck.
We gather up any cards remaining on the tableau and/or reserve piles in two passes. First we gather up all face-up cards on the tops of the piles, column-by-column, then we gather up the face-down piles column-by-column. Note that if we are putting them onto a face-up deck, the face-down card piles are flipped, reversing their sequence, but the face-up cards are not.
Finally we gather up the stock and the waste. Again one is face up and one is face-down, so one is reversed and one is not.

Obviously this procedure is somewhat arbitrary, and doesn't represent how all players play. But I had to pick some standard for our tests, and these are the ones I used.

Solvers

My solving programs are written in C, and are built on a shared library of code that is extended as I try new games. My objective is generally to build a solver that is "good enough" to get some statistics, and for many of the simpler solitaire games, this is not hard to do.

The source code is open source and available from the Google Code. You'll need to check it out from the "trunk" subdirectory of the source responsitory.

Most games use a simple depth-first search engine. After cards are dealt, the solver starts a loop where it finds all possible moves, and uses a game-dependent heuristic function to rate the moves. The heuristic function may mark some moves as "safe," moves that can never be wrong. Those are made immediately and will not be choice points if we backtrack in the future. Otherwise, we pick the best rated move and make it, recording it in our move history as a choice point, and saving the rest of the moves on our list with it.

If we reach a point where there are no moves possible, then we start backtracking. We undo moves until we reach a choice point with other saved move alternatives. We then pick the best of those and continue.

After each move, we encode the current game state as a string (the encoding is game dependent), and save that in a large hash table. If we ever return to a state we've been in before, we immediately backtrack. This is common, because often we can reach the same state by removing the same cards in different orders, and in some games it is possible to go in circles. This state hashing is very important to pruning off such redundent searches and getting decent performance.

So the whole thing is actually pretty dim-witted, and for any particular solitaire game it may well be possible to do much, much better. For many games this is good enough, but many of the more complex games would require substantally smarter solvers.

Games Studied

I've run thorough tests on the following games, in the order listed:

Captive Queens: A very simple game, for which, it turns out, seven riffle shuffles are not enough. I discuss that phenomenon a bit.
Trigon: A simple-minded variation of Klondike.
Trigon Left: An even easier version of Trigon.
Canfield: An old casino game. A surprisingly large number of deals are possible to win, given how hard it is to win in practice.
Storehouse: A simple-minded Canfield variation.
Ali Baba: A one-deck variation of Forty Thieves.
Acme: Another old Canfield variant, similar to Storehouse.
Drudgery: A game I invented to give me an excuse to run some tests counting the numbers of inversions/rising sequences in a deck. This includes some experimentation with overhand shuffling.
Agnes Sorel: A Klondike variant without a waste pile.
Indefatigable and Royal Family: Some games where you can build up and down and do redeals which are almost, but not quite, always winnable.
Cruel: A game with unlimited redeals, which is nevertheless fairly hard to win in practice.
Fortress: An old game with simple rules, which is frequently insolvable.
Simple Pairs and Relatives: Various mindless pair removal games are compared.
Golf: A game where you try to build up and down on a single foundation pile.
Black Hole and All-In-A-Row: Two stockless variations of Golf that have previoiusly been analysed by lots of other people.
Thirty Six: A game with a 6x6 tableau that is usually winnable, but still fairly challenging.

Games Partially Studied

Results for a few games where I have more limited tests are summarized below:

Kiev: This game is similar to "Yukon" in that stacks of unsorted cards can be moved, but it resembles "Spider" because there is a stock from which cards are dealt to the tableau and because only complete sequences can be moved to the foundation. A solver was written and the first 10,000 Politaire seeds were tested. 50.7% were found to be solvable.
Dnieper: This game differs from Kiev only in allowing kings to be played on aces in the tableau. A solver was written and the first 10,000 Politaire seeds were tested. 78.2% were found to be solvable.
Sevastopol: This game is like Kiev, but starts with 4 fewer cards dealt to the tableau. A solver was written and the first 10,000 Politaire seeds were tested. 85.6% were found to be solvable.