Experimental Analysis of
Drudgery Solitaire

Jan Wolter
April 22, 2013

[General Introduction]

The Game

After experimenting on the game Captive Queens, I got curious about how shuffling impacted games that are sensitive to the number of rising sequences in a deck. So I invented a game that modifies the rules of Captive Queens to emphasize rising sequences.

Drudgery is played as follows:

Note that we have destroyed the two features of Captive Queens that we said made it a playable in our discussion of that game. We no longer have multiple short foundation piles, Instead we have one really, really long one. And we no longer have a useful waste that can be used to reverse subsequences.

So if we start with a sorted deck, we can win Drudgery in one pass, but even a single card out of place would be enough to force another. If we start with a reversed deck, 52 passes will be needed. In fact, we can predict the number of passes needed by counting the number of pairs of consecutive cards that are out of order in the starting deck. That is, if card c is behind card c+1 in the deck then that's a "consecutive card inversion," and each consecutive card inversion is going to force us to do one more pass.

It turns out that the mathematicians who study card shuffling prefer to think in terms of "rising sequences", but it's really the same thing. A "rising sequence" is a group of consecutive cards which are still in the original order in the shuffled deck. A deck with n consecutive card inversions has n+1 "rising sequences".

In completely randomized decks the number of consecutive card inversions is going to be 25.5 on average. We know this because the reverse of a deck with n inversions always has 51-n inversions, so the average over all possible permutations of the deck must be 51/2. In fact, the number of inversions is almost always fairly close to that value. 95% of the time it is between 22 and 29. So Drudgery players will usually need 23 to 30 passes to finish the game, and will almost never require less than 17 or more than 35 passes.

What if we riffle shuffle instead? It turns out that if we start with a fully sorted deck and do one riffle shuffle, then we always introduce exactly one inversion. That's right, just one. Suppose when we split the deck in half we split it between card c and card c+1. After a riffle card c+1 will (almost) certainly fall before card c, but every other pair of consecutive cards will still be in their original order. (The GSR model allows for the possibility that the shuffle is done so badly that the cards end up in the original order and no inversions are added, but the chance of this is vanishingly small.)

A second shuffle gets us three or four inversions, The graph below shows experimentally determined numbers of inversions for three, five and seven shuffles in comparison to the number of inversions in a fully randomized deck shown in red. Note that we are only shuffling in this experiment, there is no cut.

Numbers of Inversions in One Million Shuffled Decks
Three Shuffles; Five Shuffles; Seven Shuffles; Fully Randomized

So Drudgery, which takes an average of 26.5 passes to win with fully randomized decks, requires only 23.7 passes with decks shuffled seven times.

The inadequacy of seven shuffles in this case can readily understood if we go a bit deeper into the mathematics of riffle shuffling.

Normally you shuffle by splitting the deck in two, but if you happen to be Vishnu, the four-armed God of the Hindus, you might split the deck instead into four roughly equal chunks and riffle them all together at the same time. This is called a "4-shuffle" as opposed to the conventional method used by two-handed mortals, which is called a "2-shuffle".

It turns out that, under the GSR model of shuffling, a 4-shuffle has exactly the same effect as two consecutive 2-shuffles. In general, performing n 2-shuffles is the same as performing one 2n shuffle.

So seven shuffles is like distributing our 52 cards randomly into 128 piles, and then riffling all those piles (or at least the non-empty ones) together randomly. That's pretty good randomization for most purposes, but when distributing 52 cards into 128 piles there are likely to be some piles of two or more cards. The cards will always be consecutive cards, and those cards will never be inverted. To really be able to invert everything, we need enough different piles so that the probability of any piles containing more than 2 cards starts to get much smaller.

Each additional shuffle doubles the number of piles, so you'd think that the odds of getting two cards in the same pile would drop rapidly, but as is seen in the Birthday Paradox this probability is higher than you would expect. So the odds of the ordering between some pairs of cards being "protected" by always falling into the same packet on every 2-shuffle remain significant even after many shuffles.

This aspect of Drudgery and Captive Queens is present to some degree in many solitaire games, but in more complex games that offer more tools for fixing inversions, it tends to be less critical.

Overhand Shuffles

Breaking up consecutive card inversions (or, to look at another way, "rising sequences") seems to be a weak point for the riffle shuffle. I got to wondering if overhand shuffles might do better on this point.

An overhand shuffle consists of removing packets of cards off the top of the deck, and piling them up in reverse order. Various different handgrips are used, and people on the Internet argue about which is the right one, but in the end it doesn't matter because it's a terrible shuffle. Not only does it take many more shuffles than anyone is likely to do to really randomize the deck, but it is also easy to cheat with, allowing magicians and card sharks to relatively easily bring particular cards to the top of the deck.

Mathematicians have studied the overhand shuffle a bit, here and here. I borrowed their model for my tests. They pick a probability, p, that a packet boundary will fall between any particular pair of cards. Reasonably capable overhand shufflers seem to have a p value around 1/4, meaning a 52 card deck will be broken into about 13 or 14 packets. Pemantle observes that some people's p values is more like 1/15, and that this scarcely shuffles at all. I used p=1/4 for my tests.

So I ran this, and the riffle shuffle 100,000 times for each different number of shuffles and got this graph:

Mean Numbers of Inversions in 100,000 Shuffled Decks
Riffle Shuffles; Overhand Shuffles

The first thing I did, of course, was spend a lot of time examining my program to see if that sinusoidal pattern for the overhand shuffle was an artifact of my code, but I don't think it is.

Overhand shuffling is really like a multi-way cut, You don't generally want to cut the deck twice, as there is a good chance that your second cut will undo your first cut. I think the same thing happens with overhand shuffles. You really want to do them in odd numbers.

But it looks like overhand shuffles are actually fairly good if your only interest is consecutive card inversions / rising sequences. Seven overhand shuffles are about as good as seven riffle shuffles. For lower numbers of shuffles, the overhand shuffles seems to be substantially superior on this metric.

But it's easy to come up with other metrics where the overhand shuffle is awful. Suppose we count then number of pairs of adjacent cards in the shuffled deck that were also adjacent in the original deck. In a fully randomized deck of 52 cards, we'd expect to find about 1.96 such pairs. Six riffle shuffles gets the expected number down to 2.07. After 10 overhand shuffles it is still 6.11. It takes over 40 overhand shuffles to get it below 2.1. So the overhand shuffle's success at breaking up rising sequences should not be taken to suggest that it is in general a good shuffle.

I got wondering if maybe we could combine them to get the best of both. If instead of doing n riffle shuffles, we did 1 overhand shuffle followed by n-1 riffle shuffles, would we fix get more inversions while still having the other statistical advantages of the riffle shuffle?

Turns out the combining shuffles this way only helps if n is three or less. One overhand shuffle and two riffles gives you an average of 11.2 inversions, compared to 6.9 inversions for three riffle shuffles. But for higher numbers of shuffles you are always better off doing pure riffle shuffles.