Probability of Drawing Six Spades in a Hand of Thirteen
I've been working through all the exercises in Statistics for Experimenters, and one problem really stumped me. I ended up solving the problem three different ways, and I thought it would be interesting to share what I learned. The problem is stated like this:
Exercise 2.13. Using a standard deck of 52 cards, you are dealt 13. What is the probability of getting six spades?
Binomial Distribution
My first attempt to solve the problem utilized the binomial distribution. The binomial distribution is used when there are exactly two mutually exclusive outcomes of a trial (like drawing a spade or not-spade). These outcomes are labeled a "success" or "failure". The binomial distribution gives the probability of y successes out of a n possible trials where the fixed probability of an individual success is p and of a failure is q = 1 - p as follows:
Obviously, the assumption of a fixed probability of success is not correct for the card drawing problem because the probability of drawing a spade changes with each card drawn. However, for a population N much larger than the sample n (in this case N = 52 and n = 13), the binomial distribution can be a good approximation. I wrote the following Python script to calculate the probability:
from scipy.special import factorial
###########################################
def binomial_probability (yy, nn, pp, qq):
p_q = (pp**yy) * (qq**(nn-yy))
prob = p_q*(factorial(nn, exact=True)/(factorial(yy, exact=True)*factorial(nn-yy, exact=True)))
return (prob)
#########################################
# Exercise 2.13 - Binomial distribution
# As a first approximation, assume the binomial distribution
p = 13./52.
q = 39./52.
n = 13
y = 6
print('Pr(six spades in hand of 13) = ', binomial_probability(y, n, p, q))
The probability of getting six spades in the first 13 cards drawn from a 52-card deck is 0.056 or about 1/18 as determined by the Binomial Distribution. However, given the fact that the probability of success is not fixed, I wanted to find a more accurate result.
Monte-Carlo Simulation
In this approach, I used a Monte-Carlo method to simulate drawing 13 cards from a 52-card deck. The "deck" consisted of 13 one's representing spades and 39 zero's representing not-spades. The Python code below randomly selects 13 cards from the deck (a "hand") and keeps track of how many times there were exactly six spades in the hand.
import numpy as np
# Exercise 2.13 - Monte Carlo method
# fill the card deck with zeros (non-spades)
deck = np.zeros(52, dtype=np.int32)
# change the first 13 values to 1 (spades)
deck[:13] = 1
# randomly sample 13 cards from the deck and make a new list
hand = [] # list of cards in one hand
num_runs = 1000000 # number if simulated hand draws
i = 0
# count the number of times six spades are present in a hand
six_spades = 0
while i <= num_runs:
# draw a hand of 13 cards randomly with no replacement
hand = np.random.choice(deck, size=13, replace = False)
# since spades == 1 and non-spades == 0, sum gives the number
# of spades
spades = np.sum(hand)
if spades == 6:
six_spades = six_spades + 1
i = i + 1
print(six_spades)
print('Pr(six spades in hand of thirteen) = ', six_spades/num_runs)
The probability of getting six spades in the first 13 cards drawn from a 52-card deck is 0.042 or about 1/24 as determined by averaging the results of six 1,000,000 draw simulations. This result is about 1/3 smaller than the binomial result, which is not surprising since the probabilities get smaller initially as more cards are drawn. The Monte-Carlo result was closer to true, but I believed there must be a way to actually calculate the exact probability.
Calculating the Probability of Each Hand
The actual probability of drawing a spade or not-spade changes for every card drawn and whether or not a spade was drawn. For example, the probability of getting a spade in the first draw is 13/52, and the probability of getting a not-spade is 39/52. Assume the first card drawn is not a spade. Then the probability of getting a spade with the second draw is 13/51, and the probability of getting a not-spade is 38/51.
There are 1,716 different ways to get six spades in a hand of 13. Each arrangement has it's own probability, which is the product of the individual probabilities for each card drawn in sequence. The Python code below calculates the actual probabilities for each possible combination.
import numpy as np
from itertools import product
# Exercise 2.13 - Find probability of each combination of hands
# Create a list of hands covering every combination of 6 spades
# in a 13 card hand. Values == 1 are spades,
# and values == 0 are not spades.
cards = [0, 1]
hands = product(cards, repeat=13)
# turn hands into a list
all_hands = list(hands)
# just keep the hands with six spades
six_spades = []
for hand in all_hands:
if np.sum(hand) == 6:
six_spades.append(hand)
# now find the probability of each hand
prob_hand = []
for hand in six_spades: # iterate over each hand
probability = 1. # reset probability of drawing the hand to 1
card_count = 52 # reset the number of cards to a full deck
spade_count = 13 # reset the number of spades to 13
non_count = 39 # reset the number of non-spades to 39
for card in hand: # iterate over the cards in each hand
# if not-spade, calculate probability of drawing that card
if card == 0:
probability = probability * non_count/card_count
non_count = non_count - 1
card_count = card_count - 1
# if spade, calculate probability of drawing that card
else:
probability = probability * spade_count/card_count
spade_count = spade_count - 1
card_count = card_count - 1
prob_hand.append(probability) # create a list of probabilities for each individual hand
# the probability of drawing 6 spades is the sum of the
# probabilities of each hand combination
print('Possible combinations = ', len(prob_hand))
print('Pr(six spades in thirteen cards) = ', np.sum(prob_hand))
print('Probability is one out of ', 1./np.sum(prob_hand))
The true probability of getting six spades in the first 13 cards drawn from a 52-card deck is 0.0416 or about 1/24 . This is approximately equal to the average result from the Monte-Carlo simulation of 1,000,000 draws.
Hypergeometric Distribution
Ben Kendall pointed out to me that this problem could be solved using a hypergeometric distribution probability calculator. The hypergeometric distribution is a discrete probability distribution that describes the probability of y successes in 𝑛 draws, without replacement, from a finite population of size 𝑁 that contains exactly 𝐾 objects with that feature, wherein each draw is either a success or a failure. The classic application of the hypergeometric distribution is sampling without replacement, which is a good description of Exercise 2.13. The results using the hypergeometric distribution are the same as those from the probability calculations above. The hypergeometric distribution can also provide cumulative probabilities.
Probabilities of Drawing y Spades
The table below summarizes the probabilities of drawing one through 13 spades using each of the above three methods.
Note that the Monte-Carlo method fails to draw any hands with 11 to 13 spades. This is because I only simulated 1,000,000 draws. The true odds of drawing 11 spades is about 1/11,000,000, and the odds for 12 and 13 are even lower. Thus I'd have to simulate many more draws (over about 635 million in the case of 13 spades) to get an accurate estimate for these lower probability events.
Note also that the textbook lists the answer as 0.13. I believe the values I calculated above are correct. Perhaps the book mistakenly shows the result for getting five spades, which is equal to 0.13 when calculated using the Binomial Distribution.