Vous êtes sur la page 1sur 3

Riddler Express (3/1/19)

Andrew DeMaio
March 3, 2019

Let’s call the number of packs that Arthur has to buy X (a random variable). This means
that his expected cost is $5·E[X]. To find E[X], break the problem down into a finer-detailed
random variable:

Xi = # of packs needed to attain the ith unique card.


So if there are n cards to collect, we have that
n
X
X= Xi
i=1
And, through the linearity of expected values, we have
n
X
E[X] = E[Xi ]
i=1
So all that’s left to find is E[Xi ]. If Arthur already has i − 1 cards, his next purchase has
a i−1
n
chance of being a repeat card, and a n−i+1n
of being a new card. In the case that the
card is a repeat, we are back in the same situation, so the number of packs to buy is now
expected to be 1 + E[Xi ]. If we got a fresh card, it only took one purchase. We can express
this relationship recursively:
i−1 n−i+1
E[Xi ] = · (1 + E[Xi ]) + ·1
n n
Algebraically simplify this by isolating E[Xi ] on the left:

nE[Xi ] = (i − 1)(1 + E[Xi ]) + n − i + 1


(n − i + 1)E[Xi ] = n
n
E[Xi ] =
n−i+1
Now we are ready to solve for E[X]:
n n
X n X1
E[X] = =n
i=1
n−i+1 i=1
i
In Arthur’s case, he has to acquire 12 sets of 12 cards each, for a total of 144 unique cards.
The Python program below outputs Arthur’s expected cost, given $5 packs and 144 cards.

1
#!/usr/bin/python

from fractions import Fraction

def expected_purchases(n):
return n * sum(Fraction(1, i) for i in range(1, n+1))

print ("$%.2f" % (5 * expected_purchases(144)))

This program outputs the answer of $3996.36, which feels excessive, Arthur.

P.S. Interestingly, the sum ni=1 1i is the Harmonic Series which, according to people smarter
P
than me, behaves an awful lot like the natural logarithm as n increases:
n
X 1
ln(n) < ≤ ln(n) + 1
i=1
i
Look at how E[X] compares to n ln(n) and n(ln(n) + 1) for the first 1000 n:

As an aside, the value n ln(n) recurs quite a bit in computer science, typically in the context
of sorting lists of numbers. In particular, the best algorithms for sorting a list with n elements

2
cost at least/at most cn ln(n) computations for some values of c > 0. Ultimately, this can
be understood in terms of Stirling’s approximation, which tells us that

ln(n!) = n ln(n) − n + O(ln(n)) = n(ln(n) − 1) + O(ln(n))


From an information point of view, then, a mixed-up list of n cigarette cards encodes
n log2 (n) − n log2 (e) + O(log2 (n)) bits of information. You could look at each of Arthur’s
cigarette pack purchases as giving log2 (n) bits of information, since we get one random card
out of n. After collecting all cards, Arthur will have gathered an expected

n log22 (n)
+ cn log2 (n)
log2 (e)
bits of information, for some 0 < c ≤ 1; but Arthur could also have mixed n cards in a
random order. So you could imagine that Arthur found a way to mix up n cards with an
expected overhead of around log(n).

Vous aimerez peut-être aussi