Expected maximums and fun with Faulhaber’s formula

A recent Futility Closet post inspired this one. There, Greg Ross mentioned a 2020 paper by P Sullivan titled “Is the Last Banana Game Fair?” in Mathematics Teacher. (BTW, it’s behind a paywall and I haven’t seen that paper).

Suppose Alice and Bob don’t want to share a banana. They each have a fair 6-sided die to throw. To decide who gets the banana, each of them rolls their die. If the largest number rolled is a 1, 2, 3, or 4, then Alice wins the banana. If the largest number rolled is a 5 or 6, then Bob wins. This is the last banana game. In this post, I’m not going to discuss the last banana game specifically, but instead look at a related question.

Let’s define things more generally. Let I_n=\{1,2,...,n\}, let X,Y be two independent, uniform random variables taken from I_n, and let Z=max(X,Y). The last banana game concerns the case n=6. Here, I’m interested in investigating the question: What is E(Z)?

Computing this isn’t hard. By definition of independent and max, we have
P(Z\leq z)=P(X\leq z)P(Y\leq z).
Since P(X\leq z)=P(Y\leq z)={\frac{z}{n}}, we have
P(Z\leq z)={\frac{z^2}{n^2}}.
The expected value of Z is defined as \sum kP(Z=k), but there’s a handy-dandy formula we can use instead:
E(Z)=\sum_{k=0}^{n-1} P(Z>k)=\sum_{k=0}^{n-1}[1-P(Z\leq k)].
Now we use the previous computation to get
This solves the problem as stated. But this method generalizes in a straightforward way to selecting m independent r.v.s in I_n, so let’s keep going.

First, let’s pause for some background and history. Notice how, in the last step above, we needed to know the formula for the sum of the squares of the first n consecutive positive integers? When we generalize this to selecting m integers, we need to know the formula for the sum of the m-th powers of the first n consecutive positive integers. This leads to the following topic.

Faulhaber polynomials are, for this post (apparently the terminology is not standardized) the sequence of polynomials F_m(n) of degree m+1 in the variable n that gives the value of the sum of the m-th powers of the first n consecutive positive integers:

\sum_{k=1}^{n} k^m=F_m(n).

(It is not immediately obvious that they exist for all integers m\geq 1 but they do and Faulhaber’s results establish this existence.) These polynomials were discovered by (German) mathematician Johann Faulhaber in the early 1600s, over 400 years ago. He computed them for “small” values of m and also discovered a sort of recursive formula relating F_{2\ell +1}(n) to F_{2\ell}(n). It was about 100 years later, in the early 1700s, that (Swiss) mathematician Jacob Bernoulli, who referenced Faulhaber, gave an explicit formula for these polynomials in terms of the now-famous Bernoulli numbers. Incidentally, Bernoulli numbers were discovered independently around the same time by (Japanese) mathematician Seki Takakazu. Concerning the Faulhaber polys, we have
and in general,
F_m(n)={\frac{n^{m+1}}{m+1}}+{\frac{n^m}{2}}+ lower order terms.

With this background aside, we return to the main topic of this post. Let I_n=\{1,2,...,n\}, let X_1,X_2,...,x_m be m independent, uniform random variables taken from I_n, and let Z=max(X_1,X_2,...,X_m). Again we ask: What is E(Z)? The above computation in the m=2 case generalizes to:


For m fixed and n “sufficiently large”, we have


I find this to be an intuitively satisfying result. The max of a bunch of independently chosen integers taken from I_n should get closer and closer to n as (the fixed) m gets larger and larger.