This is the first of a series of expository posts on matrix-theoretic sports ranking methods. This post, which owes much to discussions with TS Michael, discusses Massey’s method.

Massey’s method, currently in use by the NCAA (for football, where teams typically play each other once), was developed by Kenneth P. Massey

while an undergraduate math major in the late 1990s. We present a possible variation of Massey’s method adapted to baseball, where teams typically play each other multiple times.

There are exactly 15 pairing between these teams. These pairs are sorted lexicographically, as follows:

(1,2),(1,3),(1,4), …, (5,6).

In other words, sorted as

Army vs Bucknell, Army vs Holy Cross, Army vs Lafayette, …, Lehigh vs Navy.

The cumulative results of the 2016 regular season are given in the table below. We count only the games played in the Patriot league, but not including the Patriot league post-season tournament (see eg, the Patriot League site for details). In the table, the total score (since the teams play multiple games against each other) of the team in the vertical column on the left is listed first. In other words, ”a – b” in row $i$ and column $j$ means the total runs scored by team against team is , and the total runs allowed by team against team is . Here, we order the six teams as above (team is Army (USMI at Westpoint), team is Bucknell, and so on). For instance if X played Y and the scores were , , , , , , then the table would read in the position of row X and column Y.

X\Y | Army | Bucknell | Holy Cross | Lafayette | Lehigh | Navy |

Army | x | 14-16 | 14-13 | 14-24 | 10-12 | 8-19 |

Bucknell | 16-14 | x | 27-30 | 18-16 | 23-20 | 10-22 |

Holy Cross | 13-14 | 30-27 | x | 19-15 | 17-13 | 9-16 |

Lafayette | 24-14 | 16-18 | 15-19 | x | 12-23 | 17-39 |

Lehigh | 12-10 | 20-23 | 13-17 | 23-12 | x | 12-18 |

Navy | 19-8 | 22-10 | 16-9 | 39-17 | 18-12 | x |

In this ordering, we record their (sum total) win-loss record (a 1 for a win, -1 for a loss) in a matrix:

We also record their total losses in a column vector:

The Massey ranking of these teams is a vector which best fits the equation

While the corresponding linear system is over-determined, we can look for a best (in the least squares sense) approximate solution using the orthogonal projection formula

valid for matrices with linearly independent columns. Unfortunately, in this case does not have linearly independent columns, so the formula doesn’t apply.

Massey’s clever idea is to solve

by row-reduction and determine the rankings from the parameterized form of the solution. To this end, we compute

and

Then we compute the rref of

which is

If denotes the rankings of Army, Bucknell, Holy Cross, Lafayette, Lehigh, Navy, in that order, then

Therefore

Lafayette Army = Bucknell = Lehigh Holy Cross Navy.

If we use this ranking to predict win/losses over the season, it would fail to correctly predict Army vs Holy Cross (Army won), Bucknell vs Lehigh, and Lafayette vs Army. This gives a prediction failure rate of .