Massey Ratings Description

Introduction

This is a brief description of the Massey Rating model. To keep it short and hopefully understandable, I have omitted many technical details. Since I began this hobby of computer ratings, my model has undergone several revisions. I am labeling the current version 2.1, which was unveiled prior to the 1998 college football season.

All recent versions are based on the popular least squares method. In fact, I once published my ratings under the title "Massey LS." Many variants of this system have been developed, and several have been published in academic journals. You may want to refer to Mike Zenor's description. However, the current Massey Ratings have been modified significantly from this foundational model.

It is my goal to achieve a reasonable balance between rewarding teams for wins, convincing wins, and playing a tough schedule. This is the most argued issue in sports ratings, and you will find systems that are based on each of the extremes.

Inputs

Only the score of the game, the home team, and the date of the game, are used to calculate my ratings. So stats such as rushing yards, rebounds, or field-goal percentage are not included. Nor are game conditions such as weather, day/night, or grass/artificial turf. Overtime games are not treated any differently. Finally, neither injuries nor psychological factors like motivation are considered.

Features

Here's a list of major features currently implemented in the Massey model. More details are provided below.

Overall Team Ratings
Offense and Defense Ratings
Schedule Strength
Homefield Advantage
Standard Deviation
Conference Ratings
Total interdependence
Diminishing returns
Optional use of preseason information

Team Ratings

The basic idea is that each game outcome provides an "observation" of the relative strengths of the participants. We can write an equation that describes the result:

Ra - Rb (+/-) h + e = f(Pa,Pb)

Here Ra and Rb are the ratings of teams A and team B respectively. The global homefield advantage is represented by h. The sign preceding h should be positive if team A was the home team and negative if B was the host. Of course if the game in question were played at a neutral site, then this term should be omitted. In order to account for natural variability in the teams' performances and other factors (such as weather or injury), an error term e is included in the equation for each game.

Finally the right side of the equation is f(Pa,Pb), where f is a function of two variables, namely the points scored by team A (Pa) and the points scored by team B (Pb). This function exhibits the famous diminishing returns principle that "blowout" wins should not be that much better than "solid" wins. Among others, f satisfies the property that f(x,y) = -f(y,x) for all x,y.

Basicly, these equations establish relative relationships among the teams' ratings. There will typically be enough connections to link each team to every other team by some "chain" of games. This interdependence coupled with relativity implicitly causes schedule strength to be a major contributor to a team's rating.

Taking each game to be an observation in a multilinear regression, we can apply the least squares method. Ratings (and the homefield constant h) are chosen as model parameters that minimize the total squared error, sum(e^2). This is accomplished using some techniques from linear algebra .

Things are not quite this simple. Because the ratings are not uniquely determined, I introduce the further requirement that the average rating be zero. Weighted least squares is implemented to allow momentum to factor into a team's rating. Nonlinearities, both in the weights and f(Pa,Pb), require a more complicated iteration to reach convergence. Finally, preseason information (whose influence decays with time) may be factored in as additional equations if desired. I will not comment further on these issues, or the computational algorithms I employ.

Off and Def Ratings

A team's Off rating measures its "offensive" power, essentially the ability to score points. This does not distinguish how the team scored, so good defensive play that leads to scoring will be reflected in the Off rating. A team's Def rating measures its "defensive" power, the ability to prevent its opponent from scoring.

It should be emphasized that the Off/Def breakdown is simply a post-processing step, and as such has no bearing on the overall rating. A consequence of this is that the Off/Def ratings may not always match actual production numbers. A team that routinely wins close games may have somewhat inflated Off/Def ratings to reflect the fact that they likely play well when they have to. The Off/Def breakdown is simply an estimate of how much of a team's strength can be attributed to good offensive and defensive play respectively.

Two equations are now assigned to each game played:

Oa - Db (+/-) h/2 + e = Pa
Ob - Da (-/+) h/2 + e = Pb

Oa and Ob are team A and B's offensive ratings, while Da and Db are their defensive ratings respectively. The homefield constant is divided by two because it is counted twice (once in each equation). Again there is an error term. Pa and Pb are the points scored by teams A and B.

Instead of t unknown model parameters, there are now 2t values that must be estimated (2 for each team). Since the overall ratings have already been calculated, I use constrained least squares, requiring that Oi + Di = Ri for each team i. The solution vector is comprised by the desired offensive and defensive power ratings. An additional constraint must be added, so I set the average defensive rating equal to zero.

A team's offensive rating can be interpreted as the number of points that it would be expected to score against an average defense. If the opponent has a positive or negative defensive rating, then this expected score would be lowered or raised accordingly. In general, hypothetical score predictions can be made with the following formulae:

Pa = Oa - Db (+/-) h/2
Pb = Ob - Da (-/+) h/2

Schedule

The final column in the listing of Massey ratings is a number representing the schedule strength. It is simply the average opponent rating adjusted for the homefield advantage. Notice that it is the schedule strength through the current date, so games to be played in the future are not included.

Conference Ratings

Below the team ratings, you will find a listing of the leagues, conferences, and divisions. The win / loss records include only inter-conference games. It is obvious that in conference games, the win / loss percentage will be 50%, so it is not beneficial to include them. A conference's rating is the average rating of all of the teams in that conference.

Standard Deviation

As was mentioned before, the Massey model will in some sense minimize the unexplained error. Upsets will occur and it is impossible (and also counter-productive) to get an exact fit to the actual game outcomes. Hence, I publish an estimated standard deviation. About 68% of observed game results will fall within one standard deviation of the expected ("average") result.

Comments

If you have question, comments, or suggestions about the Massey Ratings, please email me.

Kenneth Massey, June 25, 1999

Massey's Ratings and Rankings | Theory