This is our old blog. It hasn't been active since 2011. Please see the link above for our current blog or click the logo above to see all of the great data and content on this site.

Maximum Likelihood Teams of the 2000s

Posted by Neil Paine on January 18, 2010

In a few recent posts at Basketball-Reference, I've used the method of Maximum Likelihood Estimation (MLE) to create team rankings based on W-L record and strength of schedule. I won't go into the gory mathematical details of the method here (Doug Drinen wrote a great summary at PFR a few years ago, which is better than anything I could say anyway), but the general idea is that we want to create ratings that best explain a set of past results -- we want to maximize the likelihood that if the ratings say Team A should have beaten Team B, Team A actually beat Team B in real life. These "retrodictions" are made with the following equation:

p(hW) = exp(rH - rA + HF)/(exp(rH - rA + HF) + 1)

where rH = the home team's rating, rA = the away team's rating, and HF = a home-field advantage term.

Anyway, I decided to apply this method to Major League Baseball over the past decade to rank the teams over the entirety of the 2000s. Using Retrosheet's game logs, I threw every game of the decade into a file (including regular-season and postseason games) and ran the MLE ratings; playoff games were given no extra weight, except for the fact that teams recieve bonuses in strength of schedule for the postseason, simply because they're playing the best teams. The result basically ranks the teams in an order such that, at any given point over the past 10 seasons, every team was "probably better" than the teams ranked below it. Now, let the arguments begin...

Rank Team Rating W L WPct WS Pennant Div W
1 New York Yankees 0.43095 1017 692 0.595 2 4 8
2 Boston Red Sox 0.33566 954 722 0.569 2 2 1
3 Anaheim/LA Angels 0.28842 921 744 0.553 1 1 5
4 Oakland Athletics 0.26743 901 744 0.548 0 0 4
5 Chicago White Sox 0.14927 869 771 0.530 1 1 3
6 St. Louis Cardinals 0.14822 946 737 0.562 1 2 6
7 Seattle Mariners 0.14430 846 793 0.516 0 0 1
8 Minnesota Twins 0.14299 869 776 0.528 0 0 5
9 Atlanta Braves 0.13685 903 745 0.548 0 0 6
10 Los Angeles Dodgers 0.07101 871 772 0.530 0 0 3
11 San Francisco Giants 0.06250 867 775 0.528 0 1 2
12 Toronto Blue Jays 0.05893 805 814 0.497 0 0 0
13 Philadelphia Phillies 0.05870 870 781 0.527 1 2 3
14 Cleveland Indians 0.05296 824 812 0.504 0 0 2
15 Texas Rangers 0.00799 776 844 0.479 0 0 0
16 New York Mets -0.01021 829 813 0.505 0 1 1
17 Houston Astros -0.02893 845 803 0.513 0 1 1
18 Florida Marlins -0.03519 822 813 0.503 1 1 0
19 Arizona Diamondbacks -0.05593 819 828 0.497 1 1 3
20 Chicago Cubs -0.08620 813 823 0.497 0 0 3
21 Detroit Tigers -0.13396 737 896 0.451 0 1 0
22 Colorado Rockies -0.14006 777 859 0.475 0 1 0
23 San Diego Padres -0.14105 770 858 0.473 0 0 2
24 Baltimore Orioles -0.18086 698 920 0.431 0 0 0
25 Tampa Bay (Devil) Rays -0.18623 702 931 0.430 0 1 1
26 Cincinnati Reds -0.21235 751 869 0.464 0 0 0
27 Milwaukee Brewers -0.23670 742 881 0.457 0 0 0
28 Montreal Expos/Washington Nationals -0.26496 711 908 0.439 0 0 0
29 Kansas City Royals -0.27107 672 948 0.415 0 0 0
30 Pittsburgh Pirates -0.37248 681 936 0.421 0 0 0
Home-Field Advantage 0.17244

What's probably the most striking is how clearly this shows the disparity between the AL and the NL over the past decade. If you compare the MLE ranking to the teams' winning percentages, being in the NL is worth roughly 30 points of WPct for the majority of the teams on this list; over the course of a decade, that's essentially 5 extra wins a year the average NL team got above their "true talent level" because of the weaker competition relative to the MLB average.

It's also interesting to look at the ratings by division:

Team Div Rating
Anaheim/LA Angels AL West 0.28842
Oakland Athletics AL West 0.26743
Seattle Mariners AL West 0.14430
Texas Rangers AL West 0.00799
AL West 0.17805
New York Yankees AL East 0.43095
Boston Red Sox AL East 0.33566
Toronto Blue Jays AL East 0.05893
Baltimore Orioles AL East -0.18086
Tampa Bay (Devil) Rays AL East -0.18623
AL East 0.09664
Chicago White Sox AL Central 0.14927
Minnesota Twins AL Central 0.14299
Cleveland Indians AL Central 0.05296
Detroit Tigers AL Central -0.13396
Kansas City Royals AL Central -0.27107
AL Central -0.01116
Atlanta Braves NL East 0.13685
Philadelphia Phillies NL East 0.05870
New York Mets NL East -0.01021
Florida Marlins NL East -0.03519
Montreal Expos/Washington Nationals NL East -0.26496
NL East -0.02207
Los Angeles Dodgers NL West 0.07101
San Francisco Giants NL West 0.06250
Arizona Diamondbacks NL West -0.05593
Colorado Rockies NL West -0.14006
San Diego Padres NL West -0.14105
NL West -0.04046
St. Louis Cardinals NL Central 0.14822
Houston Astros NL Central -0.02893
Chicago Cubs NL Central -0.08620
Cincinnati Reds NL Central -0.21235
Milwaukee Brewers NL Central -0.23670
Pittsburgh Pirates NL Central -0.37248
NL Central -0.12921

Most people consider the AL West a weak division whose winner usually takes the crown by default, but the MLE ratings say it was actually the strongest over the past decade, even more so than the AL East, which was dragged down by the struggles of Baltimore and Tampa for most of the 2000s. Meanwhile, the only team in the West to finish with a sub-.500 record for the Aughts was Texas, and at .479, even their mark translates to above the MLB average when accounting for the AL's superiority. In other words, the idea that the Angels have been mopping up in a weak division by default is simply a myth.

In sharp contrast to the AL West was the entire National League, whose strongest division (the East) was actually worse than the AL's weakest (the Central). At the very bottom of the heap was the NL Central, where St. Louis racked up 6 division titles in part by being the only above-average team in a group that "boasted" 3 of the 5 worst teams of the decade by MLE in Cincinnati, Milwaukee, and Pittsburgh (the latter of which ranked dead last in MLB by a mile). It's strange to think that those teams were competitive in the 80s/early 90s, but now they consider a successful season to be the occasional brief, C.C. Sabathia-fueled playoff run or an odd 2nd- or 3rd-place finish here and there.

Shifting our attention back to the other end of the spectrum, the battle for #1 overall wasn't especially close going into 2009, and the Yankees blew it wide open by winning their 2nd title of the decade. They finished the 2000s as the only team with more than 1,000 combined regular-season and playoff wins, an MLB-best 8 division titles, and an MLB-best 4 pennants as well. Many would argue that I forgot to include an additional column for their record-shattering payroll, their advantage in which mirrors their lead in the ratings... But whatever the reason for their success, you have to acknowledge that -- by the numbers, at least -- the Yankees were the decade's most accomplished team (and this is coming from a very committed Red Sox fan).

That said, let me know what you think -- who were your top teams of the 2000s? Why is the AL so much stronger than the NL? And which teams will rise to prominence over the next decade? Give me your opinions in the comments below...

9 Responses to “Maximum Likelihood Teams of the 2000s”

  1. Stat Of The Day: Yanks Team Of The Decade Says:

    [...] Paine, over at B-R Blog & Stat of the Day, takes a look at “Maximum Likelihood Teams of the 2000s.” Doing some heavy math, Neil makes a case for the Yankees being ‘the team of the last [...]

  2. cubbies Says:

    can you run this search again using the last 5 years? i feel like that would be more interesting/ relevant to the clubs of today. for example, i hate putting the 2000 cubs in the same group as the 2009 cubs, as none of the 2000 players were still on the team in '09, and going back a couple years (i dont know how many) the only remaining player was kerry wood.
    also, i would enjoy seeing the cubs jump up in rank when only the last 5 years are counted.

  3. Djibouti Says:

    "Most people consider the AL West a weak division whose winner usually takes the crown by default"

    Really? For most of the last 10 years I, and a lot of people I talked to, considered the AL West to be the most competitive from top to bottom. You could make a good argument though that they wasted the most talent. All 4 teams end the aughts above 0 in MLE, and they only have 1 pennant to show for it.

    Now the AL Central, there's a weak division, and the numbers don't tell the story. The Twins were consistently good throughout the 00's, with 2000 being the only year they had less than 79 wins. The Royals were consistently bad, with 2003 being the only year they were above .500 at 83-79. But the other 3 teams were all over the place. Year in and year out, the crown was a battle between the Twins and whichever other team happened to catch fire. In 6 of the last 10 years, only two teams in the AL Central finished at .500 or above. In 3 of the remaining years the third team was 83-79 (kind of a weird statistical aberration). In '06 they had three teams above 90 wins, and I'm going to have to declare that year a fluke.
    So while the AL Central may have averaged out to be more powerful than the NL East, if you look at it on a year-by-year basis, the NL East probably had a better decade.

  4. Devon Says:

    This is pretty fascinating, but I found myself thinkin the same as Djibouti... who says the AL West is weak? I think of the central divisions as weak, and I regularly read/hear that view from others too. I also wonder how the Devil Rays didn't rank higher, being they regularly faced some very very strong opponents in the Yankees & Red Sox. Maybe I'm missing something about that, or maybe they just didn't play NY or Boston as much as I think.

  5. JohnnyTwisto Says:

    But they lost over 90 games every season until the last two years. Doesn't matter who they were playing, they stunk.

  6. Neil Paine Says:

    Well, Baltimore and Tampa did rank ahead of Cincinnati despite the Reds' WPct being a good 30-35 points higher, which tells you something about the quality of the competition they faced. But like Johnny said, when you lose as much as they did for the majority of the decade, that great SOS can only help you so much. Besides, NY and Boston only made up 22% of their total games for the decade:

    http://www.baseball-reference.com/games/head2head.cgi?teams=TBD&from=2000&to=2009&submit=Submit

    They still went 565-694 (the equivalent of 73-89 in a 162-G season) against everybody else.

    As far as the AL West being weak goes, I certainly don't believe it, but I've gotten the impression over the years that people felt it was easy to win, that Anaheim was winning it "by default" in recent seasons... But maybe it's just me, being a Red Sox fan (and naturally looking down on the Angels' accomplishments -- until last year, that is), because Googling "weak al west" returns fewer hits than "weak nl east", "weak nl central", "weak nl west", and "weak al central". So by the "Google test" I made up just now, the AL West is actually the 2nd-best (least-weak?) division in baseball behind the AL East. For whatever that's worth.

  7. Andy Says:

    In terms of league dominance, my sense (not at all backed up by actual evidence as I have not researched it) is that when the Steroids Era hit and offense went up significantly, it went up even more in the AL due to the presence of the DH. Then, I think that pitchers wanted to go to the NL where it was easier to pitch. The better pitchers, those who had more leverage in free-agency, were able to do so. The hitters wanted to go to the AL where it seemed easier to hit homers, and again the better ones were able to do so. So then NL got better pitching talent and worse hitting talent. The numbers got more stilted favoring offense in the AL. The effect then multiplied by making pitchers want to go to the NL even more and hitters go to the AL more.

    This is an oversimplification because offense alone does not make a league better. But I would argue that with offense between the two leagues now fairly close, it means that the hitters and pitchers are both better in the AL. Whereas I am arguing above that the better pitchers flocked to the NL, it seems that overall their talent level has fallen off over the last 15 years. A starter with a 3.00 ERA in the AL is a better pitcher than one with a 3.00 ERA in the NL because the offensive talent is superior in the AL. A .300 hitter in the AL is a better hitter than a .300 hitter in the NL because the pitching talent in the AL is superior.

  8. Tomepp Says:

    Andy: the main reason a pitcher with a 3.00 ERA in the AL is always better than a pitcher with a 3.00 ERA in the NL (post-1973) is that the AL pitcher has to face a DH, while the NL pitcher gets an "easy out" (either the opposing pitcher or a pinch hitter – usually someone not good enough to be in the starting lineup to begin with). Any variance due to strength of league/division pales in comparison. While the AL pitcher may occasionally face a stronger lineup (sans DH) than his NL counterpart, he will always be facing the extra hitter in the DH (except those few interleague games in an NL park, but we’ll ignore those for the moment).

    Whether a .300 hitter in the AL is better than a .300 hitter in the NL will depend instead on the relative strength of the league/division pitching. (There’s no such thing as a “Designated Pitcher”, though a LOOGY might come close…) By your own comments, you speculated that the NL got better pitching due to natural migration away from the “hitter-friendly league” of the AL. Shouldn’t that mean that a .300 NL hitter is better than a .300 AL hitter, not vice-versa?

    Also, when you say that the offense in both leagues is fairly close, how are you measuring that? If runs/game is the standard, then doesn’t that mean that the offense in the NL is actually better than in the AL, because runs produced in NL games is (typically) generated among 8 players while runs produced in AL games is generated among 9 players?

    I find this thread fascinating, and I’d like to explore how other SOS measurements rank the teams. I also agree with Cubbies, that using only the last 5 years (I’d even argue for only 3 to 4 years) is more relevant for evaluating the current franchises’ prospects.

  9. Andy Says:

    Good points Tom. I admitted in my original comment that I was contradicting myself about strength of the leagues. It's a rich question that I sure don't know the answer to.