This is our old blog. It hasn't been active since 2011. Please see the link above for our current blog or click the logo above to see all of the great data and content on this site.

WS Prediction: AL over NL in 4 maybe 5 games.

Posted by Sean Forman on September 26, 2008

I was reading on BTF about the Angels Pythag luck which led me to run a few quick studies. The fact of the matter is that the AL is so much better than the NL this year it isn't particularly close.

First just how lucky are the Angels. Standard pythagorean W-L has them at 87 wins. That probably underrates them as they've had some blowout losses.

Below I've put in a table with my Monte Carlo pythag. This takes the actual scoring distributions (i.e. it treats a 12 and 0 runs scored different than two six runs scored (or allowed)) and then replays the season 1000 times based on those scoring distributing randomly jumbled. Percentile is how "lucky" the team has been. Due to run distributions (probably a couple of blowout losses. The Angels are 3 games better by this measure.

To my eye, the Cubs are the class of the NL by a wide margin and the Red Sox are as well in the AL.

Percentile is the percentage of times their current record was better than their simulated record.

Tm  Year           MC PythW-L              %tile of CurRec   Best Run         Worst Run
ARI 2008 Overall :  79.9-79.1  STD: 3.156, Percentile 40.9% BEST:   89-69  , WORST:   69-89
ATL 2008 Overall :  75.8-83.2  STD: 3.149, Percentile  6.9% BEST:   86-73  , WORST:   66-93
BAL 2008 Overall :  71.7-86.3  STD: 3.140, Percentile  7.8% BEST:   82-75  , WORST:   63-95
BOS 2008 Overall :  95.9-63.1  STD: 3.043, Percentile 29.0% BEST:  106-52  , WORST:   85-74
CHC 2008 Overall :  94.3-63.7  STD: 3.078, Percentile 71.4% BEST:  103-55  , WORST:   83-75
CHW 2008 Overall :  84.5-73.5  STD: 3.156, Percentile 71.0% BEST:   95-62  , WORST:   75-83
CIN 2008 Overall :  71.8-87.2  STD: 3.115, Percentile 79.2% BEST:   81-77  , WORST:   62-96
CLE 2008 Overall :  82.8-76.2  STD: 3.276, Percentile 13.7% BEST:   94-64  , WORST:   73-86
COL 2008 Overall :  75.1-83.9  STD: 3.112, Percentile 39.7% BEST:   85-73  , WORST:   64-94
DET 2008 Overall :  74.1-83.9  STD: 3.170, Percentile 26.5% BEST:   84-74  , WORST:   63-95
FLA 2008 Overall :  79.8-78.2  STD: 2.996, Percentile 78.9% BEST:   88-69  , WORST:   70-87
HOU 2008 Overall :  79.7-78.3  STD: 3.108, Percentile 91.9% BEST:   88-69  , WORST:   70-88
KCR 2008 Overall :  72.5-86.5  STD: 2.954, Percentile 59.3% BEST:   83-75  , WORST:   62-97
LAA 2008 Overall :  89.9-69.1  STD: 2.986, Percentile 100.0% BEST:   99-60  , WORST:   80-78
LAD 2008 Overall :  83.9-75.1  STD: 3.023, Percentile 41.8% BEST:   93-66  , WORST:   74-84
MIL 2008 Overall :  86.6-72.4  STD: 2.889, Percentile 71.7% BEST:   97-62  , WORST:   78-81
MIN 2008 Overall :  87.8-71.2  STD: 3.207, Percentile 44.0% BEST:   99-60  , WORST:   77-81
NYM 2008 Overall :  87.9-71.1  STD: 3.014, Percentile 55.4% BEST:   97-61  , WORST:   78-81
NYY 2008 Overall :  83.8-75.2  STD: 3.188, Percentile 86.6% BEST:   95-64  , WORST:   73-85
OAK 2008 Overall :  75.2-82.8  STD: 3.008, Percentile 50.2% BEST:   85-72  , WORST:   65-92
PHI 2008 Overall :  88.6-70.4  STD: 3.009, Percentile 59.2% BEST:  101-58  , WORST:   78-80
PIT 2008 Overall :  65.5-93.5  STD: 3.142, Percentile 47.9% BEST:   75-83  , WORST:   57-102
SDP 2008 Overall :  67.8-91.2  STD: 3.084, Percentile  3.3% BEST:   78-81  , WORST:   56-102
SEA 2008 Overall :  65.6-93.4  STD: 3.084, Percentile  1.0% BEST:   74-84  , WORST:   54-104
SFG 2008 Overall :  70.4-88.6  STD: 3.091, Percentile 49.1% BEST:   80-79  , WORST:   61-98
STL 2008 Overall :  82.9-76.1  STD: 3.023, Percentile 55.7% BEST:   91-67  , WORST:   69-90
TBR 2008 Overall :  88.5-70.5  STD: 3.055, Percentile 99.6% BEST:   98-60  , WORST:   78-81
TEX 2008 Overall :  73.3-85.7  STD: 3.177, Percentile 89.7% BEST:   84-74  , WORST:   62-96
TOR 2008 Overall :  87.1-71.9  STD: 2.924, Percentile 15.4% BEST:   95-64  , WORST:   77-81
WSN 2008 Overall :  61.4-96.6  STD: 3.087, Percentile 23.5% BEST:   70-87  , WORST:   49-109 

To compare the two leagues, here is the Simple Rating System that SR uses on some sites. This takes into account strength of schedule and margins of victory. It starts with your run margin and then iterates to include the quality of your opposition.

PFR's: Explanation of SRS

These values are in terms of run margin. SRS is the number of runs they would beat an average team by and then SOS is the number of runs better than average their opponents were. This means that the Red Sox would beat average teams by an margin of 1.8 runs/game. The teams they did play, they've beaten by 1.04 runs per game. SOS tells you the quality of your opposition. This looks insane that that the Mariners and Cubs are in the same ballpark, but keep in mind the average score of the AL to NL interleague game was 4.96-4.02. Almost as much as the biggest run differential achieved by any team.

| year_ID | lg_ID | team_ID | franch_ID | srs    | sos    |
+---------+-------+---------+-----------+--------+--------+
|    2008 | AL    | BOS     | BOS       |  1.800 |  0.868 |
|    2008 | AL    | TBR     | TBD       |  1.489 |  0.935 |
|    2008 | AL    | TOR     | TOR       |  1.411 |  0.932 |
|    2008 | AL    | MIN     | MIN       |  1.252 |  0.804 |
|    2008 | AL    | CHW     | CHW       |  1.242 |  0.842 |
|    2008 | AL    | NYY     | NYY       |  1.146 |  0.950 |
|    2008 | AL    | LAA     | ANA       |  1.135 |  0.763 |
|    2008 | AL    | CLE     | CLE       |  0.953 |  0.813 |
|    2008 | AL    | DET     | DET       |  0.551 |  0.859 |
|    2008 | AL    | OAK     | OAK       |  0.504 |  0.844 |
|    2008 | AL    | BAL     | BAL       |  0.439 |  1.026 |
|    2008 | NL    | CHC     | CHC       |  0.327 | -0.943 |
|    2008 | AL    | TEX     | TEX       |  0.257 |  0.828 |
|    2008 | AL    | KCR     | KCR       |  0.218 |  0.909 |
|    2008 | NL    | PHI     | PHI       | -0.094 | -0.872 |
|    2008 | AL    | SEA     | SEA       | -0.156 |  0.875 |
|    2008 | NL    | NYM     | NYM       | -0.252 | -0.904 |
|    2008 | NL    | MIL     | MIL       | -0.427 | -0.891 |
|    2008 | NL    | STL     | STL       | -0.478 | -0.841 |
|    2008 | NL    | LAD     | LAD       | -0.544 | -0.976 |
|    2008 | NL    | FLA     | FLA       | -0.749 | -0.829 |
|    2008 | NL    | ARI     | ARI       | -0.808 | -0.957 |
|    2008 | NL    | HOU     | HOU       | -0.863 | -0.778 |
|    2008 | NL    | ATL     | ATL       | -0.898 | -0.815 |
|    2008 | NL    | CIN     | CIN       | -1.209 | -0.767 |
|    2008 | NL    | COL     | COL       | -1.266 | -0.918 |
|    2008 | NL    | SDP     | SDP       | -1.512 | -0.850 |
|    2008 | NL    | SFG     | SFG       | -1.541 | -0.872 |
|    2008 | NL    | PIT     | PIT       | -1.597 | -0.708 |
|    2008 | NL    | WSN     | WSN       | -1.732 | -0.724 |

The AL East is just ridiculous this year. They are .570 out of division (254-192) (by my quick and possibly erroneous calculations). The Orioles are 3 games over .500 outside the division.

The NL was horrendous in inter-league play this year. Which is why the Cubs are down around the Orioles. I don't know if that makes any sense or not, but it how the numbers work out based on all of the games played during the season.

One more. What about the players who have switched leagues this year.

27 pitchers
       AL     NL
IP    1047   950
ERA   4.84   4.07   
WHIP  1.474  1.365
SO/9  6.44   7.72

38 Batters
       AL     NL
AB   3869   3757
H    1017    990
HR    117    147
BA   .262   .264
OBP  .335   .351
SLG  .416   .443 

Note that this doesn't filter out pitchers batting either, so a larger proportion of the NL AB's are coming from pitchers than from the AL AB's. The AL is just far, far, far stronger as a league.

2 Responses to “WS Prediction: AL over NL in 4 maybe 5 games.”

  1. whiz Says:

    Good stuff Sean. I have been posting similar rankings over at dugoutcentral.com for the last month or two (the older articles that describe the system in more detail are available in the archives there).

    My algorithm is a little different (it actually uses individual game scores and not run totals), although it also skews the rankings heavily towards AL teams (the Cubs are the best NL team and do no better than 6th or 7th).

    Actually I have four different ranking systems: one that uses runs scored and runs allowed (RS/RA), only wins and losses (W/L), Runs Created (RC), and a hybrid system that interpolates between W/L and RS/RA. They all incorporate a home field advantage and account for schedule strength.

    The RS/RA and RC rankings calculate both an offensive and defensive rating and then use the Pythagenpat formula to find a hypothetical win percentage against an average team at a neutral site. That's better than a differential since a one-run difference is more important (and leads to a better win pct.) for low-scoring games than for high-scoring games.

    The final regular season rankings should be posted Monday, if there are no tie-breaker games. I will also be using the rankings to make predictions for the division series and beyond; that will hopefully be posted on Tuesday.

    As a teaser, Anaheim will probably host Boston if things stay as they are now; using my current rankings, Boston has a 62% chance of winning the series, 42% of winning in 4 games or less. For more details, see my article on dugoutcentral.com next Tuesday.

  2. David in Toledo Says:

    It's terrific to see the detailed number crunching -- thanks!

    However, it also comes down to who can actually play in October. Will the Red Sox have Lowell, Drew, Ortiz, and a full complement of pitchers?