## Monday, October 27, 2014

### The Statistical Dominance Of The 1927 Yankees

I recently listed The 25 Highest And Lowest Team OPS Differentials From 1914-2014. The 27 Yanks were number 1 by a good margin. Data from the Baseball Reference  Play Index and Retrosheet. I also regressed winning pct against OPS differential and got the following equation

Pct = 1.396*OPSDIFF + .500

Then I estimated every team's pct. Here are the top 10 project records

 Team Year DIFF Pred NYY 1927 0.196 0.773 NYY 1939 0.158 0.720 ATL 1998 0.139 0.694 BAL 1969 0.136 0.690 NYY 1936 0.131 0.683 STL 1944 0.130 0.682 STL 1942 0.127 0.677 CLE 1948 0.127 0.677 NYY 1998 0.126 0.676 SEA 2001 0.126 0.676

Now the 27 Yanks actually had a .714 pct (why they did not reach .773 might be a good topic for a future post). But notice how big their lead is and how closely teams bunch up after the 1939 Yanks. The 27 Yanks would have an 8 game advantage over their 1939 counterparts in a 154 game season (although they would play each other so it might be a bit lower).

I also did the regression by decades. See The Relationship Between Team OPS Differential And Winning Percentage, By Decades. In some decades the impact of the differential was greater than others. But the 27 Yanks still dominate. Here is that top 10

 Team Year DIFF Pred NYY 1927 0.196 0.769 NYY 1939 0.158 0.728 BAL 1969 0.136 0.698 STL 1944 0.130 0.697 STL 1942 0.127 0.692 CLE 1948 0.127 0.692 NYY 1936 0.131 0.689 NYY 1937 0.121 0.675 ATL 1998 0.139 0.673 STL 1943 0.114 0.672

Now OPS weights OBP and SLG equally. What if we give more weight to OBP? I used 1.7*OBP + SLG. Then I divided that by 3 since this approximates wOBA, a stat from Tangotiger. The regression equation in this case was

The 27 Yanks had a projected pct of .769, the 39 Yanks had .718, and the 69 Orioles had .694 and then the percentages slowly fall after that.

Now we don't have teams from 1901-13 since we don't know OPS allowed. But I did estimate pct using the differentials for the following 3 stats: HRs, Walks and non-HR hits. I was curious to see where the 1906 Cubs rank.

I also compared the estimated winning percentages for the 1914-19 teams from this method and the OPS differential method to see if they gave similar estimates. If they did, then it might be reasonable to project what the OPS differential would say for the 1901-13 teams based on the projection using these other 3 stats.

The good news is that the correlation between the percentages estimated by the two methods for the 1914-19 teams is .96. But the bad news is that there was one team for which the estimates differed by .048. That is pretty big.

But we can still get somewhere. The highest predicted winning pct for the 1901-13 teams was the 1902 Pirates with about .746. The 1906 Cubs were at .690 (why they actually had a .763 pct might make a good post, too).

For the Pirates to reach the .769 of the 27 Yanks, their estimate would have to go up about .023. But only 13 of the 96 teams from 1914-19 had their estimate from the OPS method exceed the 3 stat method by as much as .023. So it seems unlikely that the Pirates would catch they Yankees.

Also, of the 10 best actual winning percentages from 1901-13, only 1 other team had a prediction over .700, the 1905 Giants at .716.

Furthermore, of the 10 best actual 1914-19 teams, only 2 had their OPS differential prediction exceed their 3 stat prediction by at least .023. So it is unusual for a very good team to be off by much.

So it looks like only one team, the 1902 Pirates MIGHT come close to the 1927 Yankees. And that seems unlikely.