Wednesday, December 31, 2014

Are Defensive Runs Saved Predictive?

Click here to read this post from John Dewan's Stat of the Week

Bill Gilbert: HOF Candidate Ratings by Win Shares

Click here to read it

Here are links to earlier years:

2014
2013
2012
2011
2010
2009
2008
2007
2006
2005

Sunday, December 14, 2014

Do Park Effects Treat DiMaggio Appropriately?

98.7 is simple average of park factors for JoeD's years, from Baseball Reference. Yankee stadium comes out at about an average park only because it favored lefties and killed RHBs.

Click here to see how unusual his home/road splits were and how it affects is estimated value.

Now let's take a look at Yankee Splits from 1947-51 (I don't think the Baseball Reference Play Index has much data on these breakdowns before 1947).


Place Split BA OBP SLG OPS
Home LHB 0.269 0.362 0.448 0.810
Road LHB 0.280 0.359 0.432 0.791
Home RHB 0.270 0.355 0.396 0.752
Road RHB 0.275 0.355 0.402 0.757






Place Split BA OBP SLG OPS
Home OPP LHB 0.237 0.338 0.368 0.706
Road OPP LHB 0.270 0.367 0.386 0.753
Home OPP RHB 0.226 0.308 0.308 0.616
Road OPP RHB 0.255 0.342 0.376 0.718

Yankee RHBs actually did just a bit better on the road (an OPS of .005 better). We see a drop off for their LHBs (OPS falls .019).

But opposition RHBs had an OPS that was .102 higher at their home parks than in Yankee Stadium.. For opposition LHBs it was only .047 better.

So their is evidence that Yankee Stadium hurt RHBs and that a simple park effect may not accurately adjust their stats.

Sunday, December 7, 2014

What does WAR say about Dick Allen and Ken Boyer?

Allen had a career WAR of 58.7, 126th among position players. He had three top 5 finishes, including a 1st. But none back to back. His 4 best were 8.8-8.6-7.5-6.4. Three of those were consecutive (but not the 8.6). His best three consecutive seasons add up to 22.7.

Boyer had a career WAR of 62.8, 104th among position players. He had four top 5 finishes, including a three years in a row. But never a first place finish (3rd was his best). His 4 best were 7.9-7.4-6.8-6.4. Those all came in 154 game seasons, which is about 5% shorter than the 162 game seasons Allen's best years came in. His best three consecutive seasons add up to 22.1 (if we increased that by 5% it would be 23.2).

Boyer has the edge in career WAR. Allen has the edge in best three consecutive season, although only Boyer had three straight in the top 5.

Allen is hurt by fielding with a -16.5 defensive WAR. Boyer has a +10.6 defensive WAR.

I usually think that significant career value and significant peak value should be enough for the Hall. Three straight top 5 finishes in WAR seems pretty good for Boyer along with being 104th among position players.

But Allen's best three consecutive seasons add up to 22.7 or almost 7.6 per year. Sean Forman says that a WAR of 8.0 is MVP caliber. So he was pretty close to that, on average, for three straight years.

If it were just based on hitting, Allen had two 1st place finishes and four other top 5 finishes in offensive WAR. His career rank is 60th. Very impressive.

He missed 44 games in 1968 and 40 in 1969 while hitting 33 and 32 HRs in those years, respectively. A full season might have meant over 40 HRs in each year. Maybe something like that might have cemented an image of him as a top slugger. He also played only 148 games in 1972 while hitting 37.

Allen also had three 1st place finishes in OPS+ and four other top 5 finishes. He led the NL in both 1966-67 when Aaron, Mays, McCovey and Billy Williams were all still good or near their primes. Allen was also 2nd in 1968.

Both Allen and Boyer seem to have enough career value and enough peak value to be in the Hall.

Sunday, November 30, 2014

Would the “wide arc" of DiMaggio’s swing have made him more vulnerable to strikeouts against the higher velocity of pitchers in today’s game, as John Thorn suggests?

Yankees great Joe DiMaggio was overrated, says MLB historian. Excerpt:
"Perhaps most remarkably, especially when compared to the current era in baseball when hitters strike out more than ever, DiMaggio never struck out more than 39 times in a season. In 1941, the year of his famous 56-game hitting streak, DiMaggio struck out a total of 13 times.

By comparison, 2014 AL MVP Mike Trout struck out 184 times, the highest total in the majors.

Yet Thorn makes the case that the “wide arc" of DiMaggio’s swing would have made him more vulnerable to strikeouts against the higher velocity of pitchers in today’s game."
This might be true, but all players would have to deal with the faster pitch speeds.

It wasn't just that DiMaggio had low strikeout totals. It is that his HR-to-strikeout ratio was astronomical, especially considering that he was a right handed batter in Yankee Stadium. See my post Which Players Had The Best HR-To-Strikeout Ratios?

DiMaggio hit 2.77 HRs for every one that the average player hit while he only struck out 59% as often (for a ratio of 4.69).

In fact, the only player to have a higher HR-to-strikeout ratio relative to the league average was Ken Williams of the St. Louis Browns. His home field, Sportsman's Park, was a great hitter's park.

DiMaggio hit only 41% of his HRs at home in his career while Williams hit 72%. So it is likely the case that DiMaggio would rank first, and probably by a wide margin, if HRs were park adjusted.

DiMaggio faced Bob Feller 138 times. He hit 6 HRs while striking out only 7 times. Feller struck out 16.9% of the batters he faced from 1938-51. He allowed a HR% of 1.3%. DiMaggio struck out much less than average against Feller and hit HRs more frequently. So it looks like he could adapt to fast pitchers.

DiMaggio face Hal Newhouser 60 times (Newhouser was 3rd behind Feller and Tommy Bridges in strikeouts per 9 IP from 1938-51 in the AL). He had 6 HRs and just one strikeout. He faced Bridges 7 times with 1 HR and no strikeouts.

Data from Baseball Reference and the Baseball Reference Play Index.



Joe DiMaggio Led MLB In Road Slugging Percentage, 1936-51

Minimum 2500 PAs. Here is the top 10

Joe DiMaggio 0.610
Ted Williams 0.607
Stan Musial 0.581
Jimmie Foxx 0.528
Johnny Mize 0.528
Hank Greenberg 0.526
Ralph Kiner 0.525
Jeff Heath 0.514
Walker Cooper 0.511
Charlie Keller 0.510

Here is the top 10 in all games

Ted Williams 0.633
Hank Greenberg 0.619
Stan Musial 0.584
Ralph Kiner 0.582
Joe DiMaggio 0.579
Jimmie Foxx 0.573
Johnny Mize 0.568
Earl Averill 0.526
Hal Trosky 0.518
Charlie Keller 0.518

From 1939-51, here are the AVG-OBP-SLG for both DiMaggio and Williams in neutral parks (with Fenway and Yankee Stadium taken out)

DiMaggio) .335-.417-.605
Williams) .333-.469-.617

Yes, Williams beats DiMaggio in SLG. But it is fairly close, much closer than their career numbers. So under pretty much the same circumstances, DiMaggio slugged close to what Williams slugged. The big edge is OBP for Williams.

Now only looking at neutral parks leaves alot of PAs out of the analysis. But DiMaggio's stats put him almost on the level of the guy many say was the greatest hitter ever.

Data from Baseball Reference and the Baseball Reference Play Index.

Friday, November 28, 2014

Should Joe DiMaggio's Offensive Value Be Estimated Upwards Because Of Yankee Stadium?

His road stats were much better than his home stats. In those days, it was over 400 feet to left-center field (I think 407). And players normally hit better at home than the road. So I tried to estimate what his career stats might have been in light of this if had played in a fair park and then estimate how many runs this would add to an average team.

The table below shows his splits. Data from the Baseball Reference Play Index


DiMaggio BA OBP SLG
Home 0.315 0.391 0.546
Away 0.333 0.405 0.610

Now the league splits from 1936-51


League BA OBP SLG
Home 0.273 0.350 0.394
Away 0.261 0.335 0.373


So players normally had an OBP that was .015 higher at home and a SLG that was .021 higher. What if DiMaggio had played in a fair park his whole career and he had these same differentials?

His home OBP and SLG would be .420 and .631. If those are averaged with his road numbers of .405 and .610, he would have a career OBP of .413 and a career SLG of .621.

That is better than his actual numbers of .398 & .579. So his OBP goes up .015 and his SLG goes up .042. That would raise a team's OBP and SLG by 0.0016 & 0.0046, respectively (assuming he has one ninth of a teams ABs and PAs).

How many extra runs would this mean? I ran a regression with runs per game as the dependent variable and OBP & SLG as the independent variables for all MLB teams from 1936-51. Here is the equation

R/G = 11.19*SLG + 19.20*OBP - 6.17

Plugging in the 0.0016 & 0.0046 changes in team SLG and OBP, we get 0.0825 more runs per game or 12.7 per 154 game season. That is about one extra win per season.

DiMaggio played 1736 games. That is 11.27 154 game seasons. That times 12.7 is 143. That adds about 14 to wins to his career value.

He has 78.2 career WAR, good for 41st among position players. This adjustment would give him 92.8, putting him at 28th.

Wednesday, November 19, 2014

Is The Run Value Of Stealing A Base Different Than The Run Value Of Allowing A Stolen Base?

Maybe this is just a statistical artifact or something quirky is going on. But I ran regressions with runs scored per game and runs allowed per game as the dependent variables and OBP, SLG, SB, CS, GDP, and ROE (reached on errors) as the independent variables (the last four were all per game). I used all teams from 2005-14 and the data was from the Baseball Reference Play Index.

Here is the regression for runs scored per game

R/G = 9.8*SLG + 17.17*OBP - 0.308*GDP - 0.394*CS + 0.143*SB + 0.54*ROE - 5.09

Now the regression for runs allowed per game

RA/G = 9.4*SLG + 17.57*OBP - 0.188*GDP - 0.446*CS + 0.302*SB + 0.86*ROE - 5.35

So the value of stealing a base is .143 runs per game while allowing one is .302 runs per game (it seems like there are big differences in GDP and ROE as well). I can't think of any reason why there would be a big difference here.

I started looking at this because when I added variables like SB differential, etc. to my regressions estimating winning pct based on OPS differential, the value of the SB differential seemed too high.

If we look at a team like the 2010 Red Sox, they allowed 1.04 SBs per game while having 0.259 CSs per game. If I use the coefficient values for RA/G, they allowed about .2 runs per game from stealing. That would be about 32 runs per season or about 3 wins.

If I used the values from the R/G regression, they would have allowed about .047 runs per game from stealing or 7.59 per season. That is not even one win. So the difference between the two methods is about 2.5 wins.

In a recent post, I found that over the last 5 years, the Red Sox seemed to under perform based on their OPS differential. See The Relationship Between OPS Differential And Winning Percentage Using 5 Year Averages. They won 5.23 fewer games on average each season than their OPS differential would predict (only the Rockies were worse at 5.4 fewer wins).

The 2011 Red Sox were similar, with 0.963 SBs per game and 0.309 CSs per game.They have 4 teams in the top 21 in SB allowed per game from 2010-14. The Rockies, however, don't seem to have been that bad at allowing SBs, having only one year in the top 50 in SB allowed per game. So their under performance must be due to something else.

Sunday, November 9, 2014

Which Teams Exceeded Their Win Total By The Most According To OPS Differential?

I have done some related posts recently, looking at the teams with the best OPS differentials since 1914 using data from Baseball Reference's Play Index. I also did a regression and the equation for winning pct was

Pct = 1.396*OPSDIFF + .500

Then I estimated each team's pct and calculated how many more games per 162 that they won than this equation would predict. Here are the top 10:

Team Year DIFF W-L% Pred Diff per 162
BSN 1914 0.002 0.614 0.503 0.111 17.96
STL 1987 -0.017 0.586 0.476 0.110 17.88
LAA 2008 0.014 0.617 0.519 0.098 15.85
CIN 1973 0.010 0.611 0.514 0.097 15.66
STL 1931 0.045 0.656 0.562 0.094 15.21
NYM 1969 0.017 0.617 0.524 0.093 15.07
NYY 2013 -0.048 0.525 0.433 0.092 14.86
MIN 1994 -0.088 0.469 0.378 0.091 14.81
PHA 1948 -0.032 0.545 0.455 0.090 14.59
BAL 1977 0.011 0.602 0.516 0.086 13.97

So the Braves in 1914 should have had a pct of .503 based on their OPS differential of .002. But it was actually .614. That gives them 17.96 more wins per 162 games than we might have expected. Maybe that is why they are called the "Miracle Braves!"

I also found a regression equation for each decade but the list of the top over achieving teams was similar to this one.

We don't have splits for things like RISP, runners on base and close and late situations for 1914. So we can't tell if the Braves did especially well in those cases, which would explain alot. The Braves were 33-20. But that is not much different than their overall pct.

They had what seems to be good fielding. Their fielding pct was .961 (the league average was .958). Their defensive efficiency rating was .701 and the league average was .698. That probably helped a bit, but I don't think that would explain their .614 pct. They stole 139 bases, as did their opponents. They turned 143 DPs and the next highest team had 119.

The 1987 Cardinals hit alot better when it counted as this table shows

Split BA OBP SLG OPS
High Lvrge 0.275 0.344 0.399 0.743
Medium Lvrge 0.267 0.350 0.389 0.739
Low Lvrge 0.254 0.328 0.357 0.686

It looks like their pitchers did a bit better when it counted

Split BA OBP SLG OPS
High Lvrge 0.261 0.344 0.397 0.740
Medium Lvrge 0.260 0.320 0.395 0.715
Low Lvrge 0.274 0.334 0.418 0.753

They stole 248 bases while their opponents stole just 100. The Cards hit into 16 fewer DPs and reached on errors 22 more times. Some of that probably helped them over achieve. They turned 172 DPs, 2nd most in the league. The league average was 146.

Sunday, November 2, 2014

Should The 1927 Yankees Have Won Even More than 110 Games? Like 118?

I recently estimated that their winning percentage could have been around .770 based on their OPS differential. They had a .872 OPS while allowing a .676 OPS, for a differential of .196, easily the highest since 1914. See The Statistical Dominance Of The 1927 Yankees.

A .770 pct would give them 118.5 wins in a 154 game season.

One reason that they did not win more is that they may have scored fewer runs than expected based on their OBP and SLG. Here is the regression generated equation for runs per game during the 1920s for all teams:

R/G = 11.29*SLG + 18.04*OBP - 5.92

The Yanks had a .489 SLG and a .384 OBP. That predicts 6.53 runs per game while they actually had 6.29. Over the whole season, that is about 37 runs fewer than expected. Out of the 160 teams in the decade, the 27 Yanks were 10th in underscoring.

So maybe that accounts for about 3.7 wins. That still leaves about 4.8 wins.

But why did they not score more runs? They stole 90 bases, just a bit below the league average of 99. Their success rate was 58.4%, just a bit below the league average of 60.7%. This probably does not matter much.

Maybe they had too many sacrifice bunts. They had 107 according to Retrosheet. But the other 7 teams averaged 148. So I doubt they lost alot of big innings bunting too much.

Their pitchers allowed an SLG of .356 and an OBP of .320. The regression equation for runs allowed was:

R/G = 11.25*SLG + 18.68*OBP - 6.12

That predicts they would allow about 3.86 runs per game, their actual total. So they did not win fewer games than expected due to the pitchers giving up more runs than expected.

Here are the splits for the Yankee hitters and pitchers followed by the league splits. Nothing jumps out as to why they won fewer games than expected. Maybe that they did not hit better with runners or on with RISP like the league generally did. That might account for them not scoring as many runs as expected.

I don't see anything in their close and late performance that explains anything. It even looks like their pitchers did a very good job then compared to what the league did.

They were also 24-19 in 1-run games for a .558 pct. That means they were .775 in all other games.  (if they had done as well in 1-run games as they did at other times it would mean 9 more wins)


Situation-Hit AVG OBP SLG
Total 0.307 0.384 0.489
None On 0.307 0.380 0.487
Men On 0.308 0.376 0.492
RISP 0.301 0.376 0.479
Close & Late 0.287 0.370 0.460








Situation-Pit AVG OBP SLG
Total 0.265 0.320 0.356
None On 0.257 0.312 0.342
Men On 0.275 0.321 0.374
RISP 0.264 0.320 0.356
Close & Late 0.224 0.275 0.288








Situation-Lg AVG OBP SLG
Total 0.286 0.352 0.399
None On 0.276 0.342 0.387
Men On 0.298 0.352 0.413
RISP 0.292 0.353 0.407
Close & Late 0.266 0.332 0.365

The Yankee pitchers had an AVG allowed when it was close and late 41 points below their total AVG allowed. The league as a whole had 20 points. It seems like they would have done well in 1-run games because of that. Their hitters did about what you would expect in close and late situations when you look at the league stats.

So the fewer runs scored explains a good chunk of the missing wins, but it is not clear what explains the rest.

Saturday, November 1, 2014

OPS Wins Baseball Games

A study done by STATS, INC, in their 1998 “Baseball Scoreboard” book showed that the team with the higher OPS at the end of a game had a winning percentage of .852. The study covered the years 1993-1997. Here are the stats they looked at. First the stat, then the winning percentage for the team that had the higher stat in each game:

OPS .852
OBP .824
SLG .820
AVG .804
fewest errors .669
SB per 9 offensive innings .653
HR per plate appearance .653
BB per plate appearance .623
SB% .576
Most strikeouts per 9 defensive innings .543

Now a lot of things could be going on here. But this at least suggests that maybe the most important thing to do is to “out OPS” your opponent. Now you can do this with better hitters or with better pitchers who hold down the opponents (or good fielders who take away hits). OPS comes out higher than AVG. Perhaps there is something to it. Notice it is much higher than any of the SB winning percentages.

Tuesday, October 28, 2014

The Relationship Between OPS Differential And Winning Percentage Using 5 Year Averages

See a recent post called The Relationship Between Team OPS Differential And Winning Percentage, By Decades. I used regression analysis to see how big the impact of OPS differential was on winning.

Here, instead of using individual years, I used the average OPS differential and average winning pct for all 30 teams over the last 5 years.

The regression equation from using individual years was

Pct = 1.325*OPSDIFF + .5

The r-squared was .827 and the standard error was .029. Over 162 games, that is 4.639 wins

The regression equation from using the 5 year average was

Pct = 1.3465*OPSDIFF + .5

The r-squared was .869 and the standard error was .017. Over 162 games, that is 2.72 wins. That is a big drop from the first regression. In a given year, luck will play a role. But the more seasons and data that are used the more accurate the relationship. By combining the years, some of the good and bad luck evens out.

The table below shows the prediction for each team. It seems strange the 6 most extreme teams are all pretty far from the rest of the pack. The Orioles were predicted to have a .476 pct but it was actually .505. That means they won 4.762 more games per season than their OPS differential would estimate.


Team OPSDIFF W-L% Pred Diff Per 162
BAL  -0.018 0.505 0.476 0.029 4.762
PHI  -0.001 0.526 0.498 0.028 4.532
NYY  0.026 0.563 0.535 0.028 4.473
ATL  0.027 0.554 0.536 0.018 2.949
CLE  -0.021 0.487 0.472 0.014 2.349
MIN  -0.053 0.443 0.429 0.014 2.266
SFG  0.018 0.538 0.525 0.013 2.157
CIN  0.017 0.535 0.523 0.012 1.912
SDP  -0.023 0.481 0.470 0.012 1.899
PIT  -0.018 0.481 0.476 0.006 0.939
NYM  -0.024 0.473 0.467 0.006 0.931
KCR  -0.023 0.475 0.470 0.006 0.894
ARI  -0.020 0.475 0.473 0.002 0.371
STL  0.041 0.557 0.555 0.002 0.360
TOR  -0.009 0.489 0.487 0.002 0.251
LAA  0.023 0.532 0.531 0.001 0.149
WSN  0.024 0.530 0.533 -0.003 -0.415
SEA  -0.037 0.446 0.450 -0.004 -0.640
TBR  0.040 0.550 0.554 -0.004 -0.687
LAD  0.030 0.536 0.541 -0.004 -0.709
OAK  0.029 0.535 0.539 -0.005 -0.738
TEX  0.035 0.539 0.547 -0.008 -1.247
CHW  -0.008 0.479 0.489 -0.010 -1.642
MIL  0.016 0.509 0.522 -0.013 -2.115
HOU  -0.078 0.380 0.395 -0.015 -2.379
DET  0.050 0.552 0.567 -0.015 -2.446
FLA  -0.029 0.444 0.461 -0.016 -2.612
CHC  -0.032 0.427 0.458 -0.030 -4.918
BOS  0.034 0.514 0.546 -0.032 -5.231
COL  -0.017 0.444 0.478 -0.033 -5.404

Here is a graph of the relationship

Monday, October 27, 2014

The Statistical Dominance Of The 1927 Yankees

I recently listed The 25 Highest And Lowest Team OPS Differentials From 1914-2014. The 27 Yanks were number 1 by a good margin. Data from the Baseball Reference  Play Index and Retrosheet. I also regressed winning pct against OPS differential and got the following equation

Pct = 1.396*OPSDIFF + .500

Then I estimated every team's pct. Here are the top 10 project records


Team Year DIFF Pred
NYY 1927 0.196 0.773
NYY 1939 0.158 0.720
ATL 1998 0.139 0.694
BAL 1969 0.136 0.690
NYY 1936 0.131 0.683
STL 1944 0.130 0.682
STL 1942 0.127 0.677
CLE 1948 0.127 0.677
NYY 1998 0.126 0.676
SEA 2001 0.126 0.676

Now the 27 Yanks actually had a .714 pct (why they did not reach .773 might be a good topic for a future post). But notice how big their lead is and how closely teams bunch up after the 1939 Yanks. The 27 Yanks would have an 8 game advantage over their 1939 counterparts in a 154 game season (although they would play each other so it might be a bit lower).

I also did the regression by decades. See The Relationship Between Team OPS Differential And Winning Percentage, By Decades. In some decades the impact of the differential was greater than others. But the 27 Yanks still dominate. Here is that top 10


Team Year DIFF Pred
NYY 1927 0.196 0.769
NYY 1939 0.158 0.728
BAL 1969 0.136 0.698
STL 1944 0.130 0.697
STL 1942 0.127 0.692
CLE 1948 0.127 0.692
NYY 1936 0.131 0.689
NYY 1937 0.121 0.675
ATL 1998 0.139 0.673
STL 1943 0.114 0.672


Now OPS weights OBP and SLG equally. What if we give more weight to OBP? I used 1.7*OBP + SLG. Then I divided that by 3 since this approximates wOBA, a stat from Tangotiger. The regression equation in this case was

Pct = 3.34*wOBADIFF +  0.5

The 27 Yanks had a projected pct of .769, the 39 Yanks had .718, and the 69 Orioles had .694 and then the percentages slowly fall after that.

Now we don't have teams from 1901-13 since we don't know OPS allowed. But I did estimate pct using the differentials for the following 3 stats: HRs, Walks and non-HR hits. I was curious to see where the 1906 Cubs rank.

I also compared the estimated winning percentages for the 1914-19 teams from this method and the OPS differential method to see if they gave similar estimates. If they did, then it might be reasonable to project what the OPS differential would say for the 1901-13 teams based on the projection using these other 3 stats.

The good news is that the correlation between the percentages estimated by the two methods for the 1914-19 teams is .96. But the bad news is that there was one team for which the estimates differed by .048. That is pretty big.

But we can still get somewhere. The highest predicted winning pct for the 1901-13 teams was the 1902 Pirates with about .746. The 1906 Cubs were at .690 (why they actually had a .763 pct might make a good post, too).

For the Pirates to reach the .769 of the 27 Yanks, their estimate would have to go up about .023. But only 13 of the 96 teams from 1914-19 had their estimate from the OPS method exceed the 3 stat method by as much as .023. So it seems unlikely that the Pirates would catch they Yankees.

Also, of the 10 best actual winning percentages from 1901-13, only 1 other team had a prediction over .700, the 1905 Giants at .716.

Furthermore, of the 10 best actual 1914-19 teams, only 2 had their OPS differential prediction exceed their 3 stat prediction by at least .023. So it is unusual for a very good team to be off by much.

So it looks like only one team, the 1902 Pirates MIGHT come close to the 1927 Yankees. And that seems unlikely.

Tuesday, October 21, 2014

The Relationship Between Team OPS Differential And Winning Percentage, By Decades

I learned on Oct 21 that there are some discrepancies between Baseball Reference and Retrosheet, so I can't be sure of these results. If I learn more, I will report it.


Oct 24. Here are the corrected numbers:


Period DIFF INT r squared Std error Per 162 Games
1914-19 1.866 0.498 0.833 0.038 6.207
1920-29 1.375 0.500 0.866 0.033 5.390
1930-39 1.442 0.500 0.851 0.038 6.157
1940-49 1.515 0.500 0.854 0.036 5.754
1950-59 1.452 0.500 0.874 0.032 5.165
1960-69 1.458 0.500 0.816 0.035 5.590
1970-79 1.361 0.500 0.811 0.032 5.165
1980-89 1.352 0.500 0.745 0.033 5.399
1990-99 1.249 0.500 0.780 0.032 5.109
2000-09 1.293 0.500 0.809 0.032 5.120
2010-14 1.325 0.500 0.827 0.029 4.639

Data from the Baseball Reference Play Index and Retrosheet.

DIFF is the value of the coefficient on OPS differential in the regression. INT is the intercept. Std error is the standard error. Per 162 games is the standard error times 162.

It seems like the relationship has gotten slightly stronger over time if you look at the standard errors, although the DIFF coefficient does not seem to be as strong as it used to be.

Also, for some reason, before the 1960s, the intercept was below .500. You might expect a team with a .000 OPS differential to have a .500 record. But that was not the case for some time. Not sure why. Maybe greater imbalance in talent levels across teams (like those great Yankee teams) meant that if you were just "average" you lost alot more than you would expect when you played those top teams.


Period DIFF INT r squared Std error Per 162 Games
1914-19 1.898 0.429 0.802 0.042 6.759
1920-29 1.366 0.441 0.803 0.040 6.542
1930-39 1.548 0.423 0.822 0.042 6.807
1940-49 1.537 0.467 0.794 0.042 6.837
1950-59 1.486 0.494 0.858 0.034 5.490
1960-69 1.458 0.500 0.816 0.035 5.590
1970-79 1.361 0.500 0.811 0.032 5.165
1980-89 1.352 0.500 0.745 0.033 5.399
1990-99 1.249 0.500 0.780 0.032 5.109
2000-09 1.293 0.500 0.809 0.032 5.120
2010-14 1.325 0.500 0.827 0.029 4.639

Saturday, October 18, 2014

The 25 Highest And Lowest Team OPS Differentials From 1914-2014

I learned on Oct 21 that there are some discrepancies between Baseball Reference and Retrosheet, so I can't be sure of these results. If I learn more, I will report it. 

Compiled from the Baseball Reference Play Index and Retrosheet

Oct. 24. Here are the corrected numbers:


Team Year OPS OPSA DIFF
NYY 1927 0.872 0.676 0.196
NYY 1939 0.825 0.667 0.158
ATL 1998 0.795 0.656 0.139
BAL 1969 0.756 0.620 0.136
NYY 1936 0.864 0.733 0.131
STL 1944 0.745 0.615 0.130
STL 1942 0.717 0.590 0.127
CLE 1948 0.792 0.665 0.127
NYY 1998 0.825 0.699 0.126
SEA 2001 0.805 0.679 0.126
PHA 1929 0.816 0.692 0.124
NYY 1937 0.825 0.704 0.121
CLE 1995 0.839 0.718 0.121
BRO 1953 0.840 0.722 0.118
CLE 1954 0.744 0.626 0.118
NYY 1932 0.830 0.714 0.116
NYY 1931 0.840 0.726 0.114
PHA 1928 0.799 0.685 0.114
ATL 1997 0.769 0.655 0.114
STL 1943 0.725 0.611 0.114
NYY 1921 0.838 0.725 0.113
BRO 1941 0.752 0.641 0.111
LAD 1974 0.743 0.633 0.110
PHA 1931 0.789 0.680 0.109
BOS 2003 0.851 0.742 0.109

Now the lowest


BOS 1927 0.677 0.796 -0.119
PHI 1945 0.633 0.752 -0.119
PIT 2010 0.678 0.797 -0.119
TOR 1979 0.673 0.793 -0.120
PIT 1952 0.631 0.752 -0.121
NYM 1965 0.604 0.728 -0.124
PIT 1953 0.676 0.803 -0.127
BOS 1932 0.665 0.792 -0.127
PHA 1919 0.634 0.761 -0.127
SLB 1937 0.747 0.875 -0.128
PHI 1939 0.669 0.797 -0.128
PHA 1920 0.642 0.771 -0.129
PHA 1915 0.615 0.745 -0.130
SLB 1951 0.674 0.804 -0.130
DET 1996 0.743 0.875 -0.132
PHA 1936 0.711 0.843 -0.132
SDP 1974 0.632 0.764 -0.132
FLA 1998 0.690 0.824 -0.134
DET 2003 0.675 0.813 -0.138
OAK 1979 0.648 0.786 -0.138
NYM 1963 0.600 0.739 -0.139
SLB 1939 0.720 0.860 -0.140
PHI 1928 0.716 0.857 -0.141
BSN 1924 0.633 0.776 -0.143
PHA 1954 0.648 0.804 -0.156


Here are the highest


Team Year OPS OPSA DIFF
NYY 1927 0.872 0.636 0.236
NYY 1939 0.825 0.638 0.187
NYY 1936 0.864 0.691 0.173
NYY 1931 0.840 0.673 0.167
NYY 1937 0.825 0.661 0.164
PHA 1929 0.816 0.655 0.161
PHA 1928 0.799 0.644 0.155
NYY 1921 0.838 0.683 0.155
NYY 1932 0.830 0.675 0.155
NYY 1930 0.872 0.719 0.153
SLB 1922 0.823 0.673 0.150
STL 1944 0.745 0.596 0.149
STL 1942 0.717 0.570 0.147
STL 1939 0.785 0.641 0.144
NYY 1926 0.806 0.663 0.143
NYY 1934 0.782 0.639 0.143
PHA 1931 0.789 0.648 0.141
NYY 1928 0.816 0.677 0.139
PHA 1930 0.821 0.682 0.139
ATL 1998 0.795 0.657 0.138
BAL 1969 0.756 0.620 0.136
NYY 1933 0.809 0.674 0.135
CLE 1920 0.793 0.659 0.134
STL 1943 0.725 0.592 0.133
WSH 1930 0.795 0.663 0.132

Now the lowest



Team Year OPS OPSA DIFF
SEA 1978 0.673 0.778 -0.105
SEA 1980 0.664 0.769 -0.105
KCA 1955 0.703 0.809 -0.106
KCR 2004 0.720 0.828 -0.108
KCR 2005 0.716 0.825 -0.109
MIN 2011 0.666 0.775 -0.109
SLB 1951 0.674 0.784 -0.110
NYM 1966 0.643 0.755 -0.112
KCA 1956 0.686 0.799 -0.113
SDP 1969 0.614 0.730 -0.116
TOR 1978 0.667 0.783 -0.116
TBD 2002 0.704 0.820 -0.116
NYM 1962 0.679 0.797 -0.118
HOU 2013 0.674 0.792 -0.118
DET 2002 0.679 0.798 -0.119
TOR 1979 0.673 0.793 -0.120
PIT 2010 0.678 0.798 -0.120
NYM 1965 0.604 0.728 -0.124
SDP 1974 0.632 0.764 -0.132
DET 1996 0.743 0.875 -0.132
FLA 1998 0.690 0.825 -0.135
DET 2003 0.675 0.813 -0.138
OAK 1979 0.648 0.786 -0.138
NYM 1963 0.600 0.740 -0.140
PHA 1954 0.648 0.803 -0.155