Sunday, December 14, 2014

Do Park Effects Treat DiMaggio Appropriately?

98.7 is simple average of park factors for JoeD's years, from Baseball Reference. Yankee stadium comes out at about an average park only because it favored lefties and killed RHBs.

Click here to see how unusual his home/road splits were and how it affects is estimated value.

Now let's take a look at Yankee Splits from 1947-51 (I don't think the Baseball Reference Play Index has much data on these breakdowns before 1947).


Place Split BA OBP SLG OPS
Home LHB 0.269 0.362 0.448 0.810
Road LHB 0.280 0.359 0.432 0.791
Home RHB 0.270 0.355 0.396 0.752
Road RHB 0.275 0.355 0.402 0.757






Place Split BA OBP SLG OPS
Home OPP LHB 0.237 0.338 0.368 0.706
Road OPP LHB 0.270 0.367 0.386 0.753
Home OPP RHB 0.226 0.308 0.308 0.616
Road OPP RHB 0.255 0.342 0.376 0.718

Yankee RHBs actually did just a bit better on the road (an OPS of .005 better). We see a drop off for their LHBs (OPS falls .019).

But opposition RHBs had an OPS that was .102 higher at their home parks than in Yankee Stadium.. For opposition LHBs it was only .047 better.

So their is evidence that Yankee Stadium hurt RHBs and that a simple park effect may not accurately adjust their stats.

Sunday, December 7, 2014

What does WAR say about Dick Allen and Ken Boyer?

Allen had a career WAR of 58.7, 126th among position players. He had three top 5 finishes, including a 1st. But none back to back. His 4 best were 8.8-8.6-7.5-6.4. Three of those were consecutive (but not the 8.6). His best three consecutive seasons add up to 22.7.

Boyer had a career WAR of 62.8, 104th among position players. He had four top 5 finishes, including a three years in a row. But never a first place finish (3rd was his best). His 4 best were 7.9-7.4-6.8-6.4. Those all came in 154 game seasons, which is about 5% shorter than the 162 game seasons Allen's best years came in. His best three consecutive seasons add up to 22.1 (if we increased that by 5% it would be 23.2).

Boyer has the edge in career WAR. Allen has the edge in best three consecutive season, although only Boyer had three straight in the top 5.

Allen is hurt by fielding with a -16.5 defensive WAR. Boyer has a +10.6 defensive WAR.

I usually think that significant career value and significant peak value should be enough for the Hall. Three straight top 5 finishes in WAR seems pretty good for Boyer along with being 104th among position players.

But Allen's best three consecutive seasons add up to 22.7 or almost 7.6 per year. Sean Forman says that a WAR of 8.0 is MVP caliber. So he was pretty close to that, on average, for three straight years.

If it were just based on hitting, Allen had two 1st place finishes and four other top 5 finishes in offensive WAR. His career rank is 60th. Very impressive.

He missed 44 games in 1968 and 40 in 1969 while hitting 33 and 32 HRs in those years, respectively. A full season might have meant over 40 HRs in each year. Maybe something like that might have cemented an image of him as a top slugger. He also played only 148 games in 1972 while hitting 37.

Allen also had three 1st place finishes in OPS+ and four other top 5 finishes. He led the NL in both 1966-67 when Aaron, Mays, McCovey and Billy Williams were all still good or near their primes. Allen was also 2nd in 1968.

Both Allen and Boyer seem to have enough career value and enough peak value to be in the Hall.

Sunday, November 30, 2014

Would the “wide arc" of DiMaggio’s swing have made him more vulnerable to strikeouts against the higher velocity of pitchers in today’s game, as John Thorn suggests?

Yankees great Joe DiMaggio was overrated, says MLB historian. Excerpt:
"Perhaps most remarkably, especially when compared to the current era in baseball when hitters strike out more than ever, DiMaggio never struck out more than 39 times in a season. In 1941, the year of his famous 56-game hitting streak, DiMaggio struck out a total of 13 times.

By comparison, 2014 AL MVP Mike Trout struck out 184 times, the highest total in the majors.

Yet Thorn makes the case that the “wide arc" of DiMaggio’s swing would have made him more vulnerable to strikeouts against the higher velocity of pitchers in today’s game."
This might be true, but all players would have to deal with the faster pitch speeds.

It wasn't just that DiMaggio had low strikeout totals. It is that his HR-to-strikeout ratio was astronomical, especially considering that he was a right handed batter in Yankee Stadium. See my post Which Players Had The Best HR-To-Strikeout Ratios?

DiMaggio hit 2.77 HRs for every one that the average player hit while he only struck out 59% as often (for a ratio of 4.69).

In fact, the only player to have a higher HR-to-strikeout ratio relative to the league average was Ken Williams of the St. Louis Browns. His home field, Sportsman's Park, was a great hitter's park.

DiMaggio hit only 41% of his HRs at home in his career while Williams hit 72%. So it is likely the case that DiMaggio would rank first, and probably by a wide margin, if HRs were park adjusted.

DiMaggio faced Bob Feller 138 times. He hit 6 HRs while striking out only 7 times. Feller struck out 16.9% of the batters he faced from 1938-51. He allowed a HR% of 1.3%. DiMaggio struck out much less than average against Feller and hit HRs more frequently. So it looks like he could adapt to fast pitchers.

DiMaggio face Hal Newhouser 60 times (Newhouser was 3rd behind Feller and Tommy Bridges in strikeouts per 9 IP from 1938-51 in the AL). He had 6 HRs and just one strikeout. He faced Bridges 7 times with 1 HR and no strikeouts.

Data from Baseball Reference and the Baseball Reference Play Index.



Joe DiMaggio Led MLB In Road Slugging Percentage, 1936-51

Minimum 2500 PAs. Here is the top 10

Joe DiMaggio 0.610
Ted Williams 0.607
Stan Musial 0.581
Jimmie Foxx 0.528
Johnny Mize 0.528
Hank Greenberg 0.526
Ralph Kiner 0.525
Jeff Heath 0.514
Walker Cooper 0.511
Charlie Keller 0.510

Here is the top 10 in all games

Ted Williams 0.633
Hank Greenberg 0.619
Stan Musial 0.584
Ralph Kiner 0.582
Joe DiMaggio 0.579
Jimmie Foxx 0.573
Johnny Mize 0.568
Earl Averill 0.526
Hal Trosky 0.518
Charlie Keller 0.518

From 1939-51, here are the AVG-OBP-SLG for both DiMaggio and Williams in neutral parks (with Fenway and Yankee Stadium taken out)

DiMaggio) .335-.417-.605
Williams) .333-.469-.617

Yes, Williams beats DiMaggio in SLG. But it is fairly close, much closer than their career numbers. So under pretty much the same circumstances, DiMaggio slugged close to what Williams slugged. The big edge is OBP for Williams.

Now only looking at neutral parks leaves alot of PAs out of the analysis. But DiMaggio's stats put him almost on the level of the guy many say was the greatest hitter ever.

Data from Baseball Reference and the Baseball Reference Play Index.

Friday, November 28, 2014

Should Joe DiMaggio's Offensive Value Be Estimated Upwards Because Of Yankee Stadium?

His road stats were much better than his home stats. In those days, it was over 400 feet to left-center field (I think 407). And players normally hit better at home than the road. So I tried to estimate what his career stats might have been in light of this if had played in a fair park and then estimate how many runs this would add to an average team.

The table below shows his splits. Data from the Baseball Reference Play Index


DiMaggio BA OBP SLG
Home 0.315 0.391 0.546
Away 0.333 0.405 0.610

Now the league splits from 1936-51


League BA OBP SLG
Home 0.273 0.350 0.394
Away 0.261 0.335 0.373


So players normally had an OBP that was .015 higher at home and a SLG that was .021 higher. What if DiMaggio had played in a fair park his whole career and he had these same differentials?

His home OBP and SLG would be .420 and .631. If those are averaged with his road numbers of .405 and .610, he would have a career OBP of .413 and a career SLG of .621.

That is better than his actual numbers of .398 & .579. So his OBP goes up .015 and his SLG goes up .042. That would raise a team's OBP and SLG by 0.0016 & 0.0046, respectively (assuming he has one ninth of a teams ABs and PAs).

How many extra runs would this mean? I ran a regression with runs per game as the dependent variable and OBP & SLG as the independent variables for all MLB teams from 1936-51. Here is the equation

R/G = 11.19*SLG + 19.20*OBP - 6.17

Plugging in the 0.0016 & 0.0046 changes in team SLG and OBP, we get 0.0825 more runs per game or 12.7 per 154 game season. That is about one extra win per season.

DiMaggio played 1736 games. That is 11.27 154 game seasons. That times 12.7 is 143. That adds about 14 to wins to his career value.

He has 78.2 career WAR, good for 41st among position players. This adjustment would give him 92.8, putting him at 28th.

Wednesday, November 19, 2014

Is The Run Value Of Stealing A Base Different Than The Run Value Of Allowing A Stolen Base?

Maybe this is just a statistical artifact or something quirky is going on. But I ran regressions with runs scored per game and runs allowed per game as the dependent variables and OBP, SLG, SB, CS, GDP, and ROE (reached on errors) as the independent variables (the last four were all per game). I used all teams from 2005-14 and the data was from the Baseball Reference Play Index.

Here is the regression for runs scored per game

R/G = 9.8*SLG + 17.17*OBP - 0.308*GDP - 0.394*CS + 0.143*SB + 0.54*ROE - 5.09

Now the regression for runs allowed per game

RA/G = 9.4*SLG + 17.57*OBP - 0.188*GDP - 0.446*CS + 0.302*SB + 0.86*ROE - 5.35

So the value of stealing a base is .143 runs per game while allowing one is .302 runs per game (it seems like there are big differences in GDP and ROE as well). I can't think of any reason why there would be a big difference here.

I started looking at this because when I added variables like SB differential, etc. to my regressions estimating winning pct based on OPS differential, the value of the SB differential seemed too high.

If we look at a team like the 2010 Red Sox, they allowed 1.04 SBs per game while having 0.259 CSs per game. If I use the coefficient values for RA/G, they allowed about .2 runs per game from stealing. That would be about 32 runs per season or about 3 wins.

If I used the values from the R/G regression, they would have allowed about .047 runs per game from stealing or 7.59 per season. That is not even one win. So the difference between the two methods is about 2.5 wins.

In a recent post, I found that over the last 5 years, the Red Sox seemed to under perform based on their OPS differential. See The Relationship Between OPS Differential And Winning Percentage Using 5 Year Averages. They won 5.23 fewer games on average each season than their OPS differential would predict (only the Rockies were worse at 5.4 fewer wins).

The 2011 Red Sox were similar, with 0.963 SBs per game and 0.309 CSs per game.They have 4 teams in the top 21 in SB allowed per game from 2010-14. The Rockies, however, don't seem to have been that bad at allowing SBs, having only one year in the top 50 in SB allowed per game. So their under performance must be due to something else.

Sunday, November 9, 2014

Which Teams Exceeded Their Win Total By The Most According To OPS Differential?

I have done some related posts recently, looking at the teams with the best OPS differentials since 1914 using data from Baseball Reference's Play Index. I also did a regression and the equation for winning pct was

Pct = 1.396*OPSDIFF + .500

Then I estimated each team's pct and calculated how many more games per 162 that they won than this equation would predict. Here are the top 10:

Team Year DIFF W-L% Pred Diff per 162
BSN 1914 0.002 0.614 0.503 0.111 17.96
STL 1987 -0.017 0.586 0.476 0.110 17.88
LAA 2008 0.014 0.617 0.519 0.098 15.85
CIN 1973 0.010 0.611 0.514 0.097 15.66
STL 1931 0.045 0.656 0.562 0.094 15.21
NYM 1969 0.017 0.617 0.524 0.093 15.07
NYY 2013 -0.048 0.525 0.433 0.092 14.86
MIN 1994 -0.088 0.469 0.378 0.091 14.81
PHA 1948 -0.032 0.545 0.455 0.090 14.59
BAL 1977 0.011 0.602 0.516 0.086 13.97

So the Braves in 1914 should have had a pct of .503 based on their OPS differential of .002. But it was actually .614. That gives them 17.96 more wins per 162 games than we might have expected. Maybe that is why they are called the "Miracle Braves!"

I also found a regression equation for each decade but the list of the top over achieving teams was similar to this one.

We don't have splits for things like RISP, runners on base and close and late situations for 1914. So we can't tell if the Braves did especially well in those cases, which would explain alot. The Braves were 33-20. But that is not much different than their overall pct.

They had what seems to be good fielding. Their fielding pct was .961 (the league average was .958). Their defensive efficiency rating was .701 and the league average was .698. That probably helped a bit, but I don't think that would explain their .614 pct. They stole 139 bases, as did their opponents. They turned 143 DPs and the next highest team had 119.

The 1987 Cardinals hit alot better when it counted as this table shows

Split BA OBP SLG OPS
High Lvrge 0.275 0.344 0.399 0.743
Medium Lvrge 0.267 0.350 0.389 0.739
Low Lvrge 0.254 0.328 0.357 0.686

It looks like their pitchers did a bit better when it counted

Split BA OBP SLG OPS
High Lvrge 0.261 0.344 0.397 0.740
Medium Lvrge 0.260 0.320 0.395 0.715
Low Lvrge 0.274 0.334 0.418 0.753

They stole 248 bases while their opponents stole just 100. The Cards hit into 16 fewer DPs and reached on errors 22 more times. Some of that probably helped them over achieve. They turned 172 DPs, 2nd most in the league. The league average was 146.