Thursday, July 31, 2014

The Best Fielding Independent ERA Seasons Relative To The League Average And Adjusted For Park Effects

Using Baseball Reference's Play Index, I found the best 200 seasons in ERA, ERA+ and FIP ERA. BR does not have this adjusted list that I want, but I figure the best 25 or so seasons probably came from at least one of these lists of 200.

Here is how I made the adjustments: Take Pedro Martinez in 1999. His ERA+ was 243. That was calculated by comparing it to the league average and adjusting for the run environment of his park.

His ERA that year was 2.07 while the league average was 4.86. 2.07/4.86 = .426. Then we have 1/.426 = 2.3478. The park factor for Fenway that year was 103, meaning pitchers gave up 3% more runs than average there. So 1.03*2.3478 = 2.42. That gets multiplied by 100 (Baseball Reference has him with 243, probably due to rounding differences).

Then I converted that into an ERA in a league whose average is 4.00. I call that ERA* in the table. Martinez would have 1.65. But, that needs to be adjusted based on his FIP ERA (which is an estimate based only on walks, strikeouts and HRs). His FIP ERA that year was 1.39 or .68 lower than his actual ERA of 2.07.

Then I lower his ERA* by .68. That leaves us with 1.65 - .68 = 0.97. And that is the FIPERA* which is the FIPERA relative to the league average and adjusted for park effects.


Pitcher Year ERA+ ERA  FIP ERA* FIPERA*
Pedro Martinez 1999 243 2.07 1.39 1.65 0.97
Randy Johnson 1995 193 2.48 2.08 2.07 1.67
Randy Johnson 2001 188 2.49 2.13 2.13 1.77
Pedro Martinez 2000 291 1.74 2.17 1.37 1.80
Pedro Martinez 2003 211 2.22 2.21 1.90 1.89
Pedro Martinez 2002 202 2.26 2.24 1.98 1.96
Clayton Kershaw 2014 201 1.76 1.74 1.99 1.97
Randy Johnson 2004 176 2.60 2.30 2.27 1.97
Roger Clemens 1997 222 2.05 2.25 1.80 2.00
Dwight Gooden 1984 137 2.60 1.69 2.92 2.01
Roger Clemens 1988 141 2.93 2.17 2.84 2.08
Randy Johnson 2000 181 2.64 2.53 2.21 2.10
Zack Greinke 2009 205 2.16 2.33 1.95 2.12
Hal Newhouser 1946 190 1.94 1.97 2.11 2.14
Roger Clemens 1990 211 1.93 2.18 1.90 2.15
Greg Maddux 1995 260 1.63 2.26 1.54 2.17
Tim Lincecum 2009 171 2.48 2.34 2.34 2.20
Bob Gibson 1968 258 1.12 1.77 1.55 2.20
Walter Johnson 1910 183 1.36 1.39 2.19 2.22
Tom Seaver 1971 194 1.76 1.93 2.06 2.23
Felix Hernandez 2014 184 2.01 2.07 2.17 2.23
Steve Carlton 1972 182 1.97 2.01 2.20 2.24
Christy Mathewson 1908 168 1.43 1.29 2.38 2.24
Mark Prior 2003 179 2.43 2.47 2.23 2.27
Christy Mathewson 1909 222 1.14 1.62 1.80 2.28
Matt Harvey 2013 157 2.27 2.01 2.55 2.29
Roger Clemens 1998 174 2.65 2.65 2.30 2.30

I listed the top 27 instead of the top 25 because there are two pitchers from this year, Kershaw and Hernandez.

Martinez has 4 of the top 6 seasons, 1999, 2000, 2002, 2003. In 2001 he was hurt an only pitched 116 innings but his FIPERA* is 1.35. He also managed to finish 6th in WAR for pitchers that year.

17 of these seasons have come since 1995. There was only 1 season between 1911 and 1967, Newhouser in 1946. But this is a high strikeout era and FIP ERA takes that into account. Maybe there is a way to first find HRs, walks and strikeouts relative to the league average and then adjust for park affects. Things might come out differently.

Missing from the leaders above are Alexander, Grove, Vance, Hubbell, Feller, Koufax, among others. Here the best FIPERA*s for each of them

Bob Feller 2.63
Dazzy Vance 2.78
Carl Hubbell 2.94
Lefty Grove 2.71
Sandy Koufax 2.45
Pete Alexander 2.38

Alexander's season ranks 41st.

I also did something similar for career FIPERA

Wednesday, July 30, 2014

Sale And Kershaw Vs. Non-Pitchers

Here are the stats for this year. They excluded any PAs against pitchers. There is a second table below using career stats and discussion follows the first table


Stat
Sale
Kershaw
HR%
0.014
0.015
BB%
0.060
0.040
SO%
0.290
0.320
SO/W
4.88
7.94
BA
0.194
0.197
OBP
0.242
0.230
SLG
0.269
0.303
OPS
0.510
0.532



ERA+
210
201
Park Factor
101
96
Team DER
0.688
0.692
Team Field%
0.983
0.985

The percents are using PAs as the denominator. I included HBP with walks. So Sale is just a bit better on HR% while Kershaw does much better on strikeouts and walks.

But Sale allows a lower OPS while pitching in a tougher park for pitchers. The 101 means that his park allows 1% more runs while Dodger Stadium allows 4% fewer.

Sale's ERA adjusted for the league average and park effects is better (210 vs. 201 in ERA+). Those numbers mean that once you adjust their ERAs for park effects, they both are a little below half the league ERA. Then that gets divided into 1 and multiplied by 100)

Now fielders could help Sale get a lower OPS and better ERA+. But the Sox defensive efficiency ratings (DER) is a bit lower than the Dodgers' so Sale has less help behind him from his fielders. It shows what % of balls in play get turned into outs. Those could be affected by the parks somehow though.

Here are the career numbers. This time PAs had IBBs excluded (as did walks). Neither has any IBBs this year. Keshaw has had a longer career so far. But Sale compares favorably


Stat
Sale
Kershaw
PA
2426
4767
HR%
0.023
0.017
BB%
0.070
0.081
SO%
0.268
0.247
SO/W
3.80
3.05
BA
0.221
0.218
OBP
0.278
0.284
SLG
0.338
0.324
OPS
0.616
0.608



ERA+
151
149

Sunday, July 27, 2014

Factors That Might Determine A Batter's GIDP Rate

I looked at how some variables impact GIDP rate using regression analysis. All data is from Baseball Reference. The variables were

Lefty-A dummy variable for being a lefty (1). Righties got 0. Switch hitters got .67, assuming they face righties about 2/3 of the time

SO%-Strikeouts divided by PA - IBB. Players who strikeout alot probably won't hit into too many DPs

IsoCon-Isolated power on contact. So extra bases divided by AB - SO. I wondered if how hard you hit the ball mattered. Maybe the harder you hit it, the easier it is to turn a DP. But maybe hard hit balls go through more or a line drives or long flies.

Speed-Triples divided by 2B + 3B. This is an idea from Voros McCracken. Fast guys turn those balls hit into the gap into 3Bs, slow guys get 2Bs.

GB/FB ratio-not used in the first regression

I looked at all AL batters who had 800+ PAs combined in 2012-3. Here is the first regression equation:

(1) DP Rate = .185 - .029*Lefty - .178*SO% - .031*IsoCon - .168*Speed

 Here are the t-values for each variable

Lefty) -4.49
SO%) -2.45
IsoCon) -1.04
Speed) -3.03

The r-squared was .393 and the standard error was .029. The mean for all the players in the group (93) was .1122.

I wanted to see how much conventional stats could explain the rate first. Then I added in GB/FB ratio. Here is the second regression equation

(2) DP Rate = .075 - .025*Lefty - .15*SO% + .071*IsoCon - .191*Speed + .096*GB/FB

The r-squared jumped to .599 and the standard error of the regression fell to .024. Here are the t-values

Lefty) -4.61
SO%) -2.51
IsoCon) 1.37
Speed) -4.21
GB/FB) 6.69

So 4 of the variables look significant being greater than 1.96 in absolute value (at the 5% level). Interesting that IsoCon saw a change in sign. The GB/FB ratio sure increased the r-squared alot.

Being a lefty lowers your rate by .025. So if you were otherwise average (.1122), then you fall to .0872. If your strikeout rate goes from .1 to .2, your DP rate falls by .015.

The GB/FB ratio ranged from 1.85 (Jeter) down to .475 (Reddick). A .25 drop in the GB/FB ratio would lower your DP rate by .024. Jeter had the highest DP rate of .245. Kendrick was 2nd with .199. Granderson was lowest with .032

The Speed variable went from .224 (Austin Jackson) down to 0. An increase of .1 here means a drop of .019 in DP rate.

Most of this is not a surprise. It might be worthwhile to add in more years of data. It will also be interesting to see what happens to the sign on the IsoCon variable with more data.