Sunday, March 3, 2024

Factors that might influence the difference between ERA and FIP

My last post mentioned that Aaron Bummer had a 6.79 ERA last year while his FIP ERA was 3.58 for a differential of -3.21 (FIP - ERA). That was the largest absolute differential last year for any pitcher with 50+ IP and it was 0.64 larger than the next largest.

So to look at what might explain why ERA differs from FIP (fielding independent ERA estimated using SOs, BBs and HRs), I ran a regression with FIP - ERA as the dependent variable and the following three independent variables:

SLG Diff (a pitcher's SLG allowed with runners on base minus the SLG they allowed with no runners on)
BAbip (the batting average a pitcher allows on balls in play, so it is dependent on how good his fielders are)
BQS/9 (Bequeathed runners that scored per 9 IP).

Bequeathed runners represents the number of runners left on base by a pitcher when that pitcher leaves the game. Any bequeathed runner who scores an earned run after a pitcher has left the game will be counted against that pitcher's ERA (from mlb.com https://www.mlb.com/glossary/advanced-stats/bequeathed-runners).
 
I used SLG Diff because some pitchers might have gotten hit pretty hard when they had runners on base, making their ERA higher than what we might otherwise expect based on their overall numbers.

I used BAbip because this is not controlled very much by the pitcher. A guy can have a low FIP but if his fielders can't catch the ball, his ERA will be high.

I used BQS/9 because a pitcher cannot control what happens after he leaves the game. Some pitchers get lucky and their bullpen bails them out. For others, it is the opposite.

I looked at all the guys who had 100+ IP last year. All data came from Baseball Reference and Stathead. There were 127 pitchers.

Here is the regression equation:

FIPERADIFF = .789*BQS/9 + 13.63*BAbip + 2.5*SLGDiff - 4.32

r-squared = .639, so 63.9% of the variance in the dependent variable is explained by the model.

standard error = .33

Here are the t-values for the three independent variables:
 
BQS/9)  6.7 (The p-value is < .00001)
BAbip)  11.6 (The p-value is < .00001)
SLG Diff)  5.04 (The p-value is < .00001)

The r-squared seems fairly high but it still means that 36.1% of the variation in FIPERADIFF is not explained.

The standard error seems high. I wish it was lower. The average absolute differential was about .44.

The t-values are all pretty high so each independent variable is significant. I used a website that converts t-values into p-values.

There may be some other variable that I should include. Maybe I could find the estimated FIPERADIFF for each guy and look at the 10 or so guys with the biggest differences between the estimated value and the actual (FIP - ERA). Maybe something that would be obvious to include would pop up.

Monday, February 19, 2024

Which pitchers had the biggest differences last year between their actual ERA & their FIP ERA? (or Bad Luck is an Aaron Bummer)

FIP ERA means fielding independent ERA and it is an estimated ERA based on what the pitcher controls: walks, strikeouts and HRs. See Baseball By The Numbers—Earned Run Average (ERA) and Fielding Independent Pitching (FIP) by Marilyn Green at "Redbird Rants" for more information and the formula.

I used Stathead from Baseball Reference to call up all the pitchers who had 50+ IP last year. Then I found the difference between their actual ERA and their FIP ERA.

Table 1 shows the guys who had the worst luck, that is they had the biggest negative differentials for FIP ERA - ERA.

Table 1

Player

IP

ERA

FIP

Diff

Aaron Bummer

58.1

6.79

3.58

-3.21

Shintaro Fujinami

79

7.18

4.61

-2.57

Zach Davies

82.1

7.00

4.58

-2.42

Hogan Harris

63

7.14

5.02

-2.12

Fernando Cruz

66

4.91

2.83

-2.08

Dylan Floro

56.2

4.76

2.96

-1.80

Michael Grove

69

6.13

4.36

-1.77

Connor Seabold

87.1

7.52

5.75

-1.77

Osvaldo Bido

50.2

5.86

4.10

-1.76

Josh Sborz

52.1

5.50

3.75

-1.75

Table 2 shows the guys who had the best luck, that is they had the biggest positive differentials for FIP ERA - ERA.

Table 2

Player

IP

ERA

FIP

Diff

Wandy Peralta

54

2.83

5.05

2.22

Héctor Neris

68.1

1.71

3.83

2.12

Tom Cosgrove

51.1

1.75

3.70

1.95

Brusdar Graterol

67.1

1.20

3.03

1.83

Kendall Graveman

66.1

3.12

4.88

1.76

Dominic Leone

54

4.67

6.29

1.62

Clayton Kershaw

131.2

2.46

4.03

1.57

Wade Miley

120.1

3.14

4.69

1.55

Bryse Wilson

76.2

2.58

4.13

1.55

Ronel Blanco

52

4.50

5.99

1.49

Bummer's differential is much greater than anyone else's. I thought that maybe he got hit hard with runners on base, but his splits don't reflect that. Did the White Sox have bad fielding? Maybe, their team FIP was 4.71 while the team ERA was 4.87. Bad, but not too bad. Certainly not close to what happened to Bummer. And if it was the fielders, we would see high numbers for his AVG & SLG allowed with runners on base, but again, that was not the case (that would not be the whole story but at least part of it). His splits can be found easily at BR.

Maybe the guys who came in after he had put some runners on just did really bad so that he got charged for the runs. I am not sure if there is an easy way to tell that.

Others have looked at the issue of what explains the difference between FIP and ERA before. I might post some of those links when I get a chance.

Update Feb. 22: Here are Bummer's AVG-OBP-SLG for his splits last year:

None on)  .240-.365-.365
Runners on) .233-.342-.333
 
Looks like he had good numbers with runners on. He did allow .340 BAbip while his AVG allowed in all situations was .236, a .104 difference. For all of the AL last year those numbers were .295/.245. For the NL they were .299/.251. So Bummer's differential here was twice the MLB average. That might partly explain why his FIP/ERA differential is so large.
 
Here are some studies on this issue:
 
The FIP/ERA Gap: Historically by Glenn DuPaul of Beyond the Box Score.
 
Evaluating the Gap Between ERA and FIP by Christopher Rinaldi of FanGraphs.