Saturday, March 23, 2024

Norm Cash's 1961 season

He won the AL batting title that year with a .361 AVG. Yet he never hit .300 or higher again and his lifetime avg was just .271 (all data is from Baseball Reference and Stathead).

His OBP that year was .487. His career OBP was .374 and his next best was .402 (in 1960 in only 428 plate appearances).

His SLG was .662. His next best was .531 and it was .488 for his career.

His OPS+ was 201 that year and his next highest was 149. Lifetime it was 139.

I wondered if his flukiness was balanced against both lefties and righties.

This table shows his OPS vs. righties relative to lefties for each of his 14 full or close to full seasons

1960          1.50
1961          1.57
1962          1.43
1963          1.36
1964          1.54
1965          1.15
1966          0.94
1967          1.31
1968          0.97
1969          1.56
1970          1.21
1971          1.27
1972          2.11
1973          2.39

The 1.57 in 1961 is the highest until late in his career when his performance against lefties went down quite a bit. But it was just a bit higher than 1960, 1964 and 1969. So this does not indicate a great imbalance.

But, I also calculated his OPS vs. righties relative to the league average of all left-handed batters vs. righites (and the same was also done for vs. lefties).

Here is his year-by-year OPS vs. righties relative to the league average of all left-handed batters vs. righties:

1960          1.21
1961          1.62
1962          1.25
1963          1.26
1964          1.19
1965          1.24
1966          1.15
1967          1.22
1968          1.20
1969          1.24
1970          1.13
1971          1.30
1972          1.24
1973          1.18

The ratio in 1961 is by far the highest at 1.62 with the next best being 1.30. So a great year for him vs. righties.

Now for his year-by-year OPS vs. lefties relative to the league average of all left-handed batters vs. lefties:  

1960          1.02
1961          1.21
1962          1.04
1963          1.03
1964          0.94
1965          1.30
1966          1.39
1967          1.08
1968          1.44
1969          0.92
1970          1.16
1971          1.23
1972          0.65
1973          0.55 

He had 1.21 in 1961, but that is only his 4th highest ratio. He had four that were higher: 1.44, 1.39, 1.30 and 1.23.

So he had, compared to the rest of his career, a fantastic season against righties. But against lefties, it was just good.

Update March 25: From 1960-73, Cash had an OPS of .918 vs. righties while all left-handed batters had .733. His ratio is 1.25 (.918/.733). So his 1.62 ratio in 1961 was far above this.

Over the same period, his OPS vs. lefties was .696 while all all left-handed batters had .625. This ratio is 1.11 (.696/.625). His 1.21 ratio from 1961 was only slightly above this.

I only looked at 1960-73 since he did not get many PAs in 1958, 1959 & 1973.

There can be some idiosyncratic things going on here. For example, 20 of his 162 PAs vs. lefties in 1961 were against Whitey Ford. He had just a .417 OPS vs. him that year.

Cash only had 52 career PAs against Ford. So it is possible that 1961 was an unusually tough year for him in terms of the quality of the lefties he faced. But to conclude that would require looking each of his seasons to see who he faced. Also, in 1961, Ford seems to be the only good lefty that he faced fairly often.

Tuesday, March 19, 2024

Interesting new article by Bill James: The Competitive Advantage of the Pitcher’s Park

It is in the latest issue of By the Numbers: The Newsletter of the SABR Statistical Analysis Committee, edited by Phil Birnbaum.

Click here to read it.

Here is a synopsis from Phil:

"Bill James finds that teams who play in pitcher's parks have had better records, historically, than teams who play in hitter's parks. He presents the data showing the effect, and then offers a suggestion for why this may be happening."

Also in the issue, Charlie Pavitt reviews several recent studies from the academic literature.

Sunday, March 3, 2024

Factors that might influence the difference between ERA and FIP

My last post mentioned that Aaron Bummer had a 6.79 ERA last year while his FIP ERA was 3.58 for a differential of -3.21 (FIP - ERA). That was the largest absolute differential last year for any pitcher with 50+ IP and it was 0.64 larger than the next largest.

So to look at what might explain why ERA differs from FIP (fielding independent ERA estimated using SOs, BBs and HRs), I ran a regression with FIP - ERA as the dependent variable and the following three independent variables:

SLG Diff (a pitcher's SLG allowed with runners on base minus the SLG they allowed with no runners on)
BAbip (the batting average a pitcher allows on balls in play, so it is dependent on how good his fielders are)
BQS/9 (Bequeathed runners that scored per 9 IP).

Bequeathed runners represents the number of runners left on base by a pitcher when that pitcher leaves the game. Any bequeathed runner who scores an earned run after a pitcher has left the game will be counted against that pitcher's ERA (from mlb.com https://www.mlb.com/glossary/advanced-stats/bequeathed-runners).
 
I used SLG Diff because some pitchers might have gotten hit pretty hard when they had runners on base, making their ERA higher than what we might otherwise expect based on their overall numbers.

I used BAbip because this is not controlled very much by the pitcher. A guy can have a low FIP but if his fielders can't catch the ball, his ERA will be high.

I used BQS/9 because a pitcher cannot control what happens after he leaves the game. Some pitchers get lucky and their bullpen bails them out. For others, it is the opposite.

I looked at all the guys who had 100+ IP last year. All data came from Baseball Reference and Stathead. There were 127 pitchers.

Here is the regression equation:

FIPERADIFF = .789*BQS/9 + 13.63*BAbip + 2.5*SLGDiff - 4.32

r-squared = .639, so 63.9% of the variance in the dependent variable is explained by the model.

standard error = .33

Here are the t-values for the three independent variables:
 
BQS/9)  6.7 (The p-value is < .00001)
BAbip)  11.6 (The p-value is < .00001)
SLG Diff)  5.04 (The p-value is < .00001)

The r-squared seems fairly high but it still means that 36.1% of the variation in FIPERADIFF is not explained.

The standard error seems high. I wish it was lower. The average absolute differential was about .44.

The t-values are all pretty high so each independent variable is significant. I used a website that converts t-values into p-values.

There may be some other variable that I should include. Maybe I could find the estimated FIPERADIFF for each guy and look at the 10 or so guys with the biggest differences between the estimated value and the actual (FIP - ERA). Maybe something that would be obvious to include would pop up.