As the title suggests, many have studies have been written
on this. There are some links to previous research listed at the end of this
article. But the question is at what age do hitter’s peak? To analyze this, I
found all players who had 15 or more seasons with 400 or more plate appearances
(a minimum for a “full” season). To measure offensive performance, I used RCAA
from the Lee Sinins’s Complete Baseball Encyclopedia. It is “Runs created above
average. It's the difference between a player's RC total and the total for an
average player who used the same amount of his team's outs. A negative RCAA
indicates a below average player in this category.” Then I sorted this group of
players by age, found the average RCAA for each age, and graphed it.
Here are the averages at each age. The count column shows
how many players at a given age.
AGE

RCAA

Count

19

1.25

4

20

23.94

18

21

16.83

42

22

23.04

55

23

22.62

74

24

28.08

77

25

31.15

86

26

33.63

89

27

33.46

87

28

33.74

89

29

34.20

88

30

29.94

88

31

30.98

85

32

32.32

88

33

24.78

86

34

22.80

83

35

22.76

82

36

20.18

77

37

17.61

64

38

14.47

51

39

13.41

41

40

5.19

27

41

3.00

11

42

1.25

4

43

4.00

2

44

5.00

1

The highest average RCAA is 34.20 at age 29, although this
is only slightly higher than ages 2628 (since there were so few cases at
certain ages, I only graphed ages 2040). It is interesting that the average
dips from 3031 but comes back up at age 32. There were a total of 90 players
in this group and from ages 2535, the number stays very high. The equation:
(1) RCAA = 0.1922*AGE2 + 10.901*AGE  122.45
predicts what the average RCAA will be at a given age. The
R2 = 0.8975 means that 89.75% of the variation in average RCAA across ages is
explained by the equation. To find the peak of the trend line, the derivative
of the equation with respect to age can be found and set equal to zero (this is
a calculus technique). The derivative of equation (1) is
(2) .3844*AGE + 10.901 = 0
The AGE at which this is true will be 28.36. That could be
another way to get the peak value and it is fairly close to the peak age in
terms of average RCAA, 29. It tells us the very highest point on the trend
line.
Equation (1) is a secondorder polynomial. A commenter on
one of JC Bradbury’s article suggested a higher order polynomial. I tried a
third, fourth and fifthorder polynomials (this can be selected for very easily
in Excel). I did not like the fifthorder polynomial because the trend line
actually went down, then came up, and declined again. That may be possible, but
I want to stick with a simple rising, then falling trend. The fourthorder
polynomial has a similar inverted Ushape found in the graph above and had a
higher Rsquared of .9567. In that case,
the equation was
(3) RCAA = 0.0006*AGE4 + 0.082*AGE3 – 4.3663*AGE2 +
103.48*AGE – 876.52
If I wanted to find the peak age using a derivative, I would
end up with a thirdorder polynomial or cubic function. If that is set equal to
zero, it can be pretty complex to find the value for AGE or whatever the
unknown variable is. So I just plugged in every age from 20 to 40 (using every
tenth like 20.1. 20.2, etc) into equation (3). Then I sorted the results to get
the highest predicted RCAA. In that case, the peak age was 26.7 (with a
predicted RCAA of 29.58). So that is much younger than the other ages found for
peak value earlier (29 and 28.36). I also only used ages 2240 for this
fourthorder polynomial case since the average RCAA at age 21 was lower than
age 20 and the equation for the age 2040 case gave an unrealistically low
predicted value for the peak age average RCAA of only about 19 (that peak age
was 25.8). That makes little sense if you look at the graph above. If I restricted
the ages to 2240 and did a secondorder polynomial, the peak age would be
28.54 and the rsquared would be .9361. The predicted RCAA would be 32.3.
I also found the number of players having their best season
(highest RCAA) at various ages. This is in the next table:
AGE

Count

20

1

21

1

22

4

23

1

24

10

25

9

26

11

27

10

28

6

29

6

30

5

31

7

32

6

33

4

34

3

35

2

36

4

Then I broke the 90 players into five groups by ranking each
guy by their average yearly RCAA. Here are the ages which had the highest
average RCAA for the top 18, the middle 18 and the lowest 18 along with the
average RCAA at that age.
Top: 28/79
Middle: 32/34.19
Bottom: 30/6.47
Then I found the trend line for the top 18, the middle 18
and the lowest 18 and what age would be predicted to be the peak age by both
the secondorder and fourthorder polynomials for each of the three cases. For
each of the three categories, if there were not at least 8 cases for an age, it
was not included. I found the peak by either using the derivative or the
sorting method mentioned earlier. For the secondorder polynomials, here are
the peak ages and the RCAA predicted by the trend line or equation:
Top: 28.75/71.63
Middle: 30.5/29.86
Bottom: 30.09/3.15
For the fourthorder polynomials, the peak ages and their
RCAAs were
Top: 27.5/78.54
Middle: 29.9/31.92
Bottom: 26/10.62
The last one is a negative RCAA. None of the others are.
Also, for the fourthorder polynomial for the bottom group, I used all
observations since this gave a curve which did not change directions (the one
that only included ages with at least 8 cases did and the predicted values did
not make sense since the predicted values just kept getting more negative with
age). So, for the bottom group, I had some ages with only 1 or 2 cases.
So, I found lots of different possible ages for peak value.
I can’t say I know for sure which one I would pick. But the ages I found
generally are between 2830 or so.
Links to other research on Bonds and aging patterns in general
"Has Anyone Aged as Well As Barry Bonds?"
"Smoothing Career Trajectories" by Jim Albert "By the Numbers" August 2002
JC Bradbury has these studies posted at his site:
"ESTIMATED AGE EFFECTS IN BASEBALL" By Ray C. Fair
No comments:
Post a Comment