In some ways it looks like he aged significantly better, but I don't the evidence is completely conclusive. I took all the guys who had 400+ HRs before McGwire and found their SLG relative (RELSLG) to the league average when they were under 31 and from ages of 31-37. The I found the ratio of their old RELSLG (ORATE) to their young RELSLG (YRATE). The data was from the Lee Sinins Complete Baseball Encyclopedia. I ranked them, including McGwire. This is shown in the table below:
From age 31-37, McGwire had a 0.683 SLG while the league average was 0.435. Since .683/.435 = 1.57, his RELSLG is 157 (everything gets multiplied by 100). Under 31, his SLG was .507 while the league average was .397. Since .507/.397 = 1.28, his RELSLG was 128. Then I divided the older RELSLG by the younger. For McGwire it was 157/128 = 1.23.
He is well ahead of everyone else. If it were not for Stargell, McGwire would really stick out. One thing that occurred to me was that in moving out of Oakland and to St. Louis, he no longer had to play in a tough HR park. So for all of these hitters I then used their neutralized SLGs from Baseball Reference. They are adjusted for both the league average and park effects. The new ratios are shown below. McGwire is still first.
Mark McGwire 1.17 (actually 1.27, I made a mistake in the calculation)
Willie Stargell 1.12
Mike Schmidt 1.04
Hank Aaron 1.02
Babe Ruth 1.02
Willie Mays 1.01
Darrell Evans 1.01
Andre Dawson 1.00
Willie McCovey 1.00
Billy Williams 0.99
Dave Winfield 0.99
Harmon Killebrew 0.98
Lou Gehrig 0.98
Ted Williams 0.97
Stan Musial 0.97
Frank Robinson 0.96
Eddie Mathews 0.95
Dave Kingman 0.91
Mel Ott 0.89
Reggie Jackson 0.89
Mickey Mantle 0.89
Duke Snider 0.89
Carl Yastrzemski 0.88
Eddie Murray 0.88
Jimmie Foxx 0.86
Ernie Banks 0.84
Again, the only guy who prevents McGwire from really standing out is Stargell. If I take out McGwire, the mean is 0.9576 and the standard deviation is 0.0766. That makes McGwire 3.21 SDs' above the mean in terms of his ability to either maintain or improve his SLG as he aged. If I also took out Stargell, the mean and SD were 0.9508 and 0.0584, respectively. That makes Stargell 2.89 SDs above average. If I keep Stargell out and put McGwire back in, the mean and SD are 0.9596 & 0.0716. That would make Stargell 2.24 SDs above the mean.
I also took all of these guys and found the average age for each of the four best seasons in their neutralized SLGs with 300+ PAs. Here is how they rank from highest to lowest.
Evans 35
McGwire 33.25
Aaron 32.5
Williams 30.75
Stargell 30.5
Dawson 30.25
Kingman 30.25
B. Williams 29.75
Winfield 29.75
McCovey 29.5
Killebrew 29.25
Schmidt 29.25
Jackson 29
Musial 29
Mays 28.5
Ruth 28.5
Robinson 28
Foxx 27.5
Gehrig 27.5
Snider 27.5
Yastrzemski 27.5
Mantle 27
Murray 26.75
Banks 26.25
Ott 24.25
Mathews 23.25
Evans' top 4 ages were 26, 36, 38 and 40. McGwire's best were 34, 36, 31 and 32. He was the only guy to have all 4 at 31 or older.
Sunday, January 31, 2010
Wednesday, January 27, 2010
How Might Integration Have Affected The Lefty Grove/Randy Johnson Debate?
Introduction
I wrote a post comparing these two along with Sandy Koufax a few days ago in response to a piece by Joe Sheehan. A commentor at B aseball Think Factory mentioned that Lefty Grove did not have to face blacks and Hispanics. So here I try estimate how his value could be affected by pitching before integration. The method I use will be similar to the one I used in this article: How Would Integration Have Affected Ruth and Cobb?
I assume that integration increases the talent pool available to baseball. So the relevant questions seem to be how much better would the hitters have been in Grove's time (AL, 1925-41) if there was no segregation and how many more runs would he have given up by having to face better hitters. But I also tried to take into account how much better the fielders and pitchers would have been. I assumed that the new, non-white talent would be about as good as they are now (relative to whites) and would replace the worst players. Then I tried to calculate how the league OPS and league ERA would change based on how good I assumed the new talent to be. Grove's ERA was also adjusted to account for the better pitchers as well as the better fielders behind him. My rough estimate is that his ERA relative to the league average would fall from 144 (or 44% better than average) to 130. Randy Johnson had 133. I think the difference between the two is still small enough to keep Grove in the debate as the greatest lefty ever.
Analysis
Since the comparison is to Randy Johnson, I first looked to see what percentage of the players were non-white from 1988-2009 and how well they hit compared to the white players. I did this position by position (I try to exlain why at the end in "technical notes"). I found the top 100 players in PAs at each position (but used the top 300 for OFers) from 1988-2009 using the Lee Sinins Complete Baseball Encyclopedia. This group of players combined to make up about 72% of all PAs during this time.
If I did not feel sure about a player being white or non-white, I found a picture of his baseball card on eBay. For the most part, anyone with an Hispanic name was considered non-white. There were a few player with Hispanic names before 1947, like Lefty Gomez. But I don't think it was very many. I put someone like Mike Gallego in the white category since he had what is considered an "Anglo" first name and he was born in the U.S.
The table below shows the weighted-average OPS of whites and non-whites along with the percentage of PAs at that position by non-whites from 1988-2009.
I assumed that these would be the percentages in the AL from 1925-41. For example, I assumed that 34.8% of the 1B men would be non-white. So I took all the 1B men who had 100 or PAs and ranked them from highest to lowest in OPS. Then I removed the bottom 34.8% (approximately) of the PAs. The idea is that teams would get rid of their worst players when adding in the new, better players from the newly used talent pool.
The players who comprised the top 65.2% of PAs were designated as white players and they had an OPS of .926. The 35.8% of the remaining PAs were assigned to players who were assumed to be non-white and I assumed they would have a collective OPS of .914 (.012 lower than the white, what I found in the first table). Combining both groups together gives an OPS of .922. Before this adjustment, the OPS of all the 1B men was .857. So with the added talent, the new cumulative OPS for 1B men would be about .065 higher than before.
Here are all of the increases in OPS at each position
1B 0.065
2B 0.080
SS 0.094
3B 0.034
OF 0.095
C 0.045
The weighted average of all these gains is .076. But since I did not include pitchers, I am just going to assume no change for them. So that would bring the change down to .070 (the weighting I used for pitchers was 8.5% based on Retrosheet data).
But before I try to recalculate how many runs Lefty Grove might have given up, I think better fielding has to be taken into account. In my article on Cobb and Ruth linked above, I estimated how much better the fielding would have been since integration began(in a manner similar to how I adjust the hitting stats-the details are explained at the link).
I found that OFers would have been better in putouts per game of by about 15% and assists per game for SS and 2B would have been better by about 5% (I think those stats for their respective positions made sense). Of course, all fielders cannot raise their putouts and/or assists per game since there are only 27 outs per game. But in the adjustment I made, I will assume that the number of hits on balls in play will have to drop by a certain percentage.
I went with 7.5%, which is in the range of the numbers I found for the OFers, SS and 2B men. Those infielders, of course, would also have more putouts. But I think their big contribution would be in throwing more batters out at first. There might be some improvement at 3B and 1B, but those are still mostly white positions, so any change will be slight.
So I assumed that there would have been a reduction in hits on balls in play of 7.5. That means 7.5% fewer, singles, doubles, and triples. At-bats would fall 7.5%, too. Once those changes were made, I recalculated the AL SLG and OBP from 1925-41. The new OBP would be .339 (down from .350) and the new SLG would be .386 (down from .404). The new OPS would be .725, down from .750. So a decline of .025.
But the improved hitting increased OPS by about .070. So subtracting the .025 leaves a .045 increase in OPS. How much would Grove's ERA increase if the OPS he allowed went up .045? To estimate that, I ran a regression on all the AL teams in his era with runs per game as the dependent variable and team OPS as the dependent variable. Here is the equation:
R/G = 14.23*OPS - 5.62
Since 14.23*.045 = .64, I assumed that every pitcher would see his ERA rise by that much (that may not be the case, some might go up more and some might go up less-I am just not sure how to figure that out-but I also tried raising each pitcher's OPS by the percentage increase, too, as explained below).
So let's say that Grove's ERA goes up by 0.64. His career ERA is now 3.70 (it was actually 3.06). If the league ERA went up the same amount, it would be 5.08. But that 5.08 is too high because it would be brought down by the fact that the new, non-white pitchers would raise the overall quality of the league pitching. How much might that be?
First, I assumed that about 22% of the innings would have been pitched by non-whites. So the worst 22% of the white pitchers would be removed. How good would the non-white pitchers be? Overall, as a group, about as good as the white pitchers who remained.
I figured that out by finding the top 288 pitchers in IP from 1988-2009. They collectively had about half the major league IP in this time. After separating them into white and non-white groups, I found that both had about the same ERA relative to the league average. The whites were 6% better and the non-whites were 7% better. So that is why I assumed in the previous paragraph that the incoming pitchers would be about as good as those that remained.
I ranked every pitcher in the AL in the ERA from lowest to highest ERA. Then I only kept the guys who combined to have about 78% of the IP (adding from lowest ERA to highest). This group of pitchers was about 5.7% better than the entire group. So I assumed that by adding the non-white pitchers, who would be as good as the remaining pitchers, would lower the league ERA by 5.7%.
Earlier I said the hitting and fielding would combine to raise the league ERA to 5.0. If it is lowered 5.7%, then it would be 4.81. How does this affect Grove? Before any adjustments, his relative ERA or ERA+ was 144 since 3.06/4.42 = .692 and 1/.692 = 1.44 (and then it is multiplied by 100). But we had his ERA rising to 3.7 and the league ERA rising to 4.81. Then 3.7/4.81 = .769 and 1/.769 = 1.3. That would give him a relative ERA of 130.
What did Randy Johnson have? 133. This still puts the two pitchers very close. What I have done is a very rough estimate. I think a more complete and thorough adjustment could leave Grove a little ahead or a little behind. But either way, Grove still deserves consideration as the greatest lefty every.
If fielding improved by some other percentages, here is what Grove's ERA+ would be
5% 128
10% 131
15% 133
I also tried to raise Grove's OPS allowed by the same percentage by which the league OPS increased instead of the absolute increase. Since I had the league OPS rising by .045 and it was actually .750, that is a 6% increase. Grove gave up 3.64 runs per 9 IP. To score that many runs per game, a team would have an OPS of .651 (so I assumed that was Grove's OPS allowed). If his OPS allowed went up 6%, it would be .690 or a .039 increase. In that case, his runs allowed would go up by 14.23*.039 = .56. That is a bit lower than the .064 mentioned earlier. In this case, his ERA goes up to 3.62. Then his ERA relative to the league average would be 1.33 (3.62/4.81 = .753 and 1/.753 = 133).
I am not sure if Grove's OPS allowed should increased by an absolute amount or a percentge. But his adjusted ERA ends up being about the same in each case.
Technical Notes
The reason I adjusted the league OPS by position instead of just looking at all players is that I ended up eliminating mostly players who were catchers and infielders. Of course, the players eliminated have be proportionate at each position.
Here is what happened. I initially found that from 1988-2009 the whites and non-whites both had about the same OPS (.771). The non-whites were about 51% of the players in the top 900 on PAs from 1988-2009. So I then found all of the players in the AL from 1925-41 with 100 or career PAs. Then I ranked them from highest to lowest OPS and dropped enough players from the lower ranks to make up about half the PAs. Then I found that the OPS for the remaining players was about 100 points higher than what is was before. I was about to use that as my increase for the league OPS but then I noticed all the players left at the top of the OPS ranking were OFers and 1B men. I was going to end up eliminating alot more than half of the 2B, SS and catchers. So then I had to break things down by position in both eras.
In the AL from 1925-41, the players at each position who had 100 or more PAs made up about 90% of all the PAs in the ERA. This is higher than the 72% share I had from the 1988-2009 era. It would have been nice to have a higher share in the latter period but that would have added alot of time to figuring out who whas white or not.
I also used both leagues from 1988-2009 since Randy Johnson pitched in both while Grove only pitched in the AL.
I also used the improvement in fielding since 1947 that I had calculated a few years ago, not just 1988-2009. This saved alot of work. It could be that the fielding improvement is greater from 1988-2009 than over the entire period of integration. But I do list how Grove's ERA+ would end up under different fielding scenarios and the differences are not great.
On the fielding adjustments, I know that it should involve more than just putouts by OFers and assists by SS and 2B men, but my guess is that this will cover the bulk of any improvement. Maybe some day I can incorporate DPs, errors, etc.
I will try to post a list of all the players I used from 1988-2009 and whether they were designated as white or non-white. Check back to see when I do that.
Click here to see that list.
I wrote a post comparing these two along with Sandy Koufax a few days ago in response to a piece by Joe Sheehan. A commentor at B aseball Think Factory mentioned that Lefty Grove did not have to face blacks and Hispanics. So here I try estimate how his value could be affected by pitching before integration. The method I use will be similar to the one I used in this article: How Would Integration Have Affected Ruth and Cobb?
I assume that integration increases the talent pool available to baseball. So the relevant questions seem to be how much better would the hitters have been in Grove's time (AL, 1925-41) if there was no segregation and how many more runs would he have given up by having to face better hitters. But I also tried to take into account how much better the fielders and pitchers would have been. I assumed that the new, non-white talent would be about as good as they are now (relative to whites) and would replace the worst players. Then I tried to calculate how the league OPS and league ERA would change based on how good I assumed the new talent to be. Grove's ERA was also adjusted to account for the better pitchers as well as the better fielders behind him. My rough estimate is that his ERA relative to the league average would fall from 144 (or 44% better than average) to 130. Randy Johnson had 133. I think the difference between the two is still small enough to keep Grove in the debate as the greatest lefty ever.
Analysis
Since the comparison is to Randy Johnson, I first looked to see what percentage of the players were non-white from 1988-2009 and how well they hit compared to the white players. I did this position by position (I try to exlain why at the end in "technical notes"). I found the top 100 players in PAs at each position (but used the top 300 for OFers) from 1988-2009 using the Lee Sinins Complete Baseball Encyclopedia. This group of players combined to make up about 72% of all PAs during this time.
If I did not feel sure about a player being white or non-white, I found a picture of his baseball card on eBay. For the most part, anyone with an Hispanic name was considered non-white. There were a few player with Hispanic names before 1947, like Lefty Gomez. But I don't think it was very many. I put someone like Mike Gallego in the white category since he had what is considered an "Anglo" first name and he was born in the U.S.
The table below shows the weighted-average OPS of whites and non-whites along with the percentage of PAs at that position by non-whites from 1988-2009.
I assumed that these would be the percentages in the AL from 1925-41. For example, I assumed that 34.8% of the 1B men would be non-white. So I took all the 1B men who had 100 or PAs and ranked them from highest to lowest in OPS. Then I removed the bottom 34.8% (approximately) of the PAs. The idea is that teams would get rid of their worst players when adding in the new, better players from the newly used talent pool.
The players who comprised the top 65.2% of PAs were designated as white players and they had an OPS of .926. The 35.8% of the remaining PAs were assigned to players who were assumed to be non-white and I assumed they would have a collective OPS of .914 (.012 lower than the white, what I found in the first table). Combining both groups together gives an OPS of .922. Before this adjustment, the OPS of all the 1B men was .857. So with the added talent, the new cumulative OPS for 1B men would be about .065 higher than before.
Here are all of the increases in OPS at each position
1B 0.065
2B 0.080
SS 0.094
3B 0.034
OF 0.095
C 0.045
The weighted average of all these gains is .076. But since I did not include pitchers, I am just going to assume no change for them. So that would bring the change down to .070 (the weighting I used for pitchers was 8.5% based on Retrosheet data).
But before I try to recalculate how many runs Lefty Grove might have given up, I think better fielding has to be taken into account. In my article on Cobb and Ruth linked above, I estimated how much better the fielding would have been since integration began(in a manner similar to how I adjust the hitting stats-the details are explained at the link).
I found that OFers would have been better in putouts per game of by about 15% and assists per game for SS and 2B would have been better by about 5% (I think those stats for their respective positions made sense). Of course, all fielders cannot raise their putouts and/or assists per game since there are only 27 outs per game. But in the adjustment I made, I will assume that the number of hits on balls in play will have to drop by a certain percentage.
I went with 7.5%, which is in the range of the numbers I found for the OFers, SS and 2B men. Those infielders, of course, would also have more putouts. But I think their big contribution would be in throwing more batters out at first. There might be some improvement at 3B and 1B, but those are still mostly white positions, so any change will be slight.
So I assumed that there would have been a reduction in hits on balls in play of 7.5. That means 7.5% fewer, singles, doubles, and triples. At-bats would fall 7.5%, too. Once those changes were made, I recalculated the AL SLG and OBP from 1925-41. The new OBP would be .339 (down from .350) and the new SLG would be .386 (down from .404). The new OPS would be .725, down from .750. So a decline of .025.
But the improved hitting increased OPS by about .070. So subtracting the .025 leaves a .045 increase in OPS. How much would Grove's ERA increase if the OPS he allowed went up .045? To estimate that, I ran a regression on all the AL teams in his era with runs per game as the dependent variable and team OPS as the dependent variable. Here is the equation:
R/G = 14.23*OPS - 5.62
Since 14.23*.045 = .64, I assumed that every pitcher would see his ERA rise by that much (that may not be the case, some might go up more and some might go up less-I am just not sure how to figure that out-but I also tried raising each pitcher's OPS by the percentage increase, too, as explained below).
So let's say that Grove's ERA goes up by 0.64. His career ERA is now 3.70 (it was actually 3.06). If the league ERA went up the same amount, it would be 5.08. But that 5.08 is too high because it would be brought down by the fact that the new, non-white pitchers would raise the overall quality of the league pitching. How much might that be?
First, I assumed that about 22% of the innings would have been pitched by non-whites. So the worst 22% of the white pitchers would be removed. How good would the non-white pitchers be? Overall, as a group, about as good as the white pitchers who remained.
I figured that out by finding the top 288 pitchers in IP from 1988-2009. They collectively had about half the major league IP in this time. After separating them into white and non-white groups, I found that both had about the same ERA relative to the league average. The whites were 6% better and the non-whites were 7% better. So that is why I assumed in the previous paragraph that the incoming pitchers would be about as good as those that remained.
I ranked every pitcher in the AL in the ERA from lowest to highest ERA. Then I only kept the guys who combined to have about 78% of the IP (adding from lowest ERA to highest). This group of pitchers was about 5.7% better than the entire group. So I assumed that by adding the non-white pitchers, who would be as good as the remaining pitchers, would lower the league ERA by 5.7%.
Earlier I said the hitting and fielding would combine to raise the league ERA to 5.0. If it is lowered 5.7%, then it would be 4.81. How does this affect Grove? Before any adjustments, his relative ERA or ERA+ was 144 since 3.06/4.42 = .692 and 1/.692 = 1.44 (and then it is multiplied by 100). But we had his ERA rising to 3.7 and the league ERA rising to 4.81. Then 3.7/4.81 = .769 and 1/.769 = 1.3. That would give him a relative ERA of 130.
What did Randy Johnson have? 133. This still puts the two pitchers very close. What I have done is a very rough estimate. I think a more complete and thorough adjustment could leave Grove a little ahead or a little behind. But either way, Grove still deserves consideration as the greatest lefty every.
If fielding improved by some other percentages, here is what Grove's ERA+ would be
5% 128
10% 131
15% 133
I also tried to raise Grove's OPS allowed by the same percentage by which the league OPS increased instead of the absolute increase. Since I had the league OPS rising by .045 and it was actually .750, that is a 6% increase. Grove gave up 3.64 runs per 9 IP. To score that many runs per game, a team would have an OPS of .651 (so I assumed that was Grove's OPS allowed). If his OPS allowed went up 6%, it would be .690 or a .039 increase. In that case, his runs allowed would go up by 14.23*.039 = .56. That is a bit lower than the .064 mentioned earlier. In this case, his ERA goes up to 3.62. Then his ERA relative to the league average would be 1.33 (3.62/4.81 = .753 and 1/.753 = 133).
I am not sure if Grove's OPS allowed should increased by an absolute amount or a percentge. But his adjusted ERA ends up being about the same in each case.
Technical Notes
The reason I adjusted the league OPS by position instead of just looking at all players is that I ended up eliminating mostly players who were catchers and infielders. Of course, the players eliminated have be proportionate at each position.
Here is what happened. I initially found that from 1988-2009 the whites and non-whites both had about the same OPS (.771). The non-whites were about 51% of the players in the top 900 on PAs from 1988-2009. So I then found all of the players in the AL from 1925-41 with 100 or career PAs. Then I ranked them from highest to lowest OPS and dropped enough players from the lower ranks to make up about half the PAs. Then I found that the OPS for the remaining players was about 100 points higher than what is was before. I was about to use that as my increase for the league OPS but then I noticed all the players left at the top of the OPS ranking were OFers and 1B men. I was going to end up eliminating alot more than half of the 2B, SS and catchers. So then I had to break things down by position in both eras.
In the AL from 1925-41, the players at each position who had 100 or more PAs made up about 90% of all the PAs in the ERA. This is higher than the 72% share I had from the 1988-2009 era. It would have been nice to have a higher share in the latter period but that would have added alot of time to figuring out who whas white or not.
I also used both leagues from 1988-2009 since Randy Johnson pitched in both while Grove only pitched in the AL.
I also used the improvement in fielding since 1947 that I had calculated a few years ago, not just 1988-2009. This saved alot of work. It could be that the fielding improvement is greater from 1988-2009 than over the entire period of integration. But I do list how Grove's ERA+ would end up under different fielding scenarios and the differences are not great.
On the fielding adjustments, I know that it should involve more than just putouts by OFers and assists by SS and 2B men, but my guess is that this will cover the bulk of any improvement. Maybe some day I can incorporate DPs, errors, etc.
I will try to post a list of all the players I used from 1988-2009 and whether they were designated as white or non-white. Check back to see when I do that.
Click here to see that list.
Friday, January 22, 2010
Lefty Grove vs. Sandy Koufax & Randy Johnson
Who was the greatest left-handed pitcher in history? My money is on Lefty Grove. This issue came up in a Joe Sheehan piece titled By Any Measure: It's no tall tale: The Big Unit was the greatest lefthander of them all. SABR members were told about it in a recent email and it got mentioned in a Seattle Post-Intelligencer blog. So here is my take on the issue.
In the table below, I summarize each pitcher's career. Here is what the abbreviations mean:
WS = Win Shares (created by Bill James)
WAR = Wins Above Replacement (from Sean Smith)
PW = Pitching Wins (from Pete Palmer via Retrosheet)
ERA+ = ERA adjusted league average and park effects (from Baseball Reference)
Grove does very well. WS might take into account clutch performance or high leverage situations. Grove pitched in relief in about 25% of the games and had alot of saves for his era. This could be bumping up his WS. Koufax and Johnson did not pitch much in relief. But Grove has such a big lead, it might not matter.
Grove pitched until he was 41 and Johnson until he was 45. If I drop Johnson's last 4 seasons, his ERA+ is 151, just slightly ahead of Grove. But then Johnson gives up about 5 WAR and over 500 IP.
The next table shows the best 5 consecutive seasons for each pitcher. Peak value should be included in any evaluation along with career value. Again, Grove looks very good.
This might not be fair to Johnson. For whatever reason, modern pitchers don't pitch as many innings as those in the past. Grove's 5 highest IP seasons add to 1,421. For Koufax it is 1,447 and for Johnson it is only 1,285. If we increased his totals in the above graph by about 10%, he would still trail Grove.
Now the best 3 consecutive seasons.
Grove looks very good again. If we gave Johnson a 10% bump, he would pass Grove, but in only one measure, WAR, and not by much.
We should also look at how they each did in the pitcher cotrolled stats. In the next table, HR shows how well they each did in preventing HRs. Grove gave up 43% fewer HRs than the average pitcher during his career (that is what the 143 means, from the Lee Sinins Complete Baseball Encyclopedia). The number in parantheses is where each pitcher ranks among lefties with 2,000+ IP since 1920.
SO/BB is strikeout-to-walk ratio relative to the league average. The 211 for Grove means his ratio was 2.11 times the league aveage.
This looks very good for Grove. If I dropped the last 4 seasons for Johnson, he gets 122 for HR and 180 for SO/BB, still well below Grove.
The next table shows the same stats for each player over what was probably their best 5-year stretch. For Grove it was 1928-32, for Koufax it was 1962-66 and for Johnson it was 1998-2002. Grove again dominates.
During these years, Grove's park allowed about 50% more HRs than average, Koufax's about 40% fewer and Johnson's was about average. Grove had an ERA+ of 174, Koufax 169 and Johnson 178. A slight edge for Johnson, but I don't think enough to overcome Grove's big leads in so many other cases.
If we simply compare the relevant stats to the league average, Grove seems to have performed better than the other two whether we consider career value or peak value.
In the table below, I summarize each pitcher's career. Here is what the abbreviations mean:
WS = Win Shares (created by Bill James)
WAR = Wins Above Replacement (from Sean Smith)
PW = Pitching Wins (from Pete Palmer via Retrosheet)
ERA+ = ERA adjusted league average and park effects (from Baseball Reference)
Grove does very well. WS might take into account clutch performance or high leverage situations. Grove pitched in relief in about 25% of the games and had alot of saves for his era. This could be bumping up his WS. Koufax and Johnson did not pitch much in relief. But Grove has such a big lead, it might not matter.
Grove pitched until he was 41 and Johnson until he was 45. If I drop Johnson's last 4 seasons, his ERA+ is 151, just slightly ahead of Grove. But then Johnson gives up about 5 WAR and over 500 IP.
The next table shows the best 5 consecutive seasons for each pitcher. Peak value should be included in any evaluation along with career value. Again, Grove looks very good.
This might not be fair to Johnson. For whatever reason, modern pitchers don't pitch as many innings as those in the past. Grove's 5 highest IP seasons add to 1,421. For Koufax it is 1,447 and for Johnson it is only 1,285. If we increased his totals in the above graph by about 10%, he would still trail Grove.
Now the best 3 consecutive seasons.
Grove looks very good again. If we gave Johnson a 10% bump, he would pass Grove, but in only one measure, WAR, and not by much.
We should also look at how they each did in the pitcher cotrolled stats. In the next table, HR shows how well they each did in preventing HRs. Grove gave up 43% fewer HRs than the average pitcher during his career (that is what the 143 means, from the Lee Sinins Complete Baseball Encyclopedia). The number in parantheses is where each pitcher ranks among lefties with 2,000+ IP since 1920.
SO/BB is strikeout-to-walk ratio relative to the league average. The 211 for Grove means his ratio was 2.11 times the league aveage.
This looks very good for Grove. If I dropped the last 4 seasons for Johnson, he gets 122 for HR and 180 for SO/BB, still well below Grove.
The next table shows the same stats for each player over what was probably their best 5-year stretch. For Grove it was 1928-32, for Koufax it was 1962-66 and for Johnson it was 1998-2002. Grove again dominates.
During these years, Grove's park allowed about 50% more HRs than average, Koufax's about 40% fewer and Johnson's was about average. Grove had an ERA+ of 174, Koufax 169 and Johnson 178. A slight edge for Johnson, but I don't think enough to overcome Grove's big leads in so many other cases.
If we simply compare the relevant stats to the league average, Grove seems to have performed better than the other two whether we consider career value or peak value.
Friday, January 15, 2010
Bert Blyleven, Jim Palmer and Defensive Efficiency Rating
Bert Blyleven had 4,970 IP and had 344 RSAA. That comes from the Lee Sinins Complete Baseball Encyclopedia. It is "Runs saved against average. It's the amount of runs that a pitcher saved vs. what an average pitcher would have allowed." It is park adjusted. So he saved about .62 runs per game. Jim Palmer pitched 3,948 innings with 314 RSAA. So he saved about .72 runs per 9 IP.
But, fielders help pitchers prevent runs and many fans know that the Orioles had great fielders over the years like Brooks Robinson, Paul Blair, Mark Belanger and Bobby Grich. This is where a Bill James stat called Defensive Efficiency Rating (DER) comes in. It tells us what % of balls in play (BIP) were turned into outs by a team's fielders. The higher the rating, the better the defense.
So I looked at how good the DERs were for the Orioles during the years Palmer pitched and also for the teams that Blyleven pitched for. I found that the Orioles had better than average fielding and the teams Blyleven pitched for had below average fielding. So Palmer's fielders added to his RSAA and Blyleven's fielders reduced his RSAA. I tried to estimate how many runs this added up to and then recalculated each pitcher's RSAA per 9 IP.
I calculated DER as 1 - BABIP (batting average on balls in play). I calculated BABIP as
(H - HR)/(BFP - SO - BB - HBP - HR)
Palmer's team DER over his career (a weighted average as a % of BFP) was .738. The league average was .726 for a difference of about -.012. That is, the DER was .012 higher for the Orioles than for the league. Palmer had 12248 BIP. Times about .012 is 145. So the Oriole fielders made about 145 plays that average teams would not. If those hits had an average linear weights run value of .55, that is about 80 runs his fielders saved him.* So his career RSAA falls to 234. Per 9 IP that becomes .53.
Blyleven's teams had a DER of .718 while the league average was .722 for a difference of about .004. He had 14883. That times .004 is 57. That means that his fielders allowed an extra 57 hits. That times .55 means about 32 runs he gave up he should not have. Then his career RSAA should rise from 344 to 376. Per 9 IP that would be .68, well ahead of Palmer.
Now this all assumes that everything that happens on BIP is up to the fielders. Even if I cut the run change in half for each guy, Blyleven is still ahead .65 to .63 in RSAA per 9 IP. I also assumed that .55 is the run value of each event prevented (or not prevented). It could have been a little lower for Palmer if more of nonHRs were singles. It could have been higher for Blyleven if they were less and 2Bs and 3Bs were a higher %. It is probably not a big deal either way.
*The linear weights run value comes from Pete Palmer. Here are the run values for particular events:
1B = .47
2B = .78
3B = 1.09
That is, for example, every additional single adds .47 runs over the course of a season. I once found a weighted average of these three events of .55, weighted by each event's own frequency.
But, fielders help pitchers prevent runs and many fans know that the Orioles had great fielders over the years like Brooks Robinson, Paul Blair, Mark Belanger and Bobby Grich. This is where a Bill James stat called Defensive Efficiency Rating (DER) comes in. It tells us what % of balls in play (BIP) were turned into outs by a team's fielders. The higher the rating, the better the defense.
So I looked at how good the DERs were for the Orioles during the years Palmer pitched and also for the teams that Blyleven pitched for. I found that the Orioles had better than average fielding and the teams Blyleven pitched for had below average fielding. So Palmer's fielders added to his RSAA and Blyleven's fielders reduced his RSAA. I tried to estimate how many runs this added up to and then recalculated each pitcher's RSAA per 9 IP.
I calculated DER as 1 - BABIP (batting average on balls in play). I calculated BABIP as
(H - HR)/(BFP - SO - BB - HBP - HR)
Palmer's team DER over his career (a weighted average as a % of BFP) was .738. The league average was .726 for a difference of about -.012. That is, the DER was .012 higher for the Orioles than for the league. Palmer had 12248 BIP. Times about .012 is 145. So the Oriole fielders made about 145 plays that average teams would not. If those hits had an average linear weights run value of .55, that is about 80 runs his fielders saved him.* So his career RSAA falls to 234. Per 9 IP that becomes .53.
Blyleven's teams had a DER of .718 while the league average was .722 for a difference of about .004. He had 14883. That times .004 is 57. That means that his fielders allowed an extra 57 hits. That times .55 means about 32 runs he gave up he should not have. Then his career RSAA should rise from 344 to 376. Per 9 IP that would be .68, well ahead of Palmer.
Now this all assumes that everything that happens on BIP is up to the fielders. Even if I cut the run change in half for each guy, Blyleven is still ahead .65 to .63 in RSAA per 9 IP. I also assumed that .55 is the run value of each event prevented (or not prevented). It could have been a little lower for Palmer if more of nonHRs were singles. It could have been higher for Blyleven if they were less and 2Bs and 3Bs were a higher %. It is probably not a big deal either way.
*The linear weights run value comes from Pete Palmer. Here are the run values for particular events:
1B = .47
2B = .78
3B = 1.09
That is, for example, every additional single adds .47 runs over the course of a season. I once found a weighted average of these three events of .55, weighted by each event's own frequency.
Wednesday, January 13, 2010
Bert Blyleven's Amazing Strikeout-To-Walk Ratio
You might know that he had a 2.8 ratio. It was also 75% better than the league average and that is the 29th best ratio relative to the league average since 1900 for pitchers with 2000+ IP (Greg Maddux is the guy just ahead of him with 75.5%). All data is from the Lee Sinins Complete Baseball Encyclopedia. Hall of Famers he is ahead of include:
Juan Marichal
Jim Bunning
Three Finger Brown
Sandy Koufax
Addie Joss
Don Sutton
Chief Bender
Don Drysdale
Tom Seaver
Rube Marquard
Bob Feller
Gaylord Perry
Hal Newhouser
Lefty Gomez
Eddie Plank
I also found the pitchers since 1900 who had the most seasons in the top 5 in strikeout-to-walk ratio. Here are the leaders:
Walter Johnson 16
Bert Blyleven 13
Mike Mussina 13
Robin Roberts 13
Carl Hubbell 12
Christy Mathewson 12
Greg Maddux 12
Lefty Grove 12
Don Sutton 11
Jim Bunning 11
Randy Johnson 11
Only the great Walter Johnson, one of the first five members of the Hall of Fame, is ahead of Blyleven. Now for the pitchers who had the most seasons in the top 10 in strikeout-to-walk ratio.
Greg Maddux 17
Bert Blyleven 16
Don Sutton 16
Walter Johnson 16
Christy Mathewson 15
Grover C Alexander 15
Mike Mussina 15
Roger Clemens 15
Ferguson Jenkins 13
Jim Bunning 13
Lefty Grove 13
Robin Roberts 13
Only Maddux is ahead of Blyleven. And here is something I posted to the SABR List in early 2006:
"I thought it would be interesting to use a point system to see how well pitchers have done in RSAA (runs saved above average and it is park adjusted). A first place finish would be 10 points, second 9, and so on. Ties would split points. A tie for first would get 9.5. Then I called up the annual top tens for the AL, NL and AA using the Lee Sinins Sabermetric Encyclopedia. Each pitcher got his points then a career total was found for each guy. Here are the top 10
Cy Young-134
Clemens-132
Grove-113.5
W. Johnson-111.5
Mathewson-102
Maddux-101.5
Alexander-98
Nichols-95
R. Johnson-83
Blyleven-74
For Blyleven to crack the top 10, he had to consistently be among the leaders for a long time. There were some years where only 4-5 pitchers had significant RSAA, like the 1870s in the NL. One year several guys tied for 10th with an RSAA of 1. I eliminated anyone with less than 10 RSAA for a given year.
Juan Marichal
Jim Bunning
Three Finger Brown
Sandy Koufax
Addie Joss
Don Sutton
Chief Bender
Don Drysdale
Tom Seaver
Rube Marquard
Bob Feller
Gaylord Perry
Hal Newhouser
Lefty Gomez
Eddie Plank
I also found the pitchers since 1900 who had the most seasons in the top 5 in strikeout-to-walk ratio. Here are the leaders:
Walter Johnson 16
Bert Blyleven 13
Mike Mussina 13
Robin Roberts 13
Carl Hubbell 12
Christy Mathewson 12
Greg Maddux 12
Lefty Grove 12
Don Sutton 11
Jim Bunning 11
Randy Johnson 11
Only the great Walter Johnson, one of the first five members of the Hall of Fame, is ahead of Blyleven. Now for the pitchers who had the most seasons in the top 10 in strikeout-to-walk ratio.
Greg Maddux 17
Bert Blyleven 16
Don Sutton 16
Walter Johnson 16
Christy Mathewson 15
Grover C Alexander 15
Mike Mussina 15
Roger Clemens 15
Ferguson Jenkins 13
Jim Bunning 13
Lefty Grove 13
Robin Roberts 13
Only Maddux is ahead of Blyleven. And here is something I posted to the SABR List in early 2006:
"I thought it would be interesting to use a point system to see how well pitchers have done in RSAA (runs saved above average and it is park adjusted). A first place finish would be 10 points, second 9, and so on. Ties would split points. A tie for first would get 9.5. Then I called up the annual top tens for the AL, NL and AA using the Lee Sinins Sabermetric Encyclopedia. Each pitcher got his points then a career total was found for each guy. Here are the top 10
Cy Young-134
Clemens-132
Grove-113.5
W. Johnson-111.5
Mathewson-102
Maddux-101.5
Alexander-98
Nichols-95
R. Johnson-83
Blyleven-74
For Blyleven to crack the top 10, he had to consistently be among the leaders for a long time. There were some years where only 4-5 pitchers had significant RSAA, like the 1870s in the NL. One year several guys tied for 10th with an RSAA of 1. I eliminated anyone with less than 10 RSAA for a given year.
Saturday, January 9, 2010
Was Edgar Martinez An Elite Hitter? Offensive Winning Percentage Says So
Offensive Winning Percentage (OWP) is the Bill James stat that says if all 9 hitters were identical, what would the team's winning percentage be if it gave up an average number of runs. The lists I show below are from the Lee Sinins Complete Baseball Encyclopedia, so the OWP they are based on is park adjusted.
The first list shows the leaders in number of seasons with a .700 OWP or greater. A .700 OWP will generally get you in the top 10 in your league for a given season. Here is everyone who did it 8 or more times. The players in red are not in the Hall of Fame. Everyone else is in or not eligible. There are 37 players on this list, to that puts Edgar Martinez in the top 37.
1 Ty Cobb 16
T2 Barry Bonds 15
T2 Willie Mays 15
T4 Babe Ruth 14
T4 Mel Ott 14
T4 Tris Speaker 14
T7 Ted Williams 13
T7 Mickey Mantle 13
T7 Stan Musial 13
T7 Hank Aaron 13
T11 Rogers Hornsby 12
T11 Roger Connor 12
T11 Honus Wagner 12
T11 Dan Brouthers 12
T11 Lou Gehrig 12
T16 Jim Thome 10
T16 Frank Robinson 10
T16 Manny Ramirez 10
T19 Frank Thomas 9
T19 Nap Lajoie 9
T19 Jimmie Foxx 9
T19 Eddie Mathews 9
T19 Ed Delahanty 9
T19 Albert Pujols 9
T19 Sam Crawford 9
T26 Joe Morgan 8
T26 Mike Schmidt 8
T26 Joe DiMaggio 8
T26 Billy Hamilton 8
T26 Cap Anson 8
T26 Rickey Henderson 8
T26 Eddie Collins 8
T26 Dick Allen 8
T26 Joe Jackson 8
T26 Johnny Mize 8
T26 Jesse Burkett 8
T26 Edgar Martinez 8
Here are some Hall of Famers who are behind Martinez on this list: Harry Heilmann, Rod Carew, Eddie Murray, Reggie Jackson, Wade Boggs, Harmon Killebrew, Willie McCovey, George Brett, and Willie Stargell. Martinez also had one other season with .700 or better when he failed to qualify for the batting title (2002). But he had 407 plate appearances. If I used 400 plate appearances as the criteria, Martinez would be in the top 25 (in a 10 way tie for 21st). In that case he is the only eligible player not in the Hall of Fame.
The next list has the leaders in seasons with a .750 OWP or higher. A .750 OWP will usually get you in the top 5 in your league in any given year.
1 Ty Cobb 15
T2 Barry Bonds 14
T2 Babe Ruth 14
T4 Ted Williams 12
T4 Mickey Mantle 12
T6 Honus Wagner 11
T6 Dan Brouthers 11
T8 Lou Gehrig 10
T8 Tris Speaker 10
T8 Stan Musial 10
T11 Rogers Hornsby 9
T11 Ed Delahanty 9
T11 Mel Ott 9
T11 Willie Mays 9
15 Jimmie Foxx 8
T16 Eddie Collins 7
T16 Frank Thomas 7
T18 Elmer Flick 6
T18 Frank Robinson 6
T18 Edgar Martinez 6
T18 Cap Anson 6
T18 Eddie Mathews 6
T18 Pete Browning 6
T18 Albert Pujols 6
Here are some Hall of Famers who are behind Martinez on the second list:
Nap Lajoie
Billy Hamilton
Jesse Burkett
Wade Boggs
Hank Aaron
Joe Morgan
Harry Heilmann
Joe DiMaggio
Roger Connor
Willie McCovey
Willie Stargell
Johnny Mize
Frank Chance
King Kelly
Reggie Jackson
The first list shows the leaders in number of seasons with a .700 OWP or greater. A .700 OWP will generally get you in the top 10 in your league for a given season. Here is everyone who did it 8 or more times. The players in red are not in the Hall of Fame. Everyone else is in or not eligible. There are 37 players on this list, to that puts Edgar Martinez in the top 37.
1 Ty Cobb 16
T2 Barry Bonds 15
T2 Willie Mays 15
T4 Babe Ruth 14
T4 Mel Ott 14
T4 Tris Speaker 14
T7 Ted Williams 13
T7 Mickey Mantle 13
T7 Stan Musial 13
T7 Hank Aaron 13
T11 Rogers Hornsby 12
T11 Roger Connor 12
T11 Honus Wagner 12
T11 Dan Brouthers 12
T11 Lou Gehrig 12
T16 Jim Thome 10
T16 Frank Robinson 10
T16 Manny Ramirez 10
T19 Frank Thomas 9
T19 Nap Lajoie 9
T19 Jimmie Foxx 9
T19 Eddie Mathews 9
T19 Ed Delahanty 9
T19 Albert Pujols 9
T19 Sam Crawford 9
T26 Joe Morgan 8
T26 Mike Schmidt 8
T26 Joe DiMaggio 8
T26 Billy Hamilton 8
T26 Cap Anson 8
T26 Rickey Henderson 8
T26 Eddie Collins 8
T26 Dick Allen 8
T26 Joe Jackson 8
T26 Johnny Mize 8
T26 Jesse Burkett 8
T26 Edgar Martinez 8
Here are some Hall of Famers who are behind Martinez on this list: Harry Heilmann, Rod Carew, Eddie Murray, Reggie Jackson, Wade Boggs, Harmon Killebrew, Willie McCovey, George Brett, and Willie Stargell. Martinez also had one other season with .700 or better when he failed to qualify for the batting title (2002). But he had 407 plate appearances. If I used 400 plate appearances as the criteria, Martinez would be in the top 25 (in a 10 way tie for 21st). In that case he is the only eligible player not in the Hall of Fame.
The next list has the leaders in seasons with a .750 OWP or higher. A .750 OWP will usually get you in the top 5 in your league in any given year.
1 Ty Cobb 15
T2 Barry Bonds 14
T2 Babe Ruth 14
T4 Ted Williams 12
T4 Mickey Mantle 12
T6 Honus Wagner 11
T6 Dan Brouthers 11
T8 Lou Gehrig 10
T8 Tris Speaker 10
T8 Stan Musial 10
T11 Rogers Hornsby 9
T11 Ed Delahanty 9
T11 Mel Ott 9
T11 Willie Mays 9
15 Jimmie Foxx 8
T16 Eddie Collins 7
T16 Frank Thomas 7
T18 Elmer Flick 6
T18 Frank Robinson 6
T18 Edgar Martinez 6
T18 Cap Anson 6
T18 Eddie Mathews 6
T18 Pete Browning 6
T18 Albert Pujols 6
Here are some Hall of Famers who are behind Martinez on the second list:
Nap Lajoie
Billy Hamilton
Jesse Burkett
Wade Boggs
Hank Aaron
Joe Morgan
Harry Heilmann
Joe DiMaggio
Roger Connor
Willie McCovey
Willie Stargell
Johnny Mize
Frank Chance
King Kelly
Reggie Jackson
Wednesday, January 6, 2010
How Accurate Were My Hall Of Fame Predictions?
I posted them on Dec. 17. Click here to see that post. The table shows the predictions I made and the actual vote. My models are aimed at estmating vote % in the first year of eligibility. A few of my predictions were fairly accurate but some were way off. Larkin and Martinez got alot more support than the model says (which I think is a good sign since I think both of them deserve to get in). The model studied the votes in the 1990-2009 period and was based on things like all-star games, gold gloves, MVP awards and miletones, like 3000 hits.
Update on Jan. 8: Dave Allen of Baseball Analysts has an interesting take on who might make it in the future. Go to Looking at Some BBWAA Vote Trajectories
Update on Jan. 8: Dave Allen of Baseball Analysts has an interesting take on who might make it in the future. Go to Looking at Some BBWAA Vote Trajectories