I wrote a post comparing these two along with Sandy Koufax a few days ago in response to a piece by Joe Sheehan. A commentor at B aseball Think Factory mentioned that Lefty Grove did not have to face blacks and Hispanics. So here I try estimate how his value could be affected by pitching before integration. The method I use will be similar to the one I used in this article: How Would Integration Have Affected Ruth and Cobb?
I assume that integration increases the talent pool available to baseball. So the relevant questions seem to be how much better would the hitters have been in Grove's time (AL, 1925-41) if there was no segregation and how many more runs would he have given up by having to face better hitters. But I also tried to take into account how much better the fielders and pitchers would have been. I assumed that the new, non-white talent would be about as good as they are now (relative to whites) and would replace the worst players. Then I tried to calculate how the league OPS and league ERA would change based on how good I assumed the new talent to be. Grove's ERA was also adjusted to account for the better pitchers as well as the better fielders behind him. My rough estimate is that his ERA relative to the league average would fall from 144 (or 44% better than average) to 130. Randy Johnson had 133. I think the difference between the two is still small enough to keep Grove in the debate as the greatest lefty ever.
Since the comparison is to Randy Johnson, I first looked to see what percentage of the players were non-white from 1988-2009 and how well they hit compared to the white players. I did this position by position (I try to exlain why at the end in "technical notes"). I found the top 100 players in PAs at each position (but used the top 300 for OFers) from 1988-2009 using the Lee Sinins Complete Baseball Encyclopedia. This group of players combined to make up about 72% of all PAs during this time.
If I did not feel sure about a player being white or non-white, I found a picture of his baseball card on eBay. For the most part, anyone with an Hispanic name was considered non-white. There were a few player with Hispanic names before 1947, like Lefty Gomez. But I don't think it was very many. I put someone like Mike Gallego in the white category since he had what is considered an "Anglo" first name and he was born in the U.S.
The table below shows the weighted-average OPS of whites and non-whites along with the percentage of PAs at that position by non-whites from 1988-2009.
I assumed that these would be the percentages in the AL from 1925-41. For example, I assumed that 34.8% of the 1B men would be non-white. So I took all the 1B men who had 100 or PAs and ranked them from highest to lowest in OPS. Then I removed the bottom 34.8% (approximately) of the PAs. The idea is that teams would get rid of their worst players when adding in the new, better players from the newly used talent pool.
The players who comprised the top 65.2% of PAs were designated as white players and they had an OPS of .926. The 35.8% of the remaining PAs were assigned to players who were assumed to be non-white and I assumed they would have a collective OPS of .914 (.012 lower than the white, what I found in the first table). Combining both groups together gives an OPS of .922. Before this adjustment, the OPS of all the 1B men was .857. So with the added talent, the new cumulative OPS for 1B men would be about .065 higher than before.
Here are all of the increases in OPS at each position
The weighted average of all these gains is .076. But since I did not include pitchers, I am just going to assume no change for them. So that would bring the change down to .070 (the weighting I used for pitchers was 8.5% based on Retrosheet data).
But before I try to recalculate how many runs Lefty Grove might have given up, I think better fielding has to be taken into account. In my article on Cobb and Ruth linked above, I estimated how much better the fielding would have been since integration began(in a manner similar to how I adjust the hitting stats-the details are explained at the link).
I found that OFers would have been better in putouts per game of by about 15% and assists per game for SS and 2B would have been better by about 5% (I think those stats for their respective positions made sense). Of course, all fielders cannot raise their putouts and/or assists per game since there are only 27 outs per game. But in the adjustment I made, I will assume that the number of hits on balls in play will have to drop by a certain percentage.
I went with 7.5%, which is in the range of the numbers I found for the OFers, SS and 2B men. Those infielders, of course, would also have more putouts. But I think their big contribution would be in throwing more batters out at first. There might be some improvement at 3B and 1B, but those are still mostly white positions, so any change will be slight.
So I assumed that there would have been a reduction in hits on balls in play of 7.5. That means 7.5% fewer, singles, doubles, and triples. At-bats would fall 7.5%, too. Once those changes were made, I recalculated the AL SLG and OBP from 1925-41. The new OBP would be .339 (down from .350) and the new SLG would be .386 (down from .404). The new OPS would be .725, down from .750. So a decline of .025.
But the improved hitting increased OPS by about .070. So subtracting the .025 leaves a .045 increase in OPS. How much would Grove's ERA increase if the OPS he allowed went up .045? To estimate that, I ran a regression on all the AL teams in his era with runs per game as the dependent variable and team OPS as the dependent variable. Here is the equation:
R/G = 14.23*OPS - 5.62
Since 14.23*.045 = .64, I assumed that every pitcher would see his ERA rise by that much (that may not be the case, some might go up more and some might go up less-I am just not sure how to figure that out-but I also tried raising each pitcher's OPS by the percentage increase, too, as explained below).
So let's say that Grove's ERA goes up by 0.64. His career ERA is now 3.70 (it was actually 3.06). If the league ERA went up the same amount, it would be 5.08. But that 5.08 is too high because it would be brought down by the fact that the new, non-white pitchers would raise the overall quality of the league pitching. How much might that be?
First, I assumed that about 22% of the innings would have been pitched by non-whites. So the worst 22% of the white pitchers would be removed. How good would the non-white pitchers be? Overall, as a group, about as good as the white pitchers who remained.
I figured that out by finding the top 288 pitchers in IP from 1988-2009. They collectively had about half the major league IP in this time. After separating them into white and non-white groups, I found that both had about the same ERA relative to the league average. The whites were 6% better and the non-whites were 7% better. So that is why I assumed in the previous paragraph that the incoming pitchers would be about as good as those that remained.
I ranked every pitcher in the AL in the ERA from lowest to highest ERA. Then I only kept the guys who combined to have about 78% of the IP (adding from lowest ERA to highest). This group of pitchers was about 5.7% better than the entire group. So I assumed that by adding the non-white pitchers, who would be as good as the remaining pitchers, would lower the league ERA by 5.7%.
Earlier I said the hitting and fielding would combine to raise the league ERA to 5.0. If it is lowered 5.7%, then it would be 4.81. How does this affect Grove? Before any adjustments, his relative ERA or ERA+ was 144 since 3.06/4.42 = .692 and 1/.692 = 1.44 (and then it is multiplied by 100). But we had his ERA rising to 3.7 and the league ERA rising to 4.81. Then 3.7/4.81 = .769 and 1/.769 = 1.3. That would give him a relative ERA of 130.
What did Randy Johnson have? 133. This still puts the two pitchers very close. What I have done is a very rough estimate. I think a more complete and thorough adjustment could leave Grove a little ahead or a little behind. But either way, Grove still deserves consideration as the greatest lefty every.
If fielding improved by some other percentages, here is what Grove's ERA+ would be
I also tried to raise Grove's OPS allowed by the same percentage by which the league OPS increased instead of the absolute increase. Since I had the league OPS rising by .045 and it was actually .750, that is a 6% increase. Grove gave up 3.64 runs per 9 IP. To score that many runs per game, a team would have an OPS of .651 (so I assumed that was Grove's OPS allowed). If his OPS allowed went up 6%, it would be .690 or a .039 increase. In that case, his runs allowed would go up by 14.23*.039 = .56. That is a bit lower than the .064 mentioned earlier. In this case, his ERA goes up to 3.62. Then his ERA relative to the league average would be 1.33 (3.62/4.81 = .753 and 1/.753 = 133).
I am not sure if Grove's OPS allowed should increased by an absolute amount or a percentge. But his adjusted ERA ends up being about the same in each case.
The reason I adjusted the league OPS by position instead of just looking at all players is that I ended up eliminating mostly players who were catchers and infielders. Of course, the players eliminated have be proportionate at each position.
Here is what happened. I initially found that from 1988-2009 the whites and non-whites both had about the same OPS (.771). The non-whites were about 51% of the players in the top 900 on PAs from 1988-2009. So I then found all of the players in the AL from 1925-41 with 100 or career PAs. Then I ranked them from highest to lowest OPS and dropped enough players from the lower ranks to make up about half the PAs. Then I found that the OPS for the remaining players was about 100 points higher than what is was before. I was about to use that as my increase for the league OPS but then I noticed all the players left at the top of the OPS ranking were OFers and 1B men. I was going to end up eliminating alot more than half of the 2B, SS and catchers. So then I had to break things down by position in both eras.
In the AL from 1925-41, the players at each position who had 100 or more PAs made up about 90% of all the PAs in the ERA. This is higher than the 72% share I had from the 1988-2009 era. It would have been nice to have a higher share in the latter period but that would have added alot of time to figuring out who whas white or not.
I also used both leagues from 1988-2009 since Randy Johnson pitched in both while Grove only pitched in the AL.
I also used the improvement in fielding since 1947 that I had calculated a few years ago, not just 1988-2009. This saved alot of work. It could be that the fielding improvement is greater from 1988-2009 than over the entire period of integration. But I do list how Grove's ERA+ would end up under different fielding scenarios and the differences are not great.
On the fielding adjustments, I know that it should involve more than just putouts by OFers and assists by SS and 2B men, but my guess is that this will cover the bulk of any improvement. Maybe some day I can incorporate DPs, errors, etc.
I will try to post a list of all the players I used from 1988-2009 and whether they were designated as white or non-white. Check back to see when I do that.
Click here to see that list.