Monday, April 6, 2009

How Many Home Runs Would Ruth Have Hit If Baseball Had Been Integrated In His Era?

(Note: This is a slightly revised version of an article that was published in 2007 in the now defunt print periodical called "The Chicago Sports Weekly." I also had posted something like this at "Beyond the Boxscore" called How Would Integration Have Affected Ruth and Cobb?)

Maybe you have seen the images on TV of fans around the country holding up asterisk signs when Barry Bonds comes to the plate, hinting that his HR record is tainted, due to his alleged steroid use. But others counter that Babe Ruth might deserve an asterisk since he never faced blacks or dark-skinned Hispanics (there were a few players with Hispanic names before 1947 whose skin was generally pretty light).

But this raises the question of how many HRs would Ruth have hit had there not been a color barrier? I know the answer because Clio, the Greek muse of history, whispered it in my ear. You see, my Ph. D. thesis was in the field of economic history and its application of statistics is called “cliometrics.” What I am about to attempt here is something dangerous called a “counterfactual” in this field. So don’t try it at home. Leave it to the trained professionals.

Robert Fogel, economic historian at the University of Chicago, won a Noble Prize, partly for using counterfactuals. He supposed what if railroads had not been built. What other kind of transportation system (like canals) would have emerged? How would this have affected economic growth? He concluded that GDP in 1890 would have been about 5% lower than it actually was.

Not everyone was thrilled with this approach. The historian Fritz Redlich referred to counterfactuals as figments, probably of an imagination gone wild. So maybe you will think this analysis is a figment of my imagination. So. Maybe you’re a figment of my imagination. In any case, here it is.

First, we need an estimate of how many non-white pitchers there might have been. Since 1947, about 15% of all the IP by pitchers with 1,000+ IP in their careers have been by non-whites. All of the 1,000+ IP pitchers made up about 58% of all the IP since 1947, so it is a good sample. Therefore, I assume that in Ruth’s day 15% of the IP were by non-whites.

How good would those pitchers have been? Good enough to replace some white guys, who would be the worst pitchers in the league. You don’t add Satchel Paige to your team and then get rid of Lefty Grove. You dump Grover Lowdermilk (who really was not a bad pitcher but his name sounds funny, unlike mine). The non-whites with 1,000+ IP since 1947 actually had a collective ERA just about the same as the whites. So pre-1947, you dump the worst 15% of the pitchers by ERA and re-calculate the league HR rate using the remaining pitchers or the top 85%

After getting rid of the bottom 15% of the IP in each season from 1920 to 1934 (when Ruth played with the Yankees and had all of his great seasons) in the AL, I recalculated the HRs allowed per IP and found how much lower than the league average the new figures were. The average fall in HRs per IP for the years 1920-1934 was about 5%. That is, the best 85% of the pitchers had a HR per IP rate that was 5% lower than the league average (which includes all pitchers). So if you improve the pitching quality in a way that is consistent with integration, Ruth would hit 5% fewer HRs or hit about 678. Even if we cut him 10%, he still hits 643.

Some things I have not considered: when Aaron and Mays were hitting HRs in the 1950s, there still were not that many non-whites pitching. So their totals might need to be reduced. We also don’t know if all batters would be affected in the same way. The best HR hitters might have had their totals reduced more than the average hitter. Also, we don’t know what percentage of pitchers would have been non-white. Probably it is more than 15% today. Suppose it is 25%. I looked at the 1927 AL and if you only count the best 75% of the pitchers, the HR rate falls about 9%.

Suppose we only looked at the best 50% of the pitchers from 1927, HRs would fall about 18.3%. If that happened to Ruth over his whole career, he still hits 583 HRs. It is about 17% for 1921. For 1934, it would be 20%. Given that I am only counting the best 50% of the pitchers, we can safely say that integration would have reduced his HR’s by no more than 20% (the top 50% of pitchers in 2008 gave up about 20% fewer HRs than average as well). So he ends up with 571 HRs. That would have stood as a record for quite awhile. And remember that we would have to reduce Aaron and Mays since they played a good part of their careers when there were not as many non-white pitchers as today.

Here is the link that shows the white and non-white pitchers since 1947


Anonymous said...

What if the Babe was never a pitcher? How might that have affected the number of home runs he it?

Cyril Morong said...


Thanks for dropping by and commenting. I have not tried to figure that out, but most likely alot more. He was pretty much a full-time pitcher at age 20 but did reach even 150 ABs in a season until age 23. He had only 351 ABs from ages 20-22. If he had been a full-time everyday player, it easily would have over 1000. Let's say 1400, based on how many he had when he did start to play everyday. At this point we would have to assume a HR%. That may not be so easy. HR hitting back then was affected or may have been affected by changes in how lively the ball was as well as rule changes, like outlawing the spitball. I think Bill James also mentioned that in 1920, they started putting fresh balls into the game alot more often. So if Ruth had been a regular player from the beginning of his career, we can't really tell for sure when the "live" ball era would have begun. I think 10% would be too high, since it was the deadball era, and even if Ruth started hitting HRs earlier, and baseball wanted to liven up the ball because they saw that fans liked the HR hitting, we still have to assume a low HR% in his first year or two. Then most likely he would have been improving as he went along, so at age 20 he would have been good, but not great. If we give him an extra 1000 ABs, and assume a HR% of 56-%, we are looking at an additional 50-60 HRs.


Matt said...

Ruth said that pitchers wouldn't pitch to him until he got a good hitter behind him (Gehrig), which led to increased home runs. Would that be a factor to consider, i.e., the better (integrated) pitchers having to pitch to a better (integrated) line-up behind Babe, leading to less room for walks?

Cyril Morong said...


Thanks for dropping and commenting. Ruth hit alot of HRs before Gehrig got there and even with Gehrig there, he still got walked alot. It might have helped a little, maybe a couple extra HRs per year.


nicoxen said...

What if Ruth were playing in a modern stadium with shorter fences?

Cyril Morong said...

How much shorter are the fences now? Remember, that in Ruth's first three years with the Yankees, in the Polo Grounds, it was only about 250 feet down the line in right field. And in Yankee Stadium it was only 297 and 344 in the power alley.

We would need to know the distances in all the parks he played in and how far he hit every HR and every fly out.

Thanks for dropping by and commenting.

Anonymous said...

In Ruth's era most of the good athlete's played Baseball. Today athletes are spread over football, basketball,soccer etc. would that and the fact that MLB has expanded to double the number of teams today as to what there were in Ruth'sEra make any difference? In Ruth's era you could only get the number of bases needed to win the game in the last inning of the game. If the bases were loaded and the score was 1-0 and Ruth hit a home run he would have only got credited with a double. Lastly when Ruth played if a home run that was hit inside the foul pole but hooked foul when it landed it was called foul. How many more home runs would Ruth have had those foul balls have been called home runs. Babe Ruth the greatest player of all time!

Cyril Morong said...

Thanks for dropping by. You raise some good issues but I think Ruth only had a few of the kinds of HRs you mention (the foul issue, the game ender). And yes, there are more teams but there are people around and baseball now draws players from Latin American and the Far East. Also, I don't think every guy in the NFL or NBA could play baseball. You don't see many 7-footers or 300 pounders in baseball

Anonymous said...

Ok sorry, but you have just shown that you know nothing about logic, probability, statistics, or any math beyond basic algebra for that matter... First, there is your assumption that only the 1000+ IP pitchers are relevant. Second is that based on the fact that the combined efforts from these pitchers accumulates to 58% of the innings pitched since 1947 and that 15% of these pitchers were not white, which may be true. But, this obviously has no effect on the number of pitchers before that... So, your pseudo-logic that this fact implies the same percentage for Ruth's era is woefully ill-posed, and you leave literally hundreds of variables out. This problem isn't even solvable due to the lack of information about the time in question. And consequently, every deduction that you make after that is not only ill-posed but also follow from other illogical deductions. Please tell me you don't actually believe this nonsense. I am an undergraduate level math major and even I can see straight through the holes in your "logic". This counterfactual argument or whatever you call it is simply an attempt to overload the reader with numbers in a pattern to try to convince them you have actually made a valid point.

Cyril Morong said...

I think capturing 58% of the IP is a great sample and I really doubt getting to 100% would change anything. I would have to figure out the race of hundreds of pitchers. What a waste of time.

Do you deny that if a new talent pool opened up that the quality of pitching would improve? That is all I am saying. That IS very logical. I don't think you have tried at all to understand what I have done.

What if Ruth faced a racial mix of pitchers that was similar to what it was during the post-integration era? Of course he would have hit fewer HRs becaue he would be facing, on average, better pitchers.

Anonymous said...

There's no mention of the difference in fence distances either. When Ruth played Yankee stadium was 520' in center and 425' in right center. Most, if not all, of the stadiums back then had obscenely large dimensions in the outfield so add that into the mix and you're looking at a whole lot more HRs for Ruth. Imagine how many flyballs he had that traveled 450'+ in dead center.

Cyril Morong said...

Thanks for dropping by and commenting.

Where did you read that it was 520 feet in CF in Yankee stadium?

But actually, distance is not the issue. If blacks had somehow been let it in, no distances would have changed

And it was only 297 down the line in RF in Yankee Stadium and 344 to the power alley. It was a great park for hitting HRs if you were a lefty. Ruth also played 3 years in the Polo Grounds and it was only about 250 down the line there