Tuesday, April 29, 2008

How Many HRs Would Ruth Have Hit With Integration?

I have written about this before. But the reason I am doing again is that a friend emailed me some comments that her friends had made about my research on this. So I answer them below. But first, here is the gist of what I did. I estimated how many non-white pitchers there might have been and how good they might have been if they had been allowed pre-1947. Then I estimated how many fewer HRs there would have been in baseball due to the improved pitching quality. I came up with 5% and assumed that Ruth's total would go down 5% (to 678).

Before I respond to the comments, here are the links.

How Would Integration Have Affected Ruth and Cobb?
How Many HRs Would Babe Ruth Have in Integrated Baseball?

The first link has a link to an article I wrote for the now defunct Chicago Sports Weekly. It no longer works. Now to the comments.

COMMENT: "I'm not sure how accurate is is to presume that the "worst" 15% of white pitchers would necessarily be replaced by black pitchers. Without the color barrier, they would have just promoted who was better. Can anyone say with any certainty that black pitchers were significantly better, inherently than whites? The other thing is that in Ruth's time, pitching staffs generally consisted of about 8 or 9 pitchers at any given time. There were 4-man rotations, and those pitchers were conditioned from early on to be able to complete games. Pitchers today could be conditioned that way too....but they're not. Back then, there was less need for relievers, hence the small pitching staffs.

So this means that at any given time, there were approximately 140 pitchers in the game-spread out over 16 teams. Even if we went with the figure of 15%, that means that there might have been about 21 black pitchers, spread out over 16 teams - which means about 1.3 per team-and half of those Ruth would not face because they would be in the National League.

So the 9 or so he would face on the opposing 7 teams-how many of them would have been exceptional enough to really put a significant dent in Ruth's perfromance? I'm not thinking too many. In fact, he may well have dominated some of them."

Yes, I think the worst 15% would be replaced. Here is how I look at it. Suppose all of the sudden a new talent pool was discovered that had major league quality pitchers in it. You would want some of them on your team, right? So whatever number of pitchers that you carry, for every one pitcher you add from this new talent pool, you have to send one to the minors or release him. The only logical thing to do would be to release the worst pitcher every time you add one good pitcher. For example, if you add Satchel Paige, you don't dump Bob Feller or Bob Lemon.

I am not saying that any one race is better than any other race. Here is what I wrote in one article

"I estimate that about 15% of the IP then were by non-whites, blacks, dark-skinned Hispanics and Asians. Using the Lee Sinins Complete Baseball Encyclopedia, I found all the pitchers with 1,000+ IP in this period and then calculated what percent of the IP by these guys was by non-whites. You can see the list here. I checked the race of any pitcher I did not already know by looking at when they played and finding pictures of them in books or online. Any pitchers with Hispanic names were considered non-white. There were pitchers like Lefty Gomez before 1947, whose skin was light enough to play. But I did not want to have to judge who would have been able to play and who would not.

In that list, I have relative ERA listed. That is simply ERA divided by the league ERA. The relative ERA of all the whites combined was 105.75, meaning their ERA’s were about 5.75% better than the league average. For the non-whites it was just a bit higher at 106.8. In the analysis below, I assume that the ERAs of whites and non-whites will be the same. The number of IP by the pitchers with 1,000+ IP since 1947 accounted for 58% of all the IP in this time period."

So, the quality of the whites who have pitched in MLB since 1947 and the non-whites is about the same. Neither is better. But if those non-whites had not been there this whole time, who would have been there in their place? Some white guys, who were not as good (if they were, they would have been there instead). Let's call them the bad-whites. Having the bad-whites instead of the non-whites means that the overall quality of pitching would have been worse, meaning more HRs hit (assuming we make no change in who is batting). But that means having the non-whites there improves the quality of pitching and the hitters would not quite do as well. If anyone has read my articles, I had a rough estimate of Ruth going down 5%, leaving him with 678 HRs, still alot.

It does not matter how many pitchers are on the teams or what kind of rotation they had. 15% is 15%. Of course he would not have faced the guys in the other league. But I make an assumption of 15% in each league, which is done to keep everything fair. Yes, with some of the non-white pitchers, he would have dominated them. A few of them would have been just a hair better than the bottom rung pitchers (the bad-whites). But some were great, like Gibson and Marichal. I am just estimating that collectively, all the non-whites were as good as the top 85% of the whites.


COMMENT: "But think about this now. If 60 years of integration can only produce a handful of solid minority pitchers, why should the 15% percent you are replacing them with be any better than the ones you are deducting?"

I think this is explained above.

COMMENT: "Ruth did not hit a HR every time up but changing 8-9 pitchers (15%)who may be better pitchers is no guarantee that they would dominate Ruth or at least do better than their predecessors."

Think of it this way. Suppose all of the sudden the worst 15% of the pitchers were let go and the best 85% all pitched a little bit more to make up for the lost IP. Suppose that there was some pill they could take that allowed them to add extra innings with no loss of effectiveness. Then all the batters face a higher quality of pitching. HRs would go down. I don't think that losing 5% of your HRs means you were dominated.

Some other issues. I think right now more than 15% of the pitchers are non-whites. Probably less than 15% were non-white when Aaron and Mays were starting to rack up their big HR totals. If there had been more non-white pitchers in the 1950s, those two guys would also lose HRs. I think my analysis shows how good Ruth was and how much we should respect his records. I think my analysis is an answer to those who ask "How many HRs would Ruth have it if he had to face Pedro Martinez?" Well, still quite a few because he would not face Pedro all the time and Pedro does allow some HRs.

Friday, April 18, 2008

Bob Feller's Strikeout Blitz, Part 2

In the graph below, I show the major league leader in strikeouts per 9 IP in each 5-year period, starting with 1901-05 and going up through 2003-2007. The minimum IP for each 5-year period was 750. Feller's 8.15 through 1939 was the highest up to that time and would not be passed until Koufax had 9.48 through 1961 (Feller did not pitch until 1936, so that 1939 figure is over 1936-39, but he had over 750 IP. His 7.92 through 1940 was the 2nd highest up to that time. And even up through 1943 he had 7.19, which was the highest since 1908 not counting his own performance. It is not until the late 1950s that even that is surpassed. You can see that things drop off pretty quickly after Feller went into the military (in 1942). The line does spike up just a bit in 1948. But that is from Feller (that is the first year after he got back from the war when his 5-year IP total reached 750). You can click on each graph to see a bigger version. (Where it says 1939-43 below, it refers to 5 different 5-year periods, with one ending in 1939, another in 1940 and so on).



In the next graph, I show what the line would be look like if I replaced Feller with the next highest totals over each 5-year period. The sections in red represent the next best totals and the green lines represent what Feller actually did. You can see that the trend would have been much different without Feller.



The next graph shows the 5-year leaders in strikeouts per 9 IP relative to the league average. A 200 means a guy had twice the league average. Feller is not as big a spike as Vance, but no one has approached what he did since then.

Sunday, April 13, 2008

Bob Feller's Strikeout Blitz

In November 1938, Bob Feller turned 18. That previous summer he pitched 62 innings for the Indians and struck out 76 batters. That ratio may not seem that unusual these days, but in that era, it was incredible. He struck out 27.2% of the batters he faced, 3.33 times the league average. Most of what I do below is graphical. For more statistical details click here.

The graph below shows the major league strikeout leaders from 1901-2007. Feller led from 1938-1941 with at least 240 in each season. In the previous 23 seasons, 240 had only been topped once. The leaders fell off until Feller had 348 in 1946, the highest since Rube Waddell had 349 in 1904. The last time anyone had 300+ was in 1912, by Walter Johnson. You can click on any image to enlarge it.




The next graph shows the major league leaders in strikeouts per 9 IP from 1901-2007 by pitchers who qualified for the ERA lead (not counting the Federalist league, which is true for all graphs here). Feller led from 1938-1940. His 7.77 in 1938 was the highest since Waddell had 7.86 in 1905.



The next graph shows the strikeout per 9 IP leaders for those with 50+ IP from 1901-2007. Feller's spikes are noted and you can see they were unprecedented and it took many years for them to be equalled.



The next graph shows the leaders in strikeouts per 9 IP with 300+ IP from 1901-1980 (the last year anyone pitched 300+ IP). Feller's 1946 season would not be passed until Koufax did it in 1963 with 8.86. Feller had the highest in the century until then. His 1940 and 1941 seasons were very good, too. In 1940 he had 7.34, which had been topped only once in the previous 28 years.

Sunday, April 6, 2008

Is David Ortiz A Clutch Hitter?

Many fans are aware of clutch hitting stats like hitting with runners in scoring position and hitting when it is close and late (CL = Situations when the game is in the 7th inning or later and the batting team is leading by one run, tied, or has the potential tying run on base, at bat or on deck). And many fans probably recall David Ortiz having many late inning clutch hits. But statistical expert Bill James has come up with a new way to judge clutch hitting which can be read in the article Mr. Clutch Big Papi, Chipper, Pujols come through when it counts.

James says ""Clutch" is a complicated concept, containing at least seven elements:

1. The score,
2. The runners on base,
3. The outs,
4. The inning,
5. The opposition,
6. The standings,
7. The calendar."

He then shows what Ortiz did in these situations from 2002-2007 in 394 ABs. He batted .322 with a .413 OBP and a .678 SLG. His overall numbers from 2002-2007 in AVG-OBP-SLG were .298-.394-.597 (thanks to the Lee Sinins Complete Baseball Encyclopedia). So he did somewhat better in the clutch situations than he normally did, but was this difference big enough to be significant?

To answer that, I used a technique of calculating a "Z-score" that Pete Palmer usded in his article Clutch Hitting One More Time. It involves how much better or worse a guy does in the clutch compared to how he hits in other situations. It also takes into account what the normal league difference is. For example, from 1991-2000, the normal dropoff in CL situations was about .012 in AVG (probably because the good closers are brought in and the pitcher has the platoon advantage). If a player gets a Z-score of 2 or more the probability of doing so is under 5% or significant (the Z-score could be negative, meaning that he does worse in the clutch).

Suppose that in the James Clutch situation, AVG also falls .012, like in the CL situations (I have to make an assumption since James did not report what the normal dropoff is in AVG-more on this later where I assume other dropoffs besides .012). But I found that the average difference between all ABs (which includes CL) and just CL ABs from 1991-2000 was .010. A quick check at Retrosheet has it about .012 from 2002-2007. That is pretty close and so the CL vs. NON-CL differences will be similar in both periods.

Ortiz batted .294 in the non-James clutch situations. If the dropoff is .012, using the Palmer technique, I got a Z-score of 1.59, or well below the significant 2.0 level. But as I said above, I don't know the normal dropoff. To get it up to 2.0, the dropoff would have to be .022. That seems like alot since it is nearly twice the CL dropoff. Can the James clutch situation be that much harder than the CL? It is possible since 2000 that CL has gotten tougher since so many more relievers are being used. But the dropoff has to be pretty big to make Ortiz clutch in AVG. Also, if Ortiz had batted .294 in the James clutch situations, he would have had 11 fewer hits than he actually did. That means 11 over 6 years, or just two clutch hits a year (if I put in a dropoff of .012, it would be 16 clutch hits, still under 3 a year).

I did the same thing for HR frequency (HR%), extra-base hit frequency (XB%) and OBP. For HR%, his normal was 7.0% while it was 8.9% in clutch situations, about a 27% increase (8.9/7 = 1.27). I had a normal dropoff in in HR% of .003575, so Ortiz would be expected to have a 6.64% rate (.07 - .003575). He gets a Z-score of 1.48. That 0.003575 is how much lower the average player's HR% is in CL situtaions than in other situations from 1991-2000. That dropoff might be different now. But how much bigger would it have to be to get Ortiz to a Z-score of 2.0 and be significant? The normal dropoff in HR% in the James clutch situation would need to be .0115, like falling from 3.0% to 1.85%. That is pretty severe and is 3 times what I had for the CL dropoff.

Moving to XB%. He had a rate of 14.8% under normal circumstances while he a a 17.5% rate in the James clutch situations, for an 18% higher rate. The Z-score was 1.96 using the normal dropoff of about .013. So this is very close to being significant. His extrabase hit performance may truly be clutch.

Now to OBP. Just using hits and walks (James does not show HBP, SFs, etc. in Ortiz's clutch stats), Ortiz had a normal OBP of .392 and .417 in the clutch. The Z-score was 1.01. Not even close to significance (my data had no dropoff in OBP in CL situations but if IBBs were taken out, it drops off .012). But even so, it would take a normal dropoff of.024 to get Ortiz up to a Z-score of 2.0. And James does not list the IBBs in his clutch data.

So, in general, Ortiz's performance in the clutch is very good but not significant. I also find interesting that about 12.5% of his PAs came in the clutch. In my 1991-2000 data, I had the average guy getting about 15% of his PAs in the clutch. So those two are close and I think using the CL differences is reasonable as a benchmark to calculate Z-scores. Bottom line, how surprising is for a guy who normally bats .294 in 2,756 ABs, to hit .322 in some randomly selected 394 ABs? Not very, even if you assume the average player hits alot lower in those ABs.

Also, some of Ortiz's hits may have ended games. In some of those cases, had he made an out, the next batter (or the batter after that) might have won the game for the Red Sox. Or even if he did not end the game, taking his hit away might still result in a Red Sox win. I once tried to calculate how many games clutch hitters win and very few clutch hitters add alot of wins. See How Many Games Do Clutch Hitters Really Win?.