Saturday, March 27, 2010

The Best 5-Year Hitting Performances By Age

The table below shows the top offensive winning percentages (OWP) for each 5-year age period. The plate appearance (PA) minimum was 2000. The "1 minus 2" column shows how far ahead of the next highest player the leader was. For example, Ted Williams, from ages 20-24, was .001 higher than Ty Cobb.

Now Ted Williams actually did not play at age 25 due to military service during WWII. Setting the PA limit at 2000 let some players like that in, guys who actually did not play all 5 years or only small parts of some other years. So the next table uses a 2750 PA limit.

Interesting that Bonds has the 2 highest 5-year periods but they are at very old ages, 34-38 & 35-39 (and he has the biggest edge over number 2). But one thing that makes his OWP's so high is that his OBPs were so high because he got walked so much. So those OWP's might not mean what they normally mean. So I also looked at the best 5-year hitting performances in slugging percentage (SLG) relative to the league average (RATE). Cobb's 160 from ages 20-24 means that his SLG was 60% higher than the league average. (.527/.328 = 1.60). )Here are the leaders.

Now the leaders with a 2750 PA minimum.

Bonds is no longer number 1. But the only guy to do better was Ruth from 24-28 & 25-29. The next highest RATE for someone besides Ruth or Bonds is Hornsby at 166 from ages 21-25. Then we have Ted Williams, whose best is from 27-31 at 164. Then Cobb from 22-26, also at 164.

The table below shows the top 30 seasons in relative SLG for players with 2750+ PAs. As you can see, except for Bonds and McGwire, high slugging is a young man's game.

Sunday, March 21, 2010

Ted Williams' Amazing Improvement In 1941

I had a post about this a couple of weeks ago called Ted Williams 1941: May 16th, .333; June 6, .436. This post will go into more detail. It looks like his performance at the ages of 22-23 (1941-2) might be much greater than anyone else in history.

The table below shows that he had the biggest gain in offensive winning percentage (OWP) from ages 21-22 (he was 22 in 1941 when he batted .406). (OWP) is the Bill James stat that says if all 9 hitters were identical, what would the team's winning percentage be if it gave up an average number of runs. The lists I show below are from the Lee Sinins Complete Baseball Encyclopedia, so the OWP they are based on is park adjusted.

But OWP cannot go above 1.00. So Williams could only increase by .236. Other players had alot more room to improve. So I ranked everyone by what percentage of the possible increase they attained. Williams jumped from .764 to .908, a jump of .144. That is 61% of of the possible .236 (shown as .610 in the table). This is by far the greatest increase (I only looked at hitters who had at least 400+ plate appearances (PAs) at each age.

Here are the leaders in the simple gain in OWP.

I also did something similar for the gain in OPS or on-base percentage (OBP) plus slugging percentage (SLG). But I looked for the biggest gainers relative to the league average, assuming a league average OPS of .750. For example, if a player had 1.000 OPS when the league average was .800, his relative OPS was 1.25 (1/.8). Multiplying that by 1.25 leaves .938. With this adjustment having been done, Williams has by far the biggest gain from 21 to 22. The next table shows this.

The next table shows how he had the biggest improvement in OWP (relative to what was possible, like the first table) for all players who had 800+ PAs at both ages 20-21 & 22-23.

The next table shows the leaders simply based on the absolute gain in OWP.

Now the gain in relative OPS from ages 20-21 to 22-23.

Saturday, March 13, 2010

Is Pedro Feliz A "Good RBI Man?"

Joe Posnanski had a blog entry called Pedro Feliz, Houston. It raises the question about whether or not Feliz is a good RBI man. It was in the larger context of discussing the "attribution problem." That came up in a Bill James article. That raised the question of to whom do we attribute things like wins and RBIs when several players have a hand in making them happen.

One thing we might expect of a guy if he is a good RBI man is that he hits well with runners on base. The table below shows these stats for Feliz over his career and for all of MLB over the years 2004-2009, which make up the bulk of Feliz's career. Data is from Retrosheet.

I think the two most important stats to look at in this case are batting average with runers in scoring position (RISP) and slugging percentage with runnes on base (ROB). Feliz actually is below average in RISP AVG. So that argues against him being a good RBI man. His ROB SLG is above average, but only .011 better. Over the course of 300 ROB ABs in a season, that is only 3.3 extra total bases. Maybe one more triple than average, or an extra double and single.

Another way to look at it is that Feliz's isolated power (ISO, SLG - AVG) is .177 with ROB while the league average is .156. So he is .021 better. Times 300 ROB ABs in a season, we get 6.3 more extra bases. So maybe he gets a double in six cases when the average guy gets a single. Maybe half the time a guy scores from first in those cases (while he would not if it were just a single). So that is an extra 3 runs a year.

But don't forget, he gets fewer hits with RISP, so it is less than 3 runs (and this all assumes that he actually has this skill with ROB-the table shows that his SLG goes up more than normal). Players usually, over a long career, hit just about the same in all situations. None of this makes up for a very poor career OBP of .293.

His career RBI-to-GIDP ratio is 5.03 (558/111). The average for all right-handed batters with 2500+ PAs from 2000-2009 is 5.54. Feliz ranked 84th out of 139 players. So in terms of GIDPs, his RBIs are more costly than average. Data compiled from the Lee Sinins Complete Baseball Encyclopedia.

Thursday, March 11, 2010

A Note On Willie Davis And Nomar Garciaparra

Beyond The Boxscore has a good article on Davis called Willie Davis, underrated ballplayer (1940-2010). He currently ranks 124th in WAR all-time with 57.1 at Sean Smith's site.

Many are mentioning how fast he was. I have a theory that fast players (or at least guys who run the bases well), have a high triple-to-double ratio. Some fast players might not hit that many triples because they just don't hit the ball hard enough or far enough. And some guys who hits lots of triples might not be that fast. They might just hit the ball well.

But if a player has alot of triples relative to doubles, then it means that he is able to turn long hits into triples. You have to be fast to do that. If you get thrown out trying to stretch a double into a triple, you get credit for a double.

A few years ago, I did a study where I found the players with the best triple-to-double ratios of all-time (it actually involved standard deviations). It was The Fastest Players Since 1900 According to the Triple-to-Double Ratio. Davis ranked 65th out of 856 players. That puts him in the top 8%. His career ratio .349 (138/395) was almost double what it was in the NL from 1962-1976 (the bulk of his career) at .189.

But he may bave been hurt by his home parks in this measure (mostly Dodger Stadium). His career ratio at home was .267 (48/180). On the road, it was .419 (90/215). My guess is that a fair home park would bump him up in the rankings quite a bit. See The Batting Splits for Willie Davis at Retrosheet. His 138 triples is 66th all-time. He finished in the top ten 10 times, the top five 6 times and led the league twice.

Beyond the Boxscore had a good article about Garciaparra and how he had a great peak value caled How Good Was Nomah's Prime?. It looks like it was very good.

Garciaparra's best two years in offensive winning percentage were .767 (2000) and .743 (1999). There have only been 52 seasons by shortstops with a .700+ OWP (400+ PAs qualifier). Both of Garciaparra's seasons are in the top 25. Honus Wagner has the top 2 and 7 of the top 8. Data from the Lee Sinins Complete Baseball Encyclopedia.

Saturday, March 6, 2010

Ted Williams 1941: May 16th, .333; June 6, .436

Yes, he raised his average 100 points in just 21 days. As you can imagine, it took alot of work to piece all this data together. But I didn't do it, the fantastic people at Retrosheet did it.

Williams started the year going 21 for 63. As of May 16th his SLG was .571 and his OBP was .425. He actually only pinched hit in each of his first 5 games. So I think no one could have expected what he would do over the next 20 games (let alone the whole season, as I explain below), even though he had just come off two excellent seasons (his first two, at ages 20-21).

From May 17th thru June 6th, he went 40 for 77 with 17 walks, good for a .519 AVG. His SLG was .857 along with a .600 OBP. The .436 was his peak AVG for the year (not counting when it was .500 after his first 4 ABs). But don't be too impressed because Williams actually struck out once in this stretch. And he played 12 of the 20 games in Fenway, where he had a career AVG of .361 while it was .328 on the road (but that is not the reason for the great average since he batted .531 in the 8 road games).

The key Retrosheet link is The 1941 BOS A Regular Season Batting Log for Ted Williams.

Williams also saw a big gain in his offensive winning percentage (OWP) over what he did in his first two years. From ages 20-21, it was .749 (.736 at 20 & .764 at 21). At age 22, it was .908. So it went up .159. I looked at all players who had 400+ PAs at age 22 and 800+ PAs from 20-21 and the top gainers are below. Williams had the 8th biggest gain out of the 78 players who fit the criteria (data from the Lee Sinins Complete Baseball Encyclopedia).

But OWP cannot be higher than 1.000. Williams could only go up by .251. So he gained 63.3% of what was possible (.159/.251 = .633). The next highest in these terms is Powell with 50.1%. So this gives Williams a huge edge. The next table shows the leaders by gain in OPS. Baseball had not seen this kind of age 22 improvement in nearly 50 years.

In the NL in 1894, the year for Kelley, the league OPS jumped .078 over the previous year. For 1893, Davis's year, it jumped .092. In AL in 1941, OPS fell .020 over the previous year (and fell .009 in 1940). So Williams did not even have the help of a general offensive surge.

If we use relative OPS, we can see how incredible his 1941 season was. His OPS relative to the league average from age 20 to 21 was 133 (meaning that his OPS divided by the league average and then times 100 = 133). At age 22 it was 170. I then adjusted every player's OPS to a league average of .750. That gave Williams an OPS of .9975 at ages 20-21. At age 22, he gets a 1.275. So he gained .2775 (round that to .278). Here are the top 10:

Ted Williams 0.278
Boog Powell 0.263
Jimmy Sheckard 0.195
Joe Kelley 0.173
George Davis 0.165
Vic Saier 0.158
Rickey Henderson 0.135
Travis Jackson 0.135
Sherry Magee 0.128
Lou Bierbauer 0.113

Again, until Williams came along, no one had improved quite like this at age 22. No one could have seen his 1941 season or his 100 point gain in batting average in 3 weeks coming.

Thursday, March 4, 2010

Bill Gilbert's Annual Arbitration Wrap-up

Bill is the president of the Rogers Hornsby Chapter of SABR. He has been involved with actual arbitration cases in the past. His co-author this year is Tim Darley. Here is the link Arbitration Wrap-up – 2010 By Bill Gilbert and Tim Darley. It is a Word file. Here is a link to the main site of the chapter Rogers Hornsby Chapter.

Tuesday, March 2, 2010

Canada Wins The Olympics! (based on the market value of the medals)

According to a Bloomberg article:

"First-place winners get gold-plated medals that are 92.5 percent silver. The second-place prizes are also 92.5 percent silver, while the third-place bronze medals are mostly copper. The Olympic medals weigh about 500 to 556 grams, meaning the metal in the gold awards are valued at about $537 based on yesterday’s closing prices. The silvers are worth about $300 and the bronzes about $3.40."

See Olympic Champs Wear Old Trinitrons as Teck Turns Junk to Medals. Applying these values to the medals won by all the countries, Canada ranks first. But, if you reverse the outcome of the gold-medal game in hockey, each country changes by $237, the difference between gold and silver. Then the US has 9614 and Canada has 9398.

If you count every medal won by a country, including one for each player on a team, like every hockey player, the next table has the results. Carl Bialik of the WSJ forwarded this NBC site which had the data. The table below is different from the one I posted yesterday. For some reason when I copied and pasted the data from the NBC site into Excel, more information got carried over than appears on the screen. Some data was double counted. So here is the revised table.

Monday, March 1, 2010

Does Consistency Help Teams Win More Games?

This issue came up over at "The Book" blog. See Consistency is better than inconsistency?. Here is something I did several years ago. I think the results are mixed and not conclusive in any way.

I tried to estimate the value of consistency once. I ran a regression with team winning pct as the dependent variable and runs per game, runs allowed per game, the standard deviation of runs per game and the standard deviation of runs allowed per game as the independent variables. I looked at all teams from 1996. First the results just using runs and runs allowed

Pct = .502 + .9*RG - .91*RAG

Then I put in the standard deviations

Pct = .072*RG - .099*RAG + .04*STDRG + .016*STDRAG

The standard error fell only .0013. That works out to about .21 per 162 games. So bringing in consistency did not help predict winning pct very much. I have no explanation for why the coefficient on runs per game falls so much. The STDRG is positive, meaning that the less consistent you are, the higher your winning pct. The difference between the highest and lowest is STDRG was about 1.36. That times .04 is about .054. Times 162 games is 8.75 wins. The least consistent team would win alot more games. The t-value was 1.9. But again, the sign is wrong. The t-value on STDRAG was .72. The difference between the highest and lowest was 1.42. That works out to 3.62 wins per season. I also looked at all teams from 1967-68. The regression equations were

Pct = .495 + .113*RG - .111*RAG


Pct = .510 + .114*RAG - .11*RAG - .004*STDRG - .007*STDRAG

The standard error actually got higher with the 2nd regression. This time the sign is right for STDRG but not for STDRAG. The difference in STDRG between the highest and lowest was 1.41. For a whole season, that works out to about .84 wins. So the most consistent team in scoring won .84 more games than the least consistent. The t-values were -.21 and -.37 for the STDs. For STDRAG, the difference between the highest and lowest was 1.06 which works out to 1.23 wins. The most consistent team in runs allowed would win 1.23 more games than the least consistent.