The stat I used for this was "offensive winning percentage" or OWP. It is a Bill James stat that says what a team's winning percentage would be if it had a lineup of 9 identical players who all hit alike and they gave up an average number of runs. Since I got the data from the Lee Sinins Complete Baseball Encyclopedia, it is park adjusted. I included all players who had at least 300 plate appearances in both 2007 and 2008. The top 25 are below:
I was surprised to see so many players aged 30 or more (11) plus 5 more aged 29. I thought that it would be younger players who improved. Maybe the older guys fluctuate alot more so bigger improvements are possible. But the leader and the #6 guy were both 36. One guy was even 38. The number of players aged 33 or more equalled the number aged 24 or less. The next table shows how much these guys improved in more conventional stats.
Sunday, January 25, 2009
Sunday, January 18, 2009
Jim Rice vs. Jose Cruz
I thought this might make an interesting comparison since I belong to a chapter of SABR in Texas (the Austin one or Hornsby chapter). The point is not that Rice does or does not belong in the Hall of Fame, just that Cruz compares so well. Cruz got 2 votes in 1994. Those are the only votes he has ever gotten.
Career PA
Rice-9058
Cruz-8931
Career Offensive Winning Percentage
Rice-.593
Cruz-.611
Highest 3 year OWP
Rice-.698 (1977-79)
Cruz-.687(1983-85)
Full seasons with .700 OWP or better
Rice-2
Cruz-3
Full season means 400+ PAs.
Full seasons with .600 OWP or better
Rice-5
Cruz-8
Cruz had an additional season with 346 PA
Career Win Shares per 648 PA
Rice-20.17
Cruz-22.71
Since it takes about 3 WS to make 1 win in Bill James' system, Cruz was worth .85 more wins per season. See
http://us.share.geocities.com/cyrilmorong@sbcglobal.net/WSperPA.htm
Career Win Shares
Rice-282
Cruz-313
Seasons with 20+ WS (all-star type seasons)
Rice-7
Cruz-8
Seasons with 30+ WS (MVP type seasons)
Rice-1
Cruz-1
Best 3 Consecutive years in WS
Rice-90 (1977-79, 26-36-28)
Cruz-80 (1983-85, 30-29-21)
I have also attemtped to rank players by their value above replacement. I did two lists. One with a TPR or a BFW (from Pete Palmer) of -2 per 700 PAs as replacement level and one with -3. I divided each guy's career PAs by 700. Then I multiplied that times 2 or 3. That result got added to his career TPR to get career value over replacement. Here is the all-time ranking through 2004.
http://www.geocities.com/cyrilmorong@sbcglobal.net/REP.htm
For VAR using -2 TPR per season
Rice-44.8
Cruz-46.72
For VAR using -3 TPR per season
Rice-57.42
Cruz-59.48
Best 3 Consecutive years in TPR or BFW
Rice-10.3 (1977-79, 3.0-4.2-3.1)
Cruz-7.7 (1983-85, 2.8-3.6-1.3)
MVP award shares
Rice-3.15 (tied for 29th, 6 top 5 finishes)
Cruz-.96 (248th, Al Oliver is higher with 1.25, only 1 top 5 finish)
Career PA
Rice-9058
Cruz-8931
Career Offensive Winning Percentage
Rice-.593
Cruz-.611
Highest 3 year OWP
Rice-.698 (1977-79)
Cruz-.687(1983-85)
Full seasons with .700 OWP or better
Rice-2
Cruz-3
Full season means 400+ PAs.
Full seasons with .600 OWP or better
Rice-5
Cruz-8
Cruz had an additional season with 346 PA
Career Win Shares per 648 PA
Rice-20.17
Cruz-22.71
Since it takes about 3 WS to make 1 win in Bill James' system, Cruz was worth .85 more wins per season. See
http://us.share.geocities.com/cyrilmorong@sbcglobal.net/WSperPA.htm
Career Win Shares
Rice-282
Cruz-313
Seasons with 20+ WS (all-star type seasons)
Rice-7
Cruz-8
Seasons with 30+ WS (MVP type seasons)
Rice-1
Cruz-1
Best 3 Consecutive years in WS
Rice-90 (1977-79, 26-36-28)
Cruz-80 (1983-85, 30-29-21)
I have also attemtped to rank players by their value above replacement. I did two lists. One with a TPR or a BFW (from Pete Palmer) of -2 per 700 PAs as replacement level and one with -3. I divided each guy's career PAs by 700. Then I multiplied that times 2 or 3. That result got added to his career TPR to get career value over replacement. Here is the all-time ranking through 2004.
http://www.geocities.com/cyrilmorong@sbcglobal.net/REP.htm
For VAR using -2 TPR per season
Rice-44.8
Cruz-46.72
For VAR using -3 TPR per season
Rice-57.42
Cruz-59.48
Best 3 Consecutive years in TPR or BFW
Rice-10.3 (1977-79, 3.0-4.2-3.1)
Cruz-7.7 (1983-85, 2.8-3.6-1.3)
MVP award shares
Rice-3.15 (tied for 29th, 6 top 5 finishes)
Cruz-.96 (248th, Al Oliver is higher with 1.25, only 1 top 5 finish)
Monday, January 12, 2009
Positional Hitting Over Time
There are four graphs below. Each one shows the slugging percentage (SLG) divided by the league average for all 8 every day fielding positions. Each data point is a five year average. The first two graphs are the AL and the next two are the NL.
Some interesting trends:
-Shortstops have been rising quite a bit in both leagues since the 1970s.
-2B men started declining in the AL in the 1940s, then starting rising again in the 1960s. But even now they have not reached their earlier peak. They started declining in the 1920s in the NL but then started back up in the late 1950s.
-3B men started to rise in the AL in the 1920s and in the 1930s in the NL. But they have tailed off in the AL since 1980.
-1B men had a big spike in the AL in the 1920s and 1930s. In both leagues, in general, they have been high but have fluctuated.
-CFers seem to have been in decline since the 1970s in both leagues.
-LFers seem to have been in decline in the AL for some time but it does not seem that way in the NL.
Some interesting trends:
-Shortstops have been rising quite a bit in both leagues since the 1970s.
-2B men started declining in the AL in the 1940s, then starting rising again in the 1960s. But even now they have not reached their earlier peak. They started declining in the 1920s in the NL but then started back up in the late 1950s.
-3B men started to rise in the AL in the 1920s and in the 1930s in the NL. But they have tailed off in the AL since 1980.
-1B men had a big spike in the AL in the 1920s and 1930s. In both leagues, in general, they have been high but have fluctuated.
-CFers seem to have been in decline since the 1970s in both leagues.
-LFers seem to have been in decline in the AL for some time but it does not seem that way in the NL.
Wednesday, January 7, 2009
Which Players Had The Most Surprising Walk Rates
A few years ago I noticed that Miller Huggins walked quite a bit yet did not seem to be much of a hitter. From 1904-1916 he lead the league 4 times in walks and was in the top ten 7 other times. So what kind of fearsome hitter was he that he got walked so much? His career batting average (AVG) was .265, not bad in the dead ball era. The league average was .260 during his career. His career slugging percentage (SLG) was .314 while the league average was .343.
But a better measure of power is isolated power (ISO) or SLG - AVG. It tells us extra bases per AB (after all, if a guy can get a single every time up, his SLG would be 1.000 yet he has now power). Huggins' ISO was .049 while the league average was .083. So I was very impressed that he knew the strike zone well enough and had so much discipline that he could walk so frequently yet not have much hitting ability in general.
I thought it would be interesting to come up with some kind of measure of this ability. At first I used walk rate divided by ISO. But his turned out to be unfair to many sluggers since ISO can go very high (theoretically as high as 3.000). Their walk rate would end up being divided by a very large number so their Walk rate/ISO would be low.
So I ran a regression. A player's walk rate (relative to the league average) was the dependent variable and his ISO (relative to the league average) was the independent variable. The idea is that power hitters would get walked more than other hitters. I used all players with 5000+ career plate appearances (885 players). The data comes from the Lee Sinins Complete Baseball Encyclopedia. Here is the regression equation:
Walk Rate = 61.5 + .428*ISO
(In the Sinins encyclopedia, a walk rate of 150, for example, means that the player walked 50% more than average). The r-squared was .159, meaning that only 15.9% of the variation in hitter's walk rates is explained by variation in ISO. But the t-value for the coefficient on ISO was over 12, so it was statistically very significant.
Then each player's walk rate was predicted using the equation and the difference between their actual rate and the predicted rate was found. Then all the players were ranked from highest to lowest by this difference. The table below shows the top 25.
So Roy Thomas did the best. His actual walk rate was 2.49 times the average but his ISO was only .55 or 55% of the average. The equation predicts that he would have a walk rate of 85.06 (or 85.06% of the league average). Since 249 - 85.06 = 163.94, his walk rate was that many points above expected and he had the highest difference. Miller Huggins does very well, coming in at number 7.
The players who walked the least (based on the equation) are below:
I also did something similar using SLG. Here are the best players followed by the worst.
But a better measure of power is isolated power (ISO) or SLG - AVG. It tells us extra bases per AB (after all, if a guy can get a single every time up, his SLG would be 1.000 yet he has now power). Huggins' ISO was .049 while the league average was .083. So I was very impressed that he knew the strike zone well enough and had so much discipline that he could walk so frequently yet not have much hitting ability in general.
I thought it would be interesting to come up with some kind of measure of this ability. At first I used walk rate divided by ISO. But his turned out to be unfair to many sluggers since ISO can go very high (theoretically as high as 3.000). Their walk rate would end up being divided by a very large number so their Walk rate/ISO would be low.
So I ran a regression. A player's walk rate (relative to the league average) was the dependent variable and his ISO (relative to the league average) was the independent variable. The idea is that power hitters would get walked more than other hitters. I used all players with 5000+ career plate appearances (885 players). The data comes from the Lee Sinins Complete Baseball Encyclopedia. Here is the regression equation:
Walk Rate = 61.5 + .428*ISO
(In the Sinins encyclopedia, a walk rate of 150, for example, means that the player walked 50% more than average). The r-squared was .159, meaning that only 15.9% of the variation in hitter's walk rates is explained by variation in ISO. But the t-value for the coefficient on ISO was over 12, so it was statistically very significant.
Then each player's walk rate was predicted using the equation and the difference between their actual rate and the predicted rate was found. Then all the players were ranked from highest to lowest by this difference. The table below shows the top 25.
So Roy Thomas did the best. His actual walk rate was 2.49 times the average but his ISO was only .55 or 55% of the average. The equation predicts that he would have a walk rate of 85.06 (or 85.06% of the league average). Since 249 - 85.06 = 163.94, his walk rate was that many points above expected and he had the highest difference. Miller Huggins does very well, coming in at number 7.
The players who walked the least (based on the equation) are below:
I also did something similar using SLG. Here are the best players followed by the worst.