Friday, December 30, 2011

Who Had A Season With WAR ≥ 10 But Is Not In The Hall Of Fame?

The table below shows all the position players who had at least one season of WAR 10 or higher (from Baseball Reference).


Of all the players who are eligible, the only one not in the Hall is Norm Cash. Now it might seem like his 1961 season when he batted .361, hit 41 HRs, drew 132 walks had a .487 OBP and a .662 SLG was the flukiest season ever. He never batted .300 again (his last year was 1974) nor did he ever reach 40 HRs or 100 RBIs. He never reached a .400 OBP again or a .600 SLG (he only reached .500 3 more times-he was pretty much a full-time player until 1973).

But his 1961 season was not the flukiest. I have written about this before. See Which Players Had The Most Uncharacteristically Good Seasons? Cash had the 25th flukiest season.

It will be interesting to see what happens with the voting for some of these other guys. Bonds, Giambi, Sosa and AROD might be penalized by some voters for using PEDs. Pujols should get in easily. It is probably too early to tell for Kemp. But Beltre is interesting. His yearly performance has been inconsistent. But he now has 47.6 WAR and he will be 33 starting next year. He is 189th in career WAR among position players but if he could get up to about 62, he would be in the top 100. His last two years were 6.1 and 5.2, so another 14.4 seems possible. Being in the top 100 would make him a legitimate candidate.

Now a WAR of 10 gives an edge to guys who played after 1961, when the season went to 162 games (in the AL, 1962 in the NL). So I looked for guys who had 9.5 or more before then (covering about the 5% difference in games from a 154 game season). There were a total of 82 seasons of WAR ≥ 10 and another 21 ≥ 9.5 before the 162 game season came in. Of the guys in that group who are not in the Hall of Fame we have George Stone (9.8, 1906), Joe Jackson (9.5, 1912), and Al Rosen (9.7, 1953, the year he missed the triple crown by losing the batting title to Mickey Vernon, .337-.336).

Click here to see George Stone's stats at Baseball Reference. He did not have a long career, ending at 33 with 3600 PAs. His next highest WAR was 5.2. Click here to read his SABR bio by John McMurray. This passage discusses his decline:
"After his great initial success, Stone held out for $5,000 to start the 1907 campaign. In order to make sure that team owner Robert Hedges met his demands, Stone did not report to the team until right before the start of the season. The holdout, as one publication put it, "seems to have been the turning point of his career." On one level, "the papers aired the case and naturally by some Stone was censured for what was termed unreasonable demands." Moreover, "when he was finally granted the amount he asked, the fans figured that a player getting such big money should never fail to deliver the goods. Any time Stone failed, and unfortunately for him he had a rather tough year in 1908, he was roasted to a turn by the fans. Stone began to show signs of slowing up that year."

Stone's statistics fell off in both 1907 and 1908, though he was still an outstanding hitter. One account indicates that he contracted malaria in 1908, and Stone's production plummeted in 1909 when he suffered an injury to his ankle. That injury cost Stone his speed, which had enabled him to beat out many infield hits. He also had problems with his arm, and "any time a ball was hit into his territory the opposing base runners advanced almost at will. The worry over all these things caused Stone's batting to suffer and as a result the sensation of the American League of 1906 was a near joke in 1910." Stone never hit higher than .300 after 1907, and his average fell to .256 in 1910, his last season in the major leagues. Stone returned to the Milwaukee Brewers in 1911, batting .282, but injuries led him to retire from professional baseball just 12 games into the 1912 campaign."

Wednesday, December 28, 2011

Why Didn't The Writers Vote Johnny Mize Into The Hall Of Fame?

He got in, via the Veterans Committee, in 1981. So it might seem a little late and silly to complain about it.

But he never he even got 50% of the vote from the writers (he topped out at 43.6% of the vote in 1971 and he got 41.3% in 1973, his last year of eligibility). If we went strictly by WAR, it seems like he should definitely be in. Even now, about 50 years from when he first became eligible, he is 55th in career WAR among position players with 70.2. He had 8 top 5 finishes and one first place. He was in the top 5 each year from 1937-40.

So he had very high career value and peak value. In Win Shares, he also had 8 top 5 finishes among position players, including 3 first places finishes. He was 104th through 2001 in career Win Shares (338) including pitchers. He also missed 3 seasons due to WW II. Bill James ranked him as the 6th best 1B man in the 2nd Historical Abstract.

We certainly cannot fault the writers in the 1960s and 1970s for not being up on sabermetrics. He did seem to have very good conventional stats, though. He hit 359 HRs and easily would have made 400 if not for the war (he was 10th in career HRs through 1960, his first year of eligibility). He had a .312 career average. He lead the NL in HRs 4 times, RBIs 3 times and AVG once. He was selected to 10 all-star teams. The writers seemed to like him when he played. He had 4 top 5 finishes in the MVP voting and even now he is 57th in MVP shares with 2.46 (of course, there was not much MVP voting before 1931, but that is still a pretty good rank). If a guy was the 57th best player since 1931, that would be a good case for being in.

So why didn't the writers vote him in, never reaching 50% of the vote? His bio at SABR by Jerry Grillo has one interesting quote: "Broeg and others have indicated that Mize’s defensive liabilities probably cost him..." Yet his career defensive WAR is a positive 2.6. He also finished in the top 3 in fielding Win Shares 7 times among NL 1B men (and 1st in 1948). Bill James gave him a B as a fielder. The SABR bio mentions that he got his nickname "The Big Cat" due to his fielding, scooping out bad throws.

The Baseball Reference Bullpen article on Mize says this:
"For a player with such notable sabermetric statistics, he was also quite late in being inducted into the Hall of Fame, finally being chosen by the Veteran's Committee 28 years after his retirement, in 1981. There are at least two possible explanations for this. One, during his playing years, he apparently did not enjoy particularly good relations with the baseball sportswriters, from whose ranks are chosen those members who vote on candidates for the Hall of Fame. Two, his power, his fine batting average, and his extremely good On-Base Percentage were not as evident to his contemporaries, who were more impressed by Ted Williams, Joe DiMaggio, and Stan Musial, as they are today in the light of sabermetric analysis.

Another couple reasons are quite powerful, too. First, his lifetime stats are not very impressive compared to most Hall of Famers - he barely had 2,000 hits, he had 359 home runs (currently # 65 on the all-time list as of 2006, one below Gary Gaetti), and he had 1,337 RBI (currently # 76 on the all-time list as of 2006, four below Gary Gaetti). Second, he played in his early years in a ballpark that favored hitters. So he was a top-notch player, but one that didn't put up numbers as large as most Hall of Famers. Since he missed three years to World War II, the Hall of Fame rightfully adjusted for the numbers."

I don't know anything about his relationship with the writers. But as I mentioned earlier, he did well in the MVP voting (a point Bill James has made about Ted Williams).

On the park effects, those are taken into account in WAR and Win Shares. Now it is possible that park effects are not always fair to individual hitters. It was only 250 feet down the line in the Polo Grounds. Yet the park factors for the Giants in the years Mize was on them are pretty neutral. Click here to see them at Baseball Reference. So maybe his hitting stats don't get adjusted downwards as much as they should.

Here are his AVG/OBP/SLG/OPS both home and road:

Home: .320/.406/.598/1.004
Road: .305/.389/.527/.916

So a big split, but not that bad. His road stats are very good. Again, the writers might not have had access to this, but it seems like there should have been a general sense that he hit well everywhere, given that these guys watched him play all the time.

In fact, if we only used his road OPS, it would be the 37th highest of all-time through 2009 (relative to the league average) of players with 5000+ PAs (using the Lee Sinins Complete Baseball Encyclopedia). His road OPS was about 25% higher than the league average. He is 9th among 1B men (guys who played at least 50% of their games there). One of the guys ahead of him is Todd Helton, who benefited even more from his home park.

In his first year of eligibility, 1960, he got only 16.7% of the vote. Click here to see the voting that year at Baseball Reference. Twelve guys got more votes than he did that year and he had more WAR than all of them. He beat 8 of them buy 20 or more WAR. Edd Roush, Sam Rice and Eppa Rixy all got over 50% of the vote that year, a level Mize never achieved. None of them had even 52 WAR (Mize had 70.2). All but one of the 12 got in before Mize (except Lazzeri). Most were by the Veterans Committee. So they too, did not give Mize the credit he deserved.

I think the writers, and to a lesser extent the Veterans Committee, did a poor job in evaluating Mize. I hope the writers have been, and are getting, better. But when I see the voting for guys like Raines and Bagwell, not to mention Lou Whitaker being gone after just one year on the ballot, I am not sure.

More discussion at Baseball Think Factory

Friday, December 23, 2011

Don Mattingly’s Peak vs. Fred McGriff’s

See HOF Story 2: The Holdovers by Joe Posnanski. (Hat Tip: The Book blog)

POZ wrote:
“Don Mattingly’s career was too short, but few would say McGriff was as good a player as Mattingly at their best. I wouldn’t.”

From 1984-86, Mattingly had an OPS+ of 158. From 1988-90, McGriff had an OPS+ of 159. McGriff’s 1988-91 OPS+ of 156 also beats Mattingly’s 1984-87 OPS+ of 155.

But in WAR, Mattingly is ahead over three years 19.6-16 and ahead over 4 years 25.3-20.6. Over the 4 years Mattingly had 0.7 more defensive WAR, so that is only a small part of the difference, meaning Mattingly’s offensive WAR advantage was 4.0. He had about 157 more PAs. It does not seem like that would give him an edge of 4.0. I can’t tell what accounts for it.

I also found Mattingly’s best 4 year period for Win Shares was 14 better than McGriff’s (122-108). Mattingly had 2.3 more fielding Win Shares, so most of the difference is due to hitting. Again, when their OPS+ is so close, this is surprising.

Wednesday, December 21, 2011

How Well Might Pujols Hit After Age 35?

The first thing I tried to do was create a set of comparables. So I found all the 1B men who had 4000+ PAs before the age of 32 since 1920 who also had a relative slugging percentage of 120 or higher (that means their SLG was 20% better than the league average-I used the Lee Sinins Complete Baseball Encyclopedia). These guys had to have played at least half their games at 1B. I also included any DHs that fit (that added in Frank Thomas). So I was trying to set up a group of power-hitting 1B men to use in comparison. I went up through 2009.

This made for a total of 35 players. Three are still active (Thome, Giambi and Helton). But, of the 32 who were not active, just 8 of them had 1000+ PAs after the age of the 35. Here they are with their RCAA, or runs created above average (it is park adjusted):


If we assume a full season is 700 PAs, then none of these guys reached even five full seasons after the age of 35 and only two reached four full seasons (although you could argue that McCovey and McGriff are also at 4+ full seasons if you use 500+ PAs). But that means out of the 32 comparable players, only 4 could be said to have played at least 4 full seasons after the age of 35 and Pujols has a contract calling for 6 seasons after 35.

Now the best performer here seems to be Palmeiro. We don't know how much of his performance was due to drugs. Again using the 700 PA for a full season, he gets 4.43 seasons. If we assume a -2 wins below average for a replacement player, that gets him 8.86 WAR. Then if we assume about 10 runs per win, his RCAA gets us about 12 wins. Then we are at about 21 WAR for Palmeiro as the best performer among Pujols' comparables (of course, I am assuming no negative WAR from defense).

Of course, Pujols is better than the average of these guys. This group collectively had only about 38% as many PAs post-35 than pre-32 and their average RCAA per PA fell from about .069 to about .022 (I added 1000 PAs and an RCAA of 100 to Mize for ages 30 and 31 which were missed due to the War). That means that Pujols, if this goup is an indicator, will get 2846 PAs after age 35 and get about .0338 RCAA per PA (he has average about .10 RCAA per PA so far in his career (I estimated his RCAA for the last two years since I don't have the latest Sinins data yet).

I assumed that Pujols' dropoff in RCAA per PA will be the same as this group. So he would get about 96 total RCAA. So that is 9.6 wins. Then he gets about 4 full seasons so we had another 8 wins, again assuming -2 wins is replacement. So that would be 17.6 WAR for Pujols after age 35 (again assuming no negative defensive WAR). Remember that he has 6 contract years at $25 million a year. So the Angels will be paying $150 million for 17.6 WAR or $8.5 million per WAR. Maybe with inflation, a championship or two and extra ticket sales that might be worth it.

Now the three guys who are still active are Thome, Giambi and Helton. He are their PAs and OPS+ after the age of 35 so far

Jim Thome 2236/138
Jason Giambi 1601/113
Todd Helton 964/103

I assume Helton will get past the 1000 PA mark. So combining this group with the others would mean that we have 11 out of 35 of the comparables getting at least 1000 PAs. I don't have their RCAA for the last two years and I don't know how much more they will play but my guess is that if I added them to the other 8 guys and re-did the analysis it would not change things much.

One thing to remember though is that Pujols is a very unusual player. He started playing full-time at age 21 and in 11 seasons has not missed many games. Only 20 players since 1920 had more PAs under the age of 32 than he did and only 3 had more WAR. So maybe looking at other power-hitting 1B men is not the way to go. One interesting thing here is that some other all-time great 1B men did very little after the age of 35. That includes Gehrig, Foxx and Greenberg. But perhaps each of them has highly unique reasons for not aging well that won't necessarily apply to Pujols. Gehrig had ALS, Foxx was a drinker and Greenberg had the chance to move into the front office.

So the last thing I will do here is compare Pujols to all the players who had 6500+ PAs since 1920 under the age of 32. Through this year, there were 65 players in this group (I took out some guys who are still active and are not yet 36-see list below). Of these 65, 23 had at least 1000 PAs after the age of 35 (that includes Jeter and Damon who might keep getting more PAs). So just about one-third of these comparables ended up with at least 1000 post 35 PAs. That is similar to what I found for the power hitting 1B men. They got about 33% as many PAs post 35 as pre-32 and about 17% of the WAR. If Pujols gets 17% as much WAR, he will get about 15.4 WAR after age 35. Maybe 16 if Damon and Jeter are taken out.


Here is that list:

Adam Dunn
Albert Pujols
Adrian Beltre
Jimmy Rollins
Andruw Jones
Carlos Beltran
Edgar Renteria
Alex Rodriguez

Saturday, December 10, 2011

Batters Who Had 7+ Seasons With A 150 Or Higher OPS+

The PA minimum was 400. Here is the list of those 38 players. I highlighted Elmer Flick because he is the answer (or maybe an answer) to a trivia question: name a player who led the NL in RBI's and later went on to lead the AL in SBs. I don't think there are any others but I am not sure.


Flick was finally voted into the Hall of Fame in 1963 by the Veterans Committee. He was 87 but luckily still alive. He only got 0.4% of the vote from the writers in 1938 and that was it. In an 8 year period he had 7 top 10 finishes in WAR among position players including 5 in the top 5. Even now he ranks pretty high in career WAR (133rd) with 56.7. Through 1938, he was 36th. He had 9 top 10 finishes in OPS+ including 6 top 5 finishes and a first.

Health problems cut his career short. He played only 99 games after the age of 31 (when he had a 153 OPS+). See his SABR Bio by Angelo Louisa. Here is an excerpt:
"Despite his short but highly productive career in the majors, Flick remained largely forgotten by the baseball community in general and the Hall of Fame voters in particular until Ty Cobb's death in 1961. Some articles written about the Georgia Peach mentioned the aborted 1907 trade and thus revived interest in who Flick was and what made him worthy of being suggested in a trade for Cobb. The renewed attention, in turn, led to Flick being voted into the Hall of Fame by the Veterans Committee in 1963, an honor he treasured until he died from congestive heart failure at 8:25 A.M. on January 9, 1971, only two days before his 95th birthday. Flick also suffered from mycosis fungoides, a malignant lymphoma, which contributed to his death."

Those who just missed with 6 were

Alex Rodriguez
Billy Hamilton
Gary Sheffield
Harmon Killebrew
Harry Heilmann
Harry Stovey
Jeff Bagwell
Larry Walker
Mike Piazza
Pete Browning
Reggie Jackson
Vladimir Guerrero

Monday, December 5, 2011

Ron Santo Finally Makes It To The Hall of Fame, As He Should Have

He got 15 votes from the 16 member "Golden Era" committee. I read a couple of stories about it but I did not see any statements from the committee or its members on why they picked Santo and no one else. I think Santo definitely deserves it. It might have helped that a former teammate, Billy Williams, was on the committee along with long time Santo supporter Brooks Robinson. I have written some posts on Santo in the past. Here are their links:

What Might Explain Ron Santo's Low Hall Of Fame Voting Percentages?

I used some voting formulas I had come up with and it seemed like Santo just did not have the kinds of stats and accomplishments that the voters (writers) liked.

Did Santo Play In An Era Of Poor Third Basemen?

I showed that he didn't. Sometimes our advanced sabermetric stats might be a little mis-leading since we compare players to the league average at his position during his time period. If Santo had been up against many weak 3B men, that could make his value look greater. But I don't think that is what happened.

Santo Was Valuable Outside Of Wrigley Field

I showed that although he hit much better in Wrigley Field than elsewhere he was still very valuable in road games (his career OPS in home games was .905 while it was .748 on the road-this includes the 117 game season for the White Sox in 1974). This was based on the run environment of his era, which was generally low. So although his road stats don't look like much, they were highly valuable.

Was Ron Santo The Best Player In the National League From 1964-68?

I looked at various sabermetric measures. If we was not the best, he was close.

Ron Santo vs. Brooks Robinson And Hall Of Fame Voting

I showed that Santo's performance compared favorably to Robinson even though Robinson did much better in the voting.

Peak Value And Hall Of Fame Worthiness

I showed that Santo's performance from 1965-67 might have been tied for the 24th best 3-year run in baseball history.

Thursday, November 24, 2011

Teams That Won The Most Post-Season Games Over A Two-Year Period Yet Failed To Win The World Series In Either Year

The Rangers won 18 post-season games over the last two years but did not win the World Series in either year. In fact, they won more post-season games than 11 of the World Series winners since 1995 (using both the year before and after). The Rangers' 18 wins is the 6th highest two year total and all the teams with more did win the series in at least one of the years. Here are the leaders for each two year period and the teams that failed to win the series in either year are in red. There were some ties so all of those teams are listed. I think I counted and entered everything right. Let me know if you spot any mistakes.

Saturday, November 12, 2011

Does Yahoo Sports Have Mistakes In Its Baseball Data?

The table below shows the discrepancies between Yahoo and Baseball Reference



It seem like SLGA is supposed to be slugging percentage allowed. Yahoo may have divided TB by TBF instead of AB. Buehrle has allowed 4011 TB in his career and 4011/10317 = .389. But if they used that for SLGA, what is OBSA? I thought it might be OPS allowed, but then Buehrle should have .704 using Yahoo's numbers. But they show only .689. So it is not clear what is going on.

Here are the links the Yahoo pages for these three pitchers.

Buehrle

Halladay

Lincecum

Now for Baseball Reference

Buehrle

Halladay

Lincecum

Monday, November 7, 2011

The Pirates were lucky to win the 1971 World Series, but how lucky?

This came up on one of the SABR bulletin boards.


Their OPS differential for the whole season was .073. That translates into a winning pct of .592 using my equation Pct = .5 + 1.26*OPSDIFF. The Orioles had a differential of .096, good for a pct of .621. That means that the Orioles would have 53% chance of winning any given game using Bill James' Log 5 method. I came up with the Orioles having about a 56.5% chance of winning the series, taking into account all the different ways they could win a series of a given length. They also had home field advantage, which should have increased things about 2% (2% more than 55.7% so about 58%)



7.9% of the time it is an Orioles sweep


14.8% of the time the O's win in 5


17.4% of the time they win in 6


16.4% of the time they win in 7


So the Pirates had a 42% chance of winning

Friday, November 4, 2011

Did The 2002 A's Of "Moneyball" Fame Win More Games Than Their Stats Might Predict?

Maybe. I plugged their OPS differential into the following equation for winning percentage:

Pct = .5 + 1.26*OPSDifferential

The A's had an OPS of .771 while their pitchers allowed an OPS of .699. So their differential was .072. The equation predicts a pct of .591 or about 95.7 wins. They actually won 103. So they won about 7.3 more games than expected. The standard error of the regression that generated the above equation was 5.04, so the A's were 1.44 standard deviations above their expectation. Not huge, but not small either.

They did have a 32-14 record in 1-run games for a pct. of .696. They had a .612 pct. in all other games. If they had that for all games, they would have won 99.16 games, alot closer to what their OPS differential predicts.

I don't see anything in particular that explains why they outperformed their projection. The table below shows how both their hitters and pitchers performed in various situations followed by their differentials in those situations (data from Retrosheet).


Nothing really jumps out. Their differentials with runners in scoring position (RISP) are a bit higher than with none on and to a lesser extent with bases loaded. Their OBP differential looks good in close and late situations but their SLG differential is much lower than normal. Their overall OBP differential was .024 while for SLG it was .048.

The A's grounded into 128 DPs, just one more than their opponents, whom they out OBPed .339 to .315. Their GIDP rate was 10% while the league average was 11%, the rate the A's allowed (some data also from Baseball Reference).

The A's only had 20 sacrifice hits while their opponents had 50. So they saved 30 outs that way. The A's were46-20 stealing while their opponents were 68-46. So they saved 26 outs there. All of that is about two games worth of outs.

The A's out homered their opponents 205-135, by 70. They only hit 10 more 2B's than their opponents and had the same number of 3B's. So their advantage in SLG was almost entirely determined by this big HR advantage and HRs have one additional edge over other hits in that they guarantee at least one run.

Tuesday, November 1, 2011

Has Starlin Castro done things offensively that guys who go on to be ten-time All-Stars and Hall of Famers did?

New Cub GM Theo Epstein recently said of Castro:
"Offensively, the things he has done is what guys who go on to be ten-time All-Stars and Hall of Famers do."

See Theo Epstein Loves Starlin Castro and Other Bullets at "Bleacher Nation."

The list below shows all 2B-SS who had 500+ PAs through the age of 21 and their OPS+.



He is close to two Hall of Famers, Alomar and Ripken. He has done better so far than two frequent All-Stars, Whitaker and Randolph. But Delino DeShields and Gregg Jeffries did better and are not in the Hall and Jeffries had just two All-Star teams.

Here are three SS he is ahead of who made the Hall:

Travis Jackson 92
Robin Yount 86
Rabbit Maranville 78

Wednesday, October 26, 2011

Teams Overdue For A World Series Appearance

A team is overdue if the number of years since their last appearance is greater than the number of teams in their league. The table below lists all such teams as of now in no particular order. Once we get into 2012, the Indians will be 15 years from their last appearance.



Some are very overdue. Some not only have not been making it to the World Series, but they have been doing poorly otherwise, too. This may seem like alot of teams, but at the conclusion of the 1958 season, 11 of the 16 teams were overdue.

We have always had hapless teams. The Phillies went 35 years without a pennant before finally getting one in 1950 but then went another 30 years. The original Senators did not win one in their last 27 years in Washington. The St. Louis Browns went only once to the Series in over 50 years before moving to Baltimore. The White Sox went 40 years without one until 1959 and then went another 46 years.

It is probably hard to quantify "overdueness" over baseball's history to accurately assess how bad things are now. In the pre-1960 years, you could be overdue in just 8 years so it may not have felt so bad to the fans. Now it means a longer time period. Right now, it would take 7 years to clear out the back log of teams overdue and it would require all of the hapless teams to get in over this stretch (it does not seem likely).

The following table shows the teams that are at least half way to being overdue.

Wednesday, October 19, 2011

OPS Differential Gives Big Edge To Rangers

The Rangers hitters had an OPS of .800 during the season and their pitchers allowed a .698 OPS. So that is a .102 differential.

The comparable numbers for the Cardinals are .766-.717-.049.

The following equation gives a good estimate of winning percentage.

Pct = .5 + 1.3*OPSDIFF

That gives the Rangers a winning percentage of .633 and the Cards .564. Using the Log5 method for predicting the probability of winning by Bill James and Dallas Adams (and posted by Tangotiger)

W%(A v. B) = W%(A)*(1 - W%(B))/(W%(A)*(1 - W%(B)) + (1 - W%(A))*W%(B))

we get the Rangers having a 57% chance of winning any given game (this leaves out home field advantage).

The Rangers were even better in September, with a differenital of .298 (.916 - .618). The Cards had .125 (.807 - .682).

The Rangers, however, have a negative differential in the playoffs so far of -.017 (.764 - .781) while the Cards have .096 (.793 - .699). Combining the September differential and the playoff differential in a weighted average by games gives the Rangers a differential of .208 and the Cards .116.

Sunday, October 16, 2011

Not The Year Of The Pitcher In The Playoffs

In all playoff games, the AL teams averaged 4.87 runs per game. The AL runs per game this year was 4.46 during the regular season.

The NL average so far is 4.63 (not counting tonight's game with the Cards leading 11-6 in the bottom of the 7th). If it ends up with that score, the NL average will be 4.88. If the Brewers could win 12-11 and tomorrow's game ends up 1-0, the NL average will be 4.82. The NL runs per game this year was 4.13 during the regular season.

Only 10 of the 30 games so far had both teams scoring 4 or fewer runs. Only 5 games had both teams scoring 3 or fewer runs.

Saturday, October 15, 2011

Carl Yastrzemski's GDP Rate In 1967? 3%

That was by far the lowest of his career. So another reason why that season was so great. I did not see very much on this. See The Great Forgotten Season: Carl Yastrzemski, 1967 by Cody Swartz of "Bleacher Report."

Yastrzemski's only season when he had a higher AVG against lefties was 1967. He also hit especially great after August 31. He batted .417 (40 for 96). He ended August with a .3085 AVG. He slugged .760 after Aug. 31. Through that date it was .594. See Was The Left Hand Of God Responsible For The Red Sox Miracle In 1967?.

That season was also one of the most indispensable seasons ever, meaning his team really needed him to have a great year. See Indispensable Seasons Go To WAR!

The table below show's his GDPs and GDP rate for each year of his career. 1967 was much lower than any other year. I also show his SO rate for each year (using PA - IBB - SH). Data from Baseball Reference. I thought maybe if his strikeout rate had been alot higher that year it would account for the lower DP rate. But it does not look that way. The one thing Yaz did that year that he never did before was hit alot of HRs, 44. His previous high had been 20. So maybe putting the ball in the air more helped. Baseball Reference does have his FB/GB ratios for any year. But in other years when he hit 40+ HRs, the GDP rate was not so low (1969 & 1970). Maybe he was just faster that year.

How many runs did he save by hitting into fewer DPs? Averaging his rate over 1966 and 1968, I get about 9 DPs saved. What was it worth to not hit into a DP? Using Tangotiger's Run Expectancy Matrix, 1950-2010, my guess is that it would be worth between .122 and .198 runs in each case. Using the 1950-68 matrix, with a man on 1st and 2 outs, the run expectancy is .264. With 2 outs and none on, it is .066. So if there is a man on 1st and no outs and Yaz strike's out or beats the throw to first, you save .198 runs. Doing something similar with a man on 1st and 1 out gets us .122 runs saved.

The average of those two is .16 and over 9 DPs avoided, it is just 1.44 runs. That may not be much but when the Red Sox and Twins played the last game of the season tied for first, the lower DP total might have mattered.

If I used run values from the 1999 Big Bad Baseball Annual, a GDP was -.37 runs and other outs were worth -.09. So not hitting into a DP and making another kind of out saves .28 runs for a total of 2.52 runs over 9 DPs avoided.

Now maybe Yaz got hits instead of hitting into DPs. But the two other years when he hit over .320 he had about a 10% GDP rate, still much higher. So we can't simply say he hit better that year and that caused the lower GDP rate.

Friday, October 7, 2011

"Data guys" More Important In Business Due To "Moneyball"

See When Data Guys Triumph by CADE MASSEY and BOB TEDESCHI, NY Times business section, 10-2-11.

Oakland A's general manager Billy Beane and author Bill James are entrepreneurs who created a whole new way of running baseball teams based on statistics and this creative spirit is starting to have an impact in the business world.

The Nobel prize winning physicist Richard Feynman said that "science is the belief in the ignorance of the experts." By challenging the experts in baseball, Beane and James were true scientists, asking questions and looking at data in new ways. Excerpts:
"JOSHUA MILBERG has plenty of business cred: an M.B.A. from Yale, experience in the mayor’s office in Chicago, a job as a vice president for an energy consulting firm. But all of that, Mr. Milberg says, matters less than his reputation as “the data guy” — someone who can offer insights through statistical analysis. And for that, he and a growing number of young executives can credit none other than “Moneyball: The Art of Winning an Unfair Game,” by Michael Lewis."

The book "...examines how the Oakland Athletics achieved an amazing winning streak while having the smallest player payroll in Major League Baseball. (Short answer: creative use of data.)

These managers are savvier with data and more welcomed in business circles in part because of the book."

"At its heart, of course, “Moneyball” isn’t about baseball. It’s not even about statistics. Rather, it’s about challenging conventional wisdom with data."

"This evangelism has created opportunities for the analytically minded."

The article calls this work "creative empiricism."
"But “Moneyball” dramatized the principles behind these forces: a reliance on data to exploit inefficiencies, allocate resources and challenge conventional wisdom — and thus broadened their appeal.

“Moneyball” traces Billy Beane’s use of unorthodox analytics to the work of Bill James. Working as a baseball outsider, Mr. James began self-publishing his analysis and commentary in 1977 and built a passionate following."

"Once people see the value of a batter’s O.P.S. — on-base plus slugging percentage, a key measure in the book — it’s a short step to applying similar principles in their own organizations."

"Generation Moneyball isn’t yet in charge. But as the Nobel laureate Max Planck once said, “A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it.”"

Monday, October 3, 2011

OPS vs. 1.8*OBP + SLG

And people say I shy away from the controversial topics.

This analysis is at the team level. Let's start with runs scored. The following regression equation shows the relationship between team runs per game and team OPS:

R/G = 13.02*OPS - 5.04

I used all teams from 1996-2009. For OBP I used (H + BB)/(AB + BB). The r-squared was .897 and the standard error was .1608. That works out to 26.05 runs per season.

Now when I used 1.8*OBP + SLG (call it adjusted OPS or ADJOPS) instead of OPS, the equation was

R/G = 10.26*ADJOPS - 5.68

The r-squared was .907 and the standard error was .153. That works out to 24.77 runs per season. So using 1.8*OBP + SLG is a bit better. The standard error is 5% lower. It is also 1.28 runs lower. That would be worth about a tenth of a win.

I have also used a team's OPS differential to predict winning percentage. That is, its hitting OPS - the OPS it allows its opponents. Here is the regression equation:

Pct = .5 + 1.26*OPSDIFF

The r-squared was .798 and the standard error was .0311. That works out to 5.04 wins per season. I used all teams from 1989-2002.

Now the same analysis with the differential using 1.8*OBP + SLG (call it the ADJOPS differential):

Pct = .5 + .986*ADJOPSDIFF

The r-squared was .815 and the standard error was .0298. That works out to 4.83 wins per season. So again, as in the analysis of runs scored, 1.8*OBP + SLG does just a bit better.

One reason why 1.8*OBP + SLG only does slightly better is probably that the range of OBP and SLG across teams is not that great. But for individual players, the range varies much more. So it might make more sense to use 1.8*OBP + SLG instead of OPS in those cases.

Saturday, October 1, 2011

Do Diamondbacks have stats to match Brewers stars?

Click here to read the AP article by Chris Jenkins. He writes:

"The Diamondbacks might not have the Brewers’ marquee names. But the numbers, and the results, show two teams that are surprisingly similar going into Saturday’s Game 1 of the NL division series."

This sounds like an interesting question yet very few numbers are presented in the article. Actually no numbers that statistically compare the two teams. It turns out, that by luck, the Diamondbacks over achieved with runners on base, a trend that no team can keep up.

It looks like the Brewers are much better. Their hitters had an OPS of .750 while their pitchers allowed a .689. That gives them a differential of .061.

The Diamondbacks hitters had an OPS of .736 while their pitchers allowed a .725 OPS. That gives them a differential of .011. Well below the Brewers.

Based on some regression analysis I have done, we can project winning pct with the following equation:

Pct = .5 + 1.3*OPSDIFF

This gives the Brewers a pct of .579 or 93.85 wins. The Diamondbacks get .514 or about 83.32 wins. This seems like a very big difference.

Yet the Brewers actually only won 2 more games (96 vs. 94). And the Brewers runs differential (721 - 638 = 83) is only slightly higher than the Diamondbacks run differential of 69 (731 - 662).

What allowed the Diamondbacks to match the Brewers, at least on the surface? Some very good luck. They over achieved with runners on base. The Diamondback hitters had an OPS of .770 with runners on base (ROB) while their pitchers allowed .731. That gives them a differential of .039.

The Brewer hitters had an OPS with ROB of .745 while their pitchers allowed .719 for a differential of .026.

This seems to be something in favor of the Diamondbacks, but it really isn't.

Doing better with runners on base will improve your chances of winning. You will score more runs than expected and give up fewer runs than expected. Yet, in the long-run, individual players and pitchers (and teams) end up performing about the same in clutch situations (like ROB) as they do overall. Even the best clutch hitters and pitchers do just a little bit better in any clutch situation than they normally do given a large enough number of games.

For the Diamondbacks to to even look like they are as good as the Brewers, they had to be lucky. We cannot expect them to continue to over achieve so much with runners on base.

Thursday, September 29, 2011

September Second Best Month For OPS

Each league had its second highest monthly OPS. The numbers for each league are in the table below. The NL's scoring was not that great but the AL had its best month.

Tuesday, September 27, 2011

Did Branch Rickey Subscribe To The Same S**t As Billy Beane And Bill James?

I am reposting this entry from April 2010 since the Moneyball movie just came out.

Some sportswriters still like to make fun of the statheads or sabermetricians who never played the game and still live in their mom's basement. But to those writers I say "read the 1954 LIFE magazine article where Branch Rickey discusses some very modern looking formulas." This article is online and was called GOODBY TO SOME OLD BASEBALL IDEAS: The 'Brain' of the game unveils formula that statistically disproves cherished myths and demonstrates what really wins. Some of the new stats he proposed were "on-base average" and "isolated power." The article even shows many formulas, some of which are complex.

Rickey is in the Hall of Fame for his work as an executive. But he also played and managed. I think if you ridicule statheads, you would probably ridicule Rickey. Here is the introductory paragraph:

"Baseball people generally are allergic to new ideas. We are slow to change. For 51 years I have judged baseball by personal observation, by considered opinion and by accepted statistical methods. But recently I have come upon a device for measuring baseball which has compelled me to put different values on some of my oldest and most cherished theories. It reveals some new and startling truths about the nature of the game. It is a means of gauging with a high degree of accuracy important factors which contribute to winning and losing baseball games. It is most disconcerting and at the same time the most constructive thing to come into baseball in my memory."
That is followed by a fairly complex formula. Then Rickey asks "Can this bizarre mathematical device be put to any practical use?" And his answer? "It can indeed! It can be applied to any major league club for any season or part of a season to diagnose points of weakness and strength."

So Rickey, perhaps one of the most influential men ever invovled in baseball, saw the need for new and complex ways of analyzing the game. How can some writers, and some GMs, not see this today?

But what about intangibles? Rickey says:

"But somehow baseball's intangibles balance out. They reflect themselves in other ways. Over an entire season, or many seasons, individuals and teams build an accumulation of mathematical constants. A man can work with them. He can measure results and establish values. He can then construct a formula which expresses something tangible, and that is why this formula was devised."
After compiling many stats and data, what did Rickey do? "We took the figures to mathematicians at a famous research institute. Did they know baseball? No, but that was not essential."

Did RBIs' figure in Rickey's formula? No. "As a statistic, RBIs were not only misleading but dishonest."

There is much more to read in this article that is of interest. Near the end of the article he mentions getting his scouts involved in finding players with power, guys who will improve the ability of his team (the Pirates) to bring runners home. But that is based on the formula. Imagine that. Rickey was going to tell his scouts what to look for based on a formula.

If you have never read the article, I think you are in for a treat since it is so well written and it was written so long ago.

Monday, September 26, 2011

Has Mariano Rivera statistically separated himself from his peers?

That is the issue raised by an article in yesterday's NY Times Mr. Young, Mr. Ryan, Mr. Rivera Will Be Joining You By DAVE ANDERSON.

He compares Rivera to Cy Young and Nolan Ryan. Young won 511 games and the next highest is Walter Johnson at 417. Ryan struck out 5,714 batters, 839 more than Randy Johnson. Does Rivera dominate his peers in a similar way?

It is hard to argue against his 0.71 ERA in 139 post-season IP. I doubt any other closer can come close to that. Probably no other has had a chance to pitch so much in the playoffs. But the article offered no comparisons. No other pitcher's post-season performance was discussed.

Rivera now has 603 saves and the only other guy with 500+ saves is Trevor Hoffman with 601. So let's see how those two compare in their regular season stats (Hoffman only had 13 post-season IP). The table below summarizes some of their stats.



Their save percentages are almost identical. I thought that maybe Rivera got into tougher situations because he has such low HR% and his ERA is so much better than Hoffman's (2.22 vs. 2.87) that it seemed like maybe he came in when the score was closer. But they each have about the same average leverage index. The 1.92 for Hoffman, for example, indicates that his typical game had nearly twice the pressure as average in terms of inning and score.

So these two guys pitched a similar number of innings under the same pressure and were equally successful in what they were asked to do. Now maybe Rivera was in more pennant races or playoff chases but the Yankees usually made it in fairly easily. Maybe the competition was a little tougher in the AL.

ERA+ is adjusted for league average and park effects. Rivera's means that his ERA so adjusted is a little less than half the league average while Hoffman is about 30% lower. So big edge for Rivera (another closer, Billy Wagner, in about 900 IP had a 187, not too far from Rivera). Wagner had a career save % of 86%.

They are close in WHIP. FIP or fielding independent ERA takes walks, HBP, SO, and HR into account (with IBBs taken out). Hoffman is not too far off. This comes from Fangraphs and I don't think it is park adjusted. A rough estimate is that Hoffman's parks were about 7.2% better than average for pitchers while for Rivera just about average. If I raised Hoffman's FIP ERA by half of that 7.2% I get 3.19.

When I do walk% and SO/BB ratio, IBBs are taken out and HBP are in. IBBs are also taken out when doing HR%. This edge for Rivera seems very large. Maybe it is even bigger being a righty in Yankee Stadium and other teams would have tended to send lefties up to face him.

Rivera is not too far ahead of Hoffman in AVG and OBP allowed. But the edge in SLG seems big, probably due to the low HR%. The edge is even bigger in road games.

One other closer, Dennis Eckersley is worth looking at. I found his best 6 seasons in FIP ERA and the simple average of them was 2.13. For Rivera it was 2.15. If I park adjust Eckersley here, he goes up to 2.19. Still not too far from Rivera. But that has to be adjusted for the league average. Those years for Eck around 1987-93. It looks like the league average has been about .44 higher in those years than it was over Rivera's years. Rivera's FIP ERA is about .474 of the league average while for Eck it is .535. If Eck pitched in those years of Rivera he would get 2.42. Not too far off (also recall that for Rivera, those are his best six years and but I just used his entire career as an approximation). Eckersely came to the closer well after the age of 30 (and maybe before that we did not really have closers). Eckersley had a career save % of 85%.

What is probably the amazing thing is for how long Rivera has been consistently great. He is almost 42 and is having another outstanding season.

Update: David Pinto points out that Rivera had 116 "long saves" (4 outs or more) while Hoffman only had 55. I left a response.

Saturday, September 24, 2011

Bill James vs. Richard Feynman

Cage match. Who would win? I think Bill James would have the size advantage, but I really don't know for sure.

Anyway, Nobel prize winning physicist Richard Feynman said “Science is the belief in the ignorance of the experts.” If that is true, then I think Bill James must be a great scientist.

I thought of this watching "Moneyball" today. It was good to see them mention James and show his picture. It is very entertaining, enough so that, in my opinion, people with just a passing knowledge of baseball can enjoy it.

I don't have anything insightful to say about it. Joe Posnanski has written two great articles.

Moneyball The Movie

Moneyball and the Ballad of Bill James

He gives the movie 3 stars out of 5. I think it is at least a 4. So far on IMDB it has an 8.0 rating out of 10. But that is in only 729 votes. It will probably come down, but my guess is that it will end up with at least a 7.0.

Update 9-25: The IMDB rating is now at 8.2 with 1395 votes.

Update 9-26: Art Howe isn’t happy about his portrayal in ‘Moneyball’

Update 9-26:
The IMDB rating is now at 8.3 with 2212 votes.

Update 9-27:
The IMDB rating is now at 8.3 with 2898 votes.

Update 9-30:
The IMDB rating is now at 8.2 with 3599 votes.

Update 10-1:
The IMDB rating is now at 8.2 with 3845 votes.

Update 10-2:
The IMDB rating is now at 8.2 with 4382 votes.

Update 10-10:
The IMDB rating is now at 8.2 with 6113 votes.

Wednesday, September 21, 2011

Verlander And Kershaw Are Leading All Three Pitching Triple Crown Categories

ERA
1. J. Verlander DET 2.29
2. J. Weaver LAA 2.41
3. J. Beckett BOS 2.50

Wins
1. J. Verlander DET 24
2. C. Sabathia NYY 19
3. J. Weaver LAA 18

Strikeouts
1. J. Verlander DET 244
2. C. Sabathia NYY 224
3. F. Hernandez SEA 220

ERA
1. C. Kershaw LAD 2.27
2. J. Cueto CIN 2.31
3. C. Lee PHI 2.38

Wins
1. C. Kershaw LAD 20
2. I. Kennedy ARI 20
3. R. Halladay PHI 18

Strikeouts
1. C. Kershaw LAD 242
2. C. Lee PHI 232
3. T. Lincecum SF 217

Click here to see all of the pitchers who won the triple crown at Baseball Almanac.

Only twice has a pitcher in both leagues won the pitching triple crown in the same year. 1918, Walter Johnson & Hippo Vaughn and 1924, Walter Johnson & Dazzy Vance.

Wednesday, September 14, 2011

Best/Worst Month for a Team's Pitchers (and the 1969 Mets allowed only 3 HRs in September)

Tom Ruane of Retrosheet looked into this after I asked him about it. Here is the link:

Best/Worst Month for a Team's Pitchers

Tom mentions "Since 1935, only the 1954 Orioles had a lower HRA rate than those 1969 Mets." Tom also looks at the best months in hits allowed and strikeouts.

Prior to Sept. that year, the Mets' HR rate was .023842 (using batters faced - IBBs - SH). In Sept. it was .002841. So the Sept. rate was only about 1/8 of what it had been before that year.

Using the cumulative binomial distribution and assuming the following:

Number of occurrences: 3
Trials: 1056
Rate: .023845

The probability of getting 3 or fewer HRs was about 1 in 36 million (I welcome any comments or corrections on this). Just cutting the HR rate in half or less that month (about 13 HRs allowed) has chance of only 1 in 183. 6 HRs or less is 1 in 225,000.

The Mets did allow 3 HRs in the last two days of the season in Oct. But by then they had clinched the division. They allowed no HRs from Aug. 30 to Sept. 18 in a total of 22 games (that is what a quick check of Retrosheet shows).

On Sept. 19, game 2, Stargell of Pit. hit a 2-run shot in an 8-0 Pirate victory. But by then, the Mets already had a 5 game lead with 13 games to play (the Cubs had 11 left).

On Sept. 21, game 1, Pagan of Pit. hit a solo shot in the 4th inning. It made the score 4-3 but the Mets won 5-3

On Sept. 21, game 2, Stargell hit a solo shot in the 4th inning to make it 4-1 but the Mets won 6-1.

It seems like none of these HRs was significant. Only one even had a man on.

The Mets also allowed only about 31% of their HRs for the whole year with runners on while the rest of the league was about 43%. The Mets turned 30 DPs in Sept. Their next highest month was 25 and the next after that was 18. Don Cardwell had an ERA under 1.00 in Sept. It was .039 in 23 IP. It was 3.25 going into Sept.

Sunday, September 11, 2011

Report On Hitting So Far In September

The table below shows some basic stats for each month in both leagues. Hard to sum up September so far. The AL so far has had a big drop in OPS but runs per game is up. The NL is about the same as it was in August.

Monday, September 5, 2011

Can Justin Verlander Win The Pitching Triple Crown?

He now leads the AL in wins, strikeouts and ERA.

Wins
1. J. Verlander DET 21
2. C. Sabathia NYY 19
3. J. Weaver LAA 16

Strikeouts
1. J. Verlander DET 224
2. C. Sabathia NYY 211
3. F. Hernandez SEA 204

ERA
1. J. Verlander DET 2.34
2. J. Weaver LAA 2.49
3. J. Beckett BOS 2.54

Weaver has seen his ERA rise quite a bit over his last two starts. After Aug. 24, Weaver's ERA was 2.03. He gave up 13 earned runs (ER) in 11 IP over those two starts. In fact, after his Aug. 5 shutout of Seattle, his ERA was 1.78. For all of August, his ERA was 4.28. He also gave up 8 ER in 4.2 IP against the Blue Jays on Aug. 13. Before the All-Star break his ERA was 1.86. Since then it has been 3.82.

Verlander seems to have been a little more steady. His ERA before the All-Star break was 2.15. Since then it is 2.75. Since June 15, his ERA has been no lower than 2.15 and no higher than 2.38. Verlander has only once given up more than 4 ER (a 6 ER game) and has always pitched at least 6 innings in every start. He has only given up 4 ER 3 times and has not given up exactly 5 runs even once. 22 of his 30 starts had both 3 ER or less and 7 IP or more.

Click here to see all of the pitchers who won the triple crown at Baseball Almanac


Saturday, September 3, 2011

Hitting And Scoring Picked Up In August

So far this year, the AL is averaging 4.41 runs per game with an AVG-OBP-SLG of .257-.322-.405 and an OPS of .727. 1992 was the last year the AL had a lower runs per game with 4.32. Same for OPS, .713. Last year the AL R/G and OPS were 4.45 and .734. So not too big of a drop off. Of course, things can change with Sept.

So far this year, the NL is averaging 4.15 runs per game with an AVG-OBP-SLG of .253-.319-.392 and an OPS of .710. The last time the NL had a runs per game lower was in 1992 at 3.88. That was also the last time they had a lower OPS, at .684. Last year the NL R/G and OPS were 4.33 and .723.

The table below shows how each league has done, month-by-month, this year. April includes March data. Both leagues had their best power month measured by SLG and ISO.

Tuesday, August 30, 2011

Players Who Led The League In Triples And Home Runs At Least Once Each

With Curtis Granderson currently tied for the AL lead in HRs while leading in triples, I thought I would find all of the players who had led the league in both in the same year as well as the others who did it but not in the same year. Some guys led one league in HRs one year and another league in 3Bs another year, like Dick Allen. They are all below, linked to Baseball Reference. If I missed anybody, let me know. They are in no order except the guys who did it in the same year are listed first. There are alot of Hall of Famers here. There are 33 in all.

Harry Stovey (twice in the same year)
Tip O'Neill (once in the same year and lead in 2Bs, too)
Harry Lumley (once in the same year)
Jim Bottomley (once in the same year)
Tommy Leach (once in the same year)
Willie Mays (once in the same year)
Mickey Mantle (once in the same year)
Jim Rice (once in the same year)
Roger Connor
John Reilly
Bid McPhee
Oyster Burns
Sam Crawford
Harry Davis
Home Run Baker
Johnny Mize
Ty Cobb
Lou Gehrig
Rogers Hornsby
Buck Freeman
Dan Brouthers
Dick Allen
Ed Delahanty
Joe DiMaggio
Sam Thompson
Wally Pipp
Frank Schulte
Buck Ewing
Jim O'Rourke
Jimmy Sheckard
Joe Medwick
Ryne Sandberg
Walt Wilmot

Saturday, August 20, 2011

The record for most consecutive games versus over and under .500 teams

A guest post by Tom Ruane of SABR and Retrosheet

I thought it might be interesting to look at four groups:

1) under .400,
2) under .500,
3) over .500 and
4) over .600.

Here's what I found:

1) Starting on August 11, 1885, the Chicago White Stockings
played a record 23 straight games against opponents with
a winning percentage under .400. During the streak, they
played only Buffalo, Detroit and St. Louis. It ended when
they faced Boston (which entered the game with a
none-too-impressive winning percentage of .406). Chicago
went 20-2-1 during the streak.

The longest such streak since 1900 is nineteen and it was
done four times:


2) The Chicago White Sox played 51 straight games against
losing teams from May 27 to July 10, 1966. The streak ended
when they hosted the third-place Indians in the first game
following the All-Star break. Ironically, the Sox went only
22-28-1 while playing losing teams, and 45-32 afterwards.
No teams are close to their streak, the second longest
being a run of forty straight games by the San Francisco
Giants in 1986. It ran from July 3rd to August 17th and,
like the White Sox, the Giants had a losing mark (19-21)
while it lasted.

3) The top five teams with the most consecutive games
played against winning teams:


At the end of the streak by the 1916 Senators, the only
other team in the AL with a losing record was the 27-94
Philadelphia Athletics. And the three streaks from 1908 are
due to the practice of scheduling long road and home trips
between the eastern and western teams. In 1908, all of the
western teams had winning records.

4) The longest stretch of games against teams with a
winning percentage higher than .600 was 27 by the
Philadelphia Quakers in 1884. From May 20th to June 19th,
they went 6-21 against Boston, Providence and New York.
Entering the games of May 20th, those three teams had a
combined record of 37-5. Philadephia had been in fourth
place at the start but was in seventh place at the end of
the run.

The record since 1900 is 21, by the 1998 Tigers from
April 3rd (which was the first game of the second series
of the year) to 30th. Given how early in the season
it was, a few more wins here and there might have
short-circuited the streak, but Detroit went 5-16 in the
games.

Thursday, August 18, 2011

Hitting Has Picked Up A Bit So Far In August

The table below shows the OPS and runs per game for each league by month for 2011.





Tuesday, August 9, 2011

Should Thurman Munson Be In The Hall Of Fame?

I'm looking at this because it was discussed at Baseball Think Factory. That discussion was inspired by a new website that is trying to get Munson in called VoteThurmanIn.com.

Munson is 14th in career WAR among catchers. Here is the top 22 from Baseball Reference (I went to 22 so I could include Campanella). It also shows their best three consecutive seasons (which I just eyeballed to find) and the number of times they finished in the top 5 and 10. These guys played at least 50% of their games at catcher.



Now Munson is only 240th among position players in career WAR. That probably is not enough for the Hall. But catchers don't have the longevity of other players, so being in the top 15 is impressive. If he had not died at age 32 in 1979, he might have made the top 10. He was on a pace to get about 3.6 WAR in 1979. He also had over 3 the previous year.

His best 3 straight years is not near the elite like Bench, Carter, Piazza or Mauer (does he have the record for best 3 straight by a catcher?-although some games at DH may have helped). But Munson's best 3 straight years beat Lombardi, Bresnahan and Hartnett. He also has more top 10 finishes than Lombardi and Hartnett.

Tenace is ahead of him in career WAR but only about 58% of Tenace's innings were at catcher (Munson is about 97%).

Munson also is 144th in MVP shares (at 1.50). That is good considering that catchers don't do that great in the voting. See my post called MVP Awards And Award Shares By Position. So his contemporary observers liked him. He also won three Gold Gloves.

Fangraphs has him at 21st in career WAR among catchers. But some of the guys ahead of him played less than half their games at catcher: King Kelly, Joe Torre, Brian Downing, Buck Ewing. Tenace and Bresnahan are both ahead of him, too. Bresnahan played about 70% of his games at catcher (this all points out something problematic with the position regarding the Hall of Fame-many guys who played there got moved to other positions). And again, don't forget that Munson died young. He could have moved up in the rankings.

Just what I have done so far is probably not enough. But I think we should be looking at him as a serious candidate.



Tuesday, August 2, 2011

July Hitting Picked Up A Bit In The AL, Still Sluggish In the NL

He are the monthly OPS figures for the AL starting in April:

April .713
May .720
June .719
July .731

For the whole season, it is .721. But just last year, for the whole season, it was .734. July should be one of the strongest months. From 1994-2009, the AL had an OPS of at least .750 every year with a high of .795 in 1996.

He are the monthly OPS figures for the NL starting in April:

April .709
May .702
June .699
July .709

For the whole season, it is .705. In 2009, for the whole season, it was .739. From 1994-2008, the NL had an OPS of at least .740 every year with a high of .773 in 1999.

Tuesday, July 26, 2011

Is Ryan Howard A Clutch Homerun Hitter?

This was inspired by a discussion at Baseball Think Factory about an article at Fangraphs called Ryan Howard and the RBI by Steve Slowinski (there is no shortage of great Polish baseball bloggers).

Some people say that Howard is a good "RBI guy" and that he is paid to drive in runs so maybe his overall stats and/or his OBP don't matter that much. But a good "RBI guy" would tend to hit better with runners on. In my opinion that means a higher AVG with runners in scoring position (RISP) than he normally gets and a higher SLG with runners on base (ROB) than normal

But this year Howard is only 47th in SLG (.463) with runners on base in all of baseball with guys with 150+ PA in that situation. Teixeira leads with .667.

Howard is 20th in AVG (.304) with runners in scoring position this year in all of baseball with guys with 100+ PA in that situation. Votto leads with .424.

So overall, nothing special on Howard's part.

But somehow Howard over his career has managed to hit better with ROB and RISP. His AVG/SLG with none on is .266/.528. With ROB it is .285/.593 and .281/.561 with RISP. The one difference that seems pretty big is the SLG with ROB.

From 1991-2000, ROB SLG was .422 in all of baseball and .411 with none on. Since 2006, it looks like SLG with ROB is about .009 higher than with none on. So Howard is way above that being .063 higher. Some of that might be because he is a lefty. But whether it is luck or some real clutch talent is hard to say.

So I thought I would try a statistical test on his HR% with none on (NONE) vs. his HR% with ROB. Here we calculate something called a "Z-score." To be significant at the 5% level (meaning there is less than a 5% chance of getting the difference between the two HR%s) the Z-score has to be at least 1.96 (plus or minus, since a guy could do worse with ROB, the clutch situation I am looking at). HR% will be highly correlated with SLG (also, it does not look like Howard hits very many more 2Bs or 3Bs with ROB than NONE). Some technical details on this are below.

Howard's HR% with NONE in his career is 6.857% while with ROB it is 8.221%. That may not seem like a big difference, but the ROB HR% is 19.9% higher (8.221/6.857 = 1.199). The question is whether or not it is statistically significant. This is where the Z-score comes in.

The Z-score takes into account the number of ABs and HR% in each situation (ROB, NONE). It also takes into account the normal major league difference (I used +.0009, the approximate difference from 2007-10). This is important because if the normal HR% was .01364, then Howard would not be clutch at all since this is the difference between his two percentages (.08221 - .06857). When I did this calculation for Howard, I got a Z-score of 1.655. That is significant at about the 10% level. I guess I would like to see it at the 5% level before considering that he is clutch in hitting HRs with ROB. But even if he had reached 1.96 or more in his Z-score, it is possible he got there by luck and not skill because 2.5% of the hitters will have a Z-score of +1.96 or more.

Z = (CLUTCH AVG – NONCLUTCH AVG + EXPECTED DIFFERENCE)/SD

NORMAL DIFFERENCE FOR CL = +.0009 (which is added in this case)

SD = STANDARD DEVIATION =

{[CLUTCH AVG*(1-CLUTCH AVG)]/CLUTCH AB + [NONCLUTCHAVG*(1-NONCLUTCH AVG)]/NONCLUTCH AB}

(FROM PETE PALMER, BY THE NUMBERS 3/90. He called it a “pooled” standard deviation)

In a normal distribution, 5% of the players would have a Z-score of at least +1.96 or less than –1.96.

Sunday, July 24, 2011

Ubaldo Jimenez Home And Away, 2009-2011

I guess he is the subject of trade rumors. Here are some of his numbers

Home:

IP-265.67
H-254
HR-20
BB-103
SO-233
ERA-3.79
H/9IP-8.60

Away:

IP-291
H-201
HR-13
BB-119
SO-287
ERA-2.94
H/9IP-6.22

So his ERA has been about .85 better on the road. But, of course, home ERA is usually better. Here are the differentials the last three years for all of baseball: .46-.54-.24.

Friday, July 22, 2011

Not all no-decisions are created equal: Evaluating a little-examined pseudo statistic

That was the title of Gilbert Martinez's presentation at the recent SABR convention. Gilbert is president of the Rogers Hornsby Chapter of SABR in Austin, TX.

Click here to get the power point slides. It has alot of neat graphs.

Here is the abstract:

"In 2009, two pitchers recorded 16 no-decisions. The Houston Astros’ Roy Oswalt set a franchise record for no-decisions and was 8-6 in 30 starts. The L.A. Dodgers’ Randy Wolf was 11-7 in 34 starts.

Research shows that 2011 Hall of Fame Inductee Bert Blyleven holds the record for most no-decisions in a season with 20 after a 12-5 record in 37 starts with the Pittsburgh Pirates in 1979.

While Oswalt and Wolf didn’t set a record, they were among the most in a single season and they did receive some attention by the media and fans. Little analysis has been done to understand the nature of these no-decisions. Statistics abound in baseball, and especially with pitchers. Statistics capture games started, wins, losses, saves, earned run average, pitch counts, walks, hit batsmen, ground-ball outs, fly-ball outs, innings pitched, homeruns given up, strikeouts per nine innings, and so on. This project will focus on starting pitchers and no-decisions recorded as a result of starting a game.

The purpose of this project is twofold: 1) to evaluate the most no-decisions by a starting pitcher in a single season and in a career and 2) to determine which pitchers with the most no-decisions were unlucky or lucky.

A pitcher would be considered unlucky if he leaves the game with a lead, only to watch his team’s bullpen lose the game in the late innings. This would be considered a positive no-decision (positive because it suggests an effective pitcher). Likewise, a lucky pitcher would be one who leaves the game with his team trailing, only to be bailed out by a potent offense, thereby taking him off the hook. This would be considered a negative no-decision. If he leaves with the game tied, this would be a neutral no-decision.

These no-decisions are not equal, and some pitchers are more valuable to their teams than others. In other words, some pitchers saddled with numerous no-decisions but who have more positive no-decisions than negative no-decisions are more worthy of our sympathy than those with more negative no-decisions compared to positive no-decisions.

Initial review of the literature shows little on this subject, so it appears this research project would contribute to our understanding of lucky and unlucky pitchers in the context of no-decisions. In fact, this statistic is not readily recorded on Baseball-Reference.com.

The methodology includes running a formula in a pitchers database to determine no-decisions: (Games started) minus (wins + losses) = no-decisions. However, those results would need further review to eliminate no-decisions that come from relief appearances. The remaining results would be analyzed to determine the circumstances in which the pitcher was given the no-decision: Was his team leading or losing when he was removed? Was he lucky in receiving the no-decision or was he unlucky?

The result would be the number of positive, negative and neutral no-decisions, shedding new light on the pitcher’s effectiveness."

Wednesday, July 20, 2011

Has Jake Peavy Had Bad Luck This Year?

His ERA is 5.19. The league average is 3.87. Yet here are his AVG-OBP-SLG allowed

.262-.281-.341

The league averages are .252-.319-.392

So he appears to be much better than the league average in OBP and SLG allowed.

He is getting hit pretty hard with runners on base (ROB). Here are his AVG-OBP-SLG allowed in those cases

.344-.342-.439

Now with runners in scoring position (RISP)

.370-.347-.508

It is not the case that he can't pitch in the clutch. For his entire career, his AVG-OBP-SLG are .233-.271-.336. With ROB, they are .235-.275-.313 and with RISP they are .240-.283-.303.

Fangraphs shows his fielding independent ERA at 3.06. That is a stat that tries to predict a pitcher's ERA just based on strikeouts, walks and HRs. His xFIP ERA is 3.49. That makes a further correction by giving all pitchers the same HR rate on flyballs (or something like that). They have an even more sophisticated stat called SIERA which projects him to have 3.57. Anyway you look at it, his actual ERA of 5.19 is very unlucky.

Monday, July 18, 2011

Pitchers Duels That Turned Into Slugfests (and Vice Versa)

This is a guest post by Tom Ruane, based on a post to the SABR list.

While watching the two teams fail to score for most of the evening
during the recent SABR outing to Dodger Stadium, I got to wondering
about the highest scoring games that began with scoreless inning
streaks. Here is the list since 1918:



And here is the flipside, the most runs scored in a game that ended
with the longest scoreless streak:



Tom Ruane, a computer programmer in Poughkeepsie, N.Y., is a member of Retrosheet's board of directors. He has published articles in "The Baseball Research Journal" and "By The Numbers." He won SABR's highest honor, the Bob Davids Award, in 2009.

Saturday, July 16, 2011

What Is Konerko Doing Differently? His OPS+ Is Up, His Strikeout Rate Is Down And So Is His GDP Rate

The table below shows these stats over the course of his career. Data from Baseball Reference.


These last two years his OPS+ has been well above anything he ever did before. He has his lowest strikeout rate in 8 years and his lowest GDP rate in 6 years (for strikeout rate it was K/PA with IBBs taken out-GDP rate is GDPs divided by opportunitites). He his making more contact which should make more GDPs a possibility but he is avoiding them more than in the past (see what I say below as to how his low 2005 GDP rate is another example of how lucky the Sox were that year).

The chart below shows Konerko's OPS+ over time.



A regression with OPS+ as the dependent variable and year as the independent variable yields an r-squared of .4 and the trend line is positive. He did not peak before 30 like most guys and then trend downwards

Now for the strikeout rate and the GDP rate.



Except for that wild fluctuation in the middle of his career, his GDP rate seems to be trending downward. He has always been slow but his GDP rate is not rising as he ages.

I think it is interesting that his GDP rate took such a big dip in 2005. He only grounded into 9 double plays that year. The year before it was 23 and the next year it was 25. Hitting into 10-15 fewer DPs must have really helped the Sox that year. This may be another way they were lucky that year. The Sox had to go down to the last week of the season before they clinched a playoff spot. It was not a sure thing. They beat their main competitor, the Indians something like 14 out of 19 that year and won a bunch of 1-run games that year against them. Konerko not hitting into DPs must have helped quite a bit. Click here to read about a bunch of other ways the Sox were lucky that year that I wrote about. For one, they got great years out of relievers Politte and Cotts who did not do much before or after that season.