Sunday, October 31, 2010

Have The Rangers And Giants Discovered A New (Old) Way To Win?

That is what a recent Wall Street Journal article says. See Hitting Baseballs, Just Not as Far: Giants and Rangers Win With Contact Hitting, Bunts and Baserunning; the 'Lost Arts'. But I don't think that they are doing anything so different from other teams that it helps them score extra runs. Here is an excerpt:

"San Francisco was 17th in runs scored and 13th in slugging percentage this season. But they ranked fifth in strikeouts and third in sacrifice bunts in the National League and fourth in all of baseball in sacrifice hits.

Texas was only ninth in slugging percentage, but the team had the most sacrifice bunts in the American League, the second-most sacrifice flies and the fourth fewest strikeouts. The Rangers were also seventh in the majors in stolen bases."

Both teams, however, are actually scoring just about the number of runs you would expect based on their OBP and SLG. From 2007-2009, the relationship between runs per game and those stats in MLB was:

R/G = 16.04*OBP + 11.595*SLG - 5.52

The Rangers had an OBP & SLG of .338 & .419. The equation predicts they would score 4.76 runs per game while it actually was 4.86. So just about what you would expect, meaning all those sacrifices and SBs are not making much difference.

The Giants had an OBP & SLG of .321 & .408, projecting to 4.36 runs per game while it was actually 4.3. Just like the Rangers, all these "small ball" strategies are not making much difference. (the equation comes from a linear regression analysis of all 90 teams from 2007-09-the r-squared was .904 and the standard error of the regression was .137 runs per game).

Another regression, based on all teams from 1989-2002, shows the relationship between team winning pct and OPS differential. Here it is:

Pct = .5 + 1.25*OPSDIFF

The Rangers hitters had an OPS (OBP + SLG) this year of .757 while they allowed an OPS of .709. The Giants had .729 & .683. So the two team's differentials, respectively, were .048 & .046. The numbers below show each team's predicted pct, and predicted wins, followed by their actual wins in parantheses:

Rangers) .560-90.72 (90)
Giants) .558-90.32 (92)

Each team won just about the number of games expected (each within two of the prediction). There are no extra wins due to using "lost arts." In fact, they have done well by some combination of hitting for power and getting on base and generally preventing their opponents from doing so. This is a time honored way of winning, as Branch Rickey explained back in 1954. I posted something about that earlier this year. See Scouts vs. Statheads: What Might Branch Rickey Say?.

Friday, October 29, 2010

Great ERAs as season ends

This is a guest post by Clem Comly and Tom Ruane, based on recent posts to SABR-L.

2010 Giamts ERA for Sept./Oct. regular season was 1.91. Cy Morong asked me off-list how unusual is a montly ERA that low.

So I asked baseball-reference.com's Play Index. Unfortunately, I couldn't specifically ask about the best team ERAs for calendar months.

So I changed the question (perhaps inspired by the Kobyashi Maru). I decided to look at ONLY September combined with October. Baseball-reference.com couldn't answer that question directly, but I asked it to round up the usual suspects That is, I asked for lowest OR of exactly 26 games for a team at the end of its season.

For those teams after 1919, I could manually look up the monthly splits for those teams on the Retrosheet site (and where necessary combine Sept. and October splits to calculate ERA for Sept./Oct.

It turns out the Giants finished with an extremely good but not record-setting autumn. Based on the results of the query, the odds are very good that the record holder for 1920-2010 is 1965 Dodgers at 1.50. I suspect if one worked through 1901-1919 a team from that era with easier unearned run rules and more errors would have the record. I include the 26-game stats (date range, game sequence number range, and OR in the span) below to indicate how much of the Sept./Oct. period the query covered. Also, the speed with which the OR zoomed up indicates the actual record holder is the 1965 Dodgers (a team that didn't make the list would have at least 65 OR in 26 games while '65 LA only had 47, only 72% of 65).



B.Retro means indicated season before the first season chronologically that Retrosheet has monthly splits. Retrosheet has splits for 1920-2009. 2010 data from baseball-reference.com.

Here is what Tom Ruane came up with after Clem raised the issue



[Editor's note: I added the relative ERA figures and they are not necessarily the 10 lowest ever, but all of them are probably near the top. The NL ERA in Sept/Oct this year was 3.72. That gives the Giants a relative ERA of .513, which looks very good by historical standards]

Clem Comly is the vice-president of Retrosheet and co-chair of SABR's statistical analysis committee while being a member of several other committees. He is also a Phillies fan.

Tom Ruane, a computer programmer in Poughkeepsie, N.Y., is a member of Retrosheet's board of directors. He has published articles in "The Baseball Research Journal" and "By The Numbers." He won SABR's highest honor, the Bob Davids Award, in 2009.

Tuesday, October 26, 2010

Neither Giants Nor Rangers Have Clear Edge According To OBP, SLG

I rated each team using the formula 1.7*OBP + SLG. I did that for their opponents as well. Then I found their differential for the 1st half of the season, the 2nd, Sept/Oct and the post season (each series was weighted by the number of games). The table below has the results (clicking on it will enlarge it).

The Rangers were clearly the superior team in the 1st half, with a .064 differential vs. the Giants .038. But in the 2nd half the Giants pulled slightly ahead, .069 vs. .058. That is largely the result of incredibly great pitching. They held their opponents to a .297 OBP and a .373 SLG. But in Sept/Oct those numbers were .251 & .292 (their ERA was 1.91, over 29 games!). They almost kept it up in the playoffs, so far with .274 & .298.

In Sept/Oct, the Giants have a huge edge in the differential, .193 to .052. The Rangers have the big lead so far in the post season, .200 to .069. So no clear winner. It would have been nice if one team had a bigger differential in all cases. I think the Rangers will win, however, because I just don't see how any team can keep up the super human pitching the Giants have displayed. One other reason is that the Rangers have probably faced tougher competition, based on the fact that the AL once again won the majority of inter-league games.

Sunday, October 24, 2010

ALCS & NLCS Both End The Same Way

Both the ALCS and the NLCS ended on a called third strike to a batter who has at least one MVP award and one 50 HR season (AROD and Ryan Howard).

Saturday, October 23, 2010

Were The Rangers Actually Better Than The Yankees By The End Of The Season?

That is sort of the question raised by Rob Neyer at Rangers had 'em all the way. One comment caught my interest. It was from the peerless yet eccentric and reclusive "maxbentley." He said:

"For the season, the Yankees had an OPS differential of .065 while the Rangers had .048. In the 2nd half, those were .033 and .043. So the Rangers passed the Yanks. In Sept/Oct, those #'s were .009 & .035. The last month or so the Rangers were playing alot better. Could be the opponents. But it is interesting."
I thought I would break things down just a little differently and use 1.7*OBP + SLG. The table below shows the results. As you can see, the Yankees were much better in the 1st half, but as the season wore on, the Rangers were clearly better. That could possibly be due to the Rangers getting Cliff Lee and getting more playing time from Moreland, as Neyer suggests. The Yankees differentials were .122, .042 and .017. The Rangers did a better job of maintaining their performance. Their differentials were .064, .058 and .052. It is true that the Yankees probably played a tougher schedule, especially the last 18 games. But even considering that, the Rangers seemed to have been a better team in the 2nd half and the last month. A .052-.017 edge the last month looks very big. Another thing occurs to me: Josh Hamilton only played 5 games in Sept./Oct. That held down the team OBP and SLG. So the Rangers might have been better than their differential indicates.

Tuesday, October 19, 2010

Simpsons Throw Alot Of Great Pitches In The "MoneyBart" Episode, But Its Not A Perfect Game

Spoiler Alert!

It was on about a week and a half ago, so most fans have seen it or heard about it. The episode was hysterical and I am not a Simpsons fan. Maybe it was too funny. The funny bits and one-liners came pretty fast. Bill James and Mike Scioscia are in it.

The premise is that Lisa takes over managing Bart's little league team. She then learns all she can about sabermetrics and decides to teach the players all she can about it. All team strategy and decisions from then on are made according to sabermetrics.

The team starts to climb out of the cellar and closes in on first place. But in a crucial game, Bart hits a home run (a grand slam, I think) to win it for the Isotots. But Lisa had ordered all the players to take pitches to wait for walks (because she learned how important OBP was). So she kicks Bart off the team.

But I think this ignores the fact that sabermetricians like power hitting (not as much as OBP, but we extol the virtues of SLG, too). I don't think there was anything in the episode up to that point that had said how many HRs he hit. Having him get kicked off the team paints a pretty black and white picture, with statsy tactics (taking pitches) only mattering while old-fashioned skills (like power) don't matter at all. That is not what sabermetrics says. In fact, all the great research has shown that HRs are the most valuable event (and people say we have not made a real contribution).

The other thing that was interesting that one of the books showed a page with a formula for OBP. When the video was paused, that page also showed a graph or chart that had the heading "OPS vs. RISP." I could see if it had been OPS vs. lefties or righties. Or even OPS with RISP. But that made no sense. Maybe they were just trying to see if geeks like me would catch it.

Wednesday, October 13, 2010

The Phillies Roy-Al Pitchers

We might consider both Halladay and Oswalt to be among the royalty in pitching. Some of the data I present below suggests that.

The first thing I did was to find all the pitchers through age 32 with 1500+ IP since 1900 and rank them by RSAA/IP (there were 445 pitchers). RSAA means "runs saved above average." It is from the Lee Sinins Complete Baseball Encyclopedia. It is also park adjusted. This is through 2009. The table below shows the top 10:



The next table simply shows total RSAA.



Then I found the leaders in pitching Wins Above Average (WAR) from Baseball Reference. Here are the leaders through age 33 including 2010:



I also constructed a crude fielding independent ERA. I ran a regression with these pitchers (through 2009) where their relative ERA depended on their relative HRs, SO, and BBs. Then I used that regression equation to predict their relative ERA (if I get a chance I will add the results-the r-squared was about .58). Here are the leaders:



So Walter Johnson had an ERA that was 69% better than the league average and he was 69% better at preventing HRs. He was predicted to have an ERA that was 48% better than average (but none of these numbers are park adjusted). Both Roys do well again.

Notice how they are alot better at preventing HRs and BBs than average but just slightly above average at striking out batters. That is similar to Maddux. In fact, I have created a HR/BB index for pitchers that Maddux did very well on. See Who Was More "Magical" Than Greg Maddux? (Or Pitcher's HR/BB/SO Rating).

I also created a WAR ranking using this crude fielding independent ERA. I divided IP by 9 to get games. Then adjusted every pitcher to a league average of 4 runs per game. That gave Walter Johnson an ERA of 2.74. Those numbers were used to calculate a predicted winning pct using Bill James' "Pythagorean formula." To compare that to a replacement level pitcher, I assumed that would be a .400 pct. So if a pitcher had 200 games and a predicted pct of .600, he would get 120 wins. The replacement would get 80. So the pitcher in question would have a WAR of 40.

The two Roys did not rank as highly as they did in the above tables, but they were still pretty good. Halladay was 50th and Oswalt was 68th. That still puts both in the top 15%. Walter Johnson had a predicted pct. of about .685 and a WAR of about 134 to lead all pitchers.

Saturday, October 9, 2010

Lincecum's Amazing Feet On Swinging Strikes

This is a guest post by Dave Smith, based on a message he sent to the SABR list. His research on this was mentioned in the Washington Post.

As most of you know, Tim Lincecum had a remarkable second inning last night in the first game of the NL Division Series. In the second inning, he struck out Alex Gonzalez, Matt Diaz, and Brooks Conrad, all on three swinging strikes. An NPR reporter contacted Lyle Spatz, chair of the SABR record committee this morning to ask about previous occurrences of such an event. Lyle referred him to me. Here is what I did and what I found:

I checked all games from 1988 through last night since this is the part of our data base with nearly complete pitch coverage (there are a small number of games without pitch data early in that period). I also looked at all Dodger games from 1947 through 1964, since we have pitch data for those games as well.

Lincecum is the second man to do this that I found in that sample, as follows:

Tim Lincecum, 10-07-2010, 14 total pitches (5 balls mixed in with the 9 swinging strikes)

Armando Benitez, 8-21-1999, 16 total pitches (7 balls mixed in with the 9 swinging strikes)

As an honorable mention, I found that Jeff Parrett of the Phillies struckout all three batters in one inning on 8-03-1989. He threw 13 pitches: 9 swinging strikes, 1 foul ball and 3 balls.

Dave Smith is president of Retrosheet. In 2005, he won SABR's highest honor, the Bob David's Award. He is also a professor of biology at the University of Delaware.

Thursday, October 7, 2010

And the 2010 Nobel Prize in physics goes to...

for his applied research demonstrating the properties of spherical objects, Roy Halladay!

Wednesday, October 6, 2010

Pitchers Who Had 30 Or More Quality Starts In A Season

This is a guest post by Clem Comly which was originally posted to SABR-L last week.

Felix Hernandez notched quality start #30 for the season yesterday. Mlb.com mentioned he was first to reach 30 since Randy Johnson in 2002. ESPN listed a handful who "recently" (since 1980?) reached 30. Using baseball-reference.com's play index, I was able to look at the period 1920-2010. Hernandez' 2010 season is the fifty-third 30+ QS season.

Looking at 1920-2010, the records for the sum of the QS games but excluding non-QS games and relief appearances:

Most QS: 37 1971 Wilbur Wood (21-10) (honorable mention 36 for 1946 Feller (26-9) and 1966 Koufax (27-6))

Most wins 28 1968 Denny McLain (28-3)

Most wins w/o loss 23 1980 Steve Stone in 24 QS

Most losses 13 1940 "Losing Pitcher" Mulcahy (12-13) and 1920 Rollie Naylor (6-13) [both pitching for a Phila. team]

Looking at just 30+QS seasons, the 1920-2010 records for the sum of the QS games but excluding non-QS games and relief appearances:

Best ERA 0.90 1922 Urban Shocker (19-10) in 30 QS

Worst ERA 2.09 1952 Robin Roberts in 31 QS

Best Winning% .960 1963 Koufax (24-1) in 31 QS

Worst Winning% .556 1972 Blyleven (15-12) in 31 QS

Most wins 28 1968 Denny McLain (28-3)

Fewest wins 13 2010 Felix Hernandez (perhaps 1 more start), 1920-2009 15 1965 Osteen, 1967 Bunning, and 1972 Blyleven.

Most losses 12 1972 Blyleven (15-12) in 31 QS
Fewest losses 1 1963 Koufax (24-1) in 31 QS

Most no decisions: 9 2010 Felix Hernandez (perhaps 1 more start), 1920-2009 8 1986 Mke Scott (17-7 in his 32 QSs)

Looking at 1920-2010 30+QS seasons, the records for the all games including non-QS games and relief appearances:

Fewest wins 14 2010 Felix Hernandez (perhaps 1 more start), 1920-2009 15 1965 Osteen (went 0-5 in non-QS games to finish overall at 15-15).

Other comments:

Is Felix 59 QS in consecutive seasons 2009-10 a record? No. Koufax 1965-6 had 71 (honorable mention Wilbur Wood 1971-2 70).

Looking at the distribution of 30+ QS seasons 1920-2009, 1960-9 had 20 while 1970-9 had 15.

Four decades had a single pitcher reach 30 QS in a season:

1930s was 1939 Bucky Walters
1950s was 1952 Robin Roberts
1990s was 1992 Maddux
2000s was 2002 Randy Johnson.

Clem Comly is the vice-president of Retrosheet and co-chair of SABR's statistical analysis committee while being a member of several other committees. He is also a Phillies fan.

Sunday, October 3, 2010

Blue Jays Set Isolated Power Record

Including the last game of the year, their team AVG is .248 while their SLG is .454. That gives them an ISO of .206, beating the record of .204 by the 1997 Mariners. The Blue Jays finish with 257 HRs, tied for 3rd best ever. They are the 14th team to have 240+ HRs and one of 7 of those teams to have 300+ 2Bs.

Their ISO was about 40% better than the league average since .206/.147 = 1.40. That is 6th best every and 2nd best since 1920, trailing only the fabled 1927 Yankees (don't call the cliche police on me-fabled is the only word I could think of). See my post from May 31 called Blue Jays On Record Power Pace.

They did hit better at home, with an ISO of .233 there and .182 on the road.

If we just looked at their road ISO relative to the league average, it would be .182/.147 = 1.238. So that was 23.8% above the league average and it would have been 34th best between 1920 and 2009. Pretty impressive.

Friday, October 1, 2010

Tim Linceum's Fluctuating Strikeout-To-Hit Ratio

About a month ago I posted Tim Linceum's Falling Strikeout-To-Hit Ratio.

Here are his ratios each month this year starting with April

43/22 = 1.95
40/33 = 1.21
34/33 = 1.03
35/42 = .833
27/33 = .818
52/31 = 1.68

In 2009 it was

261/168 = 1.55

In 2008 it was

265/182 = 1.46

For all of this year it is 1.19 = 231/194

Th NL average in 2010 is .833

I don't know how important this ratio is. Alot of young flame throwers see their strikeouts fall as they get older. But Lincecum turned things around in Sept., just in time for the Giants. I don't know how he did it, though.