Tuesday, July 26, 2011

Is Ryan Howard A Clutch Homerun Hitter?

This was inspired by a discussion at Baseball Think Factory about an article at Fangraphs called Ryan Howard and the RBI by Steve Slowinski (there is no shortage of great Polish baseball bloggers).

Some people say that Howard is a good "RBI guy" and that he is paid to drive in runs so maybe his overall stats and/or his OBP don't matter that much. But a good "RBI guy" would tend to hit better with runners on. In my opinion that means a higher AVG with runners in scoring position (RISP) than he normally gets and a higher SLG with runners on base (ROB) than normal

But this year Howard is only 47th in SLG (.463) with runners on base in all of baseball with guys with 150+ PA in that situation. Teixeira leads with .667.

Howard is 20th in AVG (.304) with runners in scoring position this year in all of baseball with guys with 100+ PA in that situation. Votto leads with .424.

So overall, nothing special on Howard's part.

But somehow Howard over his career has managed to hit better with ROB and RISP. His AVG/SLG with none on is .266/.528. With ROB it is .285/.593 and .281/.561 with RISP. The one difference that seems pretty big is the SLG with ROB.

From 1991-2000, ROB SLG was .422 in all of baseball and .411 with none on. Since 2006, it looks like SLG with ROB is about .009 higher than with none on. So Howard is way above that being .063 higher. Some of that might be because he is a lefty. But whether it is luck or some real clutch talent is hard to say.

So I thought I would try a statistical test on his HR% with none on (NONE) vs. his HR% with ROB. Here we calculate something called a "Z-score." To be significant at the 5% level (meaning there is less than a 5% chance of getting the difference between the two HR%s) the Z-score has to be at least 1.96 (plus or minus, since a guy could do worse with ROB, the clutch situation I am looking at). HR% will be highly correlated with SLG (also, it does not look like Howard hits very many more 2Bs or 3Bs with ROB than NONE). Some technical details on this are below.

Howard's HR% with NONE in his career is 6.857% while with ROB it is 8.221%. That may not seem like a big difference, but the ROB HR% is 19.9% higher (8.221/6.857 = 1.199). The question is whether or not it is statistically significant. This is where the Z-score comes in.

The Z-score takes into account the number of ABs and HR% in each situation (ROB, NONE). It also takes into account the normal major league difference (I used +.0009, the approximate difference from 2007-10). This is important because if the normal HR% was .01364, then Howard would not be clutch at all since this is the difference between his two percentages (.08221 - .06857). When I did this calculation for Howard, I got a Z-score of 1.655. That is significant at about the 10% level. I guess I would like to see it at the 5% level before considering that he is clutch in hitting HRs with ROB. But even if he had reached 1.96 or more in his Z-score, it is possible he got there by luck and not skill because 2.5% of the hitters will have a Z-score of +1.96 or more.


NORMAL DIFFERENCE FOR CL = +.0009 (which is added in this case)



(FROM PETE PALMER, BY THE NUMBERS 3/90. He called it a “pooled” standard deviation)

In a normal distribution, 5% of the players would have a Z-score of at least +1.96 or less than –1.96.

Sunday, July 24, 2011

Ubaldo Jimenez Home And Away, 2009-2011

I guess he is the subject of trade rumors. Here are some of his numbers





So his ERA has been about .85 better on the road. But, of course, home ERA is usually better. Here are the differentials the last three years for all of baseball: .46-.54-.24.

Friday, July 22, 2011

Not all no-decisions are created equal: Evaluating a little-examined pseudo statistic

That was the title of Gilbert Martinez's presentation at the recent SABR convention. Gilbert is president of the Rogers Hornsby Chapter of SABR in Austin, TX.

Click here to get the power point slides. It has alot of neat graphs.

Here is the abstract:

"In 2009, two pitchers recorded 16 no-decisions. The Houston Astros’ Roy Oswalt set a franchise record for no-decisions and was 8-6 in 30 starts. The L.A. Dodgers’ Randy Wolf was 11-7 in 34 starts.

Research shows that 2011 Hall of Fame Inductee Bert Blyleven holds the record for most no-decisions in a season with 20 after a 12-5 record in 37 starts with the Pittsburgh Pirates in 1979.

While Oswalt and Wolf didn’t set a record, they were among the most in a single season and they did receive some attention by the media and fans. Little analysis has been done to understand the nature of these no-decisions. Statistics abound in baseball, and especially with pitchers. Statistics capture games started, wins, losses, saves, earned run average, pitch counts, walks, hit batsmen, ground-ball outs, fly-ball outs, innings pitched, homeruns given up, strikeouts per nine innings, and so on. This project will focus on starting pitchers and no-decisions recorded as a result of starting a game.

The purpose of this project is twofold: 1) to evaluate the most no-decisions by a starting pitcher in a single season and in a career and 2) to determine which pitchers with the most no-decisions were unlucky or lucky.

A pitcher would be considered unlucky if he leaves the game with a lead, only to watch his team’s bullpen lose the game in the late innings. This would be considered a positive no-decision (positive because it suggests an effective pitcher). Likewise, a lucky pitcher would be one who leaves the game with his team trailing, only to be bailed out by a potent offense, thereby taking him off the hook. This would be considered a negative no-decision. If he leaves with the game tied, this would be a neutral no-decision.

These no-decisions are not equal, and some pitchers are more valuable to their teams than others. In other words, some pitchers saddled with numerous no-decisions but who have more positive no-decisions than negative no-decisions are more worthy of our sympathy than those with more negative no-decisions compared to positive no-decisions.

Initial review of the literature shows little on this subject, so it appears this research project would contribute to our understanding of lucky and unlucky pitchers in the context of no-decisions. In fact, this statistic is not readily recorded on Baseball-Reference.com.

The methodology includes running a formula in a pitchers database to determine no-decisions: (Games started) minus (wins + losses) = no-decisions. However, those results would need further review to eliminate no-decisions that come from relief appearances. The remaining results would be analyzed to determine the circumstances in which the pitcher was given the no-decision: Was his team leading or losing when he was removed? Was he lucky in receiving the no-decision or was he unlucky?

The result would be the number of positive, negative and neutral no-decisions, shedding new light on the pitcher’s effectiveness."

Wednesday, July 20, 2011

Has Jake Peavy Had Bad Luck This Year?

His ERA is 5.19. The league average is 3.87. Yet here are his AVG-OBP-SLG allowed


The league averages are .252-.319-.392

So he appears to be much better than the league average in OBP and SLG allowed.

He is getting hit pretty hard with runners on base (ROB). Here are his AVG-OBP-SLG allowed in those cases


Now with runners in scoring position (RISP)


It is not the case that he can't pitch in the clutch. For his entire career, his AVG-OBP-SLG are .233-.271-.336. With ROB, they are .235-.275-.313 and with RISP they are .240-.283-.303.

Fangraphs shows his fielding independent ERA at 3.06. That is a stat that tries to predict a pitcher's ERA just based on strikeouts, walks and HRs. His xFIP ERA is 3.49. That makes a further correction by giving all pitchers the same HR rate on flyballs (or something like that). They have an even more sophisticated stat called SIERA which projects him to have 3.57. Anyway you look at it, his actual ERA of 5.19 is very unlucky.

Monday, July 18, 2011

Pitchers Duels That Turned Into Slugfests (and Vice Versa)

This is a guest post by Tom Ruane, based on a post to the SABR list.

While watching the two teams fail to score for most of the evening
during the recent SABR outing to Dodger Stadium, I got to wondering
about the highest scoring games that began with scoreless inning
streaks. Here is the list since 1918:

And here is the flipside, the most runs scored in a game that ended
with the longest scoreless streak:

Tom Ruane, a computer programmer in Poughkeepsie, N.Y., is a member of Retrosheet's board of directors. He has published articles in "The Baseball Research Journal" and "By The Numbers." He won SABR's highest honor, the Bob Davids Award, in 2009.

Saturday, July 16, 2011

What Is Konerko Doing Differently? His OPS+ Is Up, His Strikeout Rate Is Down And So Is His GDP Rate

The table below shows these stats over the course of his career. Data from Baseball Reference.

These last two years his OPS+ has been well above anything he ever did before. He has his lowest strikeout rate in 8 years and his lowest GDP rate in 6 years (for strikeout rate it was K/PA with IBBs taken out-GDP rate is GDPs divided by opportunitites). He his making more contact which should make more GDPs a possibility but he is avoiding them more than in the past (see what I say below as to how his low 2005 GDP rate is another example of how lucky the Sox were that year).

The chart below shows Konerko's OPS+ over time.

A regression with OPS+ as the dependent variable and year as the independent variable yields an r-squared of .4 and the trend line is positive. He did not peak before 30 like most guys and then trend downwards

Now for the strikeout rate and the GDP rate.

Except for that wild fluctuation in the middle of his career, his GDP rate seems to be trending downward. He has always been slow but his GDP rate is not rising as he ages.

I think it is interesting that his GDP rate took such a big dip in 2005. He only grounded into 9 double plays that year. The year before it was 23 and the next year it was 25. Hitting into 10-15 fewer DPs must have really helped the Sox that year. This may be another way they were lucky that year. The Sox had to go down to the last week of the season before they clinched a playoff spot. It was not a sure thing. They beat their main competitor, the Indians something like 14 out of 19 that year and won a bunch of 1-run games that year against them. Konerko not hitting into DPs must have helped quite a bit. Click here to read about a bunch of other ways the Sox were lucky that year that I wrote about. For one, they got great years out of relievers Politte and Cotts who did not do much before or after that season.

Thursday, July 14, 2011

All-Star Game Oddity?

I wonder if this is not that strange. The first 3 NL pitchers used in the game are former ALers (Halladay, Lee, Clippard). The first 4 hits by the AL were by former NLers (Adrian Gonzalez, Bautista, Josh Hamilton, and Adrian Beltre). The first two hits by the NL were by former ALers (Beltran and Berkman).

With free agency, maybe something like this has happened before. Here are the former NLers who played on the AL team:

Adrian Gonzalez
Miguel Cabrera
Jose Bautista
Carlos Quentin
Josh Hamilton
Adrian Beltre
Paul Konerko
Chris Perez

Here are the former ALers who played on the NL team:

Brandon Phillips
Carlos Beltran
Lance Berkman
Matt Holliday
Scott Rolen
Roy Halladay
Cliff Lee
Tyler Clippard
Jair Jurrjens

Tuesday, July 12, 2011

If It's The Year Of The Pitcher It Is Even More The Year Of The Phillies Pitcher

You might have heard of these guys. Halladay. Hamels. Lee. Here is where they rank in WAR among NL pitchers this year:

1. Halladay (PHI) 4.9
2. Hamels (PHI) 4.4
3. Jurrjens (ATL) 4.2
4. Lee (PHI) 3.8
5. Kershaw (LAD) 3.2
Chacin (COL) 3.2
7. Cueto (CIN) 2.9
8. Cain (SFG) 2.8
Vogelsong (SFG) 2.8
Kennedy (ARI) 2.8
Hanson (ATL) 2.8

(data from Baseball Reference).

But even more impressive is that these three are in the top 5 of all players in the NL:

1. Kemp (LAD) 5.7
2. McCutchen (PIT) 5.1
3. Halladay (PHI) 4.6
4. Hamels (PHI) 4.5
5. Lee (PHI) 4.3
6. Braun (MIL) 4.2
7. Reyes (NYM) 4.1
8. Jurrjens (ATL) 3.9
9. Votto (CIN) 3.7
10. Kershaw (LAD) 3.6

The last NL team to have 3 pitchers in the top 5 in WAR among all players was the 1925 Reds. (Sean Forman found and sent me the list of all teams that had 3 or more in the top 10 just among pitchers and I went through those teams to see how they did among all players). Going through Baseball Reference year by year, it seems like 3 pitchers from one team in the top 10 in WAR among all players is very rare.

The 1966 Indians and the 1942 Tigers each had 5 in the top 10 among pitchers but none among all players.

The table below shows where these three Phillies pitchers rank in the NL in key stats so far this year.

Lee's 32nd in HR per 9 IP is in the top half of the 65 pitchers with 80+ IP. He gives up .79 HR per 9 IP while the league average is .877. Also note that the Phillies park gives up about 4% more HRs than average over the years 2008-10 (from the Bill James Handbook).

Saturday, July 9, 2011

The art of fiction is dead-Again

Derek Jeter got his 3000th hit today, only the 28th player to do so. It was a HR, at home, on what looked like a beautiful afternoon in NY, 84 degrees according to Yahoo Sports. He also went 5-for-5, the first player ever to do so in the game he got his 3000th hit (according to mlb.com). Then he drove in the go-ahead and what proved to be winning run in the 8th inning. And it was against one of the other contenders in the AL East, the Tampa Rays.

Here is what the great sports writer Red Smith said about Bobby Thomson's pennant winning HR in 1951:
"Now it is done. Now the story ends. And there is no way to tell it. The art of fiction is dead. Reality has strangled invention. Only the utterly impossible, the inexpressibly fantastic, can ever be plausible again"
See Red Smith on Baseball at the excellent Baseball Almanac site.

Friday, July 8, 2011

Angels Call Up Mike Trout

Alot of people are mentioning this on blogs. David Pinto mentioned that he is right up there with Bryce Harper as the top prospect in baseball.

Click here to see his minor league stats

Click here to see the Texas League batting leaders

You can sort them by different stats. He is 7th in the league in OPS. He is 2nd in SBs with 28 (just 3 behind the leader) and has only been caught 8 times. He is 5th in AVG and 4th in OBP. 8th in SLG. 2nd in triples (he bats right-handed). He will turn 20 on Aug. 7. Pretty impressive showing at AA.

If you look at all of his minor league stats (including 2009 and 2010) you see lots of 3Bs, SBs (with good percentages) and high OBPs. He hits alot better in away games this year. Does anyone know if he plays in a tough hitter's park?

Wikipedia says "...Trout was named 2010 J.G. Taylor Spink Award as the Topps/Minor League Player of the Year. At just 19 years and two months, Trout is the youngest player ever to win this award."

Hitting Picks Up In July, But Only Slightly

Here are the OPS levels in the NL by month starting in April:


For the whole season, the NL has .704. For all of last year it was .723. In the first half it was .729.

Now the AL:


For the whole season, the AL has .719. For all of last year it was .734. In the first half it was .741.

So quite a drop off for both leagues this year.

Thursday, July 7, 2011

Yes, It Is Easy To Explain Why McCutchen Did Not Make The All-Star Team

It seems like alot of people are wondering how a guy who is 2nd in WAR in the NL at both Baseball Reference and Fangraphs is not on the team. Here is why:

1. His value comes from non-traditional stats. He has a .389 OBP and 1.3 in defensive WAR (2nd in the league). The sabermetric revolution has only had so much influence and alot of people in baseball, including players and managers, don't know about them or don't think they are useful.

2. He has done poorly against the Giants, who are managed by the guy who picks the reserves, Bruce Bochy. The Giants have only played the Pirates 3 times this year, back in late April and McCutchen went 0 for 13. He has only a .180 career AVG with just 1 HR in 61 ABs against the Giants. So maybe that is why Bochy does not recognize how good he is.

3. He is not a standout in the traditional stats. McCutchen is not in the top 10 in HRs, RBIs or AVG or SBs. He has not won a Gold Glove and is only in his 3rd year. He has never batted .300, hit 30 HRs nor had 100 RBIs. So I guess he just does not have a great reputation (even though he should).

4. He also got off to a slow start, batting just .219 in April. Then he hit .275 in May and .347 in June (not witnessed by Bochy). So his name has not been on people's minds the whole season.

I could throw in that he plays on a team in a small market that has not had a winning season since 1992 and they have finished last in their division the last 4 years.

Wednesday, July 6, 2011

Is AROD In Serious Decline This Year?

Last week on the ESPN2 show "First Take," one of the commentators said something like "AROD is in serious decline this year." Those may not have been the exact words, but it was words to that effect.

Well, AROD has a 131 OPS+ this year, while last year it was 123. Even in 2009, it was 138, so this year is not too much worse than that. He did have 176 in 2007 and 150 in 2008. So any decline he had was before this year. And he is 35 (and turns 36 on 7-27). So this is not surprising.

He is also 7th in the league in WAR and the last year he was in the top 10 was in 2008. So the commentators of "First Take" could just as easily have said he was having a good comeback season. But they never mentioned WAR or OPS or any stats like that.

This brings me to my point: There is still not much use of sabermetrics in the mainstream media. Some complain about the new statistics ruining baseball. But the mass media, which alot more people are exposed to than stats blogs, keeps saying things that don't make sabermetric sense. So no, the new stats are not taking over and ruining baseball. They have made only a slight dent.

When the show discussed who the Yankees MVP was so far this year, they said Mark Teixeira. But here are Yankee leaders in WAR:

Curtis Granderson 3.8
Alex Rodriguez 3.3
Brett Gardner 2.8
Robinson Cano 2.5
Nick Swisher 2.5
Mark Teixeira 1.8

Teixeira is 2nd on the team in OPS+. But Granderson has 157.

Another example of this is Hawk Harrelson on the game today talking about how the leadoff walk always scores. But no one on national TV or a super station like WGN ever mentions the work of Retrosheet head Dave Smith who showed that leadoff walks only score about as often as leadoff singles. See Leadoff walks.

Great Post By Rich Lederer At Baseball Analysts

Rich has been cited for showing "you can move around the traditional gatekeepers and centers of power" because of his great campaign to get Bert Blyleven elected to the Hall of Fame. See The Declaration of Independents by Rich Lederer.

Tuesday, July 5, 2011

Phillies Send Starting Pitcher To Minors-His ERA Was Too High At 2.20

See Worley, Phils beat Marlins 1-0.

One anonymous source said "The guy clearly was not pulling his weight." Another said "We couldn't allow him to put our season in jeopardy."

If you look at the numbers I posted yesterday, 3 of the top 7 in WAR in the NL were Phillies pitchers (that is among everyone, position players included). This has got to be very rare.

Monday, July 4, 2011

Player Tied For NL Lead In WAR Not On The All-Star Team

Bill Gilbert pointed this out to me. Here are the leaders from Baseball Reference:

1. McCutchen (PIT) 5.1
Kemp (LAD) 5.1
3. Halladay (PHI) 4.6
4. Braun (MIL) 4.3
5. Reyes (NYM) 4.2
6. Hamels (PHI) 4.0
7. Lee (PHI) 3.8
8. Jurrjens (ATL) 3.7
9. Bourn (HOU) 3.5
10. Kershaw (LAD) 3.4

At Fangraphs, McCutchen is 2nd to Jose Reyes. Anyone know how often this happens, that the leader in WAR does not make the All-star team?

The Pirates have one all-star, Hanrahan, a reliever. Maybe Bochy thought that is what the NL needed. But McCutchen is having a better year than Beltran (not to pick on him) at least according to WAR, 5.1-2.7. And McCutchen has been good the last two years and had more WAR last year than Beltran. The Mets have at least one all-star anyway, Reyes. So McCutchen could have easily been picked within the current rules.

McCutchen gets alot of his value from a high OBP (.393) and his defensive WAR (1.4), which is 2nd in the league. The Giants have only played the Pirates 3 times this year, back in late April and McCutchen went 0 for 13. He has only a .180 career AVG with just 1 HR in 61 ABs against the Giants. So maybe that is why Bochy does not recognize how good he is.

McCutchen is not in the top 10 in HRs, RBIs or AVG. He has not won a Gold Glove and is only in his 3rd year. He has never batted .300, hit 30 HRs nor had 100 RBIs. So I guess he just does not have a great reputation.

His Baseball Reference page shows him with 4.0 WAR last year, which is very good. Yet when I used the Play Index to find the leaders in WAR, it shows him with just 3.7 (good for 26th) among position players.

He also got off to a slow start, batting just .219 in April. Then he hit .275 in May and .347 in June. He is 7th in OPS+ at 150.

Saturday, July 2, 2011

White Sox 31-20 Since Starting 11-22

They had the worst record in baseball on May 6 at 11-22. They were the only team below .400 at that point.

In April, they had a batting OPS of .664 while allowing .739. In May it was .754/.695 and June was .721/.659. So after a bad
-.075 In April, they were a positive .059 and .062 the last two months.

My rough formula for winning pct is

Pct = .5 + 1.25*OPSDIFF

So they should have been .575 over May and June while they were actually 30-24 or .556. A .575 pct would give them 31 wins.

Through May 6, they were outscored 158-123, slightly more than 1 run per game. Since then, including beating the Cubs today 1-0, they have outscored their opponents 219-186. That gives them a Pythagorean winning pct of .581 while 31-20 is .608.

So Far, Phil Humber Has Resembled Greg Maddux

A pleasant surprise for the White Sox, although it is only for half a season. Humber is not striking out alot of guys, but he is not walking many and he is not giving up many HRs. That is what Maddux did. Humber stats are close to Maddux's career averages.

The table below shows how Humber this year compares to Maddux's career numbers. Of course, Maddux pitched until he was 42. But even if I only went up to age 37, Humber would still be close and Maddux had a 143 ERA+, just a bit better than what Humber has this year.

Friday, July 1, 2011

This Year's Low Scoring In Recent Historical Perspective

The NL OPS in each of the first 3 months has been: .709-.702-.699. The declines seem as surpising as the low levels. And in June the NL has had some games with the DH. So it might have been lower without the inter-league games. The AL has gone .713-.720-.719.

David Pinto of "Baseball Musings" had a good post a couple of days ago about hitting this year vs. last year. See Halfway Point. He shows how HRs, hits, etc. have fallen.

The graph below shows AL runs per game from 1960-2011

It is pretty clear that this year is pretty low (4.26). The last time the AL was lower than 4.26 was in 1981, when it was 4.07. That was a strike year. Taking that out, we need to go back to 1978, when the average was 4.2.

The graph below shows NL runs per game from 1960-2011

The average this year is 4.1. The last time it was lower was in 1992 when it was 3.88.