## Sunday, August 31, 2008

### More On The Changing Historical Relationship Between Walks, HBPs and HRs

What I posted last week was something I posted on the SABR list last year. At that time, someone raised a question about this. Below is the question and how I responded, with a little more research. I think my basic finding is that there are not more HBP these days due to pitchers throwing faster.

"Cyril mentioned that current pitchers seem to be more willing to hit batters than pitchers in the past. How about since a lot more pitchers now pitch the ball around 90 MPH, it's harder for batters to get out of the way. Historically, have the pitchers leading the leagues in HB been hard throwers (more Ks) or poor control pitchers (more BBs)?"

I did some analysis on this although it is not exactly what John Lewis suggests. I took the top 500 pitchers in batters faced (seasonal data) from 1960-69 and 1997-2006. I ran a regression in each case in which the HBP rate was the dependent variable and the strikeout rate and the walk rate were the independent variables. Intentional walks were removed.

Here is the regression equation for the 1960s

HBP = .00387 + .0177*BB + .00186*SO

For the 1997-2006 period it was

HBP = .005 + .0031*BB + .00486*SO

The r-squared in the first case was just .013 and in the second it was .025. The r-squared tells us what percent of the variation in the dependent variable is explained by the model. So it is pretty weak. But the T-values for BBs and SOs in the first case were 2.44 and .44. So the walk rate is statistically significant. For the second period they were 3.32 and 1.13.

In the first period, a one standard deviation increase in BB rate increased HBP rate .000392. For the strikeout rate it was .00007. So if a pitcher increases his walk rate he increases his HBP rate more than if he increases his SO rate. For the second period these numbers were .00065 and .00022. So again, the walk rate has a bigger impact.

So all of this suggests that it is worse control in general that increases the HBP rate.

*********************

Now another response to that question

The other day I discussed a regression relating HBP, BBs and SOs. I did that again but I added in HRs with the idea that a pitcher might be more likely hit a guy who hit a HR last time up (or the next guy). I again looked at both the 1960s and the last 10 years. Skipping the regression details (except to say the coefficient values and the r-sqaured values did not change much), the interesting thing I found was that HRs had a negative relationship with HBP in the 1960s but it was positive in the last 10 years. So in the 1960s, a pitcher who gave up more HRs hit fewer batters but today a pitcher who gives up more HRs hits more batters.

Having an increase in HR% of .01 over 1000 batters faced reduced HBP in the 1960s by about .23. In the last 10 years, they went up by .33. A 1 standard deviation increase in HR% in the 1960s decreased HBP by .15. In the last 10 years it increased HBP by .24 (again, over 1000 batters). The standard deviation of HR% in the 1960s was .0066. In the last 10 years it was .0075.

The T-value on HRs was not significant for either time period. But maybe the difference in their coefficients could be. Anyone know if you can look at two different regressions and run some kind of a test to see if the difference between coefficients from the regressions is significant?

I ran a regression which combined the two periods. There was a dummy variable for time period. It indicates that pitching in the last 10 years instead of the 1960s, holding everything else constant, means 2.5 more HBP per 1000 batters faced. The T-value was 8.98. In other words, highly significant.

I also ran a regression with the dummy variable and the dummy variable was multiplied by each of the other variables (HRs, BBs, SOs). In this case the dummy for time period was just about zero and not significant. The value of the HR*dummy coefficient was .055 (although the T-value was just 1.53 and about 2 is usually needed for significance). So I think the .055 value means that any given increase in HR% in the last 10 years would make the HBP rate go up by .055 more than in the 1960s. So over 1000 batters faced, if your HR% goes up by .01 (say you give up 10 more HRs) you would hit .55 more batters in the last 10 years than you would have in the 1960s.