To read the second part go to Which Players Had The Most Surprising Walk Rates? (Part 2).
In Part 1, I looked at walk rates relative to the league average as a function of isolated power, relative to the league average with the idea being that it is harder to walk alot if you are not a power hitter.
In Part 2, I also included a variable for height and one for stealing. Height was in inches and stealing was stolen bases divided by singles + walks + HBP. Sort of a frequency. That was also relative to the league average. The idea is that shorter guys have an easier time walking and guys who steal alot won't get walked too much if the pitcher can help it. My data sourse in the Lee Sinins Complete Baseball Encyclopedia. I used all players with 5000+ PAs.
But "By The Numbers," the newsletter of SABRs statistical analysis committee recently published an article by Tom Hanrahan called Which Batter Had the Greatest “Eye”?. He used a different method than I used. One variable we both had was ISO, but he squared it. I had tried taking logs of the variables but it did not improve the results. But I thought I would redo the regression with all the same variables except that I would square ISO.
Here is the regression equation. Everything is relative to the league average except height.
Walks = 214.25 - 1.51*SB - 1.79*HT + .0016*ISOSQD
The r-squared went up a bit, from .149 to .166. The standard error fell 30.66 to 30.35 (the variable "Walks" is actually a player's relative walk rate-if you walked 100 times while the average player walked 50 times, your rate is 200 and 100 is average). The table below shows the top 25 in terms of having a walk rate greater than that predicted by the regression equation.
So the model predicted that Thomas would have a walk rate of about 91 while it was actually 219. So he was about 128 above expectations. So he has the most surprising walk rate. He was not very tall or short at 71 inches. He had a reasonably large strike zone. Pitchers would not have walked him out of fear, since he had an relative ISO or "isolated power" of 57 (that means he had only 57% of the average level of power). Pitchers might not have minded walking him since his SB rate was only .68 (he only stole 68% as often as the average runner). That would help him get walks. But with the coefficient on SB rate at 1.79, even if he had been Rickey Henderson (whose SB rate was 4.6 times the league average), Thomas' walk rate would only go down about 5.9 (since 1.51*(4.6 - .68) is about 5.9). That would not change things much-he would still be near the top in surprising walk rates.
The overall top 25 did not change much from Part 2. But Ted Williams did fall all the way to 74th. He is now predicted to have a rate of 159 instead of 144. John Olerud rose from 26th to 24th.
Here are the players who led in getting fewer walks than predicted.
This looks pretty close to the list from Part 2.
Here are the leaders that in the best eye that Hanrahan found.
They are on my list except for McGraw and Fain, who did not get 5000 or more PAs.