Cybermetrics: How Important Was Team OPS Differential In The 1920s?

I took the data from Retrosheet. The regression equation was

Pct = .5+ 1.37*OPPSDIFF

where pct is team winning percentage. The r-sqaured was .866 (meaning that 86.6% of the variaion in winning pct across teams is explained by the equation) and the standard error was .033. Over 154 games, that is about 5.12 wins. There were 160 observations, one for each team in each season. The 1927 Yankees had the highest differential of .196. Their hitters had an OPS of .872 while their pitchers allowed only a .676 OPS. The .872 was the highest of the decade, with the 1929 A's 2nd at .844. The .676 they allowed was the 7th lowest of the decade. The next highest differential, which also belonged to the 1929 A's, was .123.

The equation predicts the 1927 Yanks to have a pct of .769, far higher than their actual .714. Could they have been even better than we thought?

I also found the simple average of every team's yearly differential and pct and then ran another regression. Here was the equation:

Pct = .5 + 1.46*OPPSDIFF

In this case there were only 16 observations. The r-squared was .969 and the standard error was .012 or about 1.87 wins per 154 games. The better r-squared and standard error are probably due to averaging each team's yearly results. That helps flush out some of the randomness.

I did this same analysis a few years ago on the 1989-2002 seasons (using only walks, hits and ABs to calculate OBP). For all 394 teams, the equation was

Pct = .5 + 1.26*OPSDIFF

But when I averaged each team, the equation was

Pct = .5 + 1.21*OPSDIFF

So the averaging method lowered the value of the OPS differential slightly. But in the 20s, the averaging method raised the value of the OPS differential. I don't know why things would go in different directions in the two cases. I also don't know why the OPS differential was more valuable in the 20s in either case. Maybe it has to do with relief pitching.

The 1920s team that exceeded their predicted pct using the averaging method the most was the Senators. Their average OPSDIFF was -.0036 which projects to about a .495 pct while it was actually about .519. So they were about .024 better than predicted. The Senators did have one of the early relief specialists for most of the decade, Firpo Marberry.

Using the single season equation (.5 + 1.37*OPSDIFF), the Senators had 5 of the 20 best seasons in terms of winning more than expected, including the 3rd & 4th best. Marberry was on them for four of those seasons. I don't know if he had anything to do with them winning more than expected, but it is possible.

The next best over achievers were the Cubs who had a pct of .526 while they were predicted to have a .507. So they were .019 better than expected.

Cybermetrics

Wednesday, May 26, 2010

How Important Was Team OPS Differential In The 1920s?

No comments:

Post a Comment