Sunday, May 25, 2008

Another look at salaries and wins

Alot of people have looked at this. But I started thinking about it again after I came across some data at JC Bradbury's site. You can view that data here. The data shows how many games, on average, that teams won each year from 1986-2005. It also shows how much above or below the league average in total salary each team paid in percentage terms. Again, it shows yearly averages. Suppose a team was 10% above average one year and 30% above average another year, they would get 20 (if were just over two years).

What I did was to run a regression with average wins per year as the dependent variable and the average salary (SAL, the % above or below the league average) as the independent variable.

Here is the regression equation

Wins = 0.157*SAL + 80.22

The r-squared was .489 and the standard error was 3.89 wins. The T-value for SAL was 5.17. The .157 means that if you spent 10% more on salaries than the average team, you win 1.57 more games than the average team. A zero for SAL would mean that a team spent the average amount on salaries. A negative number means the team spent below the average salary level. The table below summarizes each team.

Tampa Bay, for example, on average, had a payroll that was 38.87% below the league average. They were predicted to win 74.12 but only 64.33 wins per game. If a team were to spend 100% more than average, it should win about 96-97 games a year. The Yankees had the highest payroll above average. They spent about 70% more than the average team. They were predicted to win 91.26 games a year but actually only won 90.24.

I think the results are fairly strong. 16 of the 30 teams were predicted to within 3 or fewer wins. Only 3 were off by 6 or more wins. I think what I did differently than JC Bradbury was to use the average annual values for each team, instead of each team's data for each year. By using the averages, I think the randomness from year-to-year is eliminated. A team can sign a big free agent and maybe one year he does not do well. Or you get lucky and some non-arbitration eligible young players do very well. So by averaging, some of the good and bad luck gets flushed out.

The graph below also summarizes the results. You can see that the relationship is strong.