tag:blogger.com,1999:blog-608528753722196209.post7104831105115993493..comments2014-04-14T08:52:39.614-07:00Comments on Cybermetrics: Have Second Basemen Been Underpaid?Cyril Moronghttp://www.blogger.com/profile/07148864847009186694noreply@blogger.comBlogger5125tag:blogger.com,1999:blog-608528753722196209.post-36297306628028643502008-06-14T04:15:00.000-07:002008-06-14T04:15:00.000-07:00You could test with the data you already have, by ...You could test with the data you already have, by holding out, for instance, year 2005, or instead a random sample. Train on one portion of the data, and test on the other.<BR/><BR/>By testing on the data used for model development, your results are optimistically biased.Will Dwinnellhttp://www.blogger.com/profile/03379859054257561952noreply@blogger.comtag:blogger.com,1999:blog-608528753722196209.post-12895027098646749502008-06-13T14:01:00.000-07:002008-06-13T14:01:00.000-07:00It might be a good idea to do it on another data s...It might be a good idea to do it on another data set, but it took me alot of time to get just the 5 years set up. Here are the average absolute prediction errors for 2000 and 2005. First the linear regression and then the log regression<BR/><BR/>2000-2.1 million, 1.2 million<BR/>2005-2.3 million, 2 millionCyril Moronghttp://www.blogger.com/profile/07148864847009186694noreply@blogger.comtag:blogger.com,1999:blog-608528753722196209.post-73699135494575575952008-06-13T13:05:00.000-07:002008-06-13T13:05:00.000-07:00Yes, although it would be even more interesting to...Yes, although it would be even more interesting to see that sort of performance measure on a holdout data set.Will Dwinnellhttp://www.blogger.com/profile/03379859054257561952noreply@blogger.comtag:blogger.com,1999:blog-608528753722196209.post-5072658948192434402008-06-13T10:58:00.000-07:002008-06-13T10:58:00.000-07:00WillThanks for dropping by and reading my blog.Whe...Will<BR/><BR/>Thanks for dropping by and reading my blog.<BR/><BR/>When you say mean absolute error, or mean absolute percent error, do you mean if I plug each guy's data into the regression equation, get a fitted or predicted value and then see how much it differs from his actual salary? Then get the average of that for everyone?Cyril Moronghttp://www.blogger.com/profile/07148864847009186694noreply@blogger.comtag:blogger.com,1999:blog-608528753722196209.post-59200307647515772362008-06-13T10:43:00.000-07:002008-06-13T10:43:00.000-07:00Do you have other performance metrics, such as mea...Do you have other performance metrics, such as mean absolute error, or mean absolute percent error?Will Dwinnellhttp://www.blogger.com/profile/03379859054257561952noreply@blogger.com