How useful is the function if it does a terrible job at explaining the observations, let alone predicting?

]]>The variable which drives Rogers’s salary is “Top QB salary vs time the contract was signed”. It grows smoothly with time. You’ll find that Rogers is a little higher than the previous record holder, and a little lower than the next guy to set the record.

NFL salaries are tiered. The overall wage level is set by the salary cap, there’s slow migration of money from one position to another, and most salaries are set by analogy to the existing salary structure.

]]>1) EPAi = β0 + β1(Salaryi) + εi

2) Salaryi = γ0 + γ1(EPAi) + μi

The key difference for this question is between εi and μi. εi, the error term in the first regression, represents all factors that affect performance besides salary. For any given QB, the residual in the first regression represents how that player’s performance compares to the expected performance of a player with an equal salary. This regression may be a starting point for an “efficiency wage” style analysis of quarterbacks where higher pay leads to better performance, perhaps by increasing morale. The reverse causality is difficult to get around though.

μi, the error term in the second regression, represents all factors that affect salary besides performance. An individual’s residual value in the second regression represents how a player’s salary compares to the expected salary of a player with the same performance. Aaron Rodgers’ μ value tells us how much Rodgers’ salary exceeds the salary that would expect him to earn if he were paid solely based on performance; it tells us exactly how overpaid Rogers is. The second regression actually gives us the answer we’re interested in.

PS: The “horizontal distance” that Brian 1 uses is inappropriate; it will be equal to (εi/β1) for any data point i. Proof here: https://imgur.com/a/dJfuxst

]]>Interestingly, none of the negative-EPA/G players are really “good deals” (Brian2 graph). Even R. Fitzpatrick is only “cheap” by about $2M, not a whole lot of money to try to buy more EPA/G at other positions (unless it’s easy to find cheap EPA/G at other positions).

There seem to be really “good deals” in the 6-8 EPA/G range: 16-M. Cassel and S. Hill are a lot cheaper than other QBs with similar EPA/G (even ignoring the regression line and looking just at the data points).

The “elite” QBs in the 10-12 EPA/G range seem mostly fairly priced (T. Brady and A. Rodgers), except for P. Manning.

B. Favre might be the most surprising “overpriced” QB. He might be the counter-example for why EPA/G isn’t necessarily a great metric? Would you rather have B. Favre or J. Kitna+$12M?

]]>Not really a problem for Brian2’s graph, which is the correct one. As r gets smaller, the regression line gets more horizontal, indicating that it’s not really worth paying for EPA/G, as expected for small r. Conversely, we see the problem in Brian1’s graph: as r gets smaller, the regression line gets more horizontal, leading to the incorrect conclusion that the more uncorrelated EPA/G and salary are, the *more* one should pay for EPA/G.

]]>Confidence intervals are for the conditional expectation of the function, not its observations. Why do you think CIs get narrower as more data from the same DGP is observed?

]]>This.

Regardless of axes, the model is garbage. Throw it out, start again. And those CIs in particular are so narrow that a wide number of observations fall outside the ranges, even before you hit the tails, in either set of axes.

]]>(You would never say players perform better just because you pay them more, or at least that this is a very important consideration in the NFL QB context, where players generally have strong incentives to perform better, regardless of their current contract).

But graph 2 doesn’t tell us anything about whether Rodgers is underpaid or not since you need some measure of the trade-off between “PA’s” (points added) and salary, with all the attendant complications others have mentioned, to determine that.

But then graph 1 can be interpreted as “if NFL teams pay their QB x, they can expect y level of performance,” so of course we can see that by this metric, Rodgers is (or appears to be) underpaid.

(Here I think the main assumption, besides linearity, is that NFL teams actually have some idea of what they’re doing, when they sign QB’s to contracts).

]]>In other words, Phil was basically correct in his argument against Brian 1, but he made a mistake in then siding with Brian 2.

]]>Both charts show that he is “underpaid.”

And, if the links worked, which they don’t, for me

I might be able to see the actual regression equations

e.g., y = a + bx

which would, I think, confirm this

The linear regression prediction is on the regression line

But, not by moving laterally

but by moving vertically

we are trying to predict y from x

if the observed y value is above the line then we would have expected a lower y value, straight down

if the y value is below the line then we would have expected a higher y value, straight up

There is no discordance here

What a bunch of BS ]]>

agreed. and i’m not sure why everything needs to be linear. i think a polynomial of order 32 would fit better

]]>