Interpreting Statistical Evidence

Betsey Stevenson & Justin Wolfers offer six principles to separate lies from statistics:

1. Focus on how robust a finding is, meaning that different ways of looking at the evidence point to the same conclusion.

In Why Most Published Research Findings are False I offered a slightly different version of the same idea

Evaluate literatures not individual papers.

SWs second principle:

2. Data mavens often make a big deal of their results being statistically significant, which is a statement that it’s unlikely their findings simply reflect chance. Don’t confuse this with something actually mattering. With huge data sets, almost everything is statistically significant. On the flip side, tests of statistical significance sometimes tell us that the evidence is weak, rather than that an effect is nonexistent.

That’s correct but there is another point worth making. Tests of statistical significance are all conditional on the estimated model being the correct model. Results that should happen only 5% of the time by chance can happen much more often once we take into account model uncertainty not just parameter uncertainty.

3. Be wary of scholars using high-powered statistical techniques as a bludgeon to silence critics who are not specialists. If the author can’t explain what they’re doing in terms you can understand, then you shouldn’t be convinced.

I am mostly in agreement but SW and I are partial to natural experiments and similar methods which generally can be explained to the lay public while other econometricians (say of the Heckman school) do work that is much more difficult to follow without significant background and while being wary I also wouldn’t reject that kind of work out of hand.

4. Don’t fall into the trap of thinking about an empirical finding as “right” or “wrong.” At best, data provide an imperfect guide. Evidence should always shift your thinking on an issue; the question is how far.

Yes, be Bayesian. See Bryan Caplan’s post on the Card-Krueger minimum wage study for a nice example.

5. Don’t mistake correlation for causation.

Does anyone still do this? I know the answer is yes. I often find, however, that the opposite problem is more common among relatively sophisticated readers–they know that correlation isn’t causation but they don’t always appreciate that economists know this and have developed sophisticated approaches to disentangling the two. Most of the effort in a typical empirical paper in economics is spent on this issue.

6. Always ask “so what?” …The “so what” question is about moving beyond the internal validity of a finding to asking about its external usefulness.

Good advice although I also run across the opposite problem frequently, thinking that a study done in 2001 doesn’t tell us anything about 2013, for example.

Here, from my earlier post, are my rules for evaluating statistical studies:

1) In evaluating any study try to take into account the amount of background noise. That is, remember that the more hypotheses which are tested and the less selection which goes into choosing hypotheses the more likely it is that you are looking at noise.

2) Bigger samples are better. (But note that even big samples won’t help to solve the problems of observational studies which is a whole other problem).

3) Small effects are to be distrusted.

4) Multiple sources and types of evidence are desirable.

5) Evaluate literatures not individual papers.

6) Trust empirical papers which test other people’s theories more than empirical papers which test the author’s theory.

7) As an editor or referee, don’t reject papers that fail to reject the null.