Can we trust published results in finance?

Maybe not.  So says the new working paper by Campbell R. Harvey, Yan Lui, and Heqing Zhu, the abstract is here:

Hundreds of papers and hundreds of factors attempt to explain the cross-section of expected returns. Given this extensive data mining, it does not make any economic or statistical sense to use the usual significance criteria for a newly discovered factor, e.g., a t-ratio greater than 2.0. However, what hurdle should be used for current research? Our paper introduces a multiple testing framework and provides a time series of historical significance cutoffs from the first empirical tests in 1967 to today. We develop a new framework that allows for correlation among the tests as well as publication bias. We also project forward 20 years assuming the rate of factor production remains similar to the experience of the last few years. The estimation of our model suggests that today a newly discovered factor needs to clear a much higher hurdle, with a t-ratio greater than 3.0. Echoing a recent disturbing conclusion in the medical literature, we argue that most claimed research findings in financial economics are likely false.

If you click on the link above, you also will see a picture summarizing their main points.

For the pointer I thank Noah Smith.


I would replace "false" with "not significant." It is not the same thing. And obviously the problem is not specific to financial economics, but any area where the same data are used repeatedly, or more broadly, where specification searching is used (cf. Ed Leamer).

Moreover there is a related problem with the use of instrumental variables (ungated link to NBER study):

I can't even tell exactly what you are saying here...what do you mean by "not significant?" Nobody discussing statistics should ever use the word "significant" without a modifier. If you mean "statistically significant" your comment makes no sense, and if you mean something else you should choose your words more carefully.

See Andrew Gelman for a useful classification of "Type M" and "Type S" errors:

You really couldn't tell that the word "statistically" was implied by the statistical context? I agree it would have been better to include it, but you are either faking your confusion or you are not capable of understanding the problem in the first place.

Give him a break.

So you think he's saying "most claimed research findings in financial economics are likely not statistically significant."

I don't think that even makes any sense.

Seconded, what does that even mean?

Even if it does mean something, we are discussing _research_ more than we are discussing statistics. If someone says some research finding is significant, the speaker is normally making implies a judgment of at least some of: nontriviality, interestingness to someone, relevance or consequence to something, or similar. It does no one any service if there's an unwritten rule that, if anyone has recently spotted statistics lurking in the area, then none of the above
apply and the word "significant" suddenly means something utterly different (moreover, something that does not imply, even subjectively, the tiniest degree
of nontriviality or interestingness.)

First of all he said "likely false". Second, there is a state of reality to claims, and false happens to be one of them. The purpose of statistics is to reach conclusions, not merely to exercise math skills. The author is raising concern that the number of false conclusions is much higher than we expected.

He also confined his conclusion to the data he examined, which is correct. It is obvious his conclusion has broader repercussions.

Wake me when it is over

It's never over - go back to sleep.

> "...we argue that most claimed research findings in financial economics are likely false."


so it's only "financial economics" research that's full of cr*p ?

luckily all other economics research is highly credible and replicable

of course this new revelation about the the economics profession merely follows the alarming realities discovered in other professional/academic research (Biomedical, Psychology, Physics, etc.). most published research is cr*p. it's a huge scam

dates back to 2005 when John Ioannidis, an epidemiologist at Stanford University, caused a huge uproar with a paper showing that strict reliance on the basically subjective guideline of statistical significance resulted in very optimistic research findings relative to the researchers starting objectives; he firmly argued that “most published research findings are probably false.” now we're seeing that problem everywhere even in the pure and sacred economics profession.

He didnt say it was "only" financial economics. The conclusion was limited to the scope of his data.

Yes, it always seemed obvious that John Ioannidis's findings applied to other fields -- in fact probably applied even more to other fields which don't have the FDA attempting to ride herd on quality. It almost seems odd to have taken this long to document it for financial economics.

I'm not sure I'm with them that setting a t-test hurdle at 3.0 will accomplish much. Only replication is really going to clear some of the chaff away.

By replication do you mean Monte Carlo simulation? If not, please explain.

Researchers generally publish p-values so that if you disagree with their choice of alpha, you can derive your own conclusion. What he is saying is that the plethora of tests indicate that the selected alpha is lower than the actual alpha, ie a control factor of the experiment is flapping in the wind.

My view has always been that, in addition to considering this control error, one must weigh the relative costs of Type I and Type II errors in the selection of alpha and n. I seldom see research that attempts to do so.

This is an idea that has been around in the Biomedical sciences since about 2005. Here is a summary of some more recent developments:

Here is a paper about trying to estimate the rate of "false discoveries" in the medical literature:

with associated discussion:

Most published research findings are wrong. I see no reason why finance would be different.

Comments for this post are closed