Comparing meta-analyses and preregistered multiple-laboratory replication projects

Many researchers rely on meta-analysis to summarize research evidence. However, there is a concern that publication bias and selective reporting may lead to biased meta-analytic effect sizes. We compare the results of meta-analyses to large-scale preregistered replications in psychology carried out at multiple laboratories. The multiple-laboratory replications provide precisely estimated effect sizes that do not suffer from publication bias or selective reporting. We searched the literature and identified 15 meta-analyses on the same topics as multiple-laboratory replications. We find that meta-analytic effect sizes are significantly different from replication effect sizes for 12 out of the 15 meta-replication pairs. These differences are systematic and, on average, meta-analytic effect sizes are almost three times as large as replication effect sizes. We also implement three methods of correcting meta-analysis for bias, but these methods do not substantively improve the meta-analytic results.

That is from a new article in Nature Human Behavior by Amanda Kvarven, Eirik Strømland, and Magnus Johannesson.


Hands up everyone who's astonished.

The problem with meta-analysis is that it is in effect a data dredge. Where you search all the data for anything that supports your bias and preconceived ideas. And lo and behold the results support your bias and preconceived ideas. Who could have seen that coming?

Was this work pre-registered? Perhaps the other studies have attempted to find this difference and failed, but they weren't published in Nature.

Meta-Analysis is a fashionable but highly flawed affectation in general research.
It is always subjective and falsely assumes a strict consistency in the various independent studies that it lumps together.

Some major approaches to meta-analysis, e.g. the Schmidt-Hunter one, are based on the assumption that the effects are heterogeneous across studies. Quantifying and characterizing that heterogeneity is one of the main goals of meta-analysis in this framework. A much larger problem is GIGO, i.e. publication bias and p-hacking render the whole literature unreliable and this is very hard to adjust for post hoc.

"the assumption that the effects are heterogeneous across studies" is irrelevant.
The problem assumption is that the data-collection & and analysis-models methodology are uniform across multiple independent studies -- and are therefore readily compared objectively.

Data-Sets vary widely in quality and there are many ways to analyze a given data-set... resulting in different study conclusions.

Thus, meta-analysis starts with an incoherent input basis.
Apples, Oranges, and DoorKnobs.

Sadly, I am not surprised by these results.

So an average experiment in psychology is biased by a factor of 3? The typical exaggeration factor in economics is 2 (

Comments for this post are closed