The smell test for an academic paper

As recently as the 1990s, you could pick up an academic paper in economics and by examining the techniques, the citations, how clearly the model was explained, and so on, you could arrive pretty quickly at a decent sense of how good a paper it was.

Today there are still many evidently bad papers, but also many more papers where “the bodies” are buried much more deeply.  There are many more credible “contender papers” where the mistakes and limitations are far from transparent and yet the paper is totally wrong or misguided.  For instance, it is easier to “produce” a novel and striking result with falsified privately-built data than with publicly available macro data, which already have been studied to death and do not yield new secrets easily if at all.

One implied prediction is that a small number of absolute frauds will do quite well professionally.  Another prediction is that having close (and reputable) associates to vouch for you will go up in value.  (How reliable a method of certification is that in fact?)  It may be harder for some outsiders to rise to the top, given the greater difficulty of those outsiders in obtaining credible personal certification.  What would you think of a new paper from Belarus, or how about Changchun, which appeared to overturn all previous results?

What else can we expect?

I do not think we are ready for an academic world where our smell test does not work very well.


My sense from 'science' is that by thinking you are setting the bar high you are ensuring frauds. It's better to have those papers with those obvious limitations than everything looking the same. My rule of thumb is "here is what we did, here is what we got." The problem is that advisors will not green light something that isn't breathtaking. And if you have an advisor who thinks more of his reputation than the abilities he has fostered in his group you are in trouble.

With the caveat that I'm a chemist by training and not an economist, I see a great many economists from both well regarded university departments and some that are more obscure (as well as lots who are independent) blogging almost constantly. Some blogs tend to be data driven and some do not though almost all have a political point of view (and I use this term broadly). Has this had any impact on the 'academic' publishing side of things?

The second broad point I have is one that was alluded to on a previous MR post a couple of weeks ago regarding where academic economists do their graduate training. I don't know if the field routinely does citation analyses and looking at the correlation between the author's training university. I seem to recall that only a few universities were responsible for training those economists who are 'highly' rated (whatever that means). Perhaps this is an area worthy of more consideration, e.g., are the Harvard/MIT/Chicago trained economists more highly regarded than those trained elsewhere? Maybe things are more self-selecting in economics than in other fields.

Ahem...I consider it ridiculous that there would be a federal policy based on a paper or two on the minimum wage, for example. And I'm not taking a political position or impuning the researchers. In science there might be 1000 papers on a small subject and noone would dare suggest that the science is settled.

A lot of orthodox wisdom in economics is based on intuition and plausibility arguments, which is why a handful of empirical papers find it so easy to knock these down.

Economics is young; in a recognizably modern form it is probably no older than Samuelson, and that was just half a century ago.

It probably helps that contemporary economics carefully trains its scholars in the Samuelsonian perspective in seeing every outcome as a policy one, i.e., the absence of a minimum wage is as much a policy stance as any given minimum wage. It is merely a policy that endorses the observed existing equilibrium as the right one.

On the other hand, there are plenty of empirical science papers that get results that everyone knows can't be generally true and no one thinks those papers overturn the theory.


What exactly is "privately-built data" in a macro article context? What are milestone papers for econ. fraud?

Data where you have to pay the data agency for access, or where you compile the data yourself from primary sources. This is more common than you might think; even state agencies may do this, because carrying out privacy censoring (in panel data, for instance) is laborious and costly.

The original statement reads strikingly like the start of a discussion of hedge fund fraud!

Should private data sets be subject to authentication or verification by an independent auditing firm prior to publication?

Why should (much) credence be paid to results based on private data that (a) cannot be replicated, and/or (b) appear to contradict what is verifiable from public sources, etc? Shouldn't some reasonable, Bayesian analysis apply?

I would think that the important aspect would be the ability to replicate results across different data sets, not to replicate data sets themselves.

"different data sets"

Independent data sets!

As a first step, replicating published results on the same dataset is a real issue, but you are correct is is more "valuable" to replicate conclusions on different datasets.

Is there something that can be called "reading software"? Like search algorithm but a based on grammar, searching for patterns on texts, specific phrases, etc. After the string search is found, results are copied into a txt file? What Tyler proposes but with the help of software. Just wondering.

Aren't the incentives balanced? i.e. Producing sensational results also means your work gets a whole lot of immediate and close scrutiny?

The doctored data is more likely to be caught as well?

I don't want it to sound like it is the only factor, but fad chasing is a big driver. And sometimes it seems like the review committees and your stakeholders (think vampire stakes, not stakes as in they actually share any downside risk responsibility with you) seem to be more of a problem than a bulwark. From my vantage point, it literally seems that they would prefer you replicate a meaningless experiment with stem cells (adult or cell line, the embryonic fad seems to have waned) than create a meaningful experiment with less flashy established cell types. There are a lot of people that would like to ride the ripples of a splash.

"Producing sensational results also means your work gets a whole lot of immediate and close scrutiny?"

How effective can the scrutiny be if the underlying data is private?

Is this Tyler's wry commentary on the "The public funding of research and development" post from earlier?

Personally, I think there's a huge amount of theoretical work to be done straight out of public sources.

If you're around business or the economy, certain themes that keep popping up:

- Fixed Cost vs Variable Cost as the driver of recessions (investing myopia as a market failure, or why the futures curve is wrong)
- Fight versus Flight as the driver of price inelasticity (or elasticity) of demand (quantitative measures of "animinal spirits" or how to generate price shocks without financial speculation)
- The Three Ideology Model as the proxies we use for an objective function for government (or why comments are almost always ideological in nature)
- Principal agent models as corresponding to (classical) liberals versus conservatives (it's still duty versus desire)
- A Theory of Conservatism (using the group as the unit of analysis)

I see a huge amount of really fundamental work to be done in economics. But I see very few conceptual thinkers in the field. Mostly it's quants substituting econometrics for concept and narrative. But there's no lack of original work to be done.

Oh, and the declining marginal utility of wealth and income as the driver of social progress (or why a democracy requires a middle class, or why democracy is a decayed form of dictatorship (and hence democracy can be a dynamic institution))

A variation of this theme: static versus dynamic: the declining marginal utility of wealth and income as its relates to the debate between (classical) liberals and egalitarians. (Declining marg utility of w&i is the theoretical foundation of egalitarianism. But does it hold up over time?)

Due to out of control "technicality", it takes much longer merely to read a new paper. Referees to do not have much incentive (are they paid?) to read manuscripts thoroughly and do due diligence. The danger is that this imparts a certain conservativism to the major journals. If it is too different, it must be wrong. Perhaps referees don't have the time to evaluate a new approach.

No one, I think, has mentioned the trend that more and more authors voluntarily make their data and code available publicly. Increasingly, top journals require it. With legions of PhD students ready to replicate the results, there is a credible threat of discovery if the data or results are falsified. I believe encouraging PhD students to replicate papers plays a role that is almost as important as the official peer review process.

The point is, I believe we will reach a separating equilibrium in which empirical papers that voluntarily disclose data and code will be considered for publication, while those who do not will be rejected outright (unless the author is very well established and highly-regarded).

Some lower-ranked journals will probably counter-signal by emphasizing they do not require data and codes, but these journals will remain lower-ranked and will not ''count'' for tenure and promotion among good research universities.

I think you'll also need the high-ranked journals to publish replications or at least corroborative and disclaiming studies. The problem is entirely top-down. I forget what third-hand source said it but to be first and wrong is glorious and second and right is meaningless.

Yep, yep, yep.

Making your raw data and codes available to all seem like a solution. In economics, there seems to lack a culture of replication of the results. A paper trying to replicate a result with a different method, a different data set, a different natural experiment, a different RCT, will get low attention. Economics seem also to lack a culture of literature reviews. As a young researcher, you are not encouraged to do literature reviews (i.e. reading a hundred papers, explaining which results are bullshit, which ones are not, add a quantitative analysis, conclude) or to replicated old results.

Who of the "legions of PhD students" wants to discuss Solow 1957 factor productivity ? in detail?

From experience, I can attest that paper replication is a useful way for PhD students to learn the ropes. Unlike in natural and physical sciences, we do not have labs where we can get the hang of how to do research.

Increasingly, this paper replication exercise is required in PhD programs, and even if not, it is recommended. I do not think a student would take on a Nobel winner's key paper, but a recent paper in a top journal, sure. E.g., when the Hoxby or Acemoglu-Johnson-Robinson papers were challenged by PhD students.

Nobody has mentioned what 102 year old R. Coase recently complained about in the Harvard Business Review: econ quants who understand math but not history. I for one love math, but hate those econ papers that 'prove', using set notation and stuff that only math majors love, their theoretical model. I would rather have a word explanation, and then the math stuff can be added as an appendix just like source code can be.

Doubt that's possible. Math is so useful a language to precisely define and reason about complex phenomena that words are a poor alternative. At least for the physical sciences and I suppose it applies to Econ. too.

Math's (a part of) the main course not some afterthought or footnote.

Eh, Rahul, you may want to look at this:

" mathematical economics is unreasonably ineffective. Unreasonable, because the mathematical assumptions are economically unwarranted; ineffective because the mathematical formalisations imply non-constructive and uncomputable structures."

Present mathematical modeling in economics does not give quantifiable predictions. Its usefulness is therefore questionable.

Sure, mathematical models can be junk. GIGO. Math modelling is no magic bullet.

My point is: Can you give any better quantifiable predictions with non-mathematical, purely verbal arguments? No, methinks.

The solution isn't non-math-models but better-math-models; more specifically more realistic assumptions.

Why did you bother replying to me in words? Why not use symbols? GIGO indeed.

The solution isn’t non-math-models but better-math-models; more specifically more realistic assumptions.

The solution might very well be outside the DSGE framework,

The problem is not math, it is the overemphasis on using complicated math as a screening signal.

I very warmly recommend Coase recent statement:

and his 2 famous papers from 1937, 1960

Well, most old econ papers, specifically including nobel prize winners, wouldn't pass the smell test today as well.

See e.g. Solow 1957.

And I am still waiting for one single person, who can point me to one specific Krugman paper, and can tell me personally, in a very few sentences, why he considers this paper good.

I do not think we are ready for an academic world where our smell test does not work very well.

Does econ have an equivalent of I ask because that may be one way of, if not getting around this problem altogether, then at least ameliorating it by making it possible to get good papers "out there" in a centralized fashion.

The lack of something like arXiv, combined with very long publishing lead times, has been a problem in the humanities for as long as I've been paying attention to it—and, to my mind, signals negatively about how much importance the humanities grant to its own work products. But that's another story.

What's the excuse for the super long publishing delays in Econ.? I've heard that some good journals take two years.

Are economists simply busier than the rest of us or the papers are so much harder to referee?

Papers are longer, and expectations are for referee reports to be long and detailed (longer than in, say statistics or political science according to stats blogger Andrew Gelman). And I feel that this work is not highly valued by the profession.

Definitely: The profession does not value this work highly enough. I like Ted Bergstrom's way of addressing the problem: Not refereeing for journals that charge an arm and a leg for subscriptions while getting free referee labor.

Interesting points. Though, I doubt Physics, Chem, Engineering etc. pay / value a referee any more than econ (and yet get faster results).

I did notice Econ. papers tend to be longer: Is that a bug or a feature? Is paper-length something that the profession might want to gradually shrink?

The closest we have to arXiv is

People of a scientific bent could once have mocked you economists on such a topic. But given the disgraceful state of "Climate Science", and the endless dodgy stuff published on epidemiology, on diet & nutrition, on drug trials, and no doubt on other topics, I'm not sure we can now. How may recent Presidents of the Royal Society pass the "smell test", Nobelists or no?

And here is the other problem I perceive- the policy imperative. Climate science would be a wonderful thing if they just published thousands of papers and let people come to their own conclusions.

I am an attorney, not an academic but, to me anyway, this problem seems quite widespread in what passes for "science" these days. Perhaps it was ever thus, but I would be curious to know if our host has considered the intellectual decay of academic science as a potential source of "The Great Stagnation"?

I note that the conclusion of "TGS" argues, among other things, that society as a whole should give greater deference, power and respect to "scientists" (particularly, one assumes, professors of economics at major universities). I am past 50 and have been hearing this all my life. At the same time, I know a lawyer can find an expert economist attached to an academic institution who will be willing to testify under oath in support of just about ANYTHING the lawyer may want to prove (as long as the client is willing and able to pay generously). But I don't want to be too hard on economists since I have seen pretty much the same thing done in almost any field of "science" you would care to name.

What you describe seems more a flaw of the legal process. Sure, you can almost always find a few crackpots or mercenaries in every profession. It's a judicial flaw to rely on such testimonies.

Now, could you get fifty Professors from Top20 universities to support the same junk assertion? There's something to be said for consensus.

For every expert there is an equal and opposite expert.

If there's money to be made, you can always find an attorney willing to defend Hitler. If there's no money to be made, you'd be hard pressed to find an attorney to defend Mother Teresa.

If there's one thing I've learnt about experts, it's that they're experts in bugger-all.

Amazing what you can get people to believe under oath for a few thousand dollars. It works better than hypnosis.

But I think judges and juries get it right more often than not.

70% of whites believed OJ Simpson was guilty.
70% of blacks believed OJ Simpson was innocent.
70% of somebody were wrong.

From what I have seen the 'personal certification' process doesn't work so looks more like cronyism. Naturally insiders don't see it that way. A few QJE papers that conveniently left out important references come to mind. (Inevitably at least one author was from Harvard). There are lots of other examples. But then like some of the above comments have noted, a lot of economics papers these days are basically about deception; presenting an interesting question and then pretending the maths or instrument or whatever can really answer that question when it almost always can't.

The most obvious example of a bad paper that shouldn't have passed the Smell Test was Levitt and Donohue's 2001 paper on how abortion cut crime. In late 2005 Foote and Goetz tried to replicate it and found it was based on coding errors. In the meantime, a number of economists and noneconomists (Joyce, Lott, me, etc.) had pointed out why it smelled fishy. But the great majority of economists who expressed an opinion on it were highly enthusiastic.

I haven't seen much soul-searching within the economics profession over this fiasco.

Another example of "there's no such thing as bad publicity"?

Would that fall in the "Fraud" bin or "Error" bin. In empirical work is a referee expected to dig so deep as to even discover coding errors?

The "Incompetence" and "Self-Interest" bins.

Levitt, and all the economists he talked to, forgot about the Crack Years. A simple reality check of look at the homicide rates by age group on the federal Bureau of Justice Statistics website showed the implausibility of Levitt's theory, as I pointed out to him in our Slate debate in 1999, two years before his paper was published:

But, Levitt kept maintaining his analysis proved him right and he became a celebrity in 2005.

In hindsight, if Levitt was so obviously wrong, was the paper retracted?

Being a researcher means never having to say you are sorry. You publish a followup paper updating your research.

Being a celebrity means never having to admit you are wrong.

But then Levitt got in trouble for not taking global warming 110% seriously as you are supposed to.

Levitt and Donahue was a hard paper to argue with even for people like me, a prosecutor, who was certain that the higher incarceration rates were the predominant explanation. It made sense from both a general and specific deterrent effect.

Learning that the coding of L&D was wrong salvages my ego from the severe bruising it took trying to find a flaw in their methodology.

I'm not as convinced that the death penalty or permissive gun laws are statistically significant. We execute about 100 people per year with more than 10,000 homicides per year. In effect, we don't have a death penalty. The vast majority of armed citizens never find a need to use their weapons even in places where crime rates are high and gun laws restrictive. That's not to say I oppose the death penalty or permissive gun laws.

Jack P. and the following comments hit the nail on the head. Until the journals begin publishing replication papers the problem will persist. Didn't Tyler or Alex include a comment in MR recently by Richard Thaler where he said a young Phd has to be novel and clever and cutting edge. What is the payoff to writing a well researched and well-written replication?

BTW: when I was in grad school one of the professors teaching first-year econometrics had the students do a replication. My office mate, a macro guy, decided on a recent paper in a leading macro journal by a well known economist at a top ten school. When he contacted this economist he was told the data set was "lost."

Lost - seriously?

I'd be a lot more impressed by this argument if it came with *even one example* of the sort of paper it's referring to. As a hypothetical problem, maybe. But is it anything moire than the null set?

Click on the link to his CV and you will find several.

Just kidding. :)

I wonder how much of this is due to the incredible proliferation of journals.

Also I agree strongly with some commenters' accusations of cronyism; one piece of evidence about this is the fact that about 95% of the economists in "Top XXX Economists" lists are white men:

I bet 95% of them went through only 6 or 7 PhD programs; I assume also that these numbers are much more extreme for economists than for most fields of study, but I have no knowledge to back this up.

I wish I could somehow convert on my white manness. I guess it's a necessary but insufficient condition.

Now look at the percentage of them who are Jewish.

Yes, I'm a lawyer and now in financial services. How cliche.

Or professors and students at top ranked universities with lots of cash to buy proprietary data sources will increase their clout by no reason other than exclusive access to unexplored data. Grants beget more grants.

The market test is ready:

Your article sounds brilliant, but I am concerned that the metric used to establish success seems to be citations. Yes, better-cited articles tend to be better articles, but it is well known that impact factor and subjective true quality (what actually influences the field) are only weakly correlated. You can find lots of examples of mediocre journals with higher impact factors than top journals.

However, this bring up another point: if top economists publishing in top journals do not in fact generate many more citations, is their work the best? Are the top journals really top? Maybe we are collectively deluded (hero worship) and objective metrics such as citations should trump what the profession believes (which is hierarchical).

"I do not think we are ready for an academic world where our smell test does not work very well." This sounds great to me: more scrutiny, more skepticism, and more provisional acceptance of results. What's not to like? That I can't rely on mood affiliation to get me through the day?

A litmus test for whether an econ theory (NOT Econometrics) article is likely to ever any operational propositions with connection to the world we live in is this: Count the number of Lemmas it contains. If 5 or more the odds are slim to none.

As a mid-career graduate economics student (GPA 6.6), with a deep background in analytical methods from other disciplines, the key problem with economics is that no other discipline has been so willing to consider empirical evidence that demonstrates that a theory doesn't hold in practice as an interesting Paradox (Leontieff, for example), but continue on as if the theory is valid despite the empirical test. This is the reverse of scientific method.

And the fundamentals of the mathematical approach are deeply flawed. The very base assumptions of neoclassical theory (rational economic man, market free of distortions) etc have no validity except as the basis for a crude model. Throwing sophisticated mathematical techniques at a crude model and losing sight of the fundamental truth that the model is crude leads to a veneer of respectability to what is essentially a rough guess. This would not be such a problem if economics was an arcane branch of political theory, but it's capture of the public policy debate means that economics has huge real world impact.

Comments for this post are closed