Who shares data?

Perhaps I’ve linked to this before, I am not sure, but it is worth another look:

We provide evidence for the status quo in economics with respect to data sharing using a unique data set with 488 hand-collected observations randomly taken from researchers’ academic webpages. Out of the sample, 435 researchers (89.14%) neither have a data&code section nor indicate whether and where their data is available. We find that 8.81% of researchers share some of their data whereas only 2.05% fully share. We run an ordered probit regression to relate the decision of researchers to share to their observable characteristics. We find that three predictors are positive and significant across specifications: being full professor, working at a higher-ranked institution and personal attitudes towards sharing as indicated by sharing other material such as lecture slides.

That is from Patrick Andreoli Versbach and Frank-Müller Lange, with a thanks to Florens Sauerbruch for the pointer.

Here (via @autismcrisis) is a new paper by John Ioannidis and Chris Doucouliagos, “What’s to Know About the Credibility of Empirical Economics?”, possibly gated for you, here is the abstract:

The scientific credibility of economics is itself a scientific question that can be addressed with both theoretical speculations and empirical data. In this review, we examine the major parameters that are expected to affect the credibility of empirical economics: sample size, magnitude of pursued effects, number and pre-selection of tested relationships, flexibility and lack of standardization in designs, definitions, outcomes and analyses, financial and other interests and prejudices, and the multiplicity and fragmentation of efforts. We summarize and discuss the empirical evidence on the lack of a robust reproducibility culture in economics and business research, the prevalence of potential publication and other selective reporting biases, and other failures and biases in the market of scientific information. Overall, the credibility of the economics literature is likely to be modest or even low.


When I left the university to start working at the Dutch CBS I was surprised about the different cultures. At the CBs people were much, much more inclined to share data and to accept criticism, partly because there was no publish or perish rat race and partly because, well, it was their job to share. At this moment, these people are clearly gaining ground in Economia (FRED!). So, let's expand our definition of economists and include the economists working at statistical offices, the data departments of central banks and economic historians ( we should have done this a long time ago, of course). All of these groups have a strong culture of sharing data and do so in ever more sophisticated ways. Academic economists are only a minority and do not always produce the most important economic research results.

A little background reading



'and include the economists working at statistical offices, the data departments of central banks and economic historians'

Absolutely - much of the 'proprietary' academic research in a field like economics (to stay in the open source framework, if only imperfectly) is based on the eminently public work provided through such mechanisms as taxpayer funding.

Much like how the Internet grew from taxpayer funding - and how the Internet's ability to open things remains a real challenge to those who previously enjoyed the comfort of their closed world.

I do wonder how the Mercatus Center project to redo the math and assumptions of literally years of published work based on factually inaccurate and apparently cherry picked data is going.

In a world with truly open access, charting this progress to remove errors and falsehoods would be a simple matter, thus allowing such Mercatus Center research to be seen as part of an ongoing enquiry where participatants follow rigorous standards of accuracy, to prevent the discussion from being distorted by flawed data.

Something that people who are familiar with their work being scrutinized - like those who work with fully open sources and who fully open their work and its methods - are very well acquainted with. Like the group mentioned above.

After all, there is a reason that Linux and its developers have become the finest team of OS developers on the planet. It just wasn't the DIY perspective of the project being literally open to everyone with an interest, but the very public debates about flaws and solutions that the community carried out in the full glare of public openness.

If only the openness of the Internet would reveal your identity. Krugman? That guy in Germany who raped his daughter for 20 years and kept his incest family hidden in the basement?

When are you going to go back and correct all your blog comments containing blatantly incorrect facts? Don't you care about accuracy and the removal of errors and falsehoods? Or do you only care about your communist agenda?

really ? they share data with bank of america or UBS ?

Is building a good data set difficult work with little reward? It would make sense to keep it proprietary if there are no external recognition for doing the dirty work. I don't know if this is the case (maybe solid careers can be built on building good data as well as analyzing), but just a supposition.

What is the "privateness-elasticity " of hidden data? Say, if enough good journals were to start refusing to publish papers based on private data would we get less good papers or would more erstwhile private data now become public.

i.e. Is a good fraction of this secret data kept so just because they can get away with it?

I wonder if some folks see their data as part of their ticket to their next publishable study. Data collection can be an arduous process, but can allow for a number of analyses. I wonder if part of the fear is that someone who receives the data now has a similar head start on a future study that one might plan on doing themself.

I agree. Data sets require a huge fixed cost to acquire but little marginal costs once that has been overcome. I know economists who delay sharing data because they still have a few projects they want to squeeze out of it.

When you freely give the data away you remove the fixed costs for other researchers. There are some benefits to this, but there's an intellectual property argument to be made here: we want to reward researchers for investing a lot to generate the data by giving them exclusive access for a while.

'we want to reward researchers for investing a lot to generate the data by giving them exclusive access for a while'

And how do we determine that the data isn't being cherry picked unless the methodology and data are available?

I'm certain that such a respected source of economic insight as the Mercatus Center is currently reworking all its published work that was based on flawed data, based how the opportunity to repair flawed assumptions that have now become fully public.

Authority through obscurity is no better than security through obscurity. Opening the methodology is as important as avoiding math errors (an interesting problem in cryptography - which is why no serious member in that field considers any closed algorithm to be secure).

Well, for those interested in the actual pursuit of knowledge, that is.

we want to reward researchers for investing a lot to generate the data by giving them exclusive access for a while.

Why not figure out a way to reward them directly for generating the data-set in itself?

Right. I'm far removed from this type of academic pursuit, but if another researcher uses my data set in their own analysis, what type of credit do I get? I'm sure I'd be cited in the paper, but depending on the importance of my data to the project, it may be somewhat appropriate to give author credit as well. That would obviously help in the "publish-or-perish" world of getting your name in the authorship of articles... and I feel like we'd see a lot more data sharing if it could lead to the possibility of an easy author mention.

Having spent several years as a data collection software developer for a large nonprofit social science research organization, separating collection and analysis activities would be fraught with peril for the status quo of these types of organizations.

Data and results are judged trustworthy based on the analysis, but the cost of analysis is just the tip of the cost pyramid for a study - most is in data collection.

Having the base of the pyramid within your organization gives you better control over the collection, yes, but it also gives you a funding source to maintain your stable of junior analysts. While the lion's share of the collection costs will go to hired guns or labor - a handful of engineers, a few hundred field staff, capital investment in infrastructure - just having a fraction of that for analysts to "supervise" gives a stable funding stream to keep them on full time between bouts of analysis.

Surely that can't be the only viable funding model? Cross subsidizing is always a dangerous gamble.

Why can't the data *itself* be the publish-worthy part, so it can be released, and the author credited, as soon as possible, so we don't have to wade through the shitty "first bite" analysis?

I agree completely. Collecting data and writing a paper have a lot in common -- high fixed cost, low marginal cost, high public value, and it's citable, useful research.

Researchers' "Publications" page could easily include published datasets, even if those datasets weren't used by the researcher him/herself to write a paper. This actually exists, see for example some of the datasets on ICPSR, but it's not nearly common enough. As commenter C points out, the incentive is currently to withhold private data until the researcher has written something valuable using it, which is putting the cart before the horse.


Biology, particularly genomics and proteomics, benefited from some very specific NIH projects/grants merely to collect vast quantities of raw data. There are also citation standards if you use this data. I guess, biology and the other hard sciences as well, already had the culture of "instrumentation grants"

I think this is the biggest thing. I just read a pending article to be published this year or early next that, because I know the author's previous work, I can tell for certain is based on his dissertation data, which was collected almost ten years ago. If you are nearing tenure review or the associate->full review and need to squeeze out a couple papers, figuring out a new approach to old data sometimes works. This probably partly explains the "full professor" finding cited in the article.

I wonder to what extent access to data is a barrier to entry in the elite levels of economic research. So much of the research from the top schools seems to be coming from unique datasets that are not widely available. I also know that datasets can take a lot of work to develop and you could run into the r and d incentive problem. Perhaps a modest patent system (3 or 5 years?) and then data must be made available. I also realize, some data are from sources when the original owners don't want the raw data released, such as in a paper I recently saw on car auctions. It is a tough but interesting question.

I don't have access to the paper, but full data sharing is unfeasible. Many economists use protected data, where they are bound by an agreement not to share.

Also, many data sets are publicly available. As long as the paper outlines how to construct the variables, the data can be accessed quickly. In fact, you could argue that not posting publicly available data is a service to the profession because it requires you to reconstruct the data for yourself, which allows you to check if the data was constructed correctly. It's similar to the difference of reading a proof and working it out on your own.

Finally, most economists are just bad at keeping up their websites. Most, including Tyler, have websites that look like they were designed in the 90s and never touched since. Although they don't post the data on their sites, I'm sure most would respond positively to an email request.

"Also, many data sets are publicly available. As long as the paper outlines how to construct the variables, the data can be accessed quickly. In fact, you could argue that not posting publicly available data is a service to the profession because it requires you to reconstruct the data for yourself, which allows you to check if the data was constructed correctly."

But this rarely happens in reality. If a paper is analyzing publicly available data, the computer code used to analyze the data (all the way from reading in the raw data files to producing regression tables) ought to be posted publicly. Computer programmers have long known the best way to ensure computer code is error-free is to have lots of different people inspecting it. Open source works for software enthusiasts, why shouldn't it work for social scientists? I don't see how it's a "service to the profession" to withhold the code one used in part of a published study, especially if it's a study on a very important topic.

'Most, including Tyler, have websites that look like they were designed in the 90s and never touched since.'

Personally, I'm confident that this website is fully up to date when it comes to collecting and collating visitor analytics.

Though probably not quite at the refined level I'm sure that 1250 billed inQbation projects hours provide to MRU.

Not everything is just surface level shininess, after all.

And if this website is too blah for you, then you should not check out the boring, but nicely spider optimized, *.com domain named after a contributor to this very web site. But if you are reading anything on that particular site, it is just by accident - it isn't really meant for people anyways. Nor for spreading through comments here, at least apparently, in the not too distant past.

"Also, many data sets are publicly available. As long as the paper outlines how to construct the variables, the data can be accessed quickly. In fact, you could argue that not posting publicly available data is a service to the profession because it requires you to reconstruct the data for yourself, which allows you to check if the data was constructed correctly."

Why stop there, genius? Why not require every researcher to write the software they use to process the data? You know, to make sure that they understand how it works and that it's not introducing errors.

Every econ professor oughta be able to write Excel!

Data access is necessary but not sufficient. The incentives for academics need to change too. Economics puts almost zero weight on replication. These are projects for grad students to learn techniques. It is so rare that someone gets a replication published, especially in a top five journal (the currency of tenure at a top school). If you have no publication outlet, then why would an academic waste their time on a time-intensive data project? Also economics has some deference issues (see the Justin Bieber birthday party reference yesterday) so these take downs don't normally gain you appreciation, except in rare cases. I knew most of the cases in this article: http://economix.blogs.nytimes.com/2013/04/17/a-history-of-oopsies-in-economic-studies/?smid=tw-nytimeseconomix&seid=auto, but for the life of me I can't remember any of the economists who caught the errors. There are examples of replication in the publishing process, see Justin Wolfer's tweet on the Brookings Papers: https://twitter.com/justinwolfers/status/324576600941289472 But that's rare. What if top journal published articles only if accompanied by a critique/robustness check from another critical author.

I do think Merijn's point about the research culture is important. The Fed wants us to publish but more importantly they want our research to be balanced and honest. I have seen many working papers come in (mainly from job market hires) with catchy titles and big claims that get chopped down to size by our working paper editors. (Oh yeah, and who edits the content of NBER working papers?) It used to irk me at our competitive disadvantage to other working papers, but I am happy that I am not paid by how well I spin my work or how well it makes a blog post. And we also get time to do replication and figure out what makes a paper tick. I am in a multi-year effort to get some proprietary data that has been used in an oft-cited paper. I am not trying to undermine that paper or the authors, I just want to see what makes it tick and whether I should change my policy analysis based on it. I would never be doing this if I were on a tenure clock. They are better economists than I will ever be and it's not a good way to make a name in academia. A more interactive research effort might actually benefit academics and would almost certainly benefit those trying to understand and use their research.

It is worse than that. I've seen cases where famous work has been replicated, tweaked, and shown to produce somewhat different, and sometimes completely new results. Yet journals routinely reject such articles as derivative, or a minor extension of the original.

1) Reinhart et al. ironically blast governments in their book TTID for not having and/or sharing economic data,

2) sometimes work at the fringe of statistical significance 'flip-flops' when replicated--for example, the famous study that showed productivity went up in a factory when the lights were changed (from dimmer to brighter then back again)--was later found to not be reproducible, but then later found to be reproducible; as well as numerous 'health facts' involving heart disease inter alia.

'who edits the content of NBER working papers'

Interesting - according to Sourcewatch (though admittedly, the data is older), these people would be the ones paying for such editting -

'Between 1985 and 2001, the organization received $9,963,301 in 73 grants from only four foundations:

* John M. Olin Foundation, Inc.
* Lynde and Harry Bradley Foundation
* Scaife Foundations (Sarah Mellon Scaife)
* Smith Richardson Foundation


Of course, I'm sure that National Bureau of Economic Research will promptly revise all published work it is responsible for producing based on incorrect data. Just like the Mercatus Center will.

After all, the people responsible for funding such research claim they are primarily concerned about fair and accurate information being used in policy debates.

(I really enjoy getting to write such less than perfectly buried sarcasm without any Australian noticing it.)


I'll make a wacky sugesstion: In addition to the traditional referee model I propose we add "data referees". These would not be busy senior academics but younger, grad students or post-docs etc. who have the time and skills to do a bit of the grunt work. The primary referees review the work and if found good send across the dataset and a list of "data questions" to the data-referees. e.g. What's the regression coefficients of DataSet-X. Or compute the following metrics or run these simulations. etc.

The data referees themselves do not get to see the results or conclusions of the paper. Merely the raw data and methods etc. They don't judge the work just compute what the work proposes but independently. The dataset is not externally exposed if private data was a concern.

The data-referees ought to be prominently acknowledged with a published paper and that should offer some incentive for putting in the work needed. It won't be a full replication but at least some errors should be weeded out by this validation.

Not wacky, actually quite interesting.

I would add that professors already DO assign replication projects to grad students in the form of problem sets. I have had to replicate several results of several published papers myself, for homework. That is a really useful way to learn both the theory, the empirics, and the technical skills, and as an added bonus if professors would just extend that practice to more recent (as-yet-unreplicated) papers then they'd be doing exactly what Rahul is saying. The marginal cost of doing that is very little, as grad students are already doing something similar in their problem sets.

You would think that independent verification of a paper's results would be mandatory, nicht wahr?

That's called science. Economics has nothing to do with that, regardless of its pretensions.

In my half-assed aspirational science experience, I can't remember a single paper exactly replicating another paper.

But I am going to guess you you never tried to replicate a paper's results without having that paper's data available to replicate.

This is not about replicating results, it is about access to the data which those results are based on.

Science is not about proprietary data.

Luckily, economics doesn't worry too much about such things.

I can even cite a source for just how far economics is from being a science using one very empirical measure -

'Out of the sample, 435 researchers (89.14%) neither have a data&code section nor indicate whether and where their data is available. We find that 8.81% of researchers share some of their data whereas only 2.05% fully share'

For the link, read the original post.

prior_approval, there is plenty of serious scholarship done in economics. in a lot of places, including your favorite think tanks and departments. I am suggesting some tweaks around the edges not abandoning ship. why not try being constructive rather than destructive for once. it's kinda fun.

It's about data and replicating results. Obtaining the data is the bulk of the work in a lot of cases. Why should someone give away their work? Turns out only 2% do. And probably not because they are just great guys.

'prior_approval, there is plenty of serious scholarship done in economics. in a lot of places, including your favorite think tanks and departments'

To be honest, I have no favorite think tanks or departments at all, but I do appreciate the effort made through taxpayer funding of government work. While having watched that funding be cut in the U.S. throughout my adult life.)

And of course there is serious scholarship done in economics - Thorstein Veblen comes to mind as a fantastic economist dealing with behavioral economics, giving very concrete examples from his time and place. And though such insights may not be precisely replicable a century later, the basic framework is not difficult to grasp and then adapt to current social conditions. For those interested in his insights, the 1915 edition of one of his major works is freely available online at http://archive.org/details/theoryofleisurec00vebliala

(Those with any Apple product may want to consider his insights in a more distant manner. As for those using GPL software - congratulations, you are participating in something he did not describe. Showing how economic understanding evolves over a century, not that most people consider Stallman an economist in his concern for freedom.)

'It’s about data and replicating results. Obtaining the data is the bulk of the work in a lot of cases. Why should someone give away their work?'

You know, people used to say the same thing about all the software created under the GPL, until it became so omnipresent we simply no longer noticed it.

Because it was both a superior model of software development, and because it allowed people the freedom to improve it. People who were motivated to do their best work in the full glare of a very public process, where only the very best work survived.

Independent verification of experimental data rarely occurs in other sciences too, such as the biological/medical sciences. But people design experiments based on the results of other experiments, so a dishonest researcher will probably be outed sooner or later.

Or just ignored. But there is a high degree of skepticism. People don't even believe a result implicitly. You don't really expect your experiment design based on a prior result (even your own) to work. I did an experiment and got an awesome result. There is NO WAY I am going to try to repeat it. A hundred papers later, and a general trend become accepted.

And I will also add that spinning results, extravagant claim-making, and other types of bullshit are alive and well in biomedical research. I've never worked in economics research, but I would imagine that the problems that plague that field probably reflect larger problems in academic research more generally, because the incentive are the same.

A datum is like the quarterback. He gets way too much credit and way too much blame. BS is the norm. When it is uncovered, I suspect there is a lot more feigned outrage than actual outrage.

Karl Popper called this "scientism", which is the aping of the physical sciences by the social sciences. Economics is a very valuable branch of knowledge, but will never have the precision of the physical sciences. The attempt to do so, especially in macroeconomics, is an illusion.

Why are other sciences immune to scientism? Take a gander at some cell signaling, not to mention gene transcription and epidemiology. There are entire new departments being created (e.g. bioinformatics) just for crunching the numbers. That seems almost identical to the macro-econ problem to me.

Maybe not "immune to" but could have "a lot less of"? So what is really the macro-econ problem? Surely you aren't saying that number-crunching by itself is the problem? The LHC experiment was probably the biggest number cruncher in recent years.

Come on, however backward compared to physics, at least in biology you can run the experiments again. THese don't get published, but it is the necessary first step, if only to get the instruments callibrated properly. If there is a mismatch that does not go away on scrutiny, that becomes publishable.

Are there any retractions of papers in economics?

I sat in on a data analysis course at some point, and they briefly talked about documentation tools like R Markdown, Knitr, and Sweave to enhance result reproducibility. Do people actually use such tools in the field?

One of the project that I worked on prior to retirement was a set of Principles for Clinical Research for pharmaceutical companies. The issue of data transparency was always hotly debated and we had a lot of discussions among the company people and editors of the major medical journals. All clinical trials are now registered and summary results are published in a timely manner. There is an effort ongoing to strengthen this and I was pleased to see GSK move to support this: http://www.gsk.com/media/press-releases/2013/GSK-announces-support-forAll-Trials-campaign-for-clinical-data-transparency.html as they have long been a leader in transparency of data (disclosure: I don't have any direct association with the company but did staff the task force that worked on the industry principles and GSK was a member).

The economics community could learn a valuable lesson from the pharmaceutical industry and move towards greater transparency!

+1 Even the Chemicals Industry finds it pretty hard to keep secrets now that all shipping, sale etc. requires an MSDS.

It could be argued this harms IP but sometimes there are overriding concerns.

If you want to be taken seriously, transparency is crucial. If you decide there is more value in keeping 'proprietary' methods close, so be it. You may be vindicated ultimately, but you should be taken less seriously in the meantime.

What is far more of a concern is when the *government* bases its policy forecasts on what they say is "unpublished data". If you look through the Social Security Administrations annual projections, a lot of it is based on "unpublished data", and they cherry pick only a subset of their results to publish.. which in some cases tries to hide problems. (for instance they give nominal figures out to 2090.. but not real figures, to hide that their "high cost" scenario" is high cost in nominal terms.. but using their projected inflation it turns into the *low cost* scenario which goes broke later, due to all the tweaking they do to make numbers look good).

This is a problem because the Social Security administration has a history of being off on its forecasts, and the techniques it uses are very questionable. This page goes into gory detail, pointing out:

“The 1950 annual SSA report projected US population in 2000 to be worst case 199 million, best case 173 million. In reality it turned out to be 282 million, 42% above their highest projection. Imagine if their real costs turn out to be 42% higher than their current projections”

before getting into concrete problems with its current approach like its implication it can forecast GDP out to 2090 more accurately than other entities can for 2-5 years, in fact more accurately than it is measured (based on the GDI-GDP statistical discrepancy). The SSA might provide info if asked, but all their data and models should be easily available to invite critique.

Even the CBO and GAO longterm forecasts post data of their results in spreadsheets.. but not the actual underlying model or all the data they use as inputs. Perhaps they will provide it if asked, but all government data shoudl be

'The 1950 annual SSA report projected US population in 2000 to be worst case 199 million, best case 173 million. In reality it turned out to be 282 million, 42% above their highest projection. Imagine if their real costs turn out to be 42% higher than their current projections”'

Interesting - in 1950, a prediction for 5 decades later turned out to be less than accurate. Since I assume you already have the public data, how did the predictions for 2000 look in 1960? 1970? 1980? 1990?

Cherry picking data is a real problem - it can even lead to a 'fracas.' Which is a bit strange - there is no dispute in the least that the results of Reinhart and Rogoff are flawed, which means that 'fracas' in terms of dispute is simple inaccurate.

The example was illustrating a point, not providing conclusive evidence of exactly how far off their population forecasts are in general. The page linked to provides data for a few sample other years, and data for other factors over the years. Feel free to provide counter data.Of course the Social Security Administration should provide a repository of their past projections, instead of requiring you to hunt back through text documents to collect them.

The issue is that the SSA provides low, medium and high forecasts (or at various times in the past 2 as in that case, or 4). It is natural long range projections may be off.. the problem is that their range of projections was also off. Their current projections still imply too much accuracy despite the history of being off. They are overoptimistic in the accuracy of their forecasts, and as that page details there are reasons to be concerned that their forecasts, and not just their ranges, aren't being done realistically. The government should plan conservatively in case forecasts are wrong, but it can't do that if all the projections are based on the pretense of being more accurate than they likely are.

'The example was illustrating a point'

The point being that a prediction from 63 years ago was less than perfect? Call anyone younger than 20 shocked by simple revelation, but don't insult anyone older than that.

'Feel free to provide counter data'

I feel zero need to provide any 'counter data,' being aware of the fact that such forecasts are refined - in this case, in the 1960s, the 1970s, the 1980s, and the 1990s - you need to show how they weren't refined to even begin to make something resembling a coherent case.

'the problem is that their range of projections was also off'

I remember a joke from some (SF?) source, where someone predicts Germany's future from 1900 to 2000, his tale starting in 1900, when the Kaiser's reign is supreme . It starts with Germany's glorious and expanding empire in 1910. When the listeners ask for more details of the next decade, the apparently time travelling source describes how the entire structure has collapsed in 1920, the Kaiser in exile, and Germany impovershed. When they ask about 1930, he tells them about a gathering force of national will, which in 1940 has catapulted Germany to the very cusp of world domination, its armies conquering everywhere they appear.

When the listeners ask about 1950, he tells them about the utter ruin of the Fatherland, and its division between its traditional enemies to the east and west. Seemingly distraught, his listeners ask about 1960, where they are told that both Germanies are in much better shape, with West Germany already becoming an economic powerhouse.

They then ask about the following decades, being told the two Germanies are reunited after a peaceful and democratic people's revolution in the oppressed east, following the dangers of a potential war involving weapons of unbelievable force and destruction.

Before he can explain what Germany looks like in 2010, he is locked up as an obvious madman.

Truly, predictions from when before I was born conerning the U.S. are as valuable as predictions concerning Germany made before I was born. In other words, amusing tales from the past, best suited for humorous tales. Much like the 1950s prediction that electricity from atomic power would be too cheap to meter.

( 'Although sometimes attributed to Walter Marshall, a pioneer of nuclear power in the United Kingdom,[1] the phrase was coined by Lewis Strauss, then Chairman of the United States Atomic Energy Commission, who in a 1954 speech to the National Association of Science Writers said:

"Our children will enjoy in their homes electrical energy too cheap to meter... It is not too much to expect that our children will know of great periodic regional famines in the world only as matters of history, will travel effortlessly over the seas and under them and through the air with a minimum of danger and at great speeds, and will experience a lifespan far longer than ours, as disease yields and man comes to understand what causes him to age."[2]

It is often (understandably but erroneously) assumed that Strauss' prediction was a reference to conventional uranium fission nuclear reactors. Indeed, only ten days prior to his “Too Cheap To Meter” speech, Strauss was present for the groundbreaking of the Shippingport Atomic Power Station where he predicted that, "industry would have electrical power from atomic furnaces in five to fifteen years." However, Strauss was actually referring to hydrogen fusion power and Project Sherwood, which was conducting secret research on developing practical fusion power plants.' http://en.wikipedia.org/wiki/Too_cheap_to_meter

I've been waiting for that to happen since I was born - which is actually the time this prediction considers reasonable.

re: "The point being that a prediction from 63 years ago was less than perfect? Call anyone younger than 20 shocked by simple revelation, but don’t insult anyone older than that."

It is not clear there is reason to believe their current estimates are any better. In the business world people consider worst case scenarios to be prepared for them, even while hoping for the best. A rational business would look at its track record in forecasting and not pretend its future forecast is more accurate than their history demonstrates they should expect. The SSA postures as if its projections are doing that, but they aren't. Other government forecasts don't even pretend to try, even though plans should be based on conservative estimates. It is better for government to be surprised and have a surplus than be surprised with added deficit.. except I suspect you are a statist who wouldn't agree and has no problem with government debt.

Actually it's GAO's policy to provide the data used in their reports only pursuant to an FOIA request and, if the report was done at the request of Congress, only if the congressional rquesters explicitly authorize release of the information. See 4 CFR sec 81.6(a).

I agree with many of these comments, but they do not nearly go far enough. How many of you can refer me to papers which both acknowledge an error and then change their results (qualitatively) accordingly? I've seen a number of errors that have been caught and the usual (universal as far as I know) response is to acknowledge the mistake, and then change the model so that the qualitative results are intact.

If I am correct about this, it points to a much more serious error. Economists are far more interested in reaching policy conclusions than replicating empirical work. The journals, the media, our classes, and our colleagues reward us for making sweeping statements ("X is a bad government policy," "policy Y actually makes the problem it is trying to solve worse," etc.) while there is little reward (or none) for replicating anybody's work (as Claudia points out). This is the curse of the "Queen of the social sciences." Economists are so enamored by their methodology that the priority is pointing out how stupid every other field is.

Economics is a bit like where astronomy was before Galileo. When observations don't match with the predictions of your model, just add another epicycle into the model and everything will be OK.

'How many of you can refer me to papers which both acknowledge an error and then change their results (qualitatively) accordingly?'

I remain confident that Prof. Cowen, as the general (http://mercatus.org/tyler-cowen) director of the Mercatus Center, will make a major effort to place the Mercatus Center at the forefront of such a worthwhile endeavor. (And no paid me a penny to write the preceding sentence - it was for personal amusement only.)

What's really shocking about this post is that many if not all leading journals in economics stipulate something like the following as a condition of publication: "Authors must also acknowledge they have read and agree to the policy on data access and estimation procedures. Data Access and Estimation Procedures. Authors are expected to document their data sources, data transformations, models, and estimation procedures as thoroughly as possible. Authors are also expected to make data available at cost for replication purposes for up to five years from publication." Journals should simply refuse to publish articles based upon proprietary data. If it cannot be replicated, it's just not credible.

I am no longer sure the place you're getting your information, however good topic. I needs to spend a while finding out more or understanding more. Thanks for wonderful information I was on the lookout for this info for my mission.

Comments for this post are closed