Category: Science

How economics has changed

Panel A illustrates a virtually linear rise in the fraction of papers, in both the NBER and top-five series, which make explicit reference to identification.  This fraction has risen from around 4 percent to 50 percent of papers.


Currently, over 40 percent of NBER papers and about 35 percent of top-five papers make reference to randomized controlled trials (RCTs), lab experiments, difference-in-differences, regression discontinuity, event studies, or bunching…The term Big Data suddenly sky-rockets after 2012, with a more recent uptick in the top five.

Note that about one-quarter of NBER working papers in applied micro make references to difference-in differences. And:

The importance of figures relative to tables has increased substantially over time…

And about five percent of top five papers were RCTs in 2019.  Note also that “structural models” have been on the decline in Labor Economics, but on the rise in Public Economics and Industrial Organization.

That is all from a recent paper by Janet Currie, Henrik Kleven, and Esmee Zwiers, “Technology and Big Data are Changing Economics: Mining Text to Track Methods.”

Via Ilya Novak.

Damir Marusic and Aaron Sibarium interview me for *The American Interest*

It was far-ranging, here is the opening bit:

Damir Marusic for TAI: Tyler, thanks so much for joining us today. One of the themes we’re trying to grapple with here at the magazine is the perception that liberal democratic capitalism is in some kind of crisis. Is there a crisis?

TC: Crisis, what does that word mean? There’s been a crisis my whole lifetime.


TC: I think addiction is an underrated issue. It’s stressed in Homer’s Odyssey and in Plato, it’s one of the classic problems of public order—yet we’ve been treating it like some little tiny annoyance, when in fact it’s a central problem for the liberal order.


AS: What about co-determination?

TC: There are too many people with the right to say no in America as it is. We need to get things done speedier, with fewer obstacles that create veto points. So no, I don’t favor that.


AS: John Maynard Keynes.

TC: I suppose underrated. He was a polymath. Polymaths tend to be underrated, and Keynes was a phenomenal writer. I’m not a Keynesian on macroeconomics, but when you read him, it’s so fresh and startling and just fantastic. So I’d say underrated.


AS: Slavoj Zizek, the quirky communist philosopher you debated recently.

TC: Way underrated. I had breakfast with Zizek before my dialogue with him, and he’s one of the 10 people I’ve met who knows the most and can command it. Now that said, he speaks in code and he’s kind of “crazy,” and his style irritates many people because he never answers any question directly. You get his Hegelian whatever. He has his partisans who are awful, but ordinary intellectuals don’t notice him and he’s pretty phenomenal actually. So I’d say very underrated.

Here is the full interview, a podcast version is coming too.

Big Data+Small Bias << Small Data+Zero Bias

Among experts it’s well understood that “big data” doesn’t solve problems of bias. But how much should one trust an estimate from a big but possibly biased data set compared to a much smaller random sample? In Statistical paradises and paradoxes in big data, Xiao-Li Meng provides some answers which are shocking, even to experts.

Meng gives the following example. Suppose you want to estimate who will win the 2016 US Presidential election. You ask 2.3 million potential voters whether they are likely to vote for Trump or not. The sample is in all ways demographically representative of the US voting population but potential Trump voters are a tiny bit less likely to answer the question, just .001 less likely to answer (note they don’t lie, they just don’t answer).

You also have a random sample of voters where here random doesn’t simply mean chosen at random (the 2.3 million are also chosen at random) but random in the sense that Trump voters are as likely to answer as are other voters. Your random sample is of size n.

How big does n have to be for you to prefer (in the sense of having a smaller mean squared error) the random sample to the 2.3 million “big data” sample? Stop. Take a guess….

The answer is…here. Which is to say that your 2.3 million “big data” sample is no better than a random sample of that number minus 1!

On the one hand, this illustrates the tremendous value of a random sample but it also shows how difficult it is in the social sciences to produce a truly random sample.

Meng goes on to show that the mathematics of random sampling fool us because it seems to deliver so much from so little. The logic of random sampling implies that you only need a small sample to learn a lot about a big population and if the population is much bigger you only need a slightly larger sample. For example, you only need a slightly larger random sample to learn about the Chinese population than about the US population. When the sample is biased, however, then not only do you need a much larger sample you need it to large relative to the total population. A sample of 2.3 million sounds big but it isn’t big relative to the US population which is what matters in the presence of bias.

A more positive way of thinking about this, at least for economists, is that what is truly valuable about big data is that there are many more opportunities to find random “natural experiments” within the data. If we have a sample of 2.3 million, for example, we can throw out huge amounts of data using an instrumental variable and still have a much better estimate than from a simple OLS regression.

How is Twitter disrupting academia?

Kris on Twitter asks that question.  I have a few hypotheses, none confirmed by any hard data, other than my “lyin’ eyes”:

1. Twitter exists as a kind of parallel truth/falsehood mechanism, and it is encroaching on traditional academic processes, for better or worse.

2. Hypotheses blaming people or institutions for failures and misdeeds will be more popular on Twitter than in academia, but over time they are spreading in academia too, in part because of their popularity on Twitter.  Blame makes for a more popular tweet.

3. Often the number of Twitter followers resembles a Power law, and thus Twitter raises the influence of very well known contributors.  Twitter also raises the influence of the relatively busy, compared to say the 2009 world where blogs held more of that influence.  Writing blog posts required more time than does issuing tweets.

4. I believe Twitter raises the relative influence of women.  For one thing, women can coordinate with each other on Twitter more easily than they can in academic life across different universities.

5. Twitter can damage the career prospects of some of the more impulsive tweeting white males.

6. On Twitter is is easier to judge people by their (supposed) intentions than in academia, so many more people will be accused of acting and writing in bad faith.

7. On Twitter more people do in fact act in bad faith.

8. Hardly anyone looks better on Twitter, so that contributes to the polarization of many professions, especially economics and those professions linked to political issues.  Top economists don’t seem so glamorous any more, not even in their areas of specialization.

9. Academic fields related to current events will rise in status and attention, and those topics will garner the Power law retweets.  Right now that means political science most of all but of course this will vary over time.

10. Twitter lowers the power of institutions more broadly, as institutions typically are bad at Twitter.

What else?

Is scholarly refereeing productive at the margin?

No, basically:

In economics many articles are subjected to multiple rounds of refereeing at the same journal, which generates time costs of referees alone of at least $50 million. This process leads to remarkably longer publication lags than in other social sciences. We examine whether repeated refereeing produces any benefits, using an experiment at one journal that allows authors to submit under an accept/reject (fast-track or not) or the usual regime. We evaluate the scholarly impacts of articles by their subsequent citation histories, holding constant their sub-fields, authors’ demographics and prior citations, and other characteristics. There is no payoff to refereeing beyond the first round and no difference between accept/reject articles and others. This result holds accounting for authors’ selectivity into the two regimes, which we model formally to generate an empirical selection equation. This latter is used to provide instrumental estimates of the effect of each regime on scholarly impact.

That is from a new NBER paper by Aboozar Hadavand, Daniel S. Hamermesh, and Wesley W. Wilson.  This is exactly the kind of work — critical, data-driven self-reflection about science — what Progress Studies wishes to see more of.

Emergent Ventures, sixth cohort

Sonja Trauss of YIMBY, assistance to publish Nicholas Barbon, A Defence of the Builder.

Parnian Barekatain.

Anna Gát, for development as a public intellectual and also toward the idea and practice of spotting and mobilizing talent in others.

M.B. Malabu, travel grant to come to the D.C. area for helping in setting up a market-oriented think tank in Nigeria.

Eric James Wang and Jordan Fernando Alexandera joint award for their work on the project Academia Mirmidón, to help find, mobilize, and market programming and tech talent in Mexico.

Gonzalo Schwarz, Archbridge Institute, for research and outreach work to improve policy through reforms in Uruguay and Brazil. 

Nolan Gray, urban planner from NYC, to be in residence at Mercatus and write a book on YIMBY, Against Zoning.

Samarth Jajoo, an Indian boy in high school, to assist his purchase of study materials for math, computer science, and tutoring.  Here is his new book gifting project.

One other, not yet ready to be announced.  But a good one.

And EV winner Harshita Arora co-founded AtoB, a startup building a sustainable transportation network for intercity commuters using buses.

Here are previous MR posts on Emergent Ventures.

What libertarianism has become and will become — State Capacity Libertarianism

Having tracked the libertarian “movement” for much of my life, I believe it is now pretty much hollowed out, at least in terms of flow.  One branch split off into Ron Paul-ism and less savory alt right directions, and another, more establishment branch remains out there in force but not really commanding new adherents.  For one thing, it doesn’t seem that old-style libertarianism can solve or even very well address a number of major problems, most significantly climate change.  For another, smart people are on the internet, and the internet seems to encourage synthetic and eclectic views, at least among the smart and curious.  Unlike the mass culture of the 1970s, it does not tend to breed “capital L Libertarianism.”  On top of all that, the out-migration from narrowly libertarian views has been severe, most of all from educated women.

There is also the word “classical liberal,” but what is “classical” supposed to mean that is not question-begging?  The classical liberalism of its time focused on 19th century problems — appropriate for the 19th century of course — but from WWII onwards it has been a very different ballgame.

Along the way, I believe the smart classical liberals and libertarians have, as if guided by an invisible hand, evolved into a view that I dub with the entirely non-sticky name of State Capacity Libertarianism.  I define State Capacity Libertarianism in terms of a number of propositions:

1. Markets and capitalism are very powerful, give them their due.

2. Earlier in history, a strong state was necessary to back the formation of capitalism and also to protect individual rights (do read Koyama and Johnson on state capacity).  Strong states remain necessary to maintain and extend capitalism and markets.  This includes keeping China at bay abroad and keeping elections free from foreign interference, as well as developing effective laws and regulations for intangible capital, intellectual property, and the new world of the internet.  (If you’ve read my other works, you will know this is not a call for massive regulation of Big Tech.)

3. A strong state is distinct from a very large or tyrannical state.  A good strong state should see the maintenance and extension of capitalism as one of its primary duties, in many cases its #1 duty.

4. Rapid increases in state capacity can be very dangerous (earlier Japan, Germany), but high levels of state capacity are not inherently tyrannical.  Denmark should in fact have a smaller government, but it is still one of the freer and more secure places in the world, at least for Danish citizens albeit not for everybody.

5. Many of the failures of today’s America are failures of excess regulation, but many others are failures of state capacity.  Our governments cannot address climate change, much improve K-12 education, fix traffic congestion, or improve the quality of their discretionary spending.  Much of our physical infrastructure is stagnant or declining in quality.  I favor much more immigration, nonetheless I think our government needs clear standards for who cannot get in, who will be forced to leave, and a workable court system to back all that up and today we do not have that either.

Those problems require state capacity — albeit to boost markets — in a way that classical libertarianism is poorly suited to deal with.  Furthermore, libertarianism is parasitic upon State Capacity Libertarianism to some degree.  For instance, even if you favor education privatization, in the shorter run we still need to make the current system much better.  That would even make privatization easier, if that is your goal.

6. I will cite again the philosophical framework of my book Stubborn Attachments: A Vision for a Society of Free, Prosperous, and Responsible Individuals.

7. The fundamental growth experience of recent decades has been the rise of capitalism, markets, and high living standards in East Asia, and State Capacity Libertarianism has no problem or embarrassment in endorsing those developments.  It remains the case that such progress (or better) could have been made with more markets and less government.  Still, state capacity had to grow in those countries and indeed it did.  Public health improvements are another major success story of our time, and those have relied heavily on state capacity — let’s just admit it.

8. The major problem areas of our time have been Africa and South Asia.  They are both lacking in markets and also in state capacity.

9. State Capacity Libertarians are more likely to have positive views of infrastructure, science subsidies, nuclear power (requires state support!), and space programs than are mainstream libertarians or modern Democrats.  Modern Democrats often claim to favor those items, and sincerely in my view, but de facto they are very willing to sacrifice them for redistribution, egalitarian and fairness concerns, mood affiliation, and serving traditional Democratic interest groups.  For instance, modern Democrats have run New York for some time now, and they’ve done a terrible job building and fixing things.  Nor are Democrats doing much to boost nuclear power as a partial solution to climate change, if anything the contrary.

10. State Capacity Libertarianism has no problem endorsing higher quality government and governance, whereas traditional libertarianism is more likely to embrace or at least be wishy-washy toward small, corrupt regimes, due to some of the residual liberties they leave behind.

11. State Capacity Libertarianism is not non-interventionist in foreign policy, as it believes in strong alliances with other relatively free nations, when feasible.  That said, the usual libertarian “problems of intervention because government makes a lot of mistakes” bar still should be applied to specific military actions.  But the alliances can be hugely beneficial, as illustrated by much of 20th century foreign policy and today much of Asia — which still relies on Pax Americana.

It is interesting to contrast State Capacity Libertarianism to liberaltarianism, another offshoot of libertarianism.  On most substantive issues, the liberaltarians might be very close to State Capacity Libertarians.  But emphasis and focus really matter, and I would offer this (partial) list of differences:

a. The liberaltarian starts by assuring “the left” that they favor lots of government transfer programs.  The State Capacity Libertarian recognizes that demands of mercy are never ending, that economic growth can benefit people more than transfers, and, within the governmental sphere, it is willing to emphasize an analytical, “cold-hearted” comparison between government discretionary spending and transfer spending.  Discretionary spending might well win out at many margins.

b. The “polarizing Left” is explicitly opposed to a lot of capitalism, and de facto standing in opposition to state capacity, due to the polarization, which tends to thwart problem-solving.  The polarizing Left is thus a bigger villain for State Capacity Libertarianism than it is for liberaltarianism.  For the liberaltarians, temporary alliances with the polarizing Left are possible because both oppose Trump and other bad elements of the right wing.  It is easy — maybe too easy — to market liberaltarianism to the Left as a critique and revision of libertarians and conservatives.

c. Liberaltarian Will Wilkinson made the mistake of expressing enthusiasm for Elizabeth Warren.  It is hard to imagine a State Capacity Libertarian making this same mistake, since so much of Warren’s energy is directed toward tearing down American business.  Ban fracking? Really?  Send money to Russia, Saudi Arabia, lose American jobs, and make climate change worse, all at the same time?  Nope.

d. State Capacity Libertarianism is more likely to make a mistake of say endorsing high-speed rail from LA to Sf (if indeed that is a mistake), and decrying the ability of U.S. governments to get such a thing done.  “Which mistakes they are most likely to commit” is an underrated way of assessing political philosophies.

You will note the influence of Peter Thiel on State Capacity Libertarianism, though I have never heard him frame the issues in this way.

Furthermore, “which ideas survive well in internet debate” has been an important filter on the evolution of the doctrine.  That point is under-discussed, for all sorts of issues, and it may get a blog post of its own.

Here is my earlier essay on the paradox of libertarianism, relevant for background.

Happy New Year everyone!

Which researchers really work long hours?

No, not work smart but put in what would appear to be lots of extra hours.  Why not measure who submits papers to journals in the off-work hours?:

Main outcome measures Manuscript and peer review submissions on weekends, on national holidays, and by hour of day (to determine early mornings and late nights). Logistic regression was used to estimate the probability of manuscript and peer review submissions on weekends or holidays.

Results The analyses included more than 49 000 manuscript submissions and 76 000 peer reviews. Little change over time was seen in the average probability of manuscript or peer review submissions occurring on weekends or holidays. The levels of out of hours work were high, with average probabilities of 0.14 to 0.18 for work on the weekends and 0.08 to 0.13 for work on holidays compared with days in the same week. Clear and consistent differences were seen between countries. Chinese researchers most often worked at weekends and at midnight, whereas researchers in Scandinavian countries were among the most likely to submit during the week and the middle of the day.

Emphasis added.  Get this, you lazy bastards:

The average probability of a manuscript being submitted at the weekend for both journals was 0.14, and for a peer review it was 0.18. Peer review submissions during holidays had average probabilities of 0.13 (The BMJ) and 0.12 (BMJ Open), which were higher than the probabilities for manuscripts of 0.08 (The BMJ) and 0.10 (BMJ Open).

For weekend paper submission, China appears to be at about 0.22, India at about 0.09, see Figure 1.  France, Italy, Spain, and Brazil all submit quite late in the afternoon, often a bit after 6 p.m.

That is from a new paper by Adrian Barnett, Inger Mewburn, and Sara Schroter.  They do not tell us when they submitted it, but I wrote this blog post a wee bit after 8 p.m.

Via Michelle Dawson.

Comparing meta-analyses and preregistered multiple-laboratory replication projects

Many researchers rely on meta-analysis to summarize research evidence. However, there is a concern that publication bias and selective reporting may lead to biased meta-analytic effect sizes. We compare the results of meta-analyses to large-scale preregistered replications in psychology carried out at multiple laboratories. The multiple-laboratory replications provide precisely estimated effect sizes that do not suffer from publication bias or selective reporting. We searched the literature and identified 15 meta-analyses on the same topics as multiple-laboratory replications. We find that meta-analytic effect sizes are significantly different from replication effect sizes for 12 out of the 15 meta-replication pairs. These differences are systematic and, on average, meta-analytic effect sizes are almost three times as large as replication effect sizes. We also implement three methods of correcting meta-analysis for bias, but these methods do not substantively improve the meta-analytic results.

That is from a new article in Nature Human Behavior by Amanda Kvarven, Eirik Strømland, and Magnus Johannesson.

Charles Murray’s *Human Diversity*

His new book is coming out in January, and the subtitle is The Biology of Gender, Race, and Class. I will get to the details shortly, but my bottom-line review is “Not as controversial as you might think,” but do note the normalization at the end of that phrase.

Here is one bit from p.294 toward the end of the book:

Nothing we are going to learn will diminish our common humanity.  Nothing we learn will justify rank-ordering human groups from superior to inferior — the bundles of qualities that make us human are far too complicated for that.  Nothing we learn will lend itself to genetic determinism.  We live our lives with an abundance of unpredictability, both genetic and environmental.

Most of the book defends ten key propositions, laid out on pp.7-8.  The first four of those propositions concern differences between men and women (“Sex differences in personality are consistent worldwide…”) and I do not find those controversial, so I will not cover them.  The chapters on those propositions provide a good survey of the evidence, and a good answer to the denialists, though I doubt if Murray is the right person to win them over.  Let’s now turn to the other propositions, with my commentary along the way:

5. Human populations are genetically distinctive in ways that correspond to self-identified race and ethnicity.

True, but Murray’s analysis did not push me beyond the usual citations of lactose intolerance, sickle cell anemia, adaptation to high altitudes, and the like.  That said, pp.190-195 offer a very dense discussion of target alleles for various traits, such as schizophrenia, and how those target alleles vary across different groups.  I found those pages difficult to follow, and also wished that discussion had been fifty pages rather than five.  Toward the end of that discussion, Murray does write (p.194): “…proof of the role of natural selection for many genetic differences will remain unobservable without methodological breakthroughs.”  With that I definitely agree.

On p.195 he adds “It is implausible to expect that none of the imbalances will yield evidence of significant genetic differences related to phenotypic differences across continental populations.”  That returns to my core point about this book not shifting my priors.  You could agree with that sentence (noting the ambiguity in the word “significant”) and still have a quite modest vision of what those differences might mean.  In any case, nothing in the book pushes me beyond that sentence in the direction of the geneticists.

And here the contrast with the chapters on men and women becomes (unintentionally?) glaring: those biological differences are relatively easy to demonstrate, so perhaps hard-to-demonstrate biological differences are not so significant.  That too is just a conjecture, but there are multiple ways to play the “absence of evidence” and “how to interpret the residuals” cards, and I wish those had received a more extensive philosophy of science-like discussion.

Now let’s move to the next proposition:

6. Evolutionary selection pressure since humans left Africa has been extensive and mostly local.

That one strikes me as a miswording or misstatement, though I do not see that it corresponds to any actual mistakes in the broader text.  You might think that general, non-local evolutionary selection for all humans has been quite large over the millennia, relative to local selection.  I genuinely do not know the ratio here, but Murray does not seem to address the actual comparison of “across all human groups” vs. “local” as loci of selection pressures.

Next up:

7. Continental population differences in variants associated with personality, abilities, and social behavior are common.

Clearly true, but note this proposition does not claim biological roots for those differences.  The real question comes in the next proposition:

8. The shared environment usually plays a minor role in explaining personalities, abilities, and social behavior.

Here I have what I think is a major disagreement with Murray.  If he means the term “shared environment” in the narrow sense used by say twin studies, he is probably correct.  But in the more literal, Webster-derived conception of “shared environment” I very much disagree.  Culture is a truly major shaper of our personalities, abilities, and social behavior, and self-evidently so. For my taste the book did not contain nearly enough discussion of culture and in fact there is virtually no discussion of the concept or its power, as a look at the index will verify.  The real lesson of “twins studies plus anthropology” is that you have to control almost all of a person’s environment to have a major impact, but a major impact indeed can be had.  I behave very differently than my Irish potato famine ancestors, and not because I am genetically 1/8 from the Madeira Islands.  That said, within the narrower range of environmental variation measured in twins studies…well those studies seem to be fairly accurate.

9. Class structure is importantly based on differences in abilities that have a substantial genetic component.

Correct as stated, but I see those differences as much less genetic than Murray does.  For instance, IQ is to some extent heritable, but how much does that shape economic outcomes?  It is worth turning to Murray’s discussion on p.232 and the associated footnote 17 (pp.428-429).  His main source is what is to me a flawed meta-study on IQ and job performance (Murray to his credit does also cite the best-known critique of such studies).  I would opt more directly for the labor market literature on IQ and individual earnings, based on actual measured wages, which shows fairly modest correlations between IQ and earnings (read here, here and here).  So, at the very least, the inherited IQ-based permanent stratification version of The Bell Curve argument is much more compelling to Murray than it is to me.

10. Outside interventions are inherently constrained in the effects they can have on personality, abilities, and social behavior.

Clearly this is literally true, if only because of the meaning of “constrained.”  But mostly I would repeat my remarks on culture from #8.  Cultures change, and over time they are likely to change a great deal.  For instance, early in the 20th century, Korea, Japan, and China often were described as low work ethic cultures.  As cultures change, in turn those cultures can shape the personalities, abilities, and social behaviors of subsequent generations, in significant ways albeit constrained.  So while Murray is correct as stated, I believe I would disagree with his intended substantive point about the weight of relative forces.

Overall this is a serious and well-written book that presents a great deal of scientific evidence very effectively.  Anyone reading it will learn a lot.  But it didn’t change my mind on much, least of all the most controversial questions in this area.  If anything, in the Bayesian sense it probably nudged me away from geneticist-based arguments, simply because it did not push me any further towards them.

Murray of course will write the book he wants to, but my personal wish list was two-fold: a) a book leaving most of the normal science behind, and focusing only on the uncertain and controversial frontier issues, in great detail, and b) much more discussion of the import of culture.

Most of all, I am happy that America’s culture of achievement is inducing Murray to continue to produce major works at the age of 76, soon to be 77.

You can pre-order here.

Work on these things

Here are some projects I’d like to see funded, some through my own ventures, or others through alternative mechanisms. On these issues, the right person could have an enormous impact, whether through the research side or directly coming up with actionable ideas, including of course creating and building companies.

More studies of super-effective people. Either individually or collectively. If you take the outliers in any domain, what should our intuitions be for understanding the underlying processes determining how many people could have ended up in those positions? How many people had the right genes but had the wrong upbringing? How many people had the right genes and the right upbringing but the wrong luck, or perhaps society failed them in some other manner? The answers to these questions have significant policy implications.

A comprehensive analysis and critique of the NIH and NSF. The US funds more science research than any other country — about $35 billion per year on the NIH and $8 billion per year on the NSF. How exactly do these institutions work? How have they changed over time and have these changes been for good or bad? Based on what we now know, how might we better structure the NIH and NSF? What experiments should we run or what kind of studies should we perform?

Why is life expectancy so long in Hong Kong? Life expectancy in Hong Kong is 84.23 years, more than five years longer than the US and the highest in the world. Hong Kong is not that wealthy (median household income is $38,000 USD); it’s somewhat polluted; people don’t obviously eat what seems like a healthy diet; and they don’t seem to exercise a great deal. What should we learn from this?

Bloomberg Terminal for everything. This might be a nonprofit, a company, or a government project. To state the obvious, many analyses hinge on having the right data. If you’re in finance, getting the right data is often easy: just pull it up on your Bloomberg terminal. But there is no practical way to ask “what most correlates with life expectancy in Hong Kong?” (See above on that topic.) Figure out a way to build a growing corpus of structured data across the broadest variety of domains.

A comprehensive guide to the American healthcare system. The American healthcare system is by far the world’s biggest and also by a considerable margin the world’s most influential. Yet there is no comprehensive, dispassionate, and analytical disaggregation of how it all works. Who are the actors and what are their incentives? To the degree that the relationships between different entities are in equilibrium, what are the forces ensuring they stay there? What is the Sankey diagram of fund flows within the U.S. healthcare system?

Better answers for how to quantify worker productivity. In most knowledge industries, companies have nothing better than highly subjective measures (i.e., supervisors’ assessments) of worker productivity. In theory, it seems significant improvements should be possible. In the short term, is it possible to measure the productivity or efficacy of individual managers, software engineers, educators, scientists? How about teams, and what size of team? And can we do so without creating Goodhart’s Law problems?

What should Widodo do? Indonesia is a large, populous middle-income country. It faces no major near-term security threats. It has a small manufacturing base and no major non-commodity export sectors. What is the best non-bureaucratic 10 page economic development briefing document and set of prescriptions that one could write for Indonesia’s president? For Indonesia, substitute Philippines, Chile, or Morocco. 

A comparative study of foundations and their efficacy. Philanthropic foundations are behind a lot of important work. But how does a foundation decide what it wants and how the resulting grants should be structured? How effective are the programs of that foundation? In practice, how have its institutional mechanisms evolved? Imagine some kind of resource that answered these questions for the major American foundations.

Institutional critiques. More broadly, there is no discipline of institutional criticism. There is a very rich literature of policy criticism in economics, journalism, and non-fiction books. There is also a rich literature of “corporate criticism”: there are thousands of articles about how Facebook (budget: $20 billion) works and how it might be good or bad. But there is relatively little analysis of the most important institutions in our society: government departments. How is the Department of Agriculture (budget: $150 billion) organized and how effective or not is it? How about the Department of Energy (budget: $32 billion)? And why are not those questions paramount in the minds of policymakers?

Cultures of excellence. If you ask informed Filipinos why the street food is mediocre, they will tell you that Philippines lacks a “culture of excellence”. It seems that some kind of “culture of doing things really well” has very persistent and generalizable effects. South Korea and Japan have developed much more rapidly than many Asian countries, despite many others adopting relatively free “Washington Consensus”-style trade policies. Russia still has higher GDP per capita than Mexico despite Mexico’s economic policies having been much better than Russia’s for many, many decades at this point. How should we think about cultures of excellence?

Regeneration at the government layer. Herbert Kaufman (unsurprisingly) concludes in an empirical study that government organizations don’t die. While we might all agree that this is a problem, actionable solutions are in short supply. What can or should we do about this?

IQ paradox. Ron Unz points out that intergenerational variation of IQ may be much higher than is often assumed, citing Ireland and Croatia as examples. For instance, not long ago Ireland had sub-par measured IQ and now that figure is much higher, following growth and prosperity. The policy implications of IQ disparities across nations may therefore be different to what might otherwise obviously follow: perhaps environment matters much more than is assumed. If so, what should we be doing more or less of?

Credible plans for new top-tier universities. 7 of the best 25 universities in the world (Times ranking) were started in the US between 1861 and 1891 by ambitious reformers. It’s probably harder in many ways to start an impactful new university today… but it’s likely not impossible and the returns to doing so successfully might be very high. What might be a good plan? Why have so few of these plans come to fruition? 

Summaries of the state of knowledge in different fields. As a general matter, a lot of oral knowledge in the world is still not readily available, and reflection on this fact might lead one in many interesting directions. One obvious application is helping people more readily understand the present state of affairs in different domains. If I want to know “how we’re doing” in, say, antiviral drug development, I could spend a few hours hunting for top researchers, email a few, and perhaps get on calls to obtain their candid assessments. Are we making good progress? What are the most important open problems? What’s holding things back? And so on. How can we make all of this knowledge publicly available across all fields?

Mechanisms for better matching. One of the single interventions that could do the most to improve global welfare would be to improve the efficiency of the partner/marriage matching ecosystem. Online dating demonstrates that significant change (and maybe even improvement?) is possible, with some figures suggesting that up to two thirds of relationships in the US may now be initiated through online dating services. Accomplished people often seem to struggle with this challenge. Good solutions would be important.

What should Durkan do? Jenny Durkan is the current mayor of Seattle. As cities become more important loci of economic activity in the world, the importance of effective city governance will increase. As with the Widodo challenge, what is the best 10 page briefing document and set of prescriptions that one could write for her? What about Baltimore and St. Louis?

My Conversation with Esther Duflo

Self-recommending if there ever was such a thing, here is the audio and transcript.  In addition to all of the expected topics, including gender in the economics profession, we even got around to Indian classical music and Bach cantatas (she prefers the latter).  Excerpt:

COWEN: Do you worry much that the RCT method — it centralizes authority in too few institutions? You need a certain amount of money. You need some managerial ability. You need connections abroad. It’s not like running regressions — everyone can do it on their PC. Is that, in some way, going to slow down science? You get more reliable results, but there’s much less competition of ideas, it seems.

DUFLO: I think it would be the case if we had not been mindful of this problem from the beginning. And it might still be the case to some extent. But I actually think that we’ve put a lot of effort in avoiding it to be the case.

When you take an organization like J-PAL, just in India we have 200 staff members. And we have, at any given time, 1,000 people running surveys. I say we, but these people are not running my project. These people are running the projects of dozens and dozens of researchers. When I started, I couldn’t have started without having the backing of my team because it was such a risky proposition that you needed to be able to easy risk capital kind of things.

But at this point, because of the infrastructure, it’s much more normal sense. People can get in with no funding of their own, in part because one of the things we are doing as a network is raising a lot of money to redistribute to other people widely. J-PAL has 400 researchers that are affiliated to it, or invited researchers, many of them quite, quite junior.

So that sort of mixture — it was very important to us, and I think we’ve been quite successful at making the tool marginally available. It’s never going to be like running a regression from your computer. But my philosophy is that if you have the drive and you’re willing to put in your own sweat equity, you can do it. And our students and many other students who are not at top institutions are doing it.


COWEN: On the internet, there’s a photo of a teenage Esther Duflo — at least it looks like you — protesting against fascism in Russia on top of a tank, is it?

DUFLO: That was a bus, and it was me. It was me. So that was in 1991. This was not when I lived for one year there. I lived one year in ’93–’94. But this was in ’91. I had gone to Russia about every year since I was a teen to learn Russian. I happened to be there the summer where there was this putsch against Gorbachev. That summer…

And someone gave me that fashizm ne poletit placard and asked me to hold it. And I’m like, “Sure, I’m going to hold it.” So I’m holding my placard. We stayed there for a long time when things were happening. Next time I saw in the evening, my parents called me, “What are you doing?” Because it turned out that that image was on all the TVs in the world. [laughs] And that’s how I very briefly became the face of this revolution.


COWEN: Does child-rearing in France strike you as more sensible than child-rearing in the United States?

DUFLO: Oh very much so, very much so.

COWEN: And why?

DUFLO: You know that book, Bringing Up Bébé?


DUFLO: I think she picked up on something which rings so true to me, which maybe is a marginal point about the US versus France. In France people are reasonably content to just go with the flow and do what everybody does. Every kid eats the same thing at 4:30, has dinner at the same time, has gone through the same experiences, learned the same songs, and everybody thinks they are totally free. But in fact, they are all on this pretty sensible railroad. And also, they don’t agonize about it.

In the US, child-rearing is one more occasion to make a statement about your identity. You’re the kind of mother that carries the baby, or you’re the kind of mother that puts the baby in a stroller. And somehow it almost can predict what you’re going to think about Donald Trump. That’s crazy. Some people are so concerned about what they do. Not only they feel that they have to invest a ton in their children, and they feel inadequate if they are not able to, but also, exactly what they do creates them as people.

In France that’s not there, and I think that makes everybody so much more laid back, children and adults.

Recommended throughout.

The gender gap in confidence, revisited — gender differences in research reporting

His team analyzed more than 100,000 medical studies and 6.2 million life sciences article that were published over a 15-year period, finding that women-authored studies were 12 percent less likely to contain at least one of a group of 25 positive terms, including “favorable,” “excellent” and “prominent.” In the most prestigious and influential journals, women were 21 percent less likely to describe their findings with such words.

Male authors deployed the word “novel” 60 percent more often than their female counterparts. “Unique” was used 44 percent more often by male authors, and “promising” was used 72 percent more often by male authors.

Here is the article, here is the unique study itself.

We need more indices

That is the upshot of my latest Bloomberg column, as the Doing Business index, PISA scores, and the Corruption Perceptions Index have been highly influential.  Here are a few of my further requests:

These successes raise a question: Which other indexes might be useful? Think of the suggestions that follow as a kind of Christmas wish list.

How about a loneliness index? David Brooks has argued that America faces a crisis of loneliness, making us unhappy and impoverishing us spiritually. I find these claims plausible, especially since the median U.S. household size has been shrinking. Still, just how bad is this problem? One recent study found that American loneliness has not been rising lately, and that loneliness increases only after people reach their early 70s…

A stress index for Americans another related idea: Just how much do our lives focus our attention on our worries rather than on our joys and hopeful expectations?

There are less emotional concerns as well. How about an infrastructure speed index? I worry about bureaucratization and the slow pace of building important public works. Construction on Manhattan’s Second Avenue subway line, for example, started in 1972, paused, resumed in 2004, and was finally completed (the first phase, anyway) in 2017. In contrast, construction of the core New York City subway system, with 28 stations, began in 1900 and finished in 1904. Similarly, construction of the Empire State Building took only 410 days.

Why do so many U.S. infrastructure projects today take so long? And if the process of improving and reshaping the environment to further human progress is now so much slower, doesn’t it make sense to try to measure this decline for the purpose of eventual improvement? Given the need for a greener energy infrastructure, this is a matter of the utmost urgency.

Speaking of energy infrastructure, how about a severity index for climate change and associated problems?

There are further noteworthy suggestions at the link.  Which indices do you wish for?

Are we undermeasuring productivity gains from the internet? part I

From my new paper with Ben Southwood on whether the rate of scientific progress is slowing down:

Third, we shouldn’t expect mismeasured GDP simply from the fact that the internet makes many goods and services cheaper. Spotify provides access to a huge range of music, and very cheaply, such that consumers can listen in a year to albums that would have cost them tens of thousands of dollars in the CD or vinyl eras. Yet this won’t lead to mismeasured GDP. For one thing, the gdp deflator already tries to capture these effects. But even if those efforts are imperfect, consider the broader economic interrelations. To the extent consumers save money on music, they have more to spend or invest elsewhere, and those alternative choices will indeed be captured by GDP. Another alternative (which does not seem to hold for music) is that the lower prices will increase the total amount of money spent on recorded music, which would mean a boost in recorded GDP for the music sector alone. Yet another alternative, more plausible, is that many artists give away their music on Spotify and YouTube to boost the demand for their live performances, and the increase in GDP  shows up there. No matter how you slice the cake, cheaper goods and services should not in general lower measured GDP in a way that will generate significant mismeasurement. 

Moving to the more formal studies, the Federal Reserve’s David Byrne, with Fed & IMF colleagues, finds a productivity adjustment worth only a few basis points when attempting to account for the gains from cheaper internet age and internet-enabled products. Work by Erik Brynjolfsson and Joo Hee Oh studies the allocation of time, and finds that people are valuing free Internet services at about $106 billion a year. That’s well under one percent of GDP, and it is not nearly large enough to close the measured productivity gap. A study by Nakamura, Samuels, and Soloveichik measures the value of free media on the internet, and concludes it is a small fraction of GDP, for instance 0.005% of measured nominal GDP growth between 1998 and 2012. 

Economist Chad Syverson probably has done the most to deflate the idea of major unmeasured productivity gains through internet technologies. For instance, countries with much smaller tech sectors than the United States usually have had comparably sized productivity slowdowns. That suggests the problem is quite general, and not belied by unmeasured productivity gains. Furthermore, and perhaps more importantly, the productivity slowdown is quite large in scale, compared to the size of the tech sector. Using a conservative estimate, the productivity slowdown implies a cumulative loss of $2.7 trillion in  GDP since the end of 2004; in other words, output would have been that much higher had the earlier rate of productivity growth been maintained. If unmeasured gains are to make up for that difference, that would have to be very large. For instance, consumer surplus would have to be five times higher in IT-related sectors than elsewhere in the economy, which seems implausibly large.

You can find footnotes and references in the original.  Here is my earlier post on the paper.