Category: Data Source

Why isn’t Sweden exploding?

…Swedish state epidemiologist Anders Tegnell remains calm: he is not seeing the kind of rapid increase that might threaten to overwhelm the Swedish health service, and unlike policymakers in the UK, he has been entirely consistent that that is his main objective.

That is from a new piece by Freddie Sayers, asserting that “the jury is still out” when it comes to Sweden.  I cannot reproduce all of the graphs in that piece, but scroll through and please note that in terms of per capita deaths Sweden seems to be doing better than Belgium, France, or the United Kingdom, all of which have serious lockdowns (Sweden does not).  If you measure extant trends, Sweden is in the middle of the pack for Europe.  And here is data on new hospital admissions:

Now I understand that ideally one should compare similar “time cohorts” across countries, not absolute numbers or percentages.  That point is logically impeccable, but still as the clock ticks it seems less likely to account for the Swedish anomaly.

Of course we still need more days and weeks of data.

To be clear, I am not saying the United States can or should copy Sweden.  Sweden has an especially large percentage of people living alone, the Swedes are probably much better at complying with informal norms for social distancing, and obesity is much less of a problem in Sweden than America, probably hypertension too.

But I’d like to ask a simple question: who predicted this and who did not?  And which of our priors should this cause us to update?

I fully recognize it is possible and maybe even likely that Sweden ends up being like Japan, in the sense of having a period when things seem (relatively) fine and then discovering they are not.  (Even in Singapore the second wave has arrived, from in-migration, and may well be worse than the first.)  But surely the chance of that scenario has gone down just a little?

And here is a new study on Lombardy by Daniil Gorbatenko:

The data clearly suggest that the spread had been trending down significantly even before the initial lockdown. They invalidate the fundamental assumption of the Covid-19 epidemiological models and with it, probably also the rationale for the harshest measures of suppression.

One possibility (and I stress that word possibility) is that these Lombardy data, shown at the link, are reflecting the importance of potent “early spreaders,” often family members, who give Covid-19 to their families fairly quickly, but after which the average rate of spread falls rapidly.

I’ll stand by my claim that the pieces on this one show an increasing probability of not really adding up.  In the meantime, I am very happy to pull out and signal boost the best criticisms of these results.

The Geographic Spread of COVID-19 Correlates with Structure of Social Networks as Measured by Facebook

There is a new NBER working paper (by economists) on Covid-19:

We use anonymized and aggregated data from Facebook to show that areas with stronger social ties to two early COVID-19 “hotspots” (Westchester County, NY, in the U.S. and Lodi province in Italy) generally have more confirmed COVID-19 cases as of March 30, 2020. These relationships hold after controlling for geographic distance to the hotspots as well as for the income and population density of the regions. These results suggest that data from online social networks may prove useful to epidemiologists and others hoping to forecast the spread of communicable diseases such as COVID-19.

That is by Theresa Kuchler, Dominic Russell, and Johannes Stroebel.

Policy for Covid-19, from Policy.NZ

Chris McIntyre from Wellington emails me:

“Pleased to share that COVID-19 Policy Watch is now live at [links fixed]. We currently cover federal policies for 12 countries, including the US, UK, and China, with 15 more underway. We expect to expand our network of publishing partners from three, to five over the coming days. All feedback is most welcome.

A blurb for MR readers to edit as you see fit:

We aim for COVID-19 Policy Watch to be the most accessible source for governments’ policy responses to COVID-19 so that researchers, policymakers, journalists, and the general public can quickly learn about and compare governments’ responses. If you think this is important, we’d love your help: we’re seeking publishing partners (news media, universities, research groups) to keep country policies up to date, and experienced front- and back-end Drupal devs to help build new features. Interested parties should email [email protected].

TC again: I am pleased to announce that Policy.NZ is a new Emergent Ventures winner (not Fast Grants with its biomedical orientation, rather “classic” Emergent Ventures).

An econometrician on the SEIRD epidemiological model for Covid-19

There is a new paper by Ivan Korolev:

This paper studies the SEIRD epidemic model for COVID-19. First, I show that the model is poorly identified from the observed number of deaths and confirmed cases. There are many sets of parameters that are observationally equivalent in the short run but lead to markedly different long run forecasts. Second, I demonstrate using the data from Iceland that auxiliary information from random tests can be used to calibrate the initial parameters of the model and reduce the range of possible forecasts about the future number of deaths. Finally, I show that the basic reproduction number R0 can be identified from the data, conditional on the clinical parameters. I then estimate it for the US and several other countries, allowing for possible underreporting of the number of cases. The resulting estimates of R0 are heterogeneous across countries: they are 2-3 times higher for Western countries than for Asian countries. I demonstrate that if one fails to take underreporting into account and estimates R0 from the cases data, the resulting estimate of R0 will be biased downward and the model will fail to fit the observed data.

Here is the full paper.  And here is Ivan’s brief supplemental note on CFR.  (By the way, here is a new and related Anthony Atkeson paper on estimating the fatality rate.)

And here is a further paper on the IMHE model, by statisticians from CTDS, Northwestern University and the University of Texas, excerpt from the opener:

  • In excess of 70% of US states had actual death rates falling outside the 95% prediction interval for that state, (see Figure 1)
  • The ability of the model to make accurate predictions decreases with increasing amount of data. (figure 2)

Again, I am very happy to present counter evidence to these arguments.  I readily admit this is outside my area of expertise, but I have read through the paper and it is not much more than a few pages of recording numbers and comparing them to the actual outcomes (you will note the model predicts New York fairly well, and thus the predictions are of a “train wreck” nature).

Let me just repeat the two central findings again:

  • In excess of 70% of US states had actual death rates falling outside the 95% prediction interval for that state, (see Figure 1)
  • The ability of the model to make accurate predictions decreases with increasing amount of data. (figure 2)

So now really is the time to be asking tough questions about epidemiology, and yes, epidemiologists.  I would very gladly publish and “signal boost” the best positive response possible.

And just to be clear (again), I fully support current lockdown efforts (best choice until we have more data and also a better theory), I don’t want Fauci to be fired, and I don’t think economists are necessarily better forecasters.  I do feel I am not getting straight answers.

Epidemiology and selection problems and further heterogeneities

Richard Lowery emails me this:

I saw your post about epidemiologists today.  I have a concern similar to point 4 about selection based what I have seen being used for policy in Austin.  It looks to me like the models being used for projection calibrate R_0 off of the initial doubling rate of the outbreak in an area.  But, if people who are most likely to spread to a large number of people are also more likely to get infected early in an outbreak, you end up with what looks kind of like a classic Heckman selection problem, right? In any observable group, there is going to be an unobserved distribution of contact frequency, and it would seem potentially first order to account for that.

As far as I can tell, if this criticism holds, the models are going to (1) be biased upward, predicting a far higher peak in the absence of policy intervention and (2) overstate the likely severity of an outcome without policy intervention, while potentially understating the value of aggressive containment measures.  The epidemiology models I have seen look really pessimistic, and they seem like they can only justify any intervention by arguing that the health sector will be overwhelmed, which now appears unlikely in a lot of places.  The Austin report did a trick of cutting off the time axis to hide that total infections do not seem to change that much under the different social distancing policies; everything just gets dragged out.

But, if the selection concern is right, the pessimism might be misplaced if the late epidemic R_0 is lower, potentially leading to a much lower effective spread rate and the possibility of killing the thing off at some point before it infects the number of people required to create the level of immunity the models are predicted require.  This seems feasible based on South Korea and maybe China, at least for areas in the US that are not already out of control.

I do not know the answers to the questions raised here, but I do see the debate on Twitter becoming more partisan, more emotional, and less substantive.  You cannot say that about this communication.  From the MR comments this one — from Kronrad — struck me as significant:

One thing both economists and epidemiologists seem to be lacking is an awareness for the problems of aggregation. Most models in both fields see the population as one homogenous mass of individuals. But sometimes, individual variation makes a difference in the aggregate, even if the average is the same.

In the case of pandemics, it makes a big difference how that infection rate varies in the population. Most models assume that it is the same for everyone. But in reality, human interactions are not evenly distributed. Some people shake hands all day, while others spend their days mostly alone in front of a screen. This uneven distribution has an interesting effect: those who spread virus the most are also the most likely to get it. This means that the infection rate looks very higher in the beginning of a pandemic, but sinks once the super spreaders has the disease and got immunity. Also, it means herd immunity is reached much earlier: not after 70% of the population is immune, but after people who are involved in 70% of all human interactions are immune. At average, this is the same. But in practice, it can make a big difference.

I did a small simulation on this and came to the conclusion that with recursively applied Pareto-distribution where 1/3 of all people are responsible for 2/3 of all human interaction, herd immunity is already reached when 10% of the population had the virus. So individual variation in the infection rate can make an enormous difference that are be captured in aggregate models.

My quick and dirty simulation can be found here:

See also Robin Hanson’s earlier post on variation in R0.  C’mon people, stop your yapping on Twitter and write some decent blog posts on these issues.  I know you can do it.

Why has the Census become less productive over time?

For the 1970 and subsequent censuses, the Postal Service took on an even greater role. Most households received a machine-readable survey by mail and returned it the same way. This cut out two labor- intensive processes: canvassing most households and transcribing data by hand. Censuses since 1970 have generally followed the same process. The biggest change was that the 2010 survey dropped the “long-form” census – a major labor-saving change that nevertheless did not have an obvious impact on the amount of labor expended.

Despite the introduction of labor-saving technologies, the Census hires more people relative to the population than it did in earlier periods. In 1950, 46 million households – the entire country – were canvassed by 170,000 field staff. In 2000, 45 million addressees failed to mail back the survey, and the Census Bureau hired 539,000 field staff to take on the task of nonresponse follow-up. What seems like basically the same task took three times as many employees.

That is from Salim Furth, there is much more at the link, including footnotes for the above excerpt.  Of course this year the labor investment is likely to be much, much lower.

China estimate of the day

“The SpaceKnow data suggest a continued slowing in China’s economy, despite official data saying otherwise,” says Jeremy Fand, SpaceKnow’s chief executive.

Pollution data from SpaceKnow, collected via satellite by measuring things like methane and ozone over China, also suggest that activity remains depressed compared with previrus levels. That index, last updated on March 30, is unchanged from the end of February…

Regardless of why China’s activity remains lower than officially reported—whether it’s the virus, frozen demand, or a combination of factors—the point is that the country hasn’t yet begun to rebound.

Here is the full story by Lisa Beilfuss.  Given this data, as I have been arguing, we should not expect a V-shaped U.S. recovery.

Does working from home work?

Better than you might think.  Here is a paper from a few years back, by Nicholas Bloom, James Liang, John Roberts, and Zhichun Jenny Ying:

A rising share of employees now regularly engage in working from home (WFH), but there are concerns this can lead to ‘‘shirking from home.’’ We report the results of a WFH experiment at Ctrip, a 16,000-employee, NASDAQ-listed Chinese travel agency. Call center employees who volunteered to WFH were randomly assigned either to work from home or in the office for nine months. Home working led to a 13% performance increase, of which 9% was from working more minutes per shift (fewer breaks and sick days) and 4% from more calls per minute (attributed to a quieter and more convenient working environment). Home workers also reported improved work satisfaction, and their attrition rate halved, but their promotion rate conditional on performance fell. Due to the success of the experiment, Ctrip rolled out the option to WFH to the whole firm and allowed the experimental employees to reselect between the home and office. Interestingly, over half of them switched, which led to the gains from WFH almost doubling to 22%. This highlights the benefits of learning and selection effects when adopting modern management practices like WFH.

Via Matt Notowidigdo.  Of course in that paper, the schools were not all closed…

Do better incentives limit cognitive biases?

There is a new paper by Benjamin Enke, Uri Gneezy, Brian Hall, David Martin, Vadim Nelidov, Theo Offerman, and Jeroen van de Ven:

Despite decades of research on heuristics and biases, empirical evidence on the effect of large incentives – as present in relevant economic decisions – on cognitive biases is scant. This paper tests the effect of incentives on four widely documented biases: base rate neglect, anchoring, failure of contingent thinking, and intuitive reasoning in the Cognitive Reflection Test. In preregistered laboratory experiments with 1,236 college students in Nairobi, we implement three incentive levels: no incentives, standard lab payments, and very high incentives that increase the stakes by a factor of 100 to more than a monthly income. We find that cognitive effort as measured by response times increases by 40% with very high stakes. Performance, on the other hand, improves very mildly or not at all as incentives increase, with the largest improvements due to a reduced reliance on intuitions. In none of the tasks are very high stakes sufficient to debias participants, or come even close to doing so. These results contrast with expert predictions that forecast larger performance improvements.

Via Kadeem Noray (EV winner, btw).  This is perhaps related to behavior during and leading up to the lockdown…

Crisis Innovation

That is the title of a new working paper by Tania Babina, Asaf Bernstein, and Filippo Mezzanotti, here is the abstract:

The effect of financial crises on innovative activity is an unsettled and important question for economic growth, but one difficult to answer with modern data. Using a differences-in-differences design surrounding the Great Depression, we are able to obtain plausible variation in local shocks to innovative ecosystems and examine the long-run impact of their inventions. We document a sudden and persistent decline in patenting by the largest organizational form of innovation at this time—independent inventors. Parallel trends prior to the shock, evidence of a drop within every major technology class, and consistent results using distress driven by commodity shocks all suggest a causal effect of local distress. Despite this negative effect, our evidence shows that innovation during crises can be more resilient than it may appear at a first glance. First, the average quality of surviving patents rises so much that there is no observable change in the aggregate future citations of these patents, in spite of the decline in the quantity of patents. Second, the shock is in part absorbed through a reallocation of inventors into established firms, which overall were less affected by the shock. Over the long run, firms in more affected areas compensate for the decline in entrepreneurial innovation and produce patents with greater impact. Third, the results reveal no significant brain drain of inventors from the affected areas. Overall, our findings suggest that financial crises are both destructive and creative forces for innovation, and we provide the first systematic evidence of the role that distress from the Great Depression played in the long-run innovative activity and the organization of innovation in the U.S. economy.

Further data coming your way…

Measuring the Cost of Regulation: A Text-Based Approach

We derive a measure of firm-level regulatory costs from the text of corporate earnings calls. We then use this measure to study the effect of regulation on companies’ operating fundamentals and cost of capital. We find that higher regulatory cost results in slower sales growth, an effect which is mitigated for large firms. Furthermore, we find a one-standard deviation increase in our preferred measure of regulatory cost is associated with an increase in firms’ cost of capital of close to 3% per year. These findings suggest that regulatory risk is a major cost to firms, but the largest firms are able to manage that risk better.

That is the abstract of a new NBER paper by Charles W. Calomiris, Harry Mamaysky, Ruoke Yang, a piece written in pre-Covid-19 times.  It has never been more relevant, except that the estimates for regulatory costs turn out to be far too low (no criticism of the authors is intended here).  To repeat my earlier point, America’s regulatory state is failing us.

That was then, this is now: Palantir privacy edition

Data-analytics company Palantir Technologies Inc. is in talks to provide software to governments across Europe to battle the spread of Covid-19 and make strained health-care systems more efficient, a person familiar with the matter said.

The software company is in discussions with authorities in France, Germany, Austria and Switzerland, the person said, asking not to be identified because the negotiations are private…

European Union Commissioner Thierry Breton said Monday that the bloc is collecting mobile-phone data to help predict epidemic peaks in various member states and help allocate resources.

Palantir has signed a deal with a regional government in Germany, where it already has a 14 million euro ($15 million) contract with law enforcement in North Rhine-Westphalia, the person said. Palantir is also seeking a contract at a national level, the person said, but talks have stalled, the person added.

When a nation or company buys access to Palantir, it can use the data analytics software to pull far-flung digital information into a single repository and mine it for patterns.

Here is the full story.  From a distance it is difficult to evaluate these deals, but I will stick with my general claim that the anti-tech intellectuals have become irrelevant, and for the most part they know it.

The fiscal multiplier during World War II

WWII is viewed as the quintessential example of fiscal stimulus and exerts an outsized influence on fiscal multiplier estimates, but the wartime economy was highly unusual. I use newly-digitized contract data to construct a state-level panel on U.S. spending in WWII. I estimate a relative fiscal multiplier of 0.25, implying an aggregate multiplier of roughly 0.3. Conversion from civilian manufacturing to war production reduced the initial shock to economic activity because war production directly displaced civilian manufacturing. Saving and taxes account for 75% of the income generated by war spending, implying that the add-on effects from increased consumption were minimal.

That is from a 2018 paper by Gillian Brunet, and you will note that it reflects the consensus of the literature as a whole.  I do favor the federal government borrowing and spending a great deal of money right now on things that we need.  If you think we are in a traditional Keynesian scenario, or are pulling out a traditional AS-AD model, you are going to be very badly disappointed.  Most of all, we need to be spending more on public health and remedies for Covid-19.  Here is my earlier Bloomberg column on analogies and disanalogies between Covid-19 and World War II.  And again, see Garett Jones and Dan Rothschild on the 2009 stimulus.