Category: Science
Fast Grants update
Fast Grants has now made over 100 grants and contributed over $18 million in funding biomedical research against Covid-19, all in a little over two weeks’ time since project conception. If you scroll down the home page, you can see a partial list of winners (we are more concerned with getting the money out the door than keeping the list fully updated, but it will continue to grow).
Fast Grants is part of Emergent Ventures, a project of the Mercatus Center, George Mason University. And I wish to thank again all of those who have contributed to this project, either financially or otherwise. A partial list of financial contributors can be found at the above link as well.
People are dying from coronavirus because research is too slow
For years, there’s been talk about making the clinical trial process more standardized, and cheaper, so that the same rules would apply each time a study needed to be run. There’s even been discussion that what are known as pragmatic trials — large, simple, randomized studies in which less data are collected — might be conducted using electronic health records. But that hasn’t happened at the pace it should.
The reason involves another part of the problem. Clinical trials are principally run by drug and medical device companies in order to obtain regulatory approvals, with public health authorities only picking up the slack in rare examples. But the result is that we have not built a system that would make studies simpler; most patients have little opportunity to participate in research; and we are too slow to figure out what works.
What would the system look like if we fixed it? It would make it easier to study drugs for heart disease, where studies are so large and expensive that many companies don’t test their medicines. It would ease studies for rare cancers, which are currently problematic because the right patients are hard to find. And it could create a medical information superhighway that would power health care through the next century.
That is from Matthew Herper in StatNews. Via Malinga Fernando.
It is better to do lots of tests, even if they are not entirely accurate
We find that the number of daily tests carried out is much more important than their sensitivity, for the success of a case-isolation based strategy.
Our results are based on a Susceptible-Exposed-Infectious-Recovered (SEIR) model, which is age-, testing-, quarantine- and hospitalisation-aware. This model has a number of parameters which we estimate from best-available UK data. We run the model with variations of these parameters – each of which represents a possible present state of circumstances in the UK – in order to test the robustness of our conclusion.
We implemented and investigated a number of potential exit strategies, focusing primarily on the effects of virus-testing based case isolation.
The implementation of our model is flexible and extensively commented, allowing us and others to investigate new policy ideas in a timely manner; we next aim to investigate the optimal use of the highly imperfect antibody tests that the United Kingdom already possesses in large numbers.
There is much more at the link, including the model, results, and source code. That is from a team led by Gergo Bohner and also Gaurav Venkataraman, Gaurav being a previous Emergent Ventures winner.
My Conversation with Philip Tetlock
Here is the audio and transcript, here is part of the summary:
He joined Tyler to discuss whether the world as a whole is becoming harder to predict, whether Goldman Sachs traders can beat forecasters, what inferences we can draw from analyzing the speech of politicians, the importance of interdisciplinary teams, the qualities he looks for in leaders, the reasons he’s skeptical machine learning will outcompete his research team, the year he thinks the ascent of the West became inevitable, how research on counterfactuals can be applied to modern debates, why people with second cultures tend to make better forecasters, how to become more fox-like, and more.
Here is one excerpt:
COWEN: If you could take just a bit of time away from your research and play in your own tournaments, are you as good as your own best superforecasters?
TETLOCK: I don’t think so. I don’t think I have the patience or the temperament for doing it. I did give it a try in the second year of the first set of forecasting tournaments back in 2012, and I monitored the aggregates. We had an aggregation algorithm that was performing very well at the time, and it was outperforming 99.8 percent of the forecasters from whom the composite was derived.
If I simply had predicted what the composite said at each point in time in that tournament, I would have been a super superforecaster. I would have been better than 99.8 percent of the superforecasters. So, even though I knew that it was unlikely that I could outperform the composite, I did research some questions where I thought the composite was excessively aggressive, and I tried to second guess it.
The net result of my efforts — instead of finishing in the top 0.02 percent or whatever, I think I finished in the middle of the superforecaster pack. That doesn’t mean I’m a superforecaster. It just means that when I tried to make a forecast better than the composite, I degraded the accuracy significantly.
COWEN: But what do you think is the kind of patience you’re lacking? Because if I look at your career, you’ve been working on these databases on this topic for what? Over 30 years. That’s incredible patience, right? More patience than most of your superforecasters have shown. Is there some dis-aggregated notion of patience where they have it and you don’t?
TETLOCK: [laughs] Yeah, they have a skill set. In the most recent tournaments, we’ve been working on with them, this becomes even more evident — their willingness to delve into the details of really pretty obscure problems for very minimal compensation is quite extraordinary. They are intrinsically cognitively motivated in a way that is quite remarkable. How am I different from that?
I guess I have a little bit of attention deficit disorder, and my attention tends to roam. I’ve not just worked on forecasting tournaments. I’ve been fairly persistent in pursuing this topic since the mid 1980s. Even before Gorbachev became general party secretary, I was doing a little bit of this. But I’ve been doing a lot of other things as well on the side. My attention tends to roam. I’m interested in taboo tradeoffs. I’m interested in accountability. There’re various things I’ve studied that don’t quite fall in this rubric.
COWEN: Doesn’t that make you more of a fox though? You know something about many different areas. I could ask you about antebellum American discourse before the Civil War, and you would know who had the smart arguments and who didn’t. Right?
And another:
TETLOCK:
…I had a very interesting correspondence with William Safire in the 1980s about forecasting tournaments. We could talk a little about it later. The upshot of this is that young people who are upwardly mobile see forecasting tournaments as an opportunity to rise. Old people like me and aging baby-boomer types who occupy relatively high status inside organizations see forecasting tournaments as a way to lose.
If I’m a senior analyst inside an intelligence agency, and say I’m on the National Intelligence Council, and I’m an expert on China and the go-to guy for the president on China, and some upstart R&D operation called IARPA says, “Hey, we’re going to run these forecasting tournaments in which we assess how well the analytic community can put probabilities on what Xi Jinping is going to do next.”
And I’ll be on a level playing field, competing against 25-year-olds, and I’m a 65-year-old, how am I likely to react to this proposal, to this new method of doing business? It doesn’t take a lot of empathy or bureaucratic imagination to suppose I’m going to try to nix this thing.
COWEN: Which nation’s government in the world do you think listens to you the most? You may not know, right?
Definitely recommended.
The Japanese coronavirus story
You may recall that some time ago MR posted an anonymous account of how the coronavirus problem actually was much worse in Japan than was being admitted by the Japanese government and broader establishment. It is now clear that this Cassandra was correct.
I can now reveal to you the full story of that posting behind the first link, including my role in it. Here is the opening excerpt:
By March 22nd, I strongly suspected there was a widespread coronavirus epidemic in Japan. This was not widely believed at the time. I, working with others, conducted an independent research project. By March 25th we had sufficient certainty to act. We projected that the default course of the epidemic would lead to a public health crisis.
We attempted to disseminate the results to appropriate parties, out of a sense of civic duty. We initially did this privately attached to our identities and publicly but anonymously to maximize the likelihood of being effective and minimize risks to the response effort and to the team. We were successful in accelerating the work of others.
The situation is, as of this writing, still very serious. In retrospect, our pre-registered results were largely correct. I am coming forward with them because the methods we used, and the fact that they arrived at a result correct enough to act upon prior to formal confirmation, may accelerate future work and future responses here and elsewhere.
I am an American. I speak Japanese and live in Tokyo. I have spent my entire adult life in Japan. I have no medical nor epidemiology background. My professional background is as a software engineer and entrepreneur. I presently work in technology. This project was on my own initiative and in my personal capacity.
I am honored to have played a modest role in this story, though full credit goes elsewhere, do read the whole thing. Hashing plays a key role in the longer narrative.
What should we believe and not believe about R?
This is from my email, highly recommended, and I will not apply further indentation:
“Although there’s a lot of pre-peer-reviewed and strongly-incorrect work out there, I’ll single out Kevin Systrom’s rt.live as being deeply problematic. Estimating R from noisy real-world data when you don’t know the underlying model is fundamentally difficult, but a minimal baseline capability is to get sign(R-1) right (at least when |R-1| isn’t small), and rt.live is going to often be badly (and confidently) wrong about that because it fails to account for how the confirmed count data it’s based on is noisy enough to be mostly garbage. (Many serious modelers have given up on case counts and just model death counts.) For an obvious example, consider their graph for WA: it’s deeply implausible on its face that WA had R=.24 on 10 April and R=1.4 on 17 April. (In an epidemiological model with fixed waiting times, the implication would be that infectious people started interacting with non-infectious people five times as often over the course of a week with no policy changes.) Digging into the data and the math, you can see that a few days of falling case counts will make the system confident of a very low R, and a few days of rising counts will make it confident of a very high one, but we know from other sources that both can and do happen due to changes in test and test processing availability. (There are additional serious methodological problems with rt.live, but trying to nowcast R from observed case counts is already garbage-in so will be garbage-out.)
However, folks are (understandably, given the difficulty and the rush) missing a lot of harder stuff too. You linked a study and wrote “Good and extensive west coast Kaiser data set, and further evidence that R doesn’t fall nearly as much as you might wish for.” We read the study tonight, and the data set seems great and important, but we don’t buy the claims about R at all — we think there are major statistical issues. (I could go into it if you want, although it’s fairly subtle, and of course there’s some chance that *we’re* wrong…)
Ultimately, the models and statistics in the field aren’t designed to handle rapidly changing R, and everything is made much worse by the massive inconsistencies in the observed data. R itself is a surprisingly subtle concept (especially in changing systems): for instance, rt.live uses a simple relationship between R and the observed rate of growth, but their claimed relationship only holds for the simplest SIR model (not epidemiologically plausible at all for COVID-19), and it has as an input the median serial interval, which is also substantially uncertain for COVID-19 (they treat it as a known constant). These things make it easy to badly missestimate R. Usually these errors pull or push R away from 1 — rt.live would at least get sign(R – 1) right if their data weren’t garbage and they fixed other statistical problems — but of course getting sign(R – 1) right is a low bar, it’s just figuring out whether what you’re observing is growing or shrinking. Many folks would actually be better off not trying to forecast R and just looking carefully at whether they believe the thing they’re observing is growing or shrinking and how quickly.
All that said, the growing (not total, but mostly shared) consensus among both folks I’ve talked to inside Google and with academic epidemiologists who are thinking hard about this is:
- Lockdowns, including Western-style lockdowns, very likely drive R substantially below 1 (say .7 or lower), even without perfect compliance. Best evidence is the daily death graphs from Italy, Spain, and probably France (their data’s a mess): those were some non-perfect lockdowns (compared to China), and you see a clear peak followed by a clear decline after basically one time constant (people who died at peak were getting infected right around the lockdown). If R was > 1 you’d see exponential growth up to herd immunity, if R was 0.9 you’d see a much bigger and later peak (there’s a lot of momentum in these systems). This is good news if true (and we think it’s probably true), since it means there’s at least some room to relax policy while keeping things under control. Another implication is the “first wave” is going to end over the next month-ish, as IHME and UTexas (my preferred public deaths forecaster; they don’t do R) predict.
- Cases are of course massively undercounted, but the weight of evidence is that they’re *probably* not *so* massively undercounted that we’re anywhere near herd immunity (though this would of course be great news). Looking at Iceland, Diamond Princess, the other studies, the flaws in the Stanford study, we’re very likely still at < ~2-3% infected in the US. (25% in large parts of NYC wouldn’t be a shock though).
Anyways, I guess my single biggest point is that if you see a result that says something about R, there’s a very good chance it’s just mathematically broken or observationally broken and isn’t actually saying that thing at all.”
That is all from Rif A. Saurous, Research Director at Google, currently working on COVID-19 modeling.
Currently it seems to me that those are the smartest and best informed views “out there,” so at least for now they are my views too.
COVID Prevalence and the Difficult Statistics of Rare Events
In a post titled Defensive Gun Use and the Difficult Statistics of Rare Events I pointed out that it’s very easy to go wrong when estimating rare events.
Since defensive gun use is relatively uncommon under any reasonable scenario there are many more opportunities to miscode in a way that inflates defensive gun use than there are ways to miscode in a way that deflates defensive gun use.
Imagine, for example, that the true rate of defensive gun use is not 1% but .1%. At the same time, imagine that 1% of all people are liars. Thus, in a survey of 10,000 people, there will be 100 liars. On average, 99.9 (~100) of the liars will say that they used a gun defensively when they did not and .1 of the liars will say that they did not use a gun defensively when they did. Of the 9900 people who report truthfully, approximately 10 will report a defensive gun use and 9890 will report no defensive gun use. Adding it up, the survey will find a defensive gun use rate of approximately (100+10)/10000=1.1%, i.e. more than ten times higher than the actual rate of .1%!
Epidemiologist Trevor Bedford points out that a similar problem applies to tests of COVID-19 when prevalence is low. The recent Santa Clara study found a 1.5% rate of antibodies to COVID-19. The authors assume a false positive rate of just .005 and a false negative rate of ~.8. Thus, if you test 1000 individuals ~5 will show up as having antibodies when they actually don’t and x*.8 will show up as having antibodies when they actually do and since (5+x*.8)/1000=.015 then x=12.5 so the true rate is 12.5/1000=1.25%, thus the reported rate is pretty close to the true rate. (The authors then inflate their numbers up for population weighting which I am ignoring). On the other hand, suppose that the false positive rate is .015 which is still very low and not implausible then we can easily have ~15/1000=1.5% showing up as having antibodies to COVID when none of them in fact do, i.e. all of the result could be due to test error.
In other words, when the event is rare the potential error in the test can easily dominate the results of the test.
Addendum: For those playing at home, Bedford uses sensitivity and specificity while I am more used to thinking about false positive and false negative rates and I simplify the numbers slightly .8 instead of his .803 and so forth but the point is the same.
More on economists and epidemiologists
From my email box, here are perspectives from people in the world of epidemiology, the first being from Jacob Oppenheim:
I’d note that epidemiology is the field that has most embraced novel and principles-driven approaches to causal inference (eg those of Judea Pearl etc). Pearl’s cluster is at UCLA; there’s one at Berkeley, and another at Harvard.
The one at Harvard simultaneously developed causal methodologies in the ’70s (eg around Rubin), then a parallel approach to Pearl in the ’80s (James Robins and others), leading to a large collection of important epi people at HSPH (Miguel Hernan, etc). Many of these methods are barely touched in economics, which is unfortunate given their power in causal inference in medicine, disease, and environmental health.
These methods and scientists are very influential not only in public health / traditional epi, but throughout the biopharma and machine learning worlds. Certainly, in my day job running data science + ml in biotech, many of us would consider well trained epidemiologists from these top schools among the best in the world for quantitative modeling, especially where causality is involved.
From Julien SL:
I’m not an epidemiologist per se, but I think my background gives me some inputs into that discussion. I have a master in Mechatronics/Robotics Engineering, a master in Management Science, and an MBA. However, in the last ten years, epidemiology (and epidemiology forecasting) has figured heavily in my work as a consultant for the pharma industry.
[some data on most of epidemiology not being about pandemic forecasting]…
The result of the neglect of pandemics epidemiology is that there is precious little expertise in pandemics forecasting and prevention. The FIR model (and it’s variants) that we see a lot these days is a good teaching aid. Still, it’s not practically useful: you can’t fit exponentials with unstable or noisy parameters and expect good predictions. The only way to use R0 is qualitatively. When I saw the first R0 and mortality estimates back in January, I thought “this is going to be bad,” then sold my liquid assets, bought gold, and naked puts on indices. I confess that I didn’t expect it to be quite as bad as what actually happened, or I would have bought more put options.
…here are a few tentative answers about your “rude questions:”
a. As a class of scientists, how much are epidemiologists paid? Is good or bad news better for their salaries?
Glassdoor data show that epidemiologists in the US are paid $63,911 on average. CDC and FDA both pay better ($98k and $120k), as well as pharma (Merck: $94k-$115k). As explained above, most are working on cancer, diabetes, etc. So I’m not sure what “bad news” would be for them.
b. How smart are they? What are their average GRE scores?
I’m not sure where you could get data to answer that question. I know that in pharma, many – maybe most – people who work on epidemiology forecasting don’t have an epidemiology degree. They can have any type of STEM degree, including engineering, economics, etc. So my base rate answer would be average of all STEM GRE scores. [TC: Here are U. Maryland stats for public health students.]
c. Are they hired into thick, liquid academic and institutional markets? And how meritocratic are those markets?
Compared to who? Epidemiology is a smaller community than economics, so you should find less liquidity. Pharma companies are heavily clustered into few geographies (New Jersey, Basel in Switzerland, Cambridge in the UK, etc.) so private-sector jobs aren’t an option for many epidemiologists.
d. What is their overall track record on predictions, whether before or during this crisis?
CDC has been running flu forecasting challenges every year for years. From what I’ve seen, the models perform reasonably well. It should be noted that those models would seem very familiar to an econometric forecaster: the same time series tools are used in both disciplines. [TC: to be clear, I meant prediction of new pandemics and how they unfold]
e. On average, what is the political orientation of epidemiologists? And compared to other academics? Which social welfare function do they use when they make non-trivial recommendations?
Hard to say. Academics lean left, but medical doctors and other healthcare professionals often lean right. There is a conservative bias to medicine, maybe due to the “primo, non nocere” imperative. We see that bias at play in the hydroxychloroquine debate. Most health authorities are reluctant to push – or even allow – a treatment option before they see overwhelming positive proof, even when the emergency should encourage faster decision making.
…g. How well do they understand how to model uncertainty of forecasts, relative to say what a top econometrician would know?
As I mentioned above, forecasting is far from the main focus of epidemiology. However, epidemiologists as a whole don’t seem to be bad statisticians. Judea Pearl has been saying for years that epidemiologists are ahead of econometricians, at least when it comes to applying his own Structural Causal Model framework… (Oldish) link: http://causality.cs.ucla.edu/blog/index.php/2014/10/27/are-economists-smarter-than-epidemiologists-comments-on-imbenss-recent-paper/
I’ve seen a similar pattern with the adoption of agent-based models (common in epidemiology, marginal in economics). Maybe epidemiologists are faster to take up new tools than economists (which maybe also give a hint about point e?)
h. Are there “zombie epidemiologists” in the manner that Paul Krugman charges there are “zombie economists”? If so, what do you have to do to earn that designation? And are the zombies sometimes right, or right on some issues? How meta-rational are those who allege zombie-ism?
I don’t think so. Epidemiology seems less political than economy. There are no equivalents to Smith, Karl Marx, Hayek, etc.
i. How many of them have studied Philip Tetlock’s work on forecasting?
Probably not many, given that their focus isn’t forecasting. Conversely, I don’t think that Tetlock has paid much attention to epidemiology. On the Good Judgement website, healthcare questions of any type are very rare.
And here is Ruben Conner:
Weighing in on your recent questions about epidemiologists. I did my undergraduate in Economics and then went on for my Masters in Public Health (both at University of Washington). I worked as an epidemiologist for Doctors Without Borders and now work as a consultant at the World Bank (a place mostly run by economists). I’ve had a chance to move between the worlds and I see a few key differences between economists and epidemiologists:
-
Trust in data: Like the previous poster said, epidemiologists recognize that “data is limited and often inaccurate.” This is really drilled into the epidemiologist training – initial data collection can have various problems and surveys are not always representative of the whole population. Epidemiologists worry about genuine errors in the underlying data. Economists seem to think more about model bias.
-
Focus on implementation: Epidemiologists expect to be part of the response and to deal with organizing data as it comes in. This isn’t a glamorous process. In addition, the government response can be well executed or poorly run and epidemiologists like to be involved in these details of planning. The knowledge here is practical and hands-on. (Epidemiologists probably could do with more training on organizational management, they’re not always great at this.)
-
Belief in models: Epidemiologists tend to be skeptical of fancy models. This could be because they have less advanced quantitative training. But it could also be because they don’t have total faith in the underlying data (as noted above) and therefore see fancy specifications as more likely to obscure the truth than reveal it. Economists often seem to want to fit the data to a particular theory – my impression is that they like thinking in the abstract and applying known theories to their observations.
As with most fields, I think both sides have something to learn from each other! There will be a need to work together as we weigh the economic impacts of suppression strategies. This is particularly crucial in low-income places like India, where the disease suppression strategies will be tremendously costly for people’s daily existence and ability to earn a living.
Here is a 2014 blog post on earlier spats between economists and epidemiologists. Here is more from Joseph on that topic.
And here is from an email from epidemiologist Dylan Green:
So with that…on to the modelers! I’ll merely point out a few important details on modeling which I haven’t seen in response to you yet. First, the urgency with which policy makers are asking for information is tremendous. I’ve been asked to generate modeling results in a matter of weeks (in a disease which I/we know very little about) which I previously would have done over the course of several months, with structured input and validation from collaborators on a disease I have studied for a decade. This ultimately leads to simpler rather than more complicated efforts, as well as difficult decisions in assumptions and parameterization. We do not have the luxury of waiting for better information or improvements in design, even if it takes a matter of days.
Another complicated detail is the publicity of COVID-19 projections. In other arenas (HIV, TB, malaria) model results are generated all the time, from hundreds of research groups, and probably <1% of the population will ever see these figures. Modeling and governance of models of these diseases is advanced. There are well organized consortia who regularly meet to present and compare findings, critically appraise methods, elegantly present uncertainty, and have deep insights into policy implications. In HIV for example, models are routinely parameterized to predict policy impact, and are ex-post validated against empirical findings to determine the best performing models. None of this is currently in scope for COVID-19 (unfortunately), as policy makers often want a single number, not a range, and they want it immediately.
I hope for all of our sakes we will see the modeling coordination efforts in COVID-19 improve. And I ask my fellow epidemiologists to stay humble during this pandemic. For those with little specialty in communicable disease, it is okay to say “this isn’t my area of expertise and I don’t have the answers”. I think there has been too much hubris in the “I-told-ya-so” from people who “said this would happen”, or in knowing the obvious optimal policy. This disease continues to surprise us, and we are learning every day. We must be careful in how we communicate our certainty to policy makers and the public, lest we lose their trust when we are inevitably wrong. I suspect this is something that economists can likely teach us from experience.
One British epidemiologist wrote me and told me they are basically all socialists in the literal sense of the term. not just leaning to the left.
Another person in the area wrote me this:
Another issue that isn’t spoken about a lot is most Epidemiologists are funded by soft money. It makes them terrifyingly hard working but it also makes them worried about making enemies. Every critic now will be reviewed by someone in IHME at some point in an NIH study section, whereas IHME, funded by the Gates Foundation, has a lot of resilience. It makes for a very muted culture of criticism.Ironically, outsiders (like economist Noah Haber) trying to push up the methods are more likely to be attacked because they are not a part of the constant funding cycle.I wonder if economists have ever looked at the potential perverse incentives of being fully grant funded on academic criticism?
Here is an earlier email response I reproduced, here is my original blog post, here is my update from yesterday.
Estimating the COVID-19 Infection Rate: Anatomy of an Inference Problem
That is a recent paper by Manski and Molinari, top people with econometrics. Here is the abstract:
As a consequence of missing data on tests for infection and imperfect accuracy of tests, reported rates of population infection by the SARS CoV-2 virus are lower than actual rates of infection. Hence, reported rates of severe illness conditional on infection are higher than actual rates. Understanding the time path of the COVID-19 pandemic has been hampered by the absence of bounds on infection rates that are credible and informative. This paper explains the logical problem of bounding these rates and reports illustrative findings, using data from Illinois, New York, and Italy. We combine the data with assumptions on the infection rate in the untested population and on the accuracy of the tests that appear credible in the current context. We find that the infection rate might be substantially higher than reported. We also find that the infection fatality rate in Italy is substantially lower than reported.
Here is a very good tweet storm on their methods, excerpt: “What I love about this paper is its humility in the face of uncertainty.” And: “…rather than trying to get exact answers using strong assumptions about who opts-in for testing, the characteristics of the tests themselves, etc, they start with what we can credibly know about each to build bounds on each of these quantities of interest.”
I genuinely cannot give a coherent account of “what is going on” with Covid-19 data issues and prevalence. But at this point I think it is safe to say that the mainstream story we have been living with for some number of weeks now just isn’t holding up.
For the pointer I thank David Joslin.
Ahem
A widely followed model for projecting Covid-19 deaths in the U.S. is producing results that have been bouncing up and down like an unpredictable fever, and now epidemiologists are criticizing it as flawed and misleading for both the public and policy makers. In particular, they warn against relying on it as the basis for government decision-making, including on “re-opening America.”
“It’s not a model that most of us in the infectious disease epidemiology field think is well suited” to projecting Covid-19 deaths, epidemiologist Marc Lipsitch of the Harvard T.H. Chan School of Public Health told reporters this week, referring to projections by the Institute for Health Metrics and Evaluation at the University of Washington.
Others experts, including some colleagues of the model-makers, are even harsher. “That the IHME model keeps changing is evidence of its lack of reliability as a predictive tool,” said epidemiologist Ruth Etzioni of the Fred Hutchinson Cancer Center, home to several of the researchers who created the model, and who has served on a search committee for IHME. “That it is being used for policy decisions and its results interpreted wrongly is a travesty unfolding before our eyes.”
…The chief reason the IHME projections worry some experts, Etzioni said, is that “the fact that they overshot” — initially projecting up to 240,000 U.S. deaths, compared with fewer than 70,000 now — “will be used to suggest that the government response prevented an even greater catastrophe, when in fact the predictions were shaky in the first place.”
Here is the full story, from StatNews, by Sharon Begley with assistance from Helen Branswell, two very good and knowledgeable sources. Via Matt Yglesias.
To be clear, I am (and always have been) fully aware that there are more nuanced epidemiological models “sitting on on the shelf,” just as is true for macroeconomics and many other areas. But I ask you, where are the numerous cases of leading epidemiologists screaming bloody murder to the press, or on their blogs, or in any other manner, that the most commonly used model for this all-important policy analysis is deeply wrong and in some regards close to a fraud? Yes I know you can point to a few tweets from the more serious people, but where has the profession as a whole been? Who organized the protest letter and petition to The Wall Street Journal?
And to be clear, I have heard this model cited and discussed in many (off the record) policy discussions, this is not just something you can pin on the Trump administration narrowly construed (though they are at fault as well).
What should I ask Glen Weyl?
I will be doing a Conversation with him, mostly about his ideas on Covid-19 response and testing, though we will cover other topics as well. So what should I ask him?
Fast Grants, a project of Emergent Ventures against Covid-19, status update
As you may recall, the goal of Fast Grants is to support biomedical research to fight back Covid-19, thus restoring prosperity and liberty.
Yesterday 40 awards were made, totaling about $7 million, and money is already going out the door with ongoing transfers today. Winners are from MIT, Harvard, Stanford, Rockefeller University, UCSF, UC Berkeley, Yale, Oxford, and other locales of note. The applications are of remarkably high quality.
Nearly 4000 applications have been turned down, and many others are being put in touch with other institutions for possible funding support, with that ancillary number set to top $5 million.
The project was announced April 8, 2020, only eight days ago. And Fast Grants was conceived of only about a week before that, and with zero dedicated funding at the time.
I wish to thank everyone who has worked so hard to make this a reality, including the very generous donors to the program, those at Stripe who contributed by writing new software, the quality-conscious and conscientious referees and academic panel members (about twenty of them), and my co-workers at Mercatus at George Mason University, which is home to Emergent Ventures.
I hope soon to give you an update on some of the supported projects.
Emergent Ventures Covid-19 prizes, second cohort
There is another round of prize winners, and I am pleased and honored to announce them:
1. Petr Ludwig.
Petr has been instrumental in building out the #Masks4All movement, and in persuading individuals in the Czech Republic, and in turn the world, to wear masks. That already has saved numerous lives and made possible — whenever the time is right — an eventual reopening of economies. And I am pleased to see this movement is now having an impact in the United States.
Here is Petr on Twitter, here is the viral video he had a hand in creating and promoting, his work has been truly impressive, and I also would like to offer praise and recognition to all of the people who have worked with him.
The covid19india project is a website for tracking the progress of Covid-19 cases through India, and it is the result of a collaboration.
It is based on a large volunteer group that is rapidly aggregating and verifying patient-level data by crowdsourcing.They portray a website for tracking the progress of Covid-19 cases through India and open-sources all the (non-personally identifiable) data for researchers and analysts to consume. The data for the react based website and the cluster graph are a crowdsourced Google Sheet filled in by a large and hardworking Ops team at covid19india. They manually fill in each case, from various news sources, as soon as the case is reported. Top contributor amongst 100 odd other code contributors and the maintainer of the website is Jeremy Philemon, an undergraduate at SUNY Binghamton, majoring in Computer Science. Another interesting contribution is from Somesh Kar, a 15 year old high school student at Delhi Public School RK Puram, New Delhi. For the COVID-19 India tracker he worked on the code for the cluster graph. He is interested in computer science tech entrepreneurship and is a designer and developer in his free time. Somesh was joined in this effort by his brother, Sibesh Kar, a tech entrepreneur in New Delhi and the founder of MayaHQ.
3. Debes Christiansen, the head of department at the National Reference Laboratory for Fish and Animal Diseases in the capital, Tórshavn, Faroe Islands.
Here is the story of Debes Christiansen. Here is one part:
A scientist who adapted his veterinary lab to test for disease among humans rather than salmon is being celebrated for helping the Faroe Islands avoid coronavirus deaths, where a larger proportion of the population has been tested than anywhere in the world.
Debes was prescient in understanding the import of testing, and also in realizing in January that he needed to move quickly.
Please note that I am trying to reach Debes Christiansen — can anyone please help me in this endeavor with an email?
Here is the list of the first cohort of winners, here is the original prize announcement. Most of the prize money still remains open to be won. It is worth noting that the winners so far are taking the money and plowing it back into their ongoing and still very valuable work.
An econometrician on the SEIRD epidemiological model for Covid-19
There is a new paper by Ivan Korolev:
This paper studies the SEIRD epidemic model for COVID-19. First, I show that the model is poorly identified from the observed number of deaths and confirmed cases. There are many sets of parameters that are observationally equivalent in the short run but lead to markedly different long run forecasts. Second, I demonstrate using the data from Iceland that auxiliary information from random tests can be used to calibrate the initial parameters of the model and reduce the range of possible forecasts about the future number of deaths. Finally, I show that the basic reproduction number R0 can be identified from the data, conditional on the clinical parameters. I then estimate it for the US and several other countries, allowing for possible underreporting of the number of cases. The resulting estimates of R0 are heterogeneous across countries: they are 2-3 times higher for Western countries than for Asian countries. I demonstrate that if one fails to take underreporting into account and estimates R0 from the cases data, the resulting estimate of R0 will be biased downward and the model will fail to fit the observed data.
Here is the full paper. And here is Ivan’s brief supplemental note on CFR. (By the way, here is a new and related Anthony Atkeson paper on estimating the fatality rate.)
And here is a further paper on the IMHE model, by statisticians from CTDS, Northwestern University and the University of Texas, excerpt from the opener:
- In excess of 70% of US states had actual death rates falling outside the 95% prediction interval for that state, (see Figure 1)
- The ability of the model to make accurate predictions decreases with increasing amount of data. (figure 2)
Again, I am very happy to present counter evidence to these arguments. I readily admit this is outside my area of expertise, but I have read through the paper and it is not much more than a few pages of recording numbers and comparing them to the actual outcomes (you will note the model predicts New York fairly well, and thus the predictions are of a “train wreck” nature).
Let me just repeat the two central findings again:
- In excess of 70% of US states had actual death rates falling outside the 95% prediction interval for that state, (see Figure 1)
- The ability of the model to make accurate predictions decreases with increasing amount of data. (figure 2)
So now really is the time to be asking tough questions about epidemiology, and yes, epidemiologists. I would very gladly publish and “signal boost” the best positive response possible.
And just to be clear (again), I fully support current lockdown efforts (best choice until we have more data and also a better theory), I don’t want Fauci to be fired, and I don’t think economists are necessarily better forecasters. I do feel I am not getting straight answers.
From my email, a note about epidemiology
This is all from my correspondent, I won’t do any further indentation and I have removed some identifying information, here goes:
“First, some background on who I am. After taking degrees in math and civil engineering at [very very good school], I studied infectious disease epidemiology at [another very, very good school] because I thought it would make for a fulfilling career. However, I became disillusioned with the enterprise for three reasons:
- Data is limited and often inaccurate in the critical forecasting window, leading to very large confidence bands for predictions
- Unless the disease has been seen before, the underlying dynamics may be sufficiently vague to make your predictions totally useless if you do not correctly specify the model structure
- Modeling is secondary to the governmental response (e.g., effective contact tracing) and individual action (e.g., social distancing, wearing masks)
Now I work as a quantitative analyst for [very, very good firm], and I don’t regret leaving epidemiology behind. Anyway, on to your questions…
What is an epidemiologist’s pay structure?
The vast majority of trained epidemiologists who would have the necessary knowledge to build models are employed in academia or the public sector; so their pay is generally average/below average for what you would expect in the private sector for the same quantitative skill set. So, aside from reputational enhancement/degradation, there’s not much of an incentive to produce accurate epidemic forecasts – at least not in monetary terms. Presumably there is better money to be made running clinical trials for drug companies.
On your question about hiring, I can’t say how meritocratic the labor market is for quantitative modelers. I can say though that there is no central lodestar, like Navier-Stokes in fluid dynamics, that guides the modeling framework. True, SIR, SEIR, and other compartmental models are widely used and accepted; however, the innovations attached to them can be numerous in a way that does not suggest parsimony.
How smart are epidemiologists?
The quantitative modelers are generally much smarter than the people performing contact tracing or qualitative epidemiology studies. However, if I’m being completely honest, their intelligence is probably lower than the average engineering professor – and certainly below that of mathematicians and statisticians.
My GRE scores were very good, and I found epidemiology to be a very interesting subject – plus, I can be pretty oblivious to what other people think. Yet when I told several of my professors in math and engineering of my plans, it was hard for me to miss their looks of disappointment. It’s just not a track that driven, intelligent people with a hint of quantitative ability take.
What is the political orientation of epidemiologists? What is their social welfare function?
Left, left, left. In the United States, I would be shocked if more than 2-5% of epidemiologists voted for Republicans in 2016 – at least among academics. At [aforementioned very very good school], I’d be surprised if the number was 1%. I remember the various unprompted bashing of Trump and generic Republicans on political matters unrelated to epidemiology in at least four classes during the 2016-17 academic year. Add that to the (literal) days of mourning after the election, it’s fair to say that academic epidemiologists are pretty solidly in the left-wing camp. (Note: I didn’t vote for Trump or any other Republican in 2016 or 2018)
I was pleasantly surprised during my time at [very, very good school] that there was at least some discussion of cost-benefit analysis for public health actions, including quarantine procedures. Realistically though, there’s a dominant strain of thought that the economic costs of an action are secondary to stopping the spread of an epidemic. To summarize the SWF: damn the torpedoes, full steam ahead!
Do epidemiologists perform uncertainty quantification?
They seem to play around with tools like the ensemble Kalman filter (found in weather forecasting) and stochastic differential equations, but it’s fair to say that mechanical engineers are much better at accounting for uncertainty (especially in parameters and boundary conditions) in their simulations than epidemiologists. By extension, that probably means that econometricians are better too.”
TC again: I am happy to pass along other well-thought out perspectives on this matter, and I would like to hear a more positive take. Please note I am not endorsing these (or subsequent) observations, I genuinely do not know, and I will repeat I do not think economists are likely better. It simply seems to me that “who are these epidemiologists anyway?” is a question now worth addressing, and hardly anyone is willing to do that.
As an opening gambit, I’d like to propose that we pay epidemiologists more. (And one of my correspondents points out they are too often paid on “soft money.”) I know, I know, this plays with your mood affiliation. You would like to find a way of endorsing that conclusion, without simultaneously admitting that right now maybe the quality isn’t quite high enough.