What should we believe and not believe about R?

This is from my email, highly recommended, and I will not apply further indentation:

“Although there’s a lot of pre-peer-reviewed and strongly-incorrect work out there, I’ll single out Kevin Systrom’s rt.live as being deeply problematic. Estimating R from noisy real-world data when you don’t know the underlying model is fundamentally difficult, but a minimal baseline capability is to get sign(R-1) right (at least when |R-1| isn’t small), and rt.live is going to often be badly (and confidently) wrong about that because it fails to account for how the confirmed count data it’s based on is noisy enough to be mostly garbage. (Many serious modelers have given up on case counts and just model death counts.) For an obvious example, consider their graph for WA: it’s deeply implausible on its face that WA had R=.24 on 10 April and R=1.4 on 17 April. (In an epidemiological model with fixed waiting times, the implication would be that infectious people started interacting with non-infectious people five times as often over the course of a week with no policy changes.) Digging into the data and the math, you can see that a few days of falling case counts will make the system confident of a very low R, and a few days of rising counts will make it confident of a very high one, but we know from other sources that both can and do happen due to changes in test and test processing availability. (There are additional serious methodological problems with rt.live, but trying to nowcast R from observed case counts is already garbage-in so will be garbage-out.)

However, folks are (understandably, given the difficulty and the rush) missing a lot of harder stuff too. You linked a study and wrote “Good and extensive west coast Kaiser data set, and further evidence that R doesn’t fall nearly as much as you might wish for.” We read the study tonight, and the data set seems great and important, but we don’t buy the claims about R at all — we think there are major statistical issues. (I could go into it if you want, although it’s fairly subtle, and of course there’s some chance that *we’re* wrong…)

Ultimately, the models and statistics in the field aren’t designed to handle rapidly changing R, and everything is made much worse by the massive inconsistencies in the observed data. R itself is a surprisingly subtle concept (especially in changing systems): for instance, rt.live uses a simple relationship between R and the observed rate of growth, but their claimed relationship only holds for the simplest SIR model (not epidemiologically plausible at all for COVID-19), and it has as an input the median serial interval, which is also substantially uncertain for COVID-19 (they treat it as a known constant). These things make it easy to badly missestimate R. Usually these errors pull or push R away from 1 — rt.live would at least get sign(R – 1) right if their data weren’t garbage and they fixed other statistical problems — but of course getting sign(R – 1) right is a low bar, it’s just figuring out whether what you’re observing is growing or shrinking. Many folks would actually be better off not trying to forecast R and just looking carefully at whether they believe the thing they’re observing is growing or shrinking and how quickly.

All that said, the growing (not total, but mostly shared) consensus among both folks I’ve talked to inside Google and with academic epidemiologists who are thinking hard about this is:

Lockdowns, including Western-style lockdowns, very likely drive R substantially below 1 (say .7 or lower), even without perfect compliance. Best evidence is the daily death graphs from Italy, Spain, and probably France (their data’s a mess): those were some non-perfect lockdowns (compared to China), and you see a clear peak followed by a clear decline after basically one time constant (people who died at peak were getting infected right around the lockdown). If R was > 1 you’d see exponential growth up to herd immunity, if R was 0.9 you’d see a much bigger and later peak (there’s a lot of momentum in these systems). This is good news if true (and we think it’s probably true), since it means there’s at least some room to relax policy while keeping things under control. Another implication is the “first wave” is going to end over the next month-ish, as IHME and UTexas (my preferred public deaths forecaster; they don’t do R) predict.
Cases are of course massively undercounted, but the weight of evidence is that they’re *probably* not *so* massively undercounted that we’re anywhere near herd immunity (though this would of course be great news). Looking at Iceland, Diamond Princess, the other studies, the flaws in the Stanford study, we’re very likely still at < ~2-3% infected in the US. (25% in large parts of NYC wouldn’t be a shock though).

Anyways, I guess my single biggest point is that if you see a result that says something about R, there’s a very good chance it’s just mathematically broken or observationally broken and isn’t actually saying that thing at all.”

That is all from Rif A. Saurous, Research Director at Google, currently working on COVID-19 modeling.

Currently it seems to me that those are the smartest and best informed views “out there,” so at least for now they are my views too.