pooled test

Pooled Testing is Super-Beneficial

Tyler and I have been pushing pooled testing for months. The primary benefit of pooled testing is obvious. If 1% are infected and we test 100 people individually we need 100 tests. If we split the group into five pools of twenty then if we’re lucky, we only need five tests. Of course, chances are that there will be some positives in at least one group and taking this into account we will require 23.2 tests on average (5 + (1 – (1 – .01)^20)*20*5). Thus, pooled testing reduces the number of needed tests by a factor of 4. Or to put it the other way, under these assumptions, pooled testing increases our effective test capacity by a factor of 4. That’s a big gain and well understood.

An important new paper from Augenblick, Kolstad, Obermeyer and Wang shows that the benefits of pooled testing go well beyond this primary benefit. Pooled testing works best when the prevalence rate is low. If 10% are infected, for example, then it’s quite likely that all five pools will have at least one positive test and thus you will still need nearly 100 tests (92.8 expected). But the reverse is also true. The lower the prevalence rate the fewer tests are needed. But this means that pooled testing is highly complementary to frequent testing. If you test frequently then the prevalence rate must be low because the people who tested negative yesterday are very likely to test negative today. Thus from the logic given above, the expected number of tests falls as you tests more frequently (per test-cohort).

Suppose instead that people are tested ten times as frequently. Testing individually at this frequency requires ten times the number of tests, for 1000 total tests. It is therefore natural think that group testing also requires ten times the number of tests, for more than 200 total tests. However, this estimation ignores the fact that testing ten times as frequently reduces the probability of infection at the point of each test (conditional on not being positive at previous test) from 1% to only around .1%. This drop in prevalence reduces the number of expected tests – given groups of 20 – to 6.9 at each of the ten testing points, such that the total number is only 69. That is, testing people 10 times as frequently only requires slightly more than three times the number of tests. Or, put in a different way, there is a “quantity discount” of around 65% by increasing frequency.

Peter Frazier, Yujia Zhang and Massey Cashore also point out that you could also do an array-protocol in which each person is tested twice but in two different groups–this doubles the number of initial tests but limits the number of false-positives (both tests must be positive) and the number of needed retests. (See figure.).

Moreover, we haven’t yet taken into account the point of testing which is to reduce the prevalence rate. If we test frequently we can reduce the prevalence rate by quickly isolating the infected population and by reducing the prevalence rate we reduce the number of needed tests. Indeed, under some parameters it’s possible to increase the frequency of testing and at the same time reduce the total number of tests!

We can do better yet if we group individuals whose risks are likely to be correlated. Consider an office building with five floors and 100 employees, 20 per floor. If the prevalence rate is 1% and we test people at random then we will need 23.2 tests on average, as before. But suppose that the virus is more likely to transmit to people who work on the same floor and now suppose that we pool each floor. Holding the total prevalence rate constant, we are now likely to have a zero prevalence rate on four floors and a 5% prevalence rate on one floor. We don’t know which floor but it doesn’t matter–the expected number of tests required now falls to 17.8.

The authors suggest using machine learning techniques to uncover correlations which is a good idea but much can be done simply by pooling families, co-workers, and so forth.

The government has failed miserably at controlling the pandemic. Tens of thousands of people have died who would have lived under a more competent government. The FDA only recently said they might allow pooled testing, if people ask nicely. Unbelievably, after telling us we don’t need masks (supposedly a noble lie to help limit shortages), the CDC is still disparaging testing of asymptomatic people (another noble lie?) which is absolutely disastrous. Paul Romer is correct, testing capacity won’t increase until we put soft drink money behind advance market commitments and start using techniques such as pooled testing. Fortunately or sadly, depending on how you look at it, it’s not too late to do better. Some universities are now proposing rapid, frequent testing using pooling. Harvard will test every three days. Cornell will test frequently. Delaware State will test weekly. Lets hope the idea spreads from the ivory tower.

FDA Allows Pooled Tests and a Call for Prizes

The FDA has announced they will no longer forbid pooled testing:

In order to preserve testing resources, many developers are interested in performing their testing using a technique of “pooling” samples. This technique allows a lab to mix several samples together in a “batch” or pooled sample and then test the pooled sample with a diagnostic test. For example, four samples may be tested together, using only the resources needed for a single test. If the pooled sample is negative, it can be deduced that all patients were negative. If the pooled sample comes back positive, then each sample needs to be tested individually to find out which was positive.

…Today, the FDA is taking another step forward by updating templates for test developers that outline the validation expectations for these testing options to help facilitate the preparation, submission, and authorization under an Emergency Use Authorization (EUA).

This is good and will increase the effective number of tests by at least a factor of 2-3 and perhaps more.

In other news, Representative Beyer (D-VA), Representative Gonzalez (R-OH) and Paul Romer have an op-ed calling for more prizes for testing:

Offering a federal prize solves a critical part of that problem: laboratories lack the incentive and the funds for research and development of a rapid diagnostic test that will, in the best-case scenario, be rendered virtually unnecessary in a year.

…We believe in the ability of the American scientific community and economy to respond to the challenge presented by the coronavirus. Congress just has to give them the incentive.

The National Institutes of Health (NIH) have already begun a similar strategy with their $1.4 billion “shark tank,” awarding speedy regulatory approval to five companies that can produce these tests. Expanding the concept to academic labs through a National Institute of Science and Technology (NIST)-sponsored competition has the added benefit ultimately funding more groundbreaking research once the prize money has been awarded.

This is all good but frustrating. I made the case for prizes in Grand Innovation Prizes for Pandemics in March and Tyler and I have been pushing for pooled testing since late March. We were by no means the first to promote these ideas. I am grateful things are happening and relative to normal procedure I know this is fast but in pandemic time it is molasses slow.

The new quicker, cheaper, supply chain robust saliva test

The FDA has just approved a new and important Covid-19 test:

Wide-spread testing is critical for our control efforts. We simplified the test so that it only costs a couple of dollars for reagents, and we expect that labs will only charge about $10 per sample. If cheap alternatives like SalivaDirect can be implemented across the country, we may finally get a handle on this pandemic, even before a vaccine,” said Grubaugh.

One of the team’s goals was to eliminate the expensive saliva collection tubes that other companies use to preserve the virus for detection. In a separate study led by Wyllie and the team at the Yale School of Public Health, and recently published on medRxiv, they found that SARS-CoV-2 is stable in saliva for prolonged periods at warm temperatures, and that preservatives or specialized tubes are not necessary for collection of saliva.

Of course this part warmed my heart (doubly):

The related research was funded by the NBA, National Basketball Players Association, and a Fast Grant from the Emergent Ventures at the Mercatus Center, George Mason University.

The NBA had the wisdom to use its unique “bubble” to run multiple tests on players at once, to see how reliable the less-known tests would be.  This WSJ article — “Experts say it could be key to increasing the nation’s testing capacity” — has the entire NBA back story.  At an estimated $10 a pop, this could especially be a game-changer for poorer nations.  Furthermore, it has the potential to make pooled testing much easier as well.

Here is an excerpt from the research pre-print:

The critical component of our approach is to use saliva instead of respiratory swabs, which enables non-invasive frequent sampling and reduces the need for trained healthcare professionals during collection. Furthermore, we simplified our diagnostic test by (1) not requiring nucleic acid preservatives at sample collection, (2) replacing nucleic acid extraction with a simple proteinase K and heat treatment step, and (3) testing specimens with a dualplex quantitative reverse transcription PCR (RT-qPCR) assay. We validated SalivaDirect with reagents and instruments from multiple vendors to minimize the risk for supply chain issues. Regardless of our tested combination of reagents and instruments from different vendors, we found that SalivaDirect is highly sensitive with a limit of detection of 6-12 SARS-CoV-2 copies/μL.

No need to worry and fuss about RNA extraction now.  Here is the best simple explanation of the whole thing.

The researchers are not seeking to commercialize their advance, rather they are making it available for the general benefit of mankind.  Here is Nathan Grubaugh on Twitter.  Here is Anne Wyllie, also a Kiwi and a Kevin Garnett fan.  A further implication of course is that the NBA bubble is not “just sports,” but also has boosted innovation by enabling data collection.

All good news of course, and Fast at that.  And this:

“This could be one the first major game changers in fighting the pandemic,” tweeted Andy Slavitt, a former acting administrator of the Centers for Medicare and Medicaid Services in the Obama administration, who expects testing capacity to be expanded significantly. “Rarely am I this enthusiastic… They are turning testing from a bespoke suit to a low-cost commodity.”

And here is coverage from Zach Lowe.  I am very pleased with the course of Fast Grants more generally, and you will be hearing more about it in the future.

Stack-Push-Pop COVID Testing

A COVID test that doesn’t come back in a few days is close to useless and PCR tests are taking a long time to process:

NYTimes: Most people who are tested for the virus do not receive results within the 24 to 48 hours recommended by public health experts to effectively stall the virus’s spread and quickly conduct contact tracing, according to a new national survey by researchers from Harvard University, Northeastern University, Northwestern University and Rutgers University….People who had been tested for the virus in July reported an average wait time of about four days. That is about the same wait time for those who reported taking a test in April. Over all, about 10 percent of people reported waiting 10 days or more.

…“A test result that comes back in seven or eight days is worthless for everybody — it shouldn’t even be counted,” said Dr. Amesh Adalja, a senior scholar at the Johns Hopkins University Center for Health Security and a physician in Pittsburgh. “It’s not a test in any kind of effective manner because it’s not actionable.”

One seemingly severe but potential solution is to change how tests are processed. Right now it’s mostly first come, first-served but this means we can easily have a situation where everyone eventually gets a test result but all the results are useless because they take a week or more to process. I propose instead that any test that can’t be reported back in 3-4 days be thrown out immediately. Labs should focus only on processing tests that can be reported back quickly.

One way of thinking about this is to use a stack or last-in first-out (LIFO) model for testing. In a stack model the newest test request is pushed onto the top of the stack and the next test to be processed is popped off the top of the stack. One disadvantage of this model is that some test requests will never be processed (they should be removed from the bottom of the stack and returned as null results). Some people will be angry.

But the stack model of testing has a huge advantage over first-come, first-served. Namely, just as many tests will be completed as under the current model but the tests results will all come back faster and be much more useful. What would you rather have, guaranteed stale test results or fresh results with some possibility of a null return? Since a stale result is not much better than a null it seems obvious that the stack system is superior. Most importantly, faster, more useful tests will help to end the crisis by reducing the number of infections.

Addendum: See also my posts Pooled Testing is Super-Beneficial and Frequent, Fast, and Cheap is Better than Sensitive on other methods to improve testing.

Pooling to multiply SARS-CoV-2 testing throughput

Here is an email from Kevin Patrick Mahaffey, and I would like to hear your views on whether this makes sense:

One question I don’t hear being asked: Can we use pooling to repeatedly test the entire labor force at low cost with limited SARS-CoV-2 testing supplies?

Pooling is a technique used elsewhere in pathogen detection where multiple samples (e.g. nasal swabs) are combined (perhaps after the RNA extraction step of RT-qPCR) and run as one assay. A negative result confirms no infection of the entire pool, but a positive result indicates “one or more of the pool is infected.” If this is the case, then each individual in the pool can receive their own test (or, if we’re getting fancy [read: probably too hard to implement in the real world], perform an efficient search of the space using sub-pools).

To me, at least, the key questions seem to be:

– Are current assays sensitive enough to work? Technion researchers report yes in a pool as large as 60.

– Can we align limiting factors in testing cost/velocity with pooled steps? For example, if nasal swabs are the limiting reagent, then pooling doesn’t help; however if PCR primers and probes are limiting it’s great.
– Can we get a regulatory allowance for this? Perhaps the hardest step.

Example (readers, please check my back-of-the-envelope math): If we assume base infection rate of the population is 1%, then pooling of 11 samples has a ~10% chance of coming out positive. If you run all positive pools through individual assays, the expected number of tests per person is 0.196 or a 5.1x multiple on testing throughput (and a 5.1x reduction in cost). This is a big deal.

If we look at this from the view of whole-population biosurveillance after the outbreak period is over and we have a 0.1% base infection rate, pools of 32 samples have an expected number of tests per person at 0.0628 or a 15.9x multiple on throughput/cost reduction.

Putting prices on this, an initial whole-US screen at 1% rate would require about 64M tests. Afterward, performing periodic biosurveillance to find hot spots requires about 21M tests per whole-population screen. At $10/assay (what some folks working on in-field RT-qPCR tests believe marginal cost could be), this is orders of magnitude less expensive than mitigations that deal with a closed economy for any extended period of time.

I’m neither a policy nor medical expert, so perhaps I’m missing something big here. Is there really $20 on the ground or [something something] efficient market?

By the way, Iceland is testing many people and trying to build up representative samples.

Monday assorted links

1. Fryer responds, in my opinion we are back to where we started after initial publication of his piece.

2. German wealth inequality is extreme.

3. Japanese building a dam almost entirely with robots.

4. Emmanuel Farhi has passed away, here is an earlier interview with him.

5. It was the later Harvard economist Robert Dorfman who came up with the pooled testing idea in 1943 to solve WWII problems.  To test recruits for syphilis, of course.

Sunday assorted links

Friday assorted links

1. Scott Alexander reviews Toby Ord’s The Precipice, about existential risk.

2. Pooled testing in Germany.

3. A critique of the Paycheck Protection Program — it might help already stable restaurants the most.  See also this tweet storm.

4. Should we pivot to a service trade agenda?

5. Full paper assessing health care capacity in India.

6. Claims about Covid and the future economics of cultural institutions.

7. I could link to Matt Levine every day, but do read this one on liquidity transformation.

8. How is the cloud holding up?  A good post.

9. Immunity segregation comes to Great Britain.

10. Robin Hanson on the variance in R0 and how hard it is to halt the spread of the virus.

11. New program for on-line “Night Owls” philosophy by Agnes Callard.

12. The true story of the toilet paper shortage: it’s not about hoarding, rather a shift of demand away from the commercial sector into the household sector (you are doing more “business” at home these days).


14. Fan, Jamison, and Larry Summers 2016 paper on the economics of a pandemic.  I wrote at the end of the blog post: “In other words, in expected value terms an influenza pandemic is a big problem indeed.  But since, unlike global warming, it does not fit conveniently into the usual social status battles which define our politics, it receives far less attention.”

15. Buying masks from China just got tougher.

16. How to produce greater capacity flexibility for hospitals.

17. Paycheck Protection Program is steeped in chaos.

A Solution if We Act

Many simulations have been run in recent weeks using standard epidemiological models and the emerging consensus, as I read it, is that test, trace and isolate can be very effective. Paul Romer’s simulations are here and he notes that a COVID-19 test does not have to be especially accurate for the test, trace and isolate strategy to work. Indeed, you don’t even need to trace, if you test enough people. Linnarsson and Taipale agree writing:

We propose an additional intervention that would contribute to the control of the COVID-19 pandemic and facilitate reopening of society, based on: (1) testing every individual (2) repeatedly, and (3) self-quarantine of infected individuals. By identification and isolation of the majority of infectious individuals, including the estimated 86% who are asymptomatic or undocumented, the reproduction number R0 of SARS-CoV-2 would be reduced well below 1.0, and the epidemic would collapse….Unlike sampling-based tests, population-scale testing does not need to be very accurate: false negative rates up to 15% could be tolerated if 80% comply with testing, and false positives can be almost arbitrarily high when a high fraction of the population is already effectively quarantined.

Similarly, Berger, Herkenhoff and Mongey conclude:

Testing at a higher rate in conjunction with targeted quarantine policies can (i) dampen the economic impact of the coronavirus and (ii) reduce peak symptomatic infections—relevant for hospital capacity constraints.

This is exactly the strategy I discussed in, Mass Testing to Fix the Labor Market, where I wrote “Testing, isolating and tracing will [get the economy back on track] much faster and cheaper than dealing with a prolonged recession.”

I want to expand on the costs because it’s clear that a mass testing regime will require millions of tests. Is that cost-effective? Yes. The two types of tests we have are a RT-PCR test for COVID-19 (there are several versions) which costs something like $100 but could probably be much less as we ramp up. (We can cut costs and greatly increase throughput, for example, by pooled testing.) The second test, a blood test for antibodies, is, as best as I can tell, in the realm of $10. Both types are useful. I am going to be very conservative and say that we use a combination of tests at $75 per test. To test the entire US population, therefore, it would cost on the order of $25 billion dollars. Coincidentally, $25 billion is about what we spent on the Manhattan Project in current dollars. Thus, I am proposing a Manhattan Project for testing.

Twenty five billion dollars to test the entire US population. Now suppose the pandemic knocks 5% off US GDP over the next year or two, that’s roughly a trillion dollars lost. Or to put it differently, $3 billion a day. Thus, if mass testing reduces the number of days we are away from work by 9, it pays for itself. Let’s again be conservative and say that testing will also require a $25 billion fixed cost to build the enzyme factories and so forth, for a total cost of $50 billion. 18 days and it’s worth it.

We would also save medical costs by suppressing the virus. (The focus on ventilators has perhaps been overdone given that ventilators in no way guarantee survival–better to stop people needing ventilators.) We would also save lives. Thus, a program of mass testing seems like a no-brainer. Yet, there is no direct funding for anything like this in the $2.2 trillion CARES bill which is stunning. Here’s Austan Goolsbee:

We literally put in a tax break for retailers and restaurants to expand their capacity but not money for production of more COVID tests.

Here’s Paul Romer:

We have an economic crisis because it is not safe for people to work or consume. Our Congress just passed a bill that will spend $2.2 trillion to deal with the crisis. Can anyone identify any spending in this bill devoted to making it safe for people to work and consume?

As I wrote:

We need to attack the virus with test, isolate, and trace. More money for counter-attack!

Objections will no doubt be raised. Isn’t there a shortage of reagents? Do we have the personnel to test everyone? To which I answer, $50 billion solves a lot of problems. We won’t know how many till we try. We don’t need all of final testing capacity at once and even poor tests like simple temperature checks will help but we need to move rapidly in the right direction. The main constraint is time. Social distancing and lock downs are starting to have an effect. I expect the emergency will peak in mid-April and then things will slowly start to get improve. Even when the worst of the emergency passes, however, we will still need lots of testing. This virus will be with us and the world for some time. Let’s get on it.

Wednesday assorted links

1. “Variation in skill can explain 44 percent of the variation in diagnostic decisions, and policies that improve skill perform better than uniform decision guidelines.”  Not a Covid-19 paper, but relevant of course, link here.

2. Which states are practicing social distancing the most? (NYT)

3. Human challenge studies to accelerate a vaccine.

4. My Bloomberg column on how the macroeconomics of Covid-19 do and do not resemble WWII.  Oops, correct link here.

5. The idea of “group testing” actually came from economist Robert Dorfman of Harvard (who taught me history of economic thought way back when).  And more on pooled tests.  And Nebraska is doing pooling.

6. “Use Surplus Federal Real Property to Expand Medical and Quarantine Capacity for COVID-19.

7. Why scaling up testing is so hard (New Yorker).

8. We still don’t know the CFR for H1N1.

9. “Overlooked is the possibility that beauty can influence college admissions.”  But not for Chinese it seems.

10. Mullainathan and Thaler with some deregulatory suggestions (NYT).

11. “The Food and Drug Administration will allow doctors across the country to begin using plasma donated by coronavirus survivors to treat patients who are critically ill with the virus, under new emergency protocols approved Tuesday.

12. Benjamin Yeoh on early vaccine use.

13. James Stock: “The most important conclusion from this exercise is that policy hinges critically on a key unknown
parameter, the fraction of infected who are asymptomatic. Evidence on this parameter is scanty, however
it could readily be estimated by randomized testing.”

14. Two elite factions in tension with each other (nasty stuff, please do not read).

Only in England, part III

From MissMarketCrash, apparently these words have been banned by British local government authorities:

across-the-piece, actioned, advocate, agencies, ambassador, area based,
area focused, autonomous, baseline, beacon, benchmarking, best
practice, blue sky thinking, bottom-up, CAAs, can do culture,
capabilities, capacity, capacity building, cascading, cautiously
welcome, challenge, champion, citizen empowerment, client, cohesive
communities, cohesiveness, collaboration, commissioning, community
engagement, compact, conditionality, consensual, contestability,
contextual, core developments, core message, core principles, core
value, coterminosity, coterminous, cross-cutting, cross-fertilization,
customer, democratic legitimacy, democratic mandate, dialogue,
direction of travel, distorts spending priorities, double devolution,
downstream, early win, edge-fit, embedded, empowerment, enabler,
engagement, engaging users, enhance, evidence base, exemplar, external
challenge, facilitate, fast-track, flex, flexibilities and freedoms,
framework, fulcrum, functionality, funding streams, gateway review,
going forward, good practice, governance, holistic, holistic
governance, horizon scanning, improvement levers, incentivising, income
streams, indicators, initiative, innovative capacity, inspectorates,
interdepartmental, interface, iteration, joined up, joint working,
LAAs, Level playing field, lever, leverage, localities, lowlights,
MAAs, mainstreaming, management capacity, meaningful consultation,
meaningful dialogue, mechanisms, menu of options, multi-agency,
multidisciplinary, municipalities, network model, normalising,
outcomes, output, outsourced, overarching, paradigm, parameter,
participatory, partnership working, partnerships, pathfinder, peer
challenge, performance network, place shaping, pooled budgets, pooled
resources, pooled risk, populace, potentialities, practitioners,
predictors of beaconicity, preventative services, prioritization,
priority, proactive, process driven, procure, procurement, promulgate,
proportionality, protocol, provider vehicles, quantum, quick hit, quick
win, rationalisation, rebaselining, reconfigured, resource allocation,
revenue streams, risk based, robust, scaled-back, scoping, sector wise,
seedbed, self-aggrandizement, service users, shared priority, shell
developments, signpost, single conversations, single point of contact,
situational, slippage, social contracts, social exclusion, spacial,
stakeholder, step change, strategic, strategic priorities, streamlined,
sub-regional, subsidiarity, sustainable, sustainable communities,
symposium, systematics, taxonomy, tested for soundness, thematic,
thinking outside of the box, third sector, toolkit, top-down,
trajectory, tranche, transactional, transformational, transparency,
upstream, upward trend, utilise, value-added, visionary, welcome,
wellbeing, worklessness.

Whenever I step off the plane in the U.K. or Netherlands a tear (or more) comes to my eye as I contemplate those countries as birthplaces of individual liberty.  But this: is it a move for or against liberty?

It's funny, but if you Google "predictors of beaconicity" you get lots and lots.

Does television viewing trigger autism?

Gregg Easterbrook says yes, citing this new study.  Here is part of the abstract:

…we empirically investigate the hypothesis that early childhood television viewing serves as such a trigger [for autism].  Using the Bureau of Labor Statistics’ American Time Use Survey, we first establish that the amount of television a young child watches is positively related to the amount of precipitation in the child’s community.  This suggests that, if television is a trigger for autism, then autism should be more prevalent in communities that receive substantial precipitation.  We then look at county-level autism data for three states – California, Oregon, and Washington – characterized by high precipitation variability.  Employing a variety of tests, we show that in each of the three states (and across all three states when pooled) there is substantial evidence that county autism rates are indeed positively related to county-wide levels of precipitation.  In our final set of tests we use California and Pennsylvania data on children born between 1972 and 1989 to show, again consistent with the television as trigger hypothesis, that county autism rates are also positively related to the percentage of households that subscribe to cable television.  Our precipitation tests indicate that just under forty percent of autism diagnoses in the three states studied is the result of television watching due to precipitation, while our cable tests indicate that approximately seventeen percent of the growth in autism in California and Pennsylvania during the 1970s and 1980s is due to the growth of cable television.  These findings are consistent with early childhood television viewing being an important trigger for autism.

I am unconvinced.  Precipitation, in these states, is a coastal phenomenon and is proxying for heterogeneity in the gene pool.  Perhaps the coastal areas attract a more "autism-ready" group of individuals.  In fairness to the authors, they do try to control for income and education and population density and diagnosis capacity, among other variables.  Note two worrying features in the results: in California precipitation is not correlated with autism rates at all (there is a north vs. south split for rain, rather than the coast vs. inland), and precipitation is a better predictor of autism than cable viewing is directly. 

Here is the latest autism news on the genetic front.

Addendum: Steve Levitt is also skeptical.