Peer Review of “COVID-19 Antibody Seroprevalence in Santa Clara County, California”

Here are 2300 words from Balaji, self-recommending.  Here is the original piece.  Balaji starts with:

The high reported positive rate in this serosurvey may be explained by the false positive rate of the test and/or by sample recruitment issues.

I look forward to posting more on this.


Dream on, I say. Clear outlier study.

Wuhan antibody estimates are finding ~1.8% implied CFR, per my calculations.

Though I do sincerely hope that the Stanford study numbers are correct and the Korean data is mistaken!

Me too. I'd be wrong but thrilled if it turned out the fatality and hospitalization rates were way lower, and we were getting close to some real herd immunity.

That's not even close to herd immunity. That is a small fraction of what is needed

Balaji's comments indicate that the results of the Santa Clara study are not enough to definitively conclude anything. That's hardly the same as a dream that is a "clear outlier study".

The author's estimated false positive rate is based on balancing expected false positives and false negatives, as far as I can see. Balaji shows that if there are basically no false negatives and the maximum number of false positives, it is possible that the results of the study are invalid.

As for the self selected cohort issue, it probably exists but it's hard to say how big of an effect and how to control for it.

This study is one data point. The Wuhan study is another one. Castiglione, Gangelt, and Netherlands study are others. More will come in the future, I expect. Saying one or another is definitive, or that one invalidates another, is spurious.

Personally I think the conclusion is pretty high certainty already, but we'll see.

I left something out. I don't understand how one of Balaji's critiques is relevant.

> In order to generate these thousands of excess deaths in just a few weeks with the very low infection fatality rate of 0.12–2% claimed in the paper, the virus would have to be wildly contagious. It would mean all the deaths are coming in the last few weeks as the virus goes vertical, churns out millions of infections per week to get thousands of deaths, and then suddenly disappears as it runs out of bodies.

This study implies COVID-19 is more infectious than previously thought. Balaji states that the excess death rate is impossible to square with this, since that would mean it is way too infectious.

On the other hand, if COVID-19 is less infectious than the study assumes, by what means are the excess deaths occuring?

The answer to both sides is the same, which is some combination of COVID-19 and some combination of people with dangerous conditions being afraid to seek medical attention. The proportion of each is uncertain.

Was the study itself the subject of a prior MR post, or is Balaji critique the leadoff hitter on this subject?

If I were trying to be objective, I would think the study itself is also self-recommending and probably a better starting point.

But only if I were trying to be objective.

It seems that the critique is the lead off and there is a strong whiff of this can’t be right. Here is the link to the study - read it yourself

I think the authors laid out the methodology pretty clearly and it should be a high priority to ramp up testing capability and test other areas. I’m getting the distinct impression that many are hostile to good news.

For the first time in forever I agree with Rich Berger!!! I read the full paper yesterday when it came out. These guys are very good researchers and their work was carefully done. They took great pains to validate an off the shelf test using sera from locally identified SARS-CoV-2 patients. Life is too short to even visit Medium posted articles.

How embarrassing must that be!

They should start running public service announcements on TV telling people that emergency rooms are now fairly empty and if they have symptoms of heart attack or stroke they should call 9/11. We could lose a lot of people who try to touch out major health crises at home rather than go to the hospital.

100% agreed.

No, we are not going to have many deaths that way. "I feel as if an elephant stepped on my chest but won't call 911" cases will be extremely low.

Mr Macron in his recent speech advised that people should continue to access health services.

@Todd K - as I understand it, the problem isn't just the idea that people will never call 911, but that they will wait for very severe symptoms of a heart attack before doing so. Earlier intervention makes a difference.

> This study implies COVID-19 is more infectious than previously what means are the excess deaths occuring?

It's possible infections started earlier than previously thought. That would explain the death rate, the prevalence, etc. We have too many data points from too many countries and methodologies indicating there are 5-15X more asymptomatics. They can't all be wrong.

This winter was bad for non-flu flu-like illnesses.

Sounds like Covid is being painted in the paper like the magic bullet from Oliver Stone's JFK. Basically it's infected everyone in the US in almost no time at all. So once the current 'crop' of patients in the hospital either recover or die, there will suddenly be no new cases and we will all be happy.

Of course how did China ever manage to do a nationwide lockdown in time to keep Wuhan the only city overwhelmed? For that matter how did it take three months or so to go from China to the US but only a few weeks to infect the entire nation despite our unprecedented lockdown efforts?

nevermind the bollocks
here is the Dr. Bhattacharya (the investigators) explanation
of the st. clara study

Those Chinese numbers could be accurate, but I once again struggle with whether we can accept any numbers out of China as accurate.

Consider the implications if a study indicated 10%, 20%, or 30% infection rate in Wuhan. That would be millions of actual infections compared to approximately 50k confirmed cases in Wuhan.

It would challenge much of the Chinese government's narrative. Testing would have detected only a very small fraction of cases, calling into question how effective the government's testing program was. It would raise yet more questions about Chinese underreporting of deaths: COVID-19 might be less lethal than otherwise thought, but is the fatality rate that low? It would make the Chinese government's response look later and less effective than the CCP storyline. Also, if it's that infectious, wouldn't it even more call into question whether the Chinese government had - and still has - the disease snuffed out in the rest of the country?

Given all the lies and obfuscations from China, I just have a tough time believing that the government would allow the release of data saying "actual infections were/are 20x (or more) confirmed cases".

Assuming deaths were under-reported, the fatality rate wouldn't be too out of line - the original Korean estimate was 0.5%.

Balaji hit all the key points, but I also wonder how much you could extrapolate from Santa Clara County in general. The obesity rate in the county is 21% - half that of the US average. Given that obesity seems to be a risk factor of higher rates of hospitalization and death with people under 60, the overall US death rate could be higher.

Exactly, this kind of problem runs right through all of these types of test so far. We need a larger, randomer, sero test. The criticisms in this post are fair, and I acknowledge that I am mood-affiliated with the conclusion of the study... but I'm beginning to think Tyler is mood-affiliated the other way. The stats are not convincing, but they are still consistent with their conclusion. One thing everyone is beginning to learn through this is not to reject all hypotheses until a perfect RCT shows the right p-value.

Thanks Tyler.

Would add to Balaji's very strong critique two quick questions:

1. The covariates seem strikingly imbalanced in Table 1, as you'd expect from a convenience sample.

They "adjust" for the population via reweighting, but that seems ... fraught. Would love someone who is more qualified to weigh in on that.

2. I noted these covariates -

"The survey asked for six data elements: zip code of residence, age, sex, race/ethnicity, underlying comorbidities, and prior clinical symptoms."

- are not comprehensively captured in Table 1. Namely, underlying comorbidities and prior clinical symptoms don't have a population-based comparison.

Comorbidities, in particular, don't strike me as difficult to track down? Would be good to have that comparison point.

The Stanford study's sample of 3,330 somewhat self-selected Silicon Valley residents found an infection rate somewhere between 1% and 6% as of April 3-4, which implies Herd Immunity is still a long way off.

Official case counts in Santa Clara County are now doubling every 21 days, so it's not likely there has been multiple times growth over the last two weeks.

So, this study isn't good news for Herd Immunity optimists.

At what level of mortality rate do you think just letting the population build herd immunity would be the correct strategy- ala Sweden?

Santa Clara County went into a lockdown very early in the US, so it wouldn't be surprising to find that only 20-50K had been infected out of 1.8 million, even if we accepted the result at face value. Was it a mistake to lock down for a month? At what point could we judge it a mistake or not?

My opinion is the big, unexpected development was that hospital capacity has proven more elastic than was assumed.

Strikingly, this didn’t happen so much by expanding medical care supply as by depressing medical care demand. Non-COVID trips to the emergency room went way down, and NYC doctors innovated a new protocol for treating COVID patients with breathing problems (put them on their stomachs and treat with oxygen masks) that cut ventilator demand substantially. Because caring for patients on ventilators is hugely labor-intensive, that in turn cut demand for doctors and nurses to far below earlier estimates.

So, we appear to have Flattened the Curve.

Now what?

Yes because when you overestimate the number of beds needed by tens of thousands in one city, it has to be because you underestimated the awesome efficacy of the prone position.

It absolutely can’t be because of the fact that the dynamics of a novel virus that no one understands very well and that put out incredibly inconsistent and noisy data were somehow misunderstood.

There has been a disturbing amount of Public Intellectual flitting-about on this web-site recently. When I first looked at the site, it seemed that the boss had a lot of cachet, he could stand in the company of, say, pod-casting Ezra Klein, and speak charmingly on many topics that he had read into and had thought about.

Nowadays the faithful reader is getting mostly grumbling about various models, as if they claimed to be anything more than controlled fantasies. That stuff is taken up in the commentariat, by knowers of the latest data from Italy or Sweden, or by disappointed
"front-line" body mechanics who can't get over the idea that China fed them some bad data to the WHO once upon a time. And then of course the TDS-callers who still profess fealty to the poor red-nosed clown-in-chief, that 1980's disco playboy with the improbable haircut, that guy on the podium wearing the fourteen-inch shoes. The guy who heard that you wear gloves to the debutante ball, so he showed up wearing a pair of gold-encrusted first-baseman's mitts.

I'm afraid that the New Normal is going to come up with professor TC fourteen or twenty points down. In the wayback machine this site for all its past glory might come to be the pages where you could learn that LA strip-malls were the best place to find good noodles, if you wanted to eat noodles, but then panic set in and it became not much more than reading the tea leaves from Sweden and Taiwan.

It's all a tragedy, because we need the guys who scored good on the GRE and who carefully built a following to keep it together. When the high-scoring GRE guys go pear-wise, there's no predicting how things will work out for the rest of us.

Are you suggesting that it is better when public intellectuals "stay in their lane" and comment on the quality of strip mall noodles?

That's called comparative advantage.

in the context of the current viral pandemic
here is where you are intellectually dishonest&lazy
"body mechanics who can't get over the idea that China fed them some bad data to the WHO once upon a time."
-this is not even a remotely a honest description of what china did at the beginning of this mess

So I am not surprised. Those people throw firecrackers at dancing paper dragons to solve their problems. What else can a first-line responder be but putty in their hands?

Did you real hold up Ezra Klein as some sort of intellectual. The guy just went on a book tour where he couldn't answer a single question on his book's topic with anything other than "I don't know." I don't disagree with your point. But I don't think this site's past is any indication of better discourse. I think public intellectual are having their so-called intellect tested and they're failing left and right.

Solid parody. The Ezra Klein mention gave the game away though

If nothing else the world is getting a real-time look at how data analysis is done, with some examples of good work and some examples of weaker work.

I think overall I'm slightly disappointed by ... not so much the quality of the work, but by the researchers', and even more so the commenters' and pundits', over-confidence in the reliability of the research results that accorded with their biases and priors.

I knew to expect a lot of that, but there's been a bit more than I was expecting.

MR comments OTOH were just about exactly what I expected, about 1/3 thoughtful, 1/3 twisting the evidence to support their biases, and 1/3 not attempting even that and just bloviating or trolling. I.e. the usual mix at MR.

Tyler has done a good job of articulating just how difficult analyzing this new problem has turned out to be, how puzzlingly "heterogeneous" the various datapoints have been.


What I’m most interested in is this: in the event this study is proven fairly accurate, have we seen enough throughout this crisis to warrant a recalibration of our heuristics vis-à-vis expert opinion?

Could it be that the most talented “thought leaders” among us are not so because they are consistently right, but because they are exceptionally adept at concealing motivated reasoning with clever framing and a superficially objective and dispassionate tone?

Balaji is one of the smartest people out there.
but .....
What he has failed to recognize is that he is trying to figure out a creature that has been around for about a billion years, successfully.

Balaji, I assume, is homo sapiens, with no access to angelic inspiration.

He is outclassed.

Bright guy, of course, but just another outclassed materialist academic.

I hope he gets it right this time, but if he does, it will be based on luck, not on his ability to outwit a creature that is basically a billion years older than him and that thrives on the sort of things that ...

and now I am going to be a little more intense than I usually am ....

what Mr Balaji is proposing is this: he is an intelligent person who understands what a creature that has thrived for a billion years, taking down or taking advantage of millions among all the generations Mr Balaji is descended from.

I wish him the best in his efforts to outthink his virus enemy.

But he really really needs to step up his game.

Of course, the outcome is not dependent on us being intelligent, the outcome is already written in, or locked in, and what has already happened is the key to what is going to happen, no matter how many brilliant little people think they are going to make it happen another way..

These things, our barely alive coronavirus companions, have to win a little and lose a little, if they always won they would he long gone.
If they always lost, they would be long gone.

Good luck,
and ----

next time one of your pals who graduated from some silly computer PHD program tells you about how he is helping create AI systems that will make life so so much more simple

take the poor little wretch aside, speak to him man to man, and tell him to give it up and find a real job.

His April 17, 2:38 pm twitter comment was particularly hubristic

You retest the positives, and then retest them again. This is how it is done.

I have a feeling, though, that the results will never be clean enough for some people. There is an enormous amount of reputation now invested in the worldwide lockdowns- any result that shows that perhaps the virus isn't as fatal as was believed from the RT-PCR tests will find it itself under both legitimate attack (Balaji's critique is solid) and illegitimate attack.

In the end, I expect we will find the true mortality rate is 0.5% or less -this does seem to be the forming consensus from the serological studies done so far along with the random RT-PCR results that are starting to be disclosed.

> You retest the positives, and then retest them again

Which fixes for sampling errors and testing errors, but not false positives in the actual antibody test itself.

For interest, I reviewed the documentation for a chinese antigen test last week. the stated false negative rate was 3% (not considering underlying pathogenic issues). I would have been quite surprised if it was lower. So I'm not impressed by any of these population scans using antigen testing. The paper doesn't even mention false negatives in the abstract - instant fail from me if I peer reviewed it

How does it fix sampling errors? It's clear that there was an incentive for people with recent symptoms to get a free test that they couldn't get elsewhere to confirm if they'd had the virus. This was not a random sample. People were self selected, and could bring one person from their household to also be tested.

What the Stanford team needs to do in addition to retesting their positives twice or more is to take samples from a much hotter hot spot, like The Bronx- there you are pretty certain to find a significantly higher ratio of anti-body positive people.

That might be interesting since Covid-19 infection rates is a function of viral load, meaning, the more people that have C-19, the higher the chance of catching it. I wouldn't be surprised if "everybody" in the The Bronx now has C-19 antibodies. As for the Stanford study, that good news lasted all of 15 seconds until I read this site.

Bonus trivia: C-19 causes: (1) early stage Alzheimer's, (2) long term mental health issues (hallucinations, hearing voices), (3) male infertility (4) long term lung problems, (5) serious reactions in 20% of the infected, (7) higher risk of death to people over 60 years old.
But go on, take a magic bullet on behalf of Team Herd Immunity, 60% of the population needs to be infected. I plan on being part of the 40% free riding.

I really can't believe this week ended without a NY study finished. We had the Italian study 2-weeks ago and the German study a week ago. We are so far behind when we know decisively what needs to be done.

Maternity ward testing ought to be a good source of data from all across the country.

We saw the one hospital in NYC had a positive rate of about 15% on women coming in to deliver. However, it still needs to be demographically adjusted -- I saw one person claim that particular hospital did a lot of maternity ward business with ultra-Orthodox, who apparently have been harder hit.

Fortunately, we usually have pretty good demographic data on new mothers.

Positive infection rate was 20% if you add the estimate from likely false negatives. Note only 2 % (of the total, so 10% of those testing positive) had symptoms. Maybe not the country, but really good for NYC. It was 2 hospitals, about 220 who came to deliver.

@WCO - Actually the data on symptoms for true positives is not 10% but 40%, from the US aircraft carrier TR (Reuters): "The Navy’s testing of the entire 4,800-member crew of the aircraft carrier - which is about 94% complete - was an extraordinary move in a headline-grabbing case that has already led to the firing of the carrier’s captain and the resignation of the Navy’s top civilian official. Roughly 60 percent of the over 600 sailors who tested positive so far have not shown symptoms of COVID-19, the potentially lethal respiratory disease caused by the coronavirus, the Navy says. The service did not speculate about how many might later develop symptoms or remain asymptomatic." - nevertheless this bodes well for a "two-tier" plan to reopen the economy with only young people (which is fine with me, since being in the 1% I like to employ young people to do my dirty work).

As we know it’s always problematic to test into a low incidence population because of the false positives.
It’s easier to estimate when we know the area is more heavily infected ( e.g New York). Let’s use some heuristics.
I estimate as follows. We know there are asymptomatic people. From the studies we get these estimates :
From the Princess cruise : 18%
From the Wuhan Japanese evacuated: 41.6%
From Iceland DeCode ~ 50%. Here are all many estimates. They vary a lot.

let’s take a median number ~ 50%
In New York from monitoring the twitter feed of ER personnel. I gathered that only candidates for hospitalization were tested due to test overload. From a CDC report I estimate this hospitalization rate at 20%. of the symptomatic.
This would imply that in the population at large in NY, 2 * 5 = 10 times the official recorded number of cases are actually ( infected+ immune). That’s 2.3 M people for New York ( 233 k total cases currently) .
If this estimate seems high let’s recall that in Heisenberg/Gangelt in Germany another highly infected area, this ratio was found to be 7.5
In New York a study of random pregnant women ready for delivery found 15% infected and 86% asymptomatic
The virus is fairly infectious because of its stronger affinity for the ACE2 receptor, It uses the receptor ACE2 for entry (same as SARS-CoV) but the host cell serine protease TMPRSS2 primes the S protein of SARS-CoV-2 for entry significantly more efficiently. We also know that people can be infectious for about ~ 2 days prior to symptoms and thus spread the virus generally undetected..

#12: A friend at NIH said that many there believe the Santa Clara study may be reflecting antibodies to several other coronaviruses as well, which for current serum screens are easy to confuse for ones specific to SARS-CoV-2. So that study may be way overestimating the number that have had the disease, and this doesn't bode well for the ease of using serum antibodies to get accurate estimates of previous infection specific to this virus.

Isn't there a problem with this theory- i.e. the negative controls? There are two populations of testing results- those pre-COVID in the design and production, and those post-COVID. 371 known negatives resulted in 369 negative results. At best, you can, maybe, assign those two false-positives to some other coronavirus exposure, but you are still left with the significantly higher ratio of positives in the sample itself.

When you explain this one away the next three to do are Dutch and Danish blood donor serological studies finding the same numbers as Stanford study (0.16%-0.20%) and German serological study finding 0.37 IFR. I am certain they are also false for completely different reasons. :)

Please tell us the reasons. Do not hold back on us. We love you.

Ivan is spot on here. It seems the "amused skeptic" seems to be the intellectual high ground these days.
This reminds me of when my youngest son was having belligerence issues at school at age 7. The dialogue would run like this: School: "Your son was in a fight today". Says me "Yes, but you cannot know the source of the fight; can you prove who was the antagonist?" Says School "Well, no..." Says me "so don't make presumptions about who was at fault." Week 2 there is a fight between my son and another child. Repeat that interchange. Week 3 there is a fight between my son and another child. This time the conversation ends with the school saying "Gee, while you may have a point on an individual basis, don't we kind of see a pattern here?".

Yes, we do see a pattern here. For a constituency that prides themselves on rational science and believing the numbers, at some point you have to suck it up and believe the numbers

HSV It's the most common sexually transmitted infection/diseases, i was so devastated when i first get tested positive and I was able to get rid of it with the use of herbal medicine and today am tested negative and the outbreak is gone, herpes free and ever since then i now play safe with whomever i came in contact with. i want you all out there to know that having herpes is not the end of your life cause it can be cured by the use of herbal medicine and i got this medicine from Dr Disu you can call or WhatsApp him on +2348167485904. 

Truly funny.

tldr (did read).

Concerns about false positives probably more substantive here. Selective recruitment less so. True of other low bound studies; blood donors in Scotland etc.

Gold standard is still Gangelt. 15% seroprevalence and 0.35% IFR ain't going away with false positives and selective recruitment "concerns".

If critics really want to really test, hopefully they could find a 3000 strong set of definite negatives, then replicate same result. Finding data might be hard though. In a more competent world, we should be taking a large subsample of blood donor results and freezing them, every couple months, to then test retrospectively (as a "pre-suspected epidemic baseline"). But I'd guess we aren't doing that. Hopefully the critics can find a sample of definite negatives to test, though.

"If critics really want to really test, hopefully they could find a 3000 strong set of definite negatives, then replicate same result. Finding data might be hard though. "

What are the rules about going to some old blood samples that were collected say 6 months or a year ago, and couldn't possibly have the virus. And testing them.

A couple of weeks ago Tyler linked to an article about a researcher at the Univ of WA who was planning to take a look at her experimental subjects. The experiment had nothing to do with coronavirus, but it gave her a handy set of people who had potentially been exposed early and could give us a retrospective look at how widespread the virus was in its early days in the US.

An older set of blood samples would allow us to test for false positives.

'Gold standard is still Gangelt.'

Ongoing, however. The reported 15% seroprevalence and 0.35% IFR is extremely preliminary, and based on a sample of 500, of the 1000 being studied. It has some extremely solid foundations - the date of first infection and its location is precisely known.

And the 15% is not enough, even with basically the longest shutdown in Germany, to stop 10-25 confirmed cases a day, as of April 17. A rate that has remained fairly constant for the last two weeks, to provide a bit more insight into what herd immunity percentage is at least reasonable to discuss as a concept (though talking about herd immunity without a vaccine is close to unnecessary in the first place). 15%, even with a shutdown, is clearly not providing it.

The participants in this study were volunteers. They were recruited through targeted internet ads... that is a problem in itself.

However a bigger problem is not adjusting for the likelihood of volunteering and the reasons for volunteering. If one had symptoms they are MUCH more likely to want to know whether they had an infection and hence participate in the study.

While those participants are important another indication, possibly better would be testing those that never had symptoms.

Using those two grouos would give a high and a low for the range of infections.

As it is, this study was NOT RANDOM

Did Tyler ever post or comment on the Stanford study in question? If not, he is clearly showing his bias against the Corona Truthers.

A peer review suggests this one study might be totally inaccurate for the following reasons
1. The false positive of the test is high
2. Were participants enriched for Covid-19?
(People got tested based on the suspicion they already had the virus and some of these recruiters may have infected other people in the study)
3. The study would imply faster spread than any pandemic we've seen

I have several issues with this study:
1) The specificity is stated a a point estimate there is no reason to believe that the specifity is 99.5. The stated interval should have been used instead.
2) The final estimate is very sensity to changes in the specificity. If you lower the estimated value from 99.5% by 0.5% the PPV drops from about 80% to 67%. At the lower end of the interval, the PPV drops to 15.5% meaning that approximately 85% of the positive test results are in error.
3. The distributors reported numbers include 2 false positives. False positives are almost always cross contamination. Any cross conrtamination would be a cause for significant concern. It is very difficult to predict the amount of contamination. It could easily vary both within and between lots of the test kit. An investigation should have been conducted to determine the cause of any false positive. Typically a regulator would ask for the investigation and corrective action as part the review for approval process.
4. I did not see a reference to the validation study used by the distributor. It is very difficult to have confidence in any presented number from the distributor.
5. I do not agree that patients waiting for hip surgery are respresentitive of the challenge of the greater population in regards to this test. They are unlikey to have a number of different conditions in the greater the public at large that would not be in a population of patients awaiting surgery.
My conclusion is that we should not use unvalidated test kits to evaluate something that is this important. I do not beleive that this data should be published as is.
In my 30 plus years as a medical device professional, I have been a part of the decision making development and release of a number of innovative devices. I have also been part of the deicision making process for a number of recalls and near misses. I have encountered a large number of cases in which we were mislead by poorly collected data at somepoint in the investigative process. I would recommend conducting the study with a test kit that has undergone review and approval by a trusted regulatory body.

Comments for this post are closed