We were unable to load Disqus. If you are a moderator please see our troubleshooting guide.

Mehdi • 4 years ago

A Stanford whistleblower complaint alleges that the controversial John Ioannidis study failed to disclose important financial ties and ignored scientists’ concerns that their antibody test was inaccurate https://www.buzzfeednews.co...

Tim McNaughton • 4 years ago

I wrote a graphical model to help understand the "false positive problem"

in general:
https://sites.google.com/vi...

and specifically for this paper:
https://sites.google.com/vi...

Choplifter • 4 years ago

I applaud the authors for publishing this study. It is not without its flaws, obviously you can pick apart whether the sample was reprsentative of the community given the limited way testing candidates were solicited, but it is important to get tests like this out to the public quickly rather than wait months to get statistically perfect data. I suspect the estimate of fatality rate they give is a little low, but the scale is consistent with similar studies out of NYC and elsewhere, all suggesting that the fatality rate from COVID-19 is far, far, far less than the terrifying 5% that continues to be touted on the CDC website and by people who support contined lockdowns. The real rate is likely well under 0.4%, making it worse then seasonal flu (even the historically virulent 2017-2018 flu season) but still orders of magnitude less than the 5% Armageddon numbers that were used to scare the public into accepting complete lockdowns and the widespread ruin to economies and livelihoods, that they caused.

buzzbree • 4 years ago

Beyond the seroprevelance conclusions of the study which are generally consistent other reports, another very important issue that needs to be clarified by the authors is if the study fully adhered to Good Clinical Practice (GCP) standards.

To be fully compliant with GCP the Stanford IRB really needed to be informed of the the email Jay Bhattacharya's wife (https://www.buzzfeednews.co... sent to potential subjects. The email had several erroneous statements- that the test was FDA approved (Its not) and they would know if they were now immune from COVID-19 and would know that they were free from getting sick and could no longer spread the virus. These statements could have impacted subject safety by encouraging riskier behavior (i.e. ignoring social distancing) from the study subjects if they believed that the test was FDA approved and a positive result was definitive proof of protective immunity.

In the Buzzfeed article Dr. Bhattacharya has stated that he did not know about the email or approve of it, but he still had an ethical duty to report it to the IRB when he found out. There is only one line in manuscript stating that IRB approved the study- how the IRB addressed this email should be expounded upon in final manuscript given these new issues that have come to light.

Relevant GCP sections:

"3.3.8 Specifying that the investigator should promptly report to the IRB/IEC:(b) Changes increasing the risk to subjects and/or affecting significantly the conduct of the trial (see 4.10.2).

4.10.2 The investigator should promptly provide written reports to the sponsor, the IRB/IEC (see 3.3.8) and, where applicable, the institution on any changes significantly affecting the conduct of the trial, and/or increasing the risk to subjects.

DaveSezThings • 4 years ago

The analysis has made a significant, basic error in handling the uncertainty associated with the specificity of the test. Leaving aside concerns regarding the applicability of the delta method, the mistake arises in that computing the standard error the values for Var(s) and Var(r) should be divided by the sample numbers used in the studies to establish these values, not the main study sample size n=3,330, which is used across all terms in the relevant equation in the appendix (it's on the middle of page 3). We can see this as the range of specificity (95% CI 98.3% to 100%) is sufficient to explain the observed data with zero genuinely positive cases.

Basically this destroys the conclusions which should now be along the lines of "unfortunately the test used for this study was not specific enough to support any conclusions beyond setting a maximum level of infection."

Stuff happens, time is short etc... the authors should just issue a correction. It'll be quick and easy and save a lot of irrelevant speculation,

Sun Wu Kong • 4 years ago

This is why confidence intervals are misleading.

Although 95% CI 98.3% to 100% looks informative, it's not. The probability of the specificity being 98.3% is very small.

In fact, under the assumption that the specificity really was 98.5% for instance, there's only a roughly 5% chance that the specificity would have been measured at 99.5%.

Going further with a simple binomial analysis, there's a 76% probability that the real specificity is 99% or greater.

Confidence intervals have been used in the past to mislead and confuse the public; and it seems like the impulse is still quite healthy.

The deeper problem is that the specificity estimates from the manufacturer were IgG (369/371~99.5%) and IgM (368/371~99.2%); and while the survey presents a positive as IgG postive OR IgM positive, it leaves out the IgM specificity in the error calculations. (http://en.biotests.com.cn/n....

mendel • 4 years ago

Also, test specificity s is too high because the manufacturer data for IgM specificity (368/371) was not considered in the paper at all. The test is positive as a whole if either IgM or IgG test strips are visible, so the combined specificty by manufacturer data is in the range of 368/371-366/371; the latter value is 98.65%, which, if observed in the study, gives them 5 true positives and 45 false positives out of 3330 samples, with a wide error margin.

Sun Wu Kong • 4 years ago

I can confirm that the paper did not mention IgM specificity for their test and also that both IgM OR IgG test strip visible results were deemed positive.

I cannot however find reference to the lower bound in the manufacturer supplied specificity, i.e. 366/371. Do you have a reference?

http://en.biotests.com.cn/n...

mendel • 4 years ago

In the source you cite, IgM has 3 false positives, IgG has 2 false positives, if these occur on separate samples, the combined false positives are 5/371 for a 366/371 specificity.

The covidtestingproject.org data had no overlap for the 2 and 1 false positives in 108, the distributor's CDC filing had overlap for the 4 and 1 false positives in 150 samples of symptomatic patients.

So I don't know if the 2 and 3 overlap or not, but worst case is that they don't, and that gives 366/371.

LOLUVA • 4 years ago

Though I am not exactly familiar with their estimation process, it not always true of what you stated. types of error analysis will adjust the var or sigma by total samples used. It sounds like you are saying they should only of adjusted values prior to their study base more limited calibration data. And fixed it. I think of it this way.... each sample has a bias and random error component. Estimators (depending of data distribution) can attempt to average out the error with increased samples. That also sometimes makes bias observable in the data set and can be adjusted for. I work in position estimating that uses TDOA FDOA or interefermetric data. We apply error covariance processes using WSLs estimators. Again I’m not familiar with this specific data set, but could you be a little more specific? Is there something unique about this data that i am missing?

DaveSezThings • 4 years ago

The error the authors have made is a basic one in the propagation of uncertainty. We have a test kit which has some uncertainty regarding its performance, hence the confidence intervals on the specificity and sensitivity of the test. These confidence intervals are established by the manufacturer's trial using 85 confirmed positive and 371 confirmed negative samples and the author's tests on 37 RT-PCR-positive samples and 30 pre-COVID samples. Without further testing of the test kit using known samples there is nothing that can be done to change the confidence intervals for the sensitivity and specificity - it doesn't matter how many samples are tested for the main study. However, this is not correctly represented in the author's calculation of the propagation of the uncertainty by the author's. In their work they use the variances for the individual binomial distributions, compute the combined variance on this basis (called Var(pi) in the appendix) and then to get the standard error by taking the square-root of this combined variance divided by themain study sample size 3,330: SE(pi)=sqrt(Var(pi)/3330)=0.0034. This is incorrect. It confuses the variance of binomial distributions with the uncertainty regarding the estimate of parameters which will be binomially distributed. The author's should have been computing the square of the standard error using the delta method and replace each of the terms Var(q), Var(r) and Var(s) with their respective standard errors-squared (which will bring in some different n's for the various studies, although it'll be a bit more complicated for Var(r) and Var(s) given that more than one study has been combined to get these). This will give a much larger standard error for pi - the test-kit corrected prevalence. Alternative/better methods than the delta-method will yield the same result and it's such a large change that it essentially reduces the study to an upper-limit estimate only... the lower limit goes down to zero (or very close to zero).

It seems there are many other concerns as well regarding the sampling
method and technical details of the serology but I'm not able to comment
on those. I'm actually an optical physicist so I just have an amateur interest in this but a basic mathematical error is a basic mathematical error and that's what they've done. It's that bad. Looking through the comments here the point has been made many times in one form or another. There are 17 authors on this - it should take less than a day for one of them to correct the results and re-write the conclusions. It's no longer an interesting outcome of course - but it's scientifically the right thing to do - the fact that things are hurried because of the pandemic excuses errors but not quick correction once these have been pointed out. And there is no debate on this one - I might not be articulating the argument as well as others can and of course there are better analysis techniques than the delta method - but as sure as 2+2 isn't 5 their standard error for these results isn't what the authors claim. It's a big whoops - I hope the author's go easy on whoever of them made the mistake as I can well understand the time pressures surrounding this work. The classy thing to do is for all of them to take collective responsibility, issue the correction and move on from there.

LOLUVA • 4 years ago

Interesting, I see a-lot of parallels with their process we what we do in position estimations of an unknown source. Each individual measure/sample has its own VAR (Both Random and Bias component). We generally assume AWGN, so the samples a similarly distributed.. As you collection measurements you get a overdetermined system of equation. We then used a WSL estimation process to weight the better quality data. Generally that would be SNR. Using Error Covariance we are able to determine the 95% containment which reduces to a 2x2 Matrix. What ends up happening is those overall combined VARs are in fact reduced by the total population size. Averaging effect. This leads to a bias generated solution. Obviously I am unfamiliar with medical field to include antibody detection and strength, so was curious if this was a possibility that estimation process. That sounds like a no based on the amount of responses.

chedca • 4 years ago

then what is to say of our government policy ? are they not the same kits?

outdoorgirl0814 • 4 years ago

My primary question on this study is why the IgM and IgG specific results were not presented, but rather pooled together. This seems like important information. From what I can tell, the test identifies them separately.

mendel • 4 years ago

They weren't pooled, the 368/371 IgM specificity was omitted for some unknown reason.

outdoorgirl0814 • 4 years ago

What I mean by "pooled" is that of the 3330 patients included in the study, they counted 50 tests in which either IgG, IgM or both IgG and IgM were positive. However, they could have stated how many of the 50 positive tests were IgM, IgG or both. That would be important information. So, you're right "pooled" is the wrong term-it was how they scored the tests. But the end result is still pooling IgM and IgG positives. And the IgM specificity was not omitted; it was discussed in the paper. I am not sure why 2 people downvoted my question, which is just plain odd. It wasn't even a criticism or political or really anything other than a simple question about why those data were omitted. What is going on here?

mendel • 4 years ago

Thank you for the explanation. For the record, I had upvoted your comment. ;) And my own criticism, posted here very soon after the paper was published, was even marked as spam; make of that what you will.

The paper mentions 99.5% specificity on page 5, first paragraph of "Results" section, and 369/371 on page 3 of the appendix. Sun Wu Kong provided a source in his comment that shows the 369/371 is the IgG specificity, the 368/371 IgM specificity is not used in the analysis at all. It is not being discussed in the paper. ( And neither is IgG sensitivity.) They chose the ones that would give them the highest prevalence in the analysis, that looks almost like it was done deliberately.

I wish they'd simply taken a picture of each completed test with a cheap digital camera at each station, and uploaded the whole set.

John Smith • 4 years ago

1. A local website (SFGate, I think) mentioned a person who emailed many friends about the free test and this selected wealthier people who might have more exposure to international travel. This would boost the percentage with antibodies above a population sample that had more poor people in the sample. It did mention the team tried to correct for this email by recruiting from other areas of the county. 2. Santa Clara county has more international travel than most other areas of the USA that have fewer immigrants, so people who are saying other areas of the USA might have the same higher level of recovered patients would be wrong.

endmathabusenow • 4 years ago

The recruitment methods from this study are unbelievable. They include an email that makes inaccurate claims that 1) the FDA approved the test and 2) "In China and U.K. they are asking for proof of immunity before returning to work"

https://old.reddit.com/r/Co...

Sun Wu Kong • 4 years ago

Your first point is misleading. https://premierbiotech.com/...

As for your second point, this would not have been an unreasonable assertion circa the beginning of the study. https://www.theguardian.com...

endmathabusenow • 3 years ago

The study is *not* approved by the FDA; the FDA is just not objecting to the usage of the tests and the manufacturer claims to have done the experiments needed to validate the test.

As to the second point. It is false. Considering immunity passports is not the same as requiring them.

When one is recruiting subjects for a clinical study, one must be far more careful than this.

gfrenke • 4 years ago

According to the CDC website their flu statistics are based solely on people who were symptomatic. The CDC doesn’t’ do antibody tests after the flu season to see how many were infected with a flu virus but never had symptoms.

Nathan R. • 4 years ago

The CDC does do antibody tests for the flu.

gfrenke • 4 years ago

The CDC may do antibody test for the flu but they don't use the results in their flu statistics. Refer to How CDC Estimates the Burden of Seasonal Influenza in the U.S. from the CDC website which verifies asymptomatic cases are not included in their Estimated Influenza Disease Burden statistics.

From an analysis published on the NIH website titlled "The fraction of influenza virus infections that are asymptomatic: a systematic review and meta-analysis": "...in longitudinal studies in which infections were identified using serology the point estimates of the asymptomatic fraction adjusted for illness from other causes fell in the range 65%–85%."

Nathan R. • 3 years ago

Exactly. They do antibody tests after the flu season and extrapolate the data.

gfrenke • 3 years ago

Not according to the CDC website where they detail how they estimate the impact of the flu every season. https://www.cdc.gov/flu/abo...

Also, the CDCs published seasonal flu statistics use "Symptomatic Illnesses" not infections for the total. https://www.cdc.gov/flu/abo...

I can't find any documentation that the CDC performs"antibody tests after the flu season and extrapolate the data." Please provide your source(s).

Nathan R. • 3 years ago

Here it says it's used for research but not for diagnosis: https://www.cdc.gov/flu/pro...

Brent Tharp • 4 years ago

The CDC info shows that solely from confirmed cases, nearly 20% of those that have been tested are positive. That does not include asymptomatic people, who would rarely seek testing, nor would they be allowed to have a test given limited resources, nor does it include those who would have tested positive because the virus is no longer in their system (i.e., they would only test positive for antibodies). The real infection level is well over 20% now, just proving that the distancing measures have had virtually no impact on contagion, and further that this "killer" is pretty weak.

gfrenke • 4 years ago

The "real" infection level of the coronavirus is 21% in NYC not in the rest of the country. Serology tests in LA and Santa Clara counties in CA show an approximate 4% infection level. Likely the majority of the country has a lower infection level than Santa Clara and LA counties. I don't think you could say the virus is weak in NYC if even with 21% infected the mortality rate ranges from .6% to .9%(if probable covid-19 deaths are included). Also if you extrapolate from the 35K non NY state covid-19 deaths in the US based on an estimate that 4% of the non-NY population is infected then to get to a herd immunity level of 60% would require multiplying by 15 which would equal 525K deaths. Of course so many unknowns but suggest this "killer" is not so weak.

Tomas Hull • 4 years ago

"New York antibody study estimates 13.9% of residents have had the coronavirus, Gov. Cuomo says"
When false negatives were to be included - those who have undetectable levels of antibodies, mainly young population - it could mean that 30%, or more people in NY already have the antibodies...

The study as well as Dr. John Ioannidis, Dr. Jay Bhattacharya, who have gone public with these findings, stand vindicated.

https://www.cnbc.com/2020/0...

Will herd immunity be achieved by the end of summer, or earlier, as predicted by another brilliant scientist, Dr, Wittkowski? It remains to be seen...

https://www.medrxiv.org/con...

Steve Condie • 4 years ago

You expect herd immunity with undetectable levels of antibodies?

chedca • 4 years ago

Sir Frank Macfarlane Burnet showed in 1940 , antibodies are not really a sign of immunity. Burnet and White. Natural History of Infectious Disease. Cambridge University Press, 1940.

Jonathan G. Harris • 4 years ago

This contradicts the Stanford study and does not vindicate it. Somewhere between .12 and .18% of NYC has died of covid. About 22% was infected . Even if this 22% is too low ant the correct were 33%, you would have fatality rates of .36 to .54, far more than the claimed .1 to .2%.

Lourenço • 4 years ago

Its the same order of magnitude. Not the catastrophe that was announced. Flu casualties can vary by that order of magnitude from one year to another, I think.

think • 4 years ago

The Stanford study, as well as the USC study, demonstrated that the case fatality rate is much, much lower than was projected and is currently being reported. The CDC is currently showing a cfr of 5.6%. The NYC survey also corroborates the conclusion that the cfr is orders of magnitude lower than what was projected and used to inform our policy decisions and drive our panic and mass hysteria. Obviously, different geographical regions with different demographics are going to have disparate outcomes in cfr. A finding of .36% most certainly does vindicate a finding of .2% when the figure being challenged by these tests is 5.6%

Jonathan G. Harris • 3 years ago

You are confusing case fatality rates with infection fatality rates. People were using .5 to 1% as the infection fatality rate; almost nobody claimed and IFR that was much over 1%

Steve Condie • 4 years ago

The CFR is roughly one order of magnitude lower than the IFR pretty much everywhere except in this study. That is not unexpected given the paucity of testing. But consider: 0.12% of the entire population of New York ***State*** has died with Covid-19 - ignoring the undercount from non-hospital deaths. As you note, the statewide infection rate from the serology testing undertaken there was about 14%. That yields a current statewide IFR of about 0.86% for an entire state of ~20 million people. You can try to massage that if you like - and no, you really can't get to 30% from there - but ... the IFR mid-contagion - where we are now - will be lower than the final IFR because deaths continue even as new cases decline and stop. So realistically the New York State level data shows that the final IFR in that state will probably be above 1% - an order of magnitude between that large scale, real world data and the projections of this study based on a relative handful of samples.

mendel • 4 years ago

You are confusing the case fatality rate (which is an observed value) with the infection fatality rate (which is not).

Back on February 19th, WHO situation report 30 with "focus on modeling" reported the reliable IFR estimates for Covid-19 as ranging from 0.3% to 1.0%. This is the figure that needs to be challenged, but all available data (including the Santa Clara study if corrected for its errors) does seem to hit this range, which is 10-40 times higher than the estimated IFR for influenza, but also 10 times lower than 2003 SARS.

Dennis Menace • 4 years ago

The children they tested were brought in by their parents. These are also not independent, is this a problem ?

JM V • 4 years ago

Oh, and for people who compare this to the flu, here is some lowballing of the disease:
~5 times higher expected infection fatality rate
~5 times higher expected infection rate w/o control measures
Multiply those out for me please.
I think that is a comparison.

Lourenço • 4 years ago

I would dispute both statements. Especially the second one. Very different countries with the very strong measures varied a lot in their outcomes. And Sweden - that's all we have as a control group - with very soft, non compulsory measures e doing far better than Belgium, for example, and twice as bad as Denmark on the elderly, but Denmark will have to open up some time and I believe the end result will be the same.

JM V • 4 years ago

With 80 (1.7%) people dead in Castiglione d'Adda (Caveats: Old/Smoking/Unlucky/Collapse of Health care system/Some would have died anyway) this was already extremely unlikely. Now, with NYC 0.22% excess deaths and 21.2% of shoppers having antibodies, an IFR of 0.8% - 1.2% appears plausible.

Tesla Coil • 4 years ago

In addition to the criticisms raised in other comments, I see a fatal flaw in the study's "Statistical Analysis" that I believe has not been raised (apologies if I have missed it).

The authors appear to first re-weight the sample by demographic factors, and only then adjust for test sensitivity and specificity. This appears to me to be the obviously incorrect order.

If, say, in the unweighted sample, true false positives of the test were 1.5% (which is within the 95% confidence interval of 0.1% to 1.7% calculated by the authors), and the authors only found 1.5% positive samples, the actual true positives would be 0%. So for the unweighted sample, the lower bound for prevalence of antibodies should be 0% true positives. Any further re-weighting of the sample cannot change this and the lower bound must remain 0%.

However, as the authors re-weight the sample first, they apply the false positive rate of 0.1% to 1.7% to their re-weighted estimate of 2.8% positive samples.

1ProudPatriot • 4 years ago

They are continuing to retest/check for Type 1 and 2 errors. The statistical differences they study and many others are showing are so glaringly obvious as to early mitigate your criticisms, even if they are 100% correct. This study is most likely to be peer-reviewed. The number of authors across departments suggests the authors are well aware of the study's shortcomings. The Stanford group are not overstating their findings.

I have two requests:
1. Please provide your expertise and critical analysis of the studies done by Imperial College and the University of WA.
2. I challenge you to find a single college text on virology or epidemiology that suggests throwing out the principles of biology and all we have learned about viruses to follow the path of lockdowns. EVERY textbook says vaccine or herd immunity, period. Vaccines are difficult to develop due to cost, mutations and the time they take to develop and be safe for humans. The track record on vaccines is clear. That leaves herd immunity which we have done in the past. Mitigation of capacity is reasonable; but we are way beyond that now. There are reasons our "leaders" are doing this; but they don't imply common sense, nor are they scientific.

gmshedd • 4 years ago

So you're advocating what might be called the "Philadelphia Approach."
https://uploads.disquscdn.c...

hispresencematters • 4 years ago

"A combination of factors caused Philadelphia to be hit especially hard by the influenza epidemic. The city already had a population of about 1.7 million, and there were an additional 300,000 wartime workers. Many people, specifically poor and working-class immigrants and African Americans, lived and worked in crowded, unhygienic conditions. Doctors and public health officials did not understand the cause or the cure of the disease, nor did they have the necessary medical advances to properly combat it. Further, because they did not understand much about the influenza epidemic, public health officials underestimated its severity and overestimated their ability to keep it contained. Finally, more than a quarter of Philadelphia doctors and nurses had been called away to work in the war effort, which put a strain on every hospital in Philadelphia, even before the influenza epidemic hit."

https://archives.upenn.edu/...

Philip Machanick • 4 years ago

The most honest statement in the paper: “For example, if new estimates indicate test specificity to be less than 97.9%, our SARS-CoV-2 prevalence estimate would change from 2.8% to less than 1%, and the lower uncertainty bound of our estimate would include zero” – yet they arrive at a conclusion that overlooks this.

Secondly, the mortality in the hardest-hit parts of NY is already at the high end of their case fatality rate so they are at least an order of magnitude out in either their case fatality rate or prevalence.

What most are missing is the very high rate of asymptomatic cases; this is what makes NPIs effective. If more than half are asymptomatic as suggested in the few relatively complete studies, this explains the very rapid spread and why NPIs slow it so much. Consider Germany where they reacted late but rapidly scaled up trace-test-isolate. Their case fatality rate is < 4%. Italy’s is 13.5%. 2 things can explain this reasonably: Germany contained asymptomatic spread and their test coverage is higher or more accurate. Italy has a higher tests per capita than Germany. Italy focused on testing the most ill; Germany did contact tracing and isolation.

If you can stop asymptomatic spread the asymptomatic recover and are no longer a source of contagion (as long as reinfection is not possible). The German model helps with this as they catch a higher fraction of these and don’t need to wait until a reliable antibody test is developed.

Imperial College: very widely discussed. Search for "Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand" on Google Scholar and you will find over 200 citations.

mendel • 4 years ago

They had a raw prevalence pf 50/3330, that's 1.5%. If the specificity is "less than 97.9%", their estimate wouldn't change to "less than 1%", it should change to 0. This shows they flubbed the maths.
The test cartridges have two stripes, one for IgG and one for IgM. They say twice, once in the main paper and once in the appendix, that the test is positive if one of these stripes is. That means they can generate a false positive from an IgM false error, but the paper omits the manufacturer's IgM specificity of 368/371.
If IgM and IgG false positives didin't overlap, the test specificity is 366/371=98.65%. That makes 45 of their 50 samples likely to be false positives.
That is what the paper should have stated if it was honest about its specificity. But then it would hardly have made headlines.

Tesla Coil • 4 years ago

2. SARS-CoV-1? Ebola? Just two of many examples where outbreaks were suppressed.

In any case, this is not a general discussion forum, but a place to discuss this specific study.

CP • 4 years ago

Cuomo of NY announced two hours ago (4/23) results of antibody study of 3,000 in 19 NY counties. Result showed average infection rate of 13.9% - higher in NYC, lower in rural areas...