Don’t you love it when people who don’t know better think that they know better, and then they end up making fools of themselves? There is a particularly interesting anti-vaccine man by the name of Brian S. Hooker. He has a doctorate in biochemical engineering, according to his Wikipedia page. Maybe you remember BS Hooker from his foray into epidemiology, which went fantastically terrible. So bad was his “re-analysis” of a study looking into the MMR vaccine and its association with autism that the journal in which his “re-analysis” was published had to retract the paper and apologize for ever letting it into the wild.
I also tore the paper a new one here, here, and here.
Anyway, BS Hooker has decided to dive into biostatistics this time. He wrote a letter to the editor about a study looking into the influenza vaccine given to pregnant women and autism diagnoses in children born to those women. Here’s what he wrote. It’s a bit long, so bear with me:
“To the Editor: The JAMA Pediatrics article by Zerbo et al reported a statistically significant association between the administration of the maternal influenza vaccine in the first trimester of pregnancy and the incidence of autism spectrum disorder. The authors stated that the analysis adjusted for covariates yielded a P value of .01 when applying a Cox proportional hazards regression model to the data.
However, this P value was erroneously adjusted to reduce the possibility of type I errors by applying the Bonferroni adjustment for 8 separate analyses completed on the data sampling. Using this adjustment, the authors stated that this association “could be due to chance (P = .10).” In this instance, it is inappropriate to apply a Bonferroni adjustment because the associations were highly interdependent, contrary to the independence assumption used by the adjustment. This can be seen by the fact that knowing the results for each trimester will yield the result for the total period.
In the Zerbo et al article, comparison is made of the autism spectrum disorder incidence in each of 3 groups depending on the trimester in which the mother received the influenza vaccination against the autism spectrum disorder incidence in a “zero exposure” control group. Rather than a set of independent tests where “set A” is compared with “set B,” “set C” is compared with “set D,” and so on, in this instance, all maternal vaccinated data sets were compared with the same control set (ie, the unvaccinated sampling). In addition, in a fourth comparison, 3 sets were combined for a comparison of vaccination in any time during pregnancy to the unvaccinated control set. Thus, the full data set in this case was a dependent combination of the data from the first, second, and third trimesters in pregnancy.
Bland and Altman 1995 warned against the use of the Bonferroni adjustment when associations are correlated and cite the danger of missing “real differences.” The study authors apply a degree of caution regarding the autism spectrum disorder finding for influenza vaccination in the first trimester of pregnancy by stating that the findings “suggest the need for additional studies on maternal influenza vaccination and autism.” However, the application of the Bonferroni adjustment in this instance is inappropriate. Furthermore, the use of any adjustment for the first trimester is especially questionable because it has long been suspected a priori that an effect, if any, is likely to be concentrated in that trimester.”
My emphasis in bold.
The explanation to a lay audience is the following… There are two big types of errors you can make in conducting a study, Type I and Type II. A Type I error is when you fail to reject the null hypothesis that there is an association between an exposure and an outcome when, in fact, there is no association. In essence, you have a false positive. A Type II error is when you fail to reject the alternative hypothesis that there is no association when, in fact, there is an association. In essence, you have a false negative.
There is always a balance between these two errors, but it’s the Type I error that you want to avoid the most. (This all depends on the impact of your study, but, for academic purposes, it’s the Type I error that is the big one.) If you commit a Type II error, well, you might get to try again at a later time. In the gibberish above, BS Hooker is trying to say that, in making their adjustment, the authors of the study not only did away with a statistically significant result (p-value less than 0.05), but they also increased the chances of false negatives happening. (They did increase that chance of false negatives. More on that later.)
Furthermore, BS Hooker warns that there was a violation to the assumption of independence between the observations. The observations, in this case, were giving the influenza vaccine at trimester 1 or 2 or 3. As you can imagine, there is a dependence between these three observations since, if you don’t give the vaccine at trimester 1, then you must give it at 2 or 3. If you give it at 2, then you must give it at 1 or 3. If you give it at 3, then you must give it at 1 or 2. However, the problem with the last two statements is that you cannot go back in time. That is, if you don’t give it at 2, there’s no way you can go back and give it at one. If you don’t give it at 3, you’re not giving it at all.
Thus, there is independence, of sorts. The analysis is valid. (More about the “dependence/independence” thing later.)
The other thing that I found interesting was that BS Hooker wanted to compare one group to another, one by one. This is the same mistake he made in his “re-analysis” of the MMR-autism study. Doing it that way misses the interactions between different factors in the analysis. That’s why you do the more complex analyses, the less “simple” statistics that give you more realistic results.
What is that Bonferroni Adjustment he speaks of, though?
In a study, you want to keep the chance of a Type I error at less than 5%. That’s the p-value. It’s basically saying that you’d have to replicate the study 100 times to see 5 or more false positives, which is unacceptable. If you have a p-value less than that, you say that the probabilities of your association being a false positive are very low, so your results are “statistically significant.”
But what if you’re doing a bunch of different comparisons at the same time with the big dataset? This paper explains it very well:
“Say you have a set of hypotheses that you wish to test simultaneously. The first idea that
might come to mind is to test each hypothesis separately, using some level of significance α. At first blush, this doesn’t seem like a bad idea. However, consider a case where you have 20 hypotheses to test, and a significance level of 0.05. What’s the probability of observing at least one significant result just due to chance?
P(at least one significant result) = 1 − P(no significant results) = 1 − (1 − 0.05)20 ≈ 0.64
So, with 20 tests being considered, we have a 64% chance of observing at least one significant result, even if all of the tests are actually not significant. In genomics and other biology-related fields, it’s not unusual for the number of simultaneous tests to be quite a bit larger than 20… and the probability of getting a significant result simply due to chance keeps going up. Methods for dealing with multiple testing frequently call for adjusting α in some way, so that the probability of observing at least one significant result due to chance remains below your desired significance level.”
The Bonferroni Adjustment takes care of that by dividing 0.05 (or whatever your desired level of probability is) by the number of comparisons (hypotheses) being tested. In the case of the paper that BS Hooker seems to be trying to discredit, the formula is more like this:
P(at least one significant result) = 1 – P(no significant results) = 1 – (1 – 0.05)8 ≈ 0.34
So, in this study, you’d have about a 34% chance of committing a Type I error. That’s pretty high. Imagine the consequences of a false positive in this case. Influenza can kill a pregnant woman and her child. At the very least, influenza in a pregnant woman is serious business. Using the Bonferroni Adjustment, the authors correctly diminished the probabilities of a false positive. Yes, they increased the probability of a false negative, but what’s the harm in that? What’s the harm in seeing no association between the influenza vaccine and autism when there might be one? Probably none, given that autism is nowhere near as bad as, say, death… Or all the other complications from influenza.
But the true sign of an anti-vaccine believer is to compare autism to death, to say that autistic children might as well be dead. That’s where they make their bread and butter. It’s a trope as old as the false association between vaccines and autism.
You don’t have to take my word for it, though. The authors of the study slapped down BS Hooker’s assertions themselves in a response to his letter to the editor. A response that, in my opinion, didn’t need to be done. BS Hooker is not a biostatistician, nor is he an epidemiologist. Why he continues to dabble in these disciplines is beyond me, though some have suggested to me that he’s doing it because vaccines causing autism are his only lifeline to a cash reward in the vaccine court, a claim denied last year. If he can somehow tie his child’s autism to a vaccine — any vaccine, at this point, given how he’s gone after the MMR and now influenza vaccines — maybe he can revive his claim?
Anyway, here’s the authors’ response, my emphasis in bold:
“In Reply: We appreciate the comments presented by Donzelli and colleagues and Hooker about our study titled “Association Between Influenza Infection and Vaccination During Pregnancy and Risk of Autism Spectrum Disorder.” Statisticians and epidemiologists have debated at length whether this type of epidemiologic study should adjust for multiple testing, and no consensus has been reached. We used the conservative Bonferroni adjustment following suggestions received from JAMA Pediatrics reviewers. We agree with Donzelli et al and Hooker that the 3 trimesters are not independent of the entire pregnancy period. However, a less-conservative adjustment for multiple testing, accounting for the dependence of the entire pregnancy on the trimesters, would still yield a P value of .07 or higher, which should not change interpretations of our findings.
We do not see enough evidence of risk to suggest changes in vaccination guidelines and policies, but additional studies of maternal influenza vaccination during pregnancy are needed.”
(Donzelli et al, by the way, wrote a letter to the editor that was less fallacious than BS Hooker’s, in my opinion. You can read it here.)
But wait, the authors admit that there was dependence. Yeah, that’s why I wrote that there is independence “of sorts.” See, the design of this study leads to some dependence between the time periods when you give the vaccine, but, because of temporality, it leads to independence because you can’t say that women were given the vaccine in the first or second trimester because they were not given it in the third. Likewise, you can’t say that giving the vaccine in the third trimester caused them not to get it in the first or second… Or that not giving it in the third assured that they got it in the first or second. And so on.
Policy is not only about statistical significance.
In the end, good policy decisions are not made solely based on one scientific study. Heck, good policy decisions sometimes are not made based on a hundred studies. Good policy decisions require people who can see the forest for the trees, the big picture, if you will. When the government was looking at the anthrax vaccine for use in children, Dr. Paul Offit (a “vaccine industrialist,” according to his detractors) opposed using the vaccine in children. It’s not that the vaccine wouldn’t be safe or effective in children. It’s just that the risk of them catching anthrax is negligible compared to, say, a soldier on the front line of a war where the opposing army is known to have a bioweapons program.
In essence, you weigh the pros and the cons of a vaccine both under ideal conditions (i.e. clinical trials and such) and under real-world conditions (i.e. taking into account the risk of the disease in the general population). You certainly don’t do it based on one study, Bonferroni adjustment or not, and you certainly don’t do it based on the thoughts of someone who is not a biostatistician nor an epidemiologist, and someone who likes to do biostatistics the “simple” way.
(Special thanks to The Spaniard for his review of this blog post for accuracy regarding the biostats.)