by Fernando Botto, MD, MSc
During 2020, most research was focused on powerful and modern anti-inflammatory drugs, unfortunately expensive and not widely available for everyone. Colchicine is a cheap and widely available drug. Therefore, this is why COLCORONA and other ongoing studies (RECOVERY from UK, COLCOVID from Argentina) are important.
Trial summary
Question: In patients of 40 years or more with COVID-19 plus 1 additional risk factor and not hospitalized, will treatment with colchicine reduce the composite of death or hospitalization at 30 days when compared with placebo?
Design: international 1:1 RCT, double-blind, placebo-controlled, investigator-initiated trial. Sample size estimation 6,000, assuming at 25% risk reduction with colchicine (power 80%, alpha 5%), given a placebo arm event rate of 7%.
Results: Primary endpoint 4.7% with colchicine and 5.8% with placebo (OR 0.79; 95%CI 0.61 to 1.03; p=0.08). Secondary endpoints: death (OR 0.56; 95% CI 0.19 to 1.67), hospitalization (OR 0.79; 95% CI 0.60 to 1.03) and need for mechanical ventilation (OR 0.53; 95% CI 0.25 to 1.09). Pre-specified analysis of COVID-19 confirmed by PCR (n=4,159): primary endpoint 4.6% and 6.0% in the colchicine and placebo group, respectively (OR 0.75; 95% CI 0.57 to 0.99; p=0.04).
Methodological topics and comments
COLCORONA included a low-risk population compared to moderate and severe patients hospitalized with COVID-19. Sample size calculation was based on expected 7% control group event rate; they finally observed 5.8%. Not far.
1- Stopping early: Surprisingly, COLCORONA authors stopped early the trial with a total of 235 events, near to the end. The sample size planned was 6,000 and they stopped at 75% (n= 4,488) of recruitment, with a primary outcome of 4.7% (104 events) with colchicine vs 5.8% (131 events) with control >>> OR 0.79 (95% CI 0.61 to 1.03, p=0.08).
The trial was stopped early due to logistical issues and the need to provide results during pandemic, reasons difficult to understand. Stop early always reduces the quality of RCTs, because it limits the number of events, usually magnifies the effect size and increases the chance of a false positive result. Here, it was near to the end, without applying statistical rules for benefit (indeed the p-value was not significant).
The unbridled pursuit of success, breaking some classical rules or strategies, seems to have been a constant in this pandemic, such as the French study of 42 patients with Hydroxy-Chloroquine released early in April, claiming for its miraculous effect, and which results were ultimately not confirmed by a large RCT (RECOVERY). Similarly, it happened with other studies that were retracted by some scientific journals.
Did the authors understand that completing the trial would not have observed a positive result and reporting this data would help in the course of the pandemic? Did they believe in the need to report a pre-specified subgroup with COVID-19 confirmed by PCR? (OR 0.75; CI 95% 0.57 to 0.99; p=0.04).
2- P-value or alpha: it represents the rate of a false positive result that we are willing to accept before starting a RCT. Therefore, we will reject the null hypothesis and conclude that such intervention is beneficial compared to control, assuming a 5% probability of being wrong.
So, what does it mean a p=0.08? It means that there is a probability of 8% that the observed 21% event reduction with colchicine (OR 0.79) is a false positive finding, therefore, the study is “not positive”, since by convention the limit is 5% or less. If it had been a p=0.05, the chance of a false positive result would have been 5%, and we would have said that the study is “positive”.
Is it very different a false positive rate of 8% compared to 5% in the pandemic era? What a mess, huh?
If the study had been completed, assuming 3,000 patients in each group, with the same events rate, we would have observed a RR 0.89 (CI 95% 0.78 to 1) with p=0.056. Would it have been negative? Just 5.6% chance of a false positive result… BUT, with 2 more events difference between groups, (yes, I said two), the p-value would have been 0.049. So, a positive result? 4.9% chance of false positive result…
Undoubtedly, borderline p-values generate confusion, without mentioning that many statisticians and scientists claim for a reduction of p-value from 0.05 to 0.01 or even less. This is another discussion.
3- Random error, sample size, events rate: Clearly, the level of imprecision or “fragility” is revealed, as seen before, through modifying statistical significance by changing 2 or 3 events between groups. Chance can easily do it.
In COLCORONA there were 235 total events, not bad… Imagine what happens in those small studies published during the pandemic with fewer than 100 total events. Confidence barely starts with 200 events; I would say better with 300… And with 600 events we already have solid results. However, I´d like to say, that during the pandemic, without time and with the need of pragmatism to generate a “desperate evidence-based medicine”, I would accept 200 events or more.
Conclusion
Taking into account that there were multiple approvals of compassionate treatments based on faith, small sample size, physio-pathological or in vitro studies, and soft clinical and often surrogate endpoints, COLCORONA is a 4,500 patients RCT that compared colchicine to a control group, that demonstrated a 21% risk reduction in primary events (hospitalization and death) with a “non-significant” p-value. In the context of this health emergency, where hospitals run out of beds and without oxygen, I would like to express that colchicine might be considered for clinical use in the mild COVID-19 patients, assuming an 8% chance that this suggestion is incorrect (or 92% correct). Colchicine is a cheap drug, widely available and without SAEs, and its evidence is much more robust (at the moment), that the existing evidence with other commonly used drugs.
I promise that I will never try to suggest again a treatment with a p=0.08. Just to think. Amen.