Written by Bo Stubblefield
Statistical methods were developed and applied to clinical trials to quantify uncertainty, not as a decision tool to be used in the face of uncertainty. We should become comfortable with uncertainty, as it is present irrespective of the arbitrary thresholds used for interpretation of data.
Why does this matter?
p-values are consistently reported in the literature as thresholds for statistical significance. Investigators draw definitive conclusions and make decisions based on arbitrary thresholds. Consequently, there is a not-so-silent revolution occurring which advocates to remove the p-value as our test for statistical significance. This recently published opinion piece uses an analogy to illustrate how we should be conducting and evaluating clinical trials.
Be more like a quantum physicist – uncertainty is the world in which we live
This is an opinion piece published in Circulation by cardiologist Dr. Milton Packer. Packer uses the difference between Newtonian and quantum physics to draw a parallel to research frameworks that are probabilistic and not deterministic. The thought analogy is erudite, namely Schrödinger’s Cat. Yes, we have included a “cat video” on JF, but the concept is valuable to illustrate randomized clinical trials and their reported p-values.
On evaluation of a new drug in a randomized clinical trial, a researcher provides a result in probabilistic terms with a reported confidence interval that estimates the effect size of the drug. Effect size may reach statistical significance as the sample size increases, but the result may still be clinically trivial. The result of a study is only actionable if the researchers determine the degree of uncertainty they are willing to tolerate prior to the start of the study. But, these levels of tolerance are arbitrary. There is no absolute certainty with this result. Furthermore, findings that are statistically significant in one study are frequently unable to be reproduced in a second study (1). This is especially true if there is a large degree of uncertainty (e.g. large confidence intervals) (2).
What is the answer? – We must become more comfortable with uncertainty and be careful not to use a p-value to make artificial distinctions between levels of uncertainty that are tolerable and levels that are not. The declaration of a decision (p<0.05 = good) does not resolve the uncertainty. As the p-value falls out of favor in the scientific community, Bayesian analysis has been gaining favor and may be the future of statistical analyses (3,4).
Another Spoonful/FOAMed Links
Evolution in reporting p-values in the biomedical literature – https://www.ncbi.nlm.nih.gov/pubmed/26978209
John Ioannidis’ proposal to lower the p-value threshold to 0.005 – https://www.ncbi.nlm.nih.gov/pubmed/29566133
Scientists rise up against statistical significance – https://www.ncbi.nlm.nih.gov/pubmed/30894741
The Bottom Line – https://www.thebottomline.org.uk/blog/ebm/p-value/
The Parable of Schrödinger’s Cat and the Illusion of Statistical Significance in Clinical Trials. Circulation. 2019 Sep 9;140(10):799-800. doi: 10.1161/CIRCULATIONAHA.119.041245. Epub 2019 Sep 3.
Open in Read by QxMD
Reviewed by Thomas Davis
The behavior of the P-value when the alternative hypothesis is true. Biometrics. 1997 Mar;53(1):11-22.
Double Vision: Replicating a Trial Showing a Survival Benefit. JACC Heart Fail. 2017 Mar;5(3):232-235. doi: 10.1016/j.jchf.2016.12.017.
Scientists rise up against statistical significance. Nature. 2019 Mar;567(7748):305-307. doi: 10.1038/d41586-019-00857-9.
The Proposal to Lower P Value Thresholds to .005. JAMA. 2018 Apr 10;319(14):1429-1430. doi: 10.1001/jama.2018.1536.