New Videos with Amal Mattu, MD

# How to Not Sound Dumb When Describing Statistics

August 3, 2021

Written by Clay Smith

Spoon Feed

Why does this matter?

Statistical pearls for mere mortals

Here are the stats pearls I take away.

• There is a right and wrong way to do a study. A RCT should follow CONSORT guidelines. Observational studies should follow STROBE guidelines. See the Equator Network for more. If the study you’re reading doesn’t mention or seem to follow these, beware, you are probably reading a poorly done paper.

• There should be a clear study question and clear explanation of the analysis used to answer the primary question. If you can’t find this, throw the paper in the trash.

• P values – We don’t accept the null hypothesis; we either reject or do not reject it. P values just above 0.05 should not be described as “a trend.” A trend indicates something is moving. A p value over 0.05 is not a trend; it just means an endpoint did not meet that measure of statistical significance.

• A p value of 0.03 does not mean there is a 3% probability that results are due to chance. It doesn’t quantify the probability of a hypothesis. Rather, it is the probability of rejecting the null hypothesis when it is really true. Similarly, a 95% CI does not mean there is a 95% chance the true value falls in that range of numbers; it means that if the experiment was repeated in different samples, there is a 95% chance the true parameter value would fall in that range. At first glance, this seems like splitting hairs, but smart people swear it’s not. You stats wonks put something in the comments for the rest of us.

• Statistical significance does not equal clinical significance. For example, you may find a statistically significant ½ point difference on a pain scale, but is decreasing pain from a 10 to a 9.5 really making your patient feel better?

• Multivariate and propensity analyses help mitigate but do not remove the impact of confounders and cannot act as a substitute for a randomized controlled trial when it comes to determining causality. For instance, we shouldn’t say, “multivariate analyses removed confounding.”

• It is more helpful to discuss the clinical impact of a result, rather than just the statistical facts. For example, one could report the sensitivity, specificity, and AUC for a test. But more relevant would be to report the stats facts and clinical import. For example, if the sensitivity for appendix ultrasound increased, X% of CT scans could be avoided.

• The authors say we should avoid saying “may” or “might.” Oh boy…I do that all the time when I don’t want to convey causality. The authors note that saying a hypothesis “may” be true is the reason we do a study. It is also always a true statement unless a hypothesis is proven to false, which the authors point out is very difficult in science. Instead, we should say, “There is evidence that X was associated with Y, and a RCT is needed.” I may change the way I write in the future 🙂 .

Source
Statistical Analysis and Reporting Guidelines for CHEST. Chest. 2020 Jul;158(1S):S3-S11. doi: 10.1016/j.chest.2019.10.064.

## 2 thoughts on “How to Not Sound Dumb When Describing Statistics”

• harri816@gmail.com says:

Regarding 95%CI –

Second (my own biggest way-to-nitpicky-stats-nerd pet peeve), people tend to assume that values towards the center of a confidence interval are more likely as the true value than those at the borders, and this isn’t true. If a study shows an OR of 4.0 with 95%CI 1.1 – 6.0, the value of the true OR is no more or less likely to be 4.0 than it is 1.1, or 6.0 for that matter. This is due to similar reasons as what is described above regarding repeating the experiment 100 times – in the true definition of a CI, it’s wholly plausible that 1.1 is the true value, and it just happens that in this experiment it was found on the tail of the CI drawn. There’s nothing in the definition to say that in 94 of the next 99 replications the number all of those CIs contain is 1.1 any less-so than 4.0, since you have not actually done those experiments yet. There IS a distant cousin of the confidence interval, called a highest-density interval (HDI) where central values are more likely than those on the tails (and therefore is much more helpful in interpreting uncertainty). Unfortunately you won’t see this often- its an output of Bayesian statistics and not Frequentist statistics (aka what most think of as just “statistics”, since the Bayesian family of stats methods are rarely used in biomedical science).

Hope that long winded answer is slighlty less clear than mud.

-nick harrison, MD MSc, indiana university EM

• geerg@USACS.com says:

Finally, someone pointed out a serious problem throughout the literature!

"P values just above 0.05 should not be described as “a trend.” A trend indicates something is moving. A p value over 0.05 is not a trend; it just means an endpoint did not meet that measure of statistical significance."

This is pervasive in medical articles and throughout the blogs. Even my idols in the EMA/EMRap world do this every single month. "Trend" gets used simply when the number you want is not significantly larger than another. This is NOT correct. A trend only indicates a movement in value over time. In a single article multiple numbers will be not significant different, but an author will pick the ones that they want & call it a trend – and the ones they don’t like get ignored.

I would suggest article reviewers do a simple word search for "trend", "tends to" … If this is not a proper use of the work I would reject the paper. A tip to an aspiring researcher out there – Take a year of articles from a journal (maybe the Annals of EM), text search the entire contents for "trend" and similar terms. Review any ones that pop up. I hypothesize you will find it is commonly misused. I also think you might find that in the same paper, results unfavorable to the author hypothesis will not be listed as trends.

Thanks Clay – first time I have seen this brought up.