“How many statisticians does it take to ensure at least a 50% chance of a disagreement about p-values?”
So questions George Cobb of Mt. Holyoke College in his commentary on the ASA’s 7th March ‘Statement on P Values: context, process and purpose’
The association took the unprecedented step of publishing the statement following a long litany of criticism of misuse of the P value. This included the decision last year by Journal of Basic and Applied Social Pyschology to restrict the publication of papers where hypothesis testing is used for analysis.
The 6 principles in the statement were summarized as :
P-values can indicate how incompatible the data are with a specified statistical model.
P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.
Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.
Proper inference requires full reporting and transparency.
A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.
By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.
Of course, this is not a new topic for statisticians who have been debating the issue for decades. It’s new though, that the ASA have formally spoken out, and importantly brought together a concise piece of guidance of this kind.
The ASA statement also includes 20 commentaries by eminent discussants, which are well worth taking the time to read.
One compelling argument is made by Professor Donald Berry of The University of Texas M.D. Anderson Cancer Center. He comments that ‘More important than the credibility of our discipline is the impact that misuse and misinterpretation of statistical significance and p-values has on science and society. Patients with serious diseases have been harmed. Researchers have chased wild geese, finding too often that statistically significant conclusions could not be reproduced. The economic impacts of faulty statistical conclusions are great.’
Berry also goes on to contextualize the difficulties of communicating and discussing statistical significance within drug development.
In his response, Stephen Senn urges caution. He notes that ‘inferences will be all the better for recognizing their limitations but will be worse if we attempt to replace them rather than supplement them.’ Yoav Benjamani of Tel Aviv University argues that the P value is unfairly singled out, saying that ‘ The well-‐phrased statement demonstrates our mistake in singling out the p value: posing the P value as a culprit, rather than the way most statistical tools are used in the modern world of industrialized science.’ He goes on to say that the P value offers ‘a first line of defense’ against being 'fooled by randomness', allowing researchers to separate 'signal from noise'. He argues that the P value should be complemented (rather than replaced) with confidence intervals and effect size estimates.
One area which many agreed on is that proper education is fundamental to addressing the issues moving forward. This holds true within statistical discipline itself, and in how statisticial principles are communicated to non-statisticians. To that end, the ASA are urging their members spread the word throughout their networks- to 'users of statistics' as well as to statisticians.