WhyPValues Are Not a Useful Measure of Evidence in Statistical Significance Testing

1 February 2008

journal article
review article
Published by SAGE Publications in Theory & Psychology

Vol. 18 (1), 69-88
https://doi.org/10.1177/0959354307086923

Abstract

Reporting p values from statistical significance tests is common in psychology's empirical literature. Sir Ronald Fisher saw the p value as playing a useful role in knowledge development by acting as an `objective' measure of inductive evidence against the null hypothesis. We review several reasons why the p value is an unobjective and inadequate measure of evidence when statistically testing hypotheses. A common theme throughout many of these reasons is that p values exaggerate the evidence against H₀. This, in turn, calls into question the validity of much published work based on comparatively small, including .05, p values. Indeed, if researchers were fully informed about the limitations of the p value as a measure of evidence, this inferential index could not possibly enjoy its ongoing ubiquity. Replication with extension research focusing on sample statistics, effect sizes, and their confidence intervals is a better vehicle for reliable knowledge development than using p values. Fisher would also have agreed with the need for replication research.

Keywords

This publication has 74 references indexed in Scilit:

Why We Don't Really Know What Statistical Significance Means: Implications for Educators
Journal of Marketing Education, 2006
Inference by Eye: Confidence Intervals and How to Read Pictures of Data.
American Psychologist, 2005
Editors Can Lead Researchers to Confidence Intervals, but Can't Make Them Think
Psychological Science, 2004
The Null Hypothesis Testing Controversy in Psychology
Journal of the American Statistical Association, 1999
The appropriate use of null hypothesis testing.
Psychological Methods, 1996
Reconciling Bayesian and Frequentist Evidence in the One-Sided Testing Problem
Journal of the American Statistical Association, 1987
Frequentist probability and frequentist statistics
Synthese, 1977
The test of significance in psychological research.
Psychological Bulletin, 1966
On the Foundations of Statistical Inference
Journal of the American Statistical Association, 1962
Tests of Significance Considered as Evidence
Journal of the American Statistical Association, 1942

Cited by 157 articles