WhyPValues Are Not a Useful Measure of Evidence in Statistical Significance Testing
- 1 February 2008
- journal article
- review article
- Published by SAGE Publications in Theory & Psychology
- Vol. 18 (1), 69-88
- https://doi.org/10.1177/0959354307086923
Abstract
Reporting p values from statistical significance tests is common in psychology's empirical literature. Sir Ronald Fisher saw the p value as playing a useful role in knowledge development by acting as an `objective' measure of inductive evidence against the null hypothesis. We review several reasons why the p value is an unobjective and inadequate measure of evidence when statistically testing hypotheses. A common theme throughout many of these reasons is that p values exaggerate the evidence against H0. This, in turn, calls into question the validity of much published work based on comparatively small, including .05, p values. Indeed, if researchers were fully informed about the limitations of the p value as a measure of evidence, this inferential index could not possibly enjoy its ongoing ubiquity. Replication with extension research focusing on sample statistics, effect sizes, and their confidence intervals is a better vehicle for reliable knowledge development than using p values. Fisher would also have agreed with the need for replication research.Keywords
This publication has 74 references indexed in Scilit:
- Why We Don't Really Know What Statistical Significance Means: Implications for EducatorsJournal of Marketing Education, 2006
- Inference by Eye: Confidence Intervals and How to Read Pictures of Data.American Psychologist, 2005
- Editors Can Lead Researchers to Confidence Intervals, but Can't Make Them ThinkPsychological Science, 2004
- The Null Hypothesis Testing Controversy in PsychologyJournal of the American Statistical Association, 1999
- The appropriate use of null hypothesis testing.Psychological Methods, 1996
- Reconciling Bayesian and Frequentist Evidence in the One-Sided Testing ProblemJournal of the American Statistical Association, 1987
- Frequentist probability and frequentist statisticsSynthese, 1977
- The test of significance in psychological research.Psychological Bulletin, 1966
- On the Foundations of Statistical InferenceJournal of the American Statistical Association, 1962
- Tests of Significance Considered as EvidenceJournal of the American Statistical Association, 1942