Coverage is not strongly correlated with test suite effectiveness
- 31 May 2014
- conference paper
- conference paper
- Published by Association for Computing Machinery (ACM)
- p. 435-445
- https://doi.org/10.1145/2568225.2568271
Abstract
The coverage of a test suite is often used as a proxy for its ability to detect faults. However, previous studies that investigated the correlation between code coverage and test suite effectiveness have failed to reach a consensus about the nature and strength of the relationship between these test suite characteristics. Moreover, many of the studies were done with small or synthetic programs, making it unclear whether their results generalize to larger programs, and some of the studies did not account for the confounding influence of test suite size. In addition, most of the studies were done with adequate suites, which are are rare in practice, so the results may not generalize to typical test suites. We have extended these studies by evaluating the relationship between test suite size, coverage, and effectiveness for large Java programs. Our study is the largest to date in the literature: we generated 31,000 test suites for five systems consisting of up to 724,000 lines of source code. We measured the statement coverage, decision coverage, and modified condition coverage of these suites and used mutation testing to evaluate their fault detection effectiveness. We found that there is a low to moderate correlation between coverage and effectiveness when the number of test cases in the suite is controlled for. In addition, we found that stronger forms of coverage do not provide greater insight into the effectiveness of the suite. Our results suggest that coverage, while useful for identifying under-tested parts of a program, should not be used as a quality target because it is not a good indicator of test suite effectiveness.Keywords
This publication has 18 references indexed in Scilit:
- Comparing non-adequate test suites using coverage criteriaPublished by Association for Computing Machinery (ACM) ,2013
- The influence of size and coverage on test suite effectivenessPublished by Association for Computing Machinery (ACM) ,2009
- Is mutation an appropriate tool for testing experiments?Published by Association for Computing Machinery (ACM) ,2005
- The effect of code coverage on fault detection under different testing profilesPublished by Association for Computing Machinery (ACM) ,2005
- The confounding effect of class size on the validity of object-oriented metricsIEEE Transactions on Software Engineering, 2001
- Quantitative analysis of faults and failures in a complex software systemIEEE Transactions on Software Engineering, 2000
- All-uses vs mutation testing: An experimental comparison of effectivenessJournal of Systems and Software, 1997
- An experimental comparison of the effectiveness of branch testing and data flow testingIEEE Transactions on Software Engineering, 1993
- Investigations of the software testing coupling effectACM Transactions on Software Engineering and Methodology, 1992
- A Modification of Kendall's Tau for the Case of Arbitrary Ties in Both RankingsJournal of the American Statistical Association, 1957