Using regression trees to classify fault-prone software modules
- 10 December 2002
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Reliability
- Vol. 51 (4), 455-462
- https://doi.org/10.1109/tr.2002.804488
Abstract
Software faults are defects in software modules that might cause failures. Software developers tend to focus on faults, because they are closely related to the amount of rework necessary to prevent future operational software failures. The goal of this paper is to predict which modules are fault-prone and to do it early enough in the life cycle to be useful to developers. A regression tree is an algorithm represented by an abstract tree, where the response variable is a real quantity. Software modules are classified as fault-prone or not, by comparing the predicted value to a threshold. A classification rule is proposed that allows one to choose a preferred balance between the two types of misclassification rates. A case study of a very large telecommunications systems considered software modules to be fault-prone, if any faults were discovered by customers. Our research shows that classifying fault-prone modules with regression trees and the using the classification rule in this paper, resulted in predictions with satisfactory accuracy and robustness.Keywords
This publication has 32 references indexed in Scilit:
- Software metrics model for integrating quality control and predictionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Building software quality classification trees: approach, experimentation, evaluationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Application of multivariate analysis for software fault predictionSoftware Quality Journal, 1998
- Emerald: software metrics and models on the desktopIEEE Software, 1996
- Measurement and defect modeling for a legacy software systemComputational Geosciences, 1995
- Software metrics validation: Space Shuttle flight software exampleComputational Geosciences, 1995
- Empirically guided software development using metric-based classification treesIEEE Software, 1990
- Learning from examples: generation and evaluation of decision trees for software resource analysisIEEE Transactions on Software Engineering, 1988
- Induction of decision treesMachine Learning, 1986
- Multivariate ObservationsWiley Series in Probability and Statistics, 1984