Are Fix-Inducing Changes a Moving Target? A Longitudinal Case Study of Just-In-Time Defect Prediction
- 12 April 2017
- journal article
- research article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Software Engineering
- Vol. 44 (5), 412-428
- https://doi.org/10.1109/tse.2017.2693980
Abstract
Just-In-Time (JIT) models identify fix-inducing code changes. JIT models are trained using techniques that assume that past fix-inducing changes are similar to future ones. However, this assumption may not hold, e.g., as system complexity tends to accrue, expertise may become more important as systems age. In this paper, we study JIT models as systems evolve. Through a longitudinal case study of 37,524 changes from the rapidly evolving QT and OPENSTACK systems, we find that fluctuations in the properties of fix-inducing changes can impact the performance and interpretation of JIT models. More specifically: (a) the discriminatory power (AUC) and calibration (Brier) scores of JIT models drop considerably one year after being trained; (b) the role that code change properties (e.g., Size, Experience) play within JIT models fluctuates over time; and (c) those fluctuations yield consistent over- or underestimates of the future impact of code change properties on the likelihood of inducing fixes. To avoid erroneous or misleading predictions, JIT models should be retrained using recently recorded data (within three months). Moreover, quality improvement plans should be informed by JIT models that are trained using six months (or more) of historical data, since they are more resilient to period-specific fluctuations in importance.Keywords
Funding Information
- Natural Sciences and Engineering Research Council of Canada (NSERC)
- JSPS KAKENHI (15H05306)
This publication has 42 references indexed in Scilit:
- The Impact of Mislabelling on the Performance and Interpretation of Defect Prediction ModelsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Revisiting the Impact of Classification Techniques on the Performance of Defect Prediction ModelsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Investigating Code Review Practices in Defective Files: An Empirical Study of the Qt SystemPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- An empirical study of the impact of modern code review practices on software qualityEmpirical Software Engineering, 2015
- On the dataset shift problem in software engineering prediction modelsEmpirical Software Engineering, 2011
- Predicting Bugs from HistoryPublished by Springer Science and Business Media LLC ,2008
- Predicting risk of software changesBell Labs Technical Journal, 2002
- Predicting fault incidence using software change historyIEEE Transactions on Software Engineering, 2000
- Understanding the sources of variation in software inspectionsACM Transactions on Software Engineering and Methodology, 1998
- Regression modelling strategies for improved prognostic predictionStatistics in Medicine, 1984