It's not a bug, it's a feature: How misclassification impacts bug prediction

Top Cited Papers

1 May 2013

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 392-401
https://doi.org/10.1109/icse.2013.6606585

Abstract

In a manual examination of more than 7,000 issue reports from the bug databases of five open-source projects, we found 33.8% of all bug reports to be misclassified - that is, rather than referring to a code fix, they resulted in a new feature, an update to documentation, or an internal refactoring. This misclassification introduces bias in bug prediction models, confusing bugs and features: On average, 39% of files marked as defective actually never had a bug. We discuss the impact of this misclassification on earlier studies and recommend manual data validation for future studies.

Keywords

This publication has 31 references indexed in Scilit:

ReLink
Published by Association for Computing Machinery (ACM) ,2011
Dealing with noise in defect prediction
Published by Association for Computing Machinery (ACM) ,2011
LINKSTER
Published by Association for Computing Machinery (ACM) ,2010
A Case Study of Bias in Bug-Fix Datasets
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2010
Characterizing and predicting which bugs get fixed
Published by Association for Computing Machinery (ACM) ,2010
What makes a good bug report?
Published by Association for Computing Machinery (ACM) ,2008
An approach to detecting duplicate bug reports using natural language and execution information
Published by Association for Computing Machinery (ACM) ,2008
Modeling bug report quality
Published by Association for Computing Machinery (ACM) ,2007
How Long Will It Take to Fix This Bug?
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2007
Detection of Duplicate Defect Reports Using Natural Language Processing
29th International Conference on Software Engineering (ICSE'07), 2007

Cited by 238 articles