Automatic Bug Triage in Software Systems Using Graph Neighborhood Relations for Feature Augmentation
- 1 September 2020
- journal article
- research article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Computational Social Systems
- Vol. 7 (5), 1288-1303
- https://doi.org/10.1109/tcss.2020.3017501
Abstract
Bug triaging is the process of prioritizing bugs based on their severity, frequency, and risk in order to be assigned to appropriate developers for validation and resolution. This article introduces a graph-based feature augmentation approach for enhancing bug triaging systems using machine learning. A new feature augmentation approach that utilizes graph partitioning based on neighborhood overlap is proposed. Neighborhood overlap is a quite effective approach for discovering relationships in social graphs. Terms of bug summaries are represented as nodes in a graph, which is then partitioned into clusters of terms. Terms in strong clusters are augmented to the original feature vectors of bug summaries based on the similarity between the terms in each cluster and a bug summary. We employed other techniques such as term frequency, term correlation, and topic modeling to identify latent terms and augment them to the original feature vectors of bug summaries. Consequently, we utilized frequency, correlation, and neighborhood overlap techniques to create another feature augmentation approach that enriches the feature vectors of bug summaries to use them for bug triaging. The new modified vectors are used to classify bug reports into different priorities. Bug Triage in this context is to correctly recognize the priority of new bugs. Several classification algorithms are tested using the proposed methods. Experimental results on a data set with Eclipse bug reports extracted from the Bugzilla tracking system have shown that our approach outperformed the existing bug triaging systems including modern techniques that utilize deep learning.Keywords
This publication has 57 references indexed in Scilit:
- A time-based approach to automatic bug report assignmentJournal of Systems and Software, 2015
- A survey on bug-report analysisScience China Information Sciences, 2015
- Topic-based, time-aware bug assignmentACM SIGSOFT Software Engineering Notes, 2014
- Developer Profiles for Recommendation SystemsPublished by Springer Science and Business Media LLC ,2013
- Probabilistic topic modelsCommunications of the ACM, 2012
- Reducing the effort of bug report triageACM Transactions on Software Engineering and Methodology, 2011
- Strong and Weak TiesPublished by Cambridge University Press (CUP) ,2010
- Similarity Measures for Categorical Data: A Comparative EvaluationPublished by Society for Industrial & Applied Mathematics (SIAM) ,2008
- Indexing by latent semantic analysisJournal of the American Society for Information Science, 1990
- Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of ImagesPublished by Elsevier BV ,1987