Parameter tuning in KNN for software defect prediction: an empirical analysis
Open Access
- 10 August 2019
- journal article
- Published by Institute of Research and Community Services Diponegoro University (LPPM UNDIP) in Jurnal Teknologi dan Sistem Komputer
- Vol. 7 (4), 121-126
- https://doi.org/10.14710/jtsiskom.7.4.2019.121-126
Abstract
Software Defect Prediction (SDP) provides insights that can help software teams to allocate their limited resources in developing software systems. It predicts likely defective modules and helps avoid pitfalls that are associated with such modules. However, these insights may be inaccurate and unreliable if parameters of SDP models are not taken into consideration. In this study, the effect of parameter tuning on the k nearest neighbor (k-NN) in SDP was investigated. More specifically, the impact of varying and selecting optimal k value, the influence of distance weighting and the impact of distance functions on k-NN. An experiment was designed to investigate this problem in SDP over 6 software defect datasets. The experimental results revealed that k value should be greater than 1 (default) as the average RMSE values of k-NN when k>1(0.2727) is less than when k=1(default) (0.3296). In addition, the predictive performance of k-NN with distance weighing improved by 8.82% and 1.7% based on AUC and accuracy respectively. In terms of the distance function, kNN models based on Dilca distance function performed better than the Euclidean distance function (default distance function). Hence, we conclude that parameter tuning has a positive effect on the predictive performance of k-NN in SDP.Keywords
Funding Information
- University of Ilorin
This publication has 27 references indexed in Scilit:
- Progress on approaches to software defect predictionIET Software, 2018
- Software defect prediction using stacked denoising autoencoders and two-stage ensemble learningInformation and Software Technology, 2018
- A Framework for Software Defect Prediction and Metric SelectionIEEE Access, 2017
- A parallel framework for software defect detection and metric selection on cloud computingCluster Computing, 2017
- An Improved SDA Based Defect Prediction Framework for Both Within-Project and Cross-Project Class-Imbalance ProblemsIEEE Transactions on Software Engineering, 2016
- An Empirical Comparison of Model Validation Techniques for Defect Prediction ModelsIEEE Transactions on Software Engineering, 2016
- Cross-project defect prediction using a connectivity-based unsupervised classifierPublished by Association for Computing Machinery (ACM) ,2016
- Software defect prediction using cost-sensitive neural networkApplied Soft Computing, 2015
- A fuzzy logic based approach for phase-wise software defects prediction using software metricsInformation and Software Technology, 2015
- Online Defect Prediction for Imbalanced DataPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015