On software engineering repositories and their open problems
Open Access
- 1 June 2012
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE) in 2012 First International Workshop on Realizing AI Synergies in Software Engineering (RAISE)
Abstract
In the last decade, a large number of software repositories have been created for different purposes. In this paper we present a survey of the publicly available repositories and classify the most common ones as well as discussing the problems faced by researchers when applying machine learning or statistical techniques to them.Keywords
This publication has 29 references indexed in Scilit:
- On the reproducibility of empirical software engineering studies based on data retrieved from development repositoriesEmpirical Software Engineering, 2011
- On the dataset shift problem in software engineering prediction modelsEmpirical Software Engineering, 2011
- Evaluating defect prediction approaches: a benchmark and an extensive comparisonEmpirical Software Engineering, 2011
- Tools for the Study of the Usual Data Sources found in Libre Software ProjectsInternational Journal of Open Source Software and Processes, 2009
- Sourcerer: mining and searching internet-scale software repositoriesData Mining and Knowledge Discovery, 2008
- The role of replications in empirical software engineering—a word of warningEmpirical Software Engineering, 2008
- Finding the Right Data for Software Cost ModelingIEEE Software, 2005
- Supporting Controlled Experimentation with Testing Techniques: An Infrastructure and its Potential ImpactEmpirical Software Engineering, 2005
- Organizational benchmarking using the ISBSG Data RepositoryIEEE Software, 2001
- Small sample size effects in statistical pattern recognition: recommendations for practitionersIEEE Transactions on Pattern Analysis and Machine Intelligence, 1991