An Empirical Study on Real Bug Fixes

Abstract
Software bugs can cause significant financial loss and even the loss of human lives. To reduce such loss, developers devote substantial efforts to fixing bugs, which generally requires much expertise and experience. Various approaches have been proposed to aid debugging. An interesting recent research direction is automatic program repair, which achieves promising results, and attracts much academic and industrial attention. However, people also cast doubt on the effectiveness and promise of this direction. A key criticism is to what extent such approaches can fix real bugs. As only research prototypes for these approaches are available, it is infeasible to address the criticism by evaluating them directly on real bugs. Instead, in this paper, we design and develop BUGSTAT, a tool that extracts and analyzes bug fixes. With BUGSTAT's support, we conduct an empirical study on more than 9,000 real-world bug fixes from six popular Java projects. Comparing the nature of manual fixes with automatic program repair, we distill 15 findings, which are further summarized into four insights on the two key ingredients of automatic program repair: fault localization and faulty code fix. In addition, we provide indirect evidence on the size of the search space to fix real bugs and find that bugs may also reside in non-source files. Our results provide useful guidance and insights for improving the state-of-the-art of automatic program repair.

This publication has 37 references indexed in Scilit: