Evaluating defect prediction approaches: a benchmark and an extensive comparison

D'Ambros, Marco; Lanza, Michele; Robbes, Romain

Information

Fulltext

Evaluating defect prediction approaches: a benchmark and an extensive comparison

D'Ambros, Marco ; Lanza, Michele ; Robbes, Romain

In: Empirical Software Engineering, 2012, vol. 17, no. 4-5, p. 531-577

Zum persönliche Liste hinzufügen

Titel

Evaluating defect prediction approaches: a benchmark and an extensive comparison

Autor

D'Ambros, Marco. REVEAL @ Faculty of Informatics, University of Lugano, 6900, Lugano, Switzerland
Lanza, Michele. REVEAL @ Faculty of Informatics, University of Lugano, 6900, Lugano, Switzerland
Robbes, Romain. PLEIAD Lab @ Computer Science Department (DCC), University of Chile, Santiago, Chile

Dokumententyp

Postprint

Sprache

Englisch

Veröffentlicht in

Empirical Software Engineering, 2012, vol. 17, no. 4-5, p. 531-577. Springer US; http://www.springer-ny.com

Andere elektronische Ausgabe

Publisher's version : https://doi.org/10.1007/s10664-011-9173-9

Klassifikation

Informatik

Schlagwörter

Defect prediction ; Source code metrics ; Change metrics

OAI-PMH-ID

oai:doc.rero.ch:317717

Summary

Reliably predicting software defects is one of the holy grails of software engineering. Researchers have devised and implemented a plethora of defect/bug prediction approaches varying in terms of accuracy, complexity and the input data they require. However, the absence of an established benchmark makes it hard, if not impossible, to compare approaches. We present a benchmark for defect prediction, in the form of a publicly available dataset consisting of several software systems, and provide an extensive comparison of well-known bug prediction approaches, together with novel approaches we devised. We evaluate the performance of the approaches using different performance indicators: classification of entities as defect-prone or not, ranking of the entities, with and without taking into account the effort to review an entity. We performed three sets of experiments aimed at (1) comparing the approaches across different systems, (2) testing whether the differences in performance are statistically significant, and (3) investigating the stability of approaches across different learners. Our results indicate that, while some approaches perform better than others in a statistically significant manner, external validity in defect prediction is still an open problem, as generalizing results to different contexts/learners proved to be a partially unsuccessful endeavor

Evaluating defect prediction approaches: a benchmark and an extensive comparison

D'Ambros, Marco ; Lanza, Michele ; Robbes, Romain

In: Empirical Software Engineering, 2012, vol. 17, no. 4-5, p. 531-577

Siehe auch

Export als

Evaluating defect prediction approaches: a benchmark and an extensive comparison

D'Ambros, Marco ; Lanza, Michele ; Robbes, Romain

In: Empirical Software Engineering, 2012, vol. 17, no. 4-5, p. 531-577

Siehe auch

Links

Aktie

Export als