Visible to the public Can We Predict the Quality of Spectrum-based Fault Localization?

TitleCan We Predict the Quality of Spectrum-based Fault Localization?
Publication TypeConference Paper
Year of Publication2020
AuthorsGolagha, M., Pretschner, A., Briand, L. C.
Conference Name2020 IEEE 13th International Conference on Software Testing, Validation and Verification (ICST)
Keywordsautomated fault localization, Complexity theory, Couplings, Debugging, dynamic test metrics, dynamic test suite, effective fault localization, Electronic mail, fault diagnosis, fault localization effectiveness, fault localization techniques, fault-related metrics, machine learning, Measurement, Metrics, potential effectiveness, prediction model, Predictive models, program debugging, program testing, pubcrawl, Resiliency, Scalability, software fault tolerance, software metrics, spectrum-based fault localization, static test metrics, static test suite, Tools, work factor metrics
AbstractFault localization and repair are time-consuming and tedious. There is a significant and growing need for automated techniques to support such tasks. Despite significant progress in this area, existing fault localization techniques are not widely applied in practice yet and their effectiveness varies greatly from case to case. Existing work suggests new algorithms and ideas as well as adjustments to the test suites to improve the effectiveness of automated fault localization. However, important questions remain open: Why is the effectiveness of these techniques so unpredictable? What are the factors that influence the effectiveness of fault localization? Can we accurately predict fault localization effectiveness? In this paper, we try to answer these questions by collecting 70 static, dynamic, test suite, and fault-related metrics that we hypothesize are related to effectiveness. Our analysis shows that a combination of only a few static, dynamic, and test metrics enables the construction of a prediction model with excellent discrimination power between levels of effectiveness (eight metrics yielding an AUC of .86; fifteen metrics yielding an AUC of.88). The model hence yields a practically useful confidence factor that can be used to assess the potential effectiveness of fault localization. Given that the metrics are the most influential metrics explaining the effectiveness of fault localization, they can also be used as a guide for corrective actions on code and test suites leading to more effective fault localization.
Citation Keygolagha_can_2020