Visible to the public Predicting Fault-Prone Classes in Object-Oriented Software: An Adaptation of an Unsupervised Hybrid SOM Algorithm

TitlePredicting Fault-Prone Classes in Object-Oriented Software: An Adaptation of an Unsupervised Hybrid SOM Algorithm
Publication TypeConference Paper
Year of Publication2017
AuthorsBoucher, A., Badri, M.
Conference Name2017 IEEE International Conference on Software Quality, Reliability and Security (QRS)
Date Publishedjul
ISBN Number978-1-5386-0592-9
KeywordsAdaptation models, class-level granularity, Data models, fault data history, fault-prone classes prediction model, fault-prone code identification, function-level source code metrics, HySOM model, Measurement, Metrics, Multilayer Perceptron, Naive Bayes Network, Object oriented modeling, Object-Oriented Metrics Threshold Values, object-oriented programming, object-oriented software systems, Prediction algorithms, Predictive Metrics, Predictive models, predictive security metrics, pubcrawl, self-organising feature maps, Self-Organizing Map, semisupervised fault-proneness prediction models, software metrics, software quality, Software systems, source code (software), supervised learning algorithms, Unsupervised Fault-Proneness Prediction, unsupervised fault-proneness prediction models, unsupervised hybrid SOM algorithm, unsupervised learning

Many fault-proneness prediction models have been proposed in literature to identify fault-prone code in software systems. Most of the approaches use fault data history and supervised learning algorithms to build these models. However, since fault data history is not always available, some approaches also suggest using semi-supervised or unsupervised fault-proneness prediction models. The HySOM model, proposed in literature, uses function-level source code metrics to predict fault-prone functions in software systems, without using any fault data. In this paper, we adapt the HySOM approach for object-oriented software systems to predict fault-prone code at class-level granularity using object-oriented source code metrics. This adaptation makes it easier to prioritize the efforts of the testing team as unit tests are often written for classes in object-oriented software systems, and not for methods. Our adaptation also generalizes one main element of the HySOM model, which is the calculation of the source code metrics threshold values. We conducted an empirical study using 12 public datasets. Results show that the adaptation of the HySOM model for class-level fault-proneness prediction improves the consistency and the performance of the model. We additionally compared the performance of the adapted model to supervised approaches based on the Naive Bayes Network, ANN and Random Forest algorithms.

Citation Keyboucher_predicting_2017