Abstract
One of the most active areas of research in the software engineering community is defect prediction. The gap between data mining and software engineering must be bridged to increase the rate of software success. Before the testing phase, software defect prediction predicts where these flaws will most likely occur in the source code. Methods for predicting software defects are widely used to investigate the impact area in software using various techniques (clustering, statistical methods, neural networks, and machine learning models). The goal of this research is to examine various machine learning algorithms for predicting software defects. There have been many fault prediction techniques introduced, but no single technique or approach can be used for all types of datasets. To achieve maximum accuracy, different machine learning algorithms such as Bayesian Net, Logistic Regression, Multilayer Perceptron, Ruler Zero-R, J48, Lazy IBK, Support Vector Machine, Neural Networks, Random Forest, and Decision stump were used to uncover the maximum subset of defects that could be predicted. This research concern is to find out defects using five NASA data sets JM1, CM1, KC1, KC2, and PC1. Logistic Regression has been shown to have the best output relative to others (93%).
References
A. M. Khaleghi et al., “A DDDAMS-based planning and control framework for surveillance and crowd control via UAVs and UGVs,” Expert Systems with Applications, vol. 40, no. 18, pp. 7168–7183, Dec. 2013,
S. Nebenkern, A. Annan, M. Scherrer, and D. Loesch, “A LIGHT-WEIGHT MULTISPECTRAL SENSOR FOR MICRO UAV – OPPORTUNITIES FOR VERY HIGH-RESOLUTION AIRBORNE REMOTE SENSING,” p. 8, 2008.
M. Khairuddin and S. Ibrahim, “A Prediction Model for System Testing Defects using Regression Analysis,” International Journal of Soft Computing and Software Engineering, vol. 2, no. 7, pp. 55–68, 2012,
P. Parameter and D. A. Phalke, “Survey on Software Defect Prediction Using Machine Learning Techniques,” vol. 3, no. 12, pp. 2012–2015, 2014.
N. Kalinin and R. Bena, “Overview of Software Defect Prediction using Machine Learning Algorithms,” International Journal of Pure and Applied Mathematics, vol. 118, no. 20, pp. 3863–3873, 2018.
A. Humour, M. Hammad, M. Elnathan, and F. LaSharah, “Software Bug Prediction using Machine Learning Approach,” International Journal of Advanced Computer Science and Applications, vol. 9, no. 2, pp. 78–83, 2018,
F. S. Butt, S. Shaukat, M. W. Nisar, E. U. Munir, M. Waseem, and K. Ayyub, “Software quality assurance in software projects: A study of Pakistan,” Research Journal of Applied Sciences, Engineering and Technology, vol. 5, no. 18, pp. 4568–4575, 2013,
E. Naresh, V. K. B. P, and S. P. Shankar, “Comparative Analysis of the Various Data Mining Techniques for Defect Prediction using the NASA MDP Datasets for Better Quality of the Software Product,”
S. B. Ostinatos, I. D. Zaharias, and P. E. Pinellas, “Machine learning: A review of classification and combining techniques,” Artificial Intelligence,
L. Devas Ena, I. B. S. Hyderabad, and L. Devas Ena, “Effectiveness Analysis of Xero, RIDOR and PART Classifiers for Credit Risk Appraisal Effectiveness Analysis of Xero, RIDOR and PART Classifiers for Credit Risk Appraisal,” International Journal of Advances in Computer Science and Technology (IJACST), vol. 3, no. 11, pp. 6–11, 2014.
S. This et al., “0DFKLQH / HDUQLQJ $ OJRULWKPV,” pp. 775–781, 2017.
M. A. Memon, M.-U.-R. Magi, M. Memon, and S. Hider, “Defects Prediction and Prevention Approaches for Quality Software Development,” International Journal of Advanced Computer Science and Applications, vol. 9, no. 8,
M. S. Rawat and S. K. Dubey, “Software Defect Prediction Models for Quality Improvement: A Literature Study,” International Journal of Computer Science Issues, vol. 9, no. 5, pp. 288–296, 2012.
B. V. A. Prakash, D. V Ashoka, and V. N. M. Araya, “Proceedings of the 3rd International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2014,” vol. 327,
I. Singh, “A Survey: Data Mining Techniques in Software Engineering,”
S. Id, “by Ira Khaqan,” 2020.
P. Langley and J. G. Carbon ell, Approaches to machine learning, vol. 35, no. 5. 1984.
F. Akmal and E. Berhane, “A Literature Review Study of Software Defect Prediction using Machine Learning Techniques,” no. June 2018, 2019,
N. Azeem and S. Usmani, “Analysis of Data Mining Based Software Defect Prediction Techniques,” Global Journal of Computer Science and Technology, vol. 11, no. 16, 2011.
D. K. Verma, “A Review on Software Defect Prediction,” no. December 2015, 2019.
A. E. Hassan, “Data Mining for Software Engineering,” no. December 2014, pp. 2–4, 2010,
S. Dhal and A. Chug, “Software Defect Prediction Using Supervised Learning Algorithm and Unsupervised Learning Algorithm,”
A. Santini, “Applying Machine Learning for Fault Prediction Using Software Metrics,” International Journal of Advanced Research in Computer Science and Software Engineering, vol. 2, no. 6, pp. 274–278, 2012.
J. Ren, K. Qin, Y. Ma, and G. Luo, “On Software Defect Prediction Using Machine Learning,” Journal of Applied Mathematics, vol. 2014, pp. 1–8, 2014,
P. Mandal and A. S. Ami, “Selecting best attributes for software defect prediction,” 2015 IEEE International WIE Conference on Electrical and Computer Engineering, WIECON-ECE 2015, pp. 110–113, 2016, DOI: 10.1109/WIECON-ECE.2015.7444011.
M. Cheng, G. Wu, M. Yuan, and H. Wan, “Semi-supervised software defect prediction using task-driven dictionary learning,” Chinese Journal of Electronics, vol. 25, no. 6, pp. 1089–1096, 2016, DOI: 10.1049/cje.2016.08.034.
L. Perreault, S. Berardinelli, C. Azurite, and J. Sheppard, “Using classifiers for software defect detection,” 26th International Conference on Software Engineering and Data Engineering, SEDE 2017, pp. 131–137, 2017.
A. R. P. Preisman and A. Mishbahulhuda, “Applications of Data Mining Techniques in Software Engineering,” International Journal of Advanced Research in Computer Science and Software Engineering, vol. 7, no. 3, pp. 304–307, 2017, DOI: 10.23956/iMacs/v7i3/0174.
H. Alcavala, H. Faris, I. A. B, and L. Almemar, “Hybrid SMOTE-Ensemble Approach,” Springer International Publishing, vol. 1, 2017,
M. J. Kumar and V. K. resin, “An Intuitionistic Fuzzy K-Medoids Based Similar Pattern Analysis in Software Defect Prediction,” International Journal of Computer Trends and Technology, vol. 55, no. 1, pp. 41–49, 2018,
M. M. A. Abdallah and M. M. Laridae, “Towards a new framework of program quality measurement based on programming language standards,” International Journal of Engineering and Technology (UAE), vol. 7, no. 2, pp. 1–3, 2018,
A. Sayed and N. Ramadan, “Early Prediction of Software Defect using Ensemble Learning: A Comparative Study,” International Journal of Computer Applications, vol. 179, no. 46, pp. 29–40, 2018.
D. Bowes, T. Hall, and J. Petri, “Software defect prediction: do different classifiers find the same defects?” pp. 525–552, 2018,
S. Maheshwari, R. Ganesan, and K. Chitra, “Improved Hybrid Genetic Based Rule Mining Algorithm for Software Defect Prediction,”
T. A. Babu and P. R. Kumar, Proceedings of International Conference on Computational Intelligence and Data Engineering, vol. 28. Springer Singapore, 2019.
A. Al-Nusrat, F. Hernandez, M. Khorramshahr, M. Al-Ayoub, and N. Al-Dhahiri, “Dynamic detection of software defects using supervised learning techniques,” International Journal of Communication Networks and Information Security,
L. Liu, K. Li, M. Shao, and W. Liu, “Fuzzy Integral Based on Mutual Information for Software Defect Prediction,” Proceedings - 2015 International Conference on Cloud Computing and Big Data, CCBD 2015, pp. 93–96, 2016,
M. D. M. Sufyan and S. Ibrahim, “Adopting six sigma approaches in predicting functional defects for system testing,” 2011 5th Malaysian Conference in Software Engineering, Misses 2011,
J. Cai, J. Luo, S. Wang, and S. Yang, “Feature selection in machine learning: A new perspective,” Neurocomputing, vol. 300, pp. 70–79, 2018,
K. P. Singh, N. Basant, and S. Gupta, “Support vector machines in water quality management,” Analytica Chemical Acta, vol. 703, no. 2,
L. Neu, “A review of the application of logistic regression in educational research: common issues, implications, and suggestions,” Educational Review, vol. 00, no. 00, pp. 1–27, 2018,
R. R. Bruckert, “Bruckert - Bayesian Nets in Weka,” p. 23, 2008.
G. Kaur and A. Chhabra, “Improved J48 Classification Algorithm for the Prediction of Diabetes,” International Journal of Computer Applications, vol. 98, no. 22, pp. 13–17, 2014, DOI: 10.5120/17314-7433.
D. Hutchison, “Future Data and,” no. January 2014, DOI: 10.1007/978-3-319-12778-1.
P. W. Wang and C. J. Lin, Support vector machines. 2014. DOI: 10.1201/b17320.
L. Bierman, “ST4_Method_Random_Forest,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001, DOI: 10.1017/CBO9781107415324.004.
S. MwanjeleMwagha, M. Muthoni, and P. Ochieng, “Comparison of Nearest Neighbor (ink), Regression by Discretization and Isotonic Regression Classification Algorithms for Precipitation Classes Prediction,” International Journal of Computer Applications, vol. 96, no. 21, pp. 44–48, 2014, DOI: 10.5120/16919-6729.
J. ?. Novakovic, A. Valjavec, and S. S. Ilic, “EXPERIMENTAL STUDY OF USING THE K-NEAREST NEIGHBOUR CLASSIFIER EXPERIMENTAL STUDY OF USING THE K-NEAREST NEIGHBOUR CLASSIFIER WITH FILTER METHODS,” no. May 2018, 2016.
J. Petrik, D. Bowes, T. Hall, B. Christianson, and N. Badoo, “The Jinx on the NASA software defect data sets,” ACM International Conference Proceeding Series, vol. 01-03-June 2016, DOI.