Machine learning model approach in cyber attack threat detection in security operation center

Muhammad Ajran Saputra, Deris Stiawan, Rahmat Budiarto

Abstract


The evolution of technology roles attracted cyber security threats not only compromise stable technology but also cause significant financial loss for organizations and individuals. As a result, organizations must create and implement a comprehensive cybersecurity strategy to minimize further loss. The founding of a cybersecurity surveillance center is one of the optimal adopted strategies, known as security operation center (SOC). The strategy has become the forefront of digital systems protection. We propose strategy optimization to prevent or mitigate cyberattacks by analyzing and detecting log anomalies using machine learning models. This study employs two machine learning models: the naïve Bayes model with multinomial, Gaussian, and Bernoulli variants, and the support vector machine (SVM) model with radial basis function (RBF), linear, polynomial, and sigmoid kernel variants. The hyperparameters in both models are then optimized. The models with optimized hyperparameters are subsequently trained and tested. The experimental results indicate that the best performance is achieved by the RBF kernel SVM model, with an accuracy of 79.75%, precision of 80.8%, recall of 79.75%, and F1-score of 80.01%; and the Gaussian naïve Bayes model, with an accuracy of 70.0%, precision of 80.27%, recall of 70.0%, and F1-score of 70.66%. Overall, both models perform relatively well and are classified in the very good category (75%‒89%).

Keywords


Cyber attack; Detection; Hyperparameter; Naïve Bayes; Support vector machine

Full Text:

PDF

References


M. Ahsan, K. E. Nygard, R. Gomes, M. M. Chowdhury, N. Rifat, and J. F. Connolly, “Cybersecurity Threats and Their Mitigation Approaches Using Machine Learning—A Review,” Journal of Cybersecurity and Privacy, vol. 2, no. 3. mdpi.com, pp. 527–555, 2022. doi: 10.3390/jcp2030027.

C. Brooks, “Cybersecurity Trends & Statistics For 2023; What You Need To Know,” 2023. [Online]. Available: https://www.forbes.com/sites/chuckbrooks/2023/03/05/cybersecurity-trends--statistics-for-2023-more-treachery-and-risk-ahead-as-attack-surface-and-hacker-capabilities-grow/?sh=6bb5ea0319db

S. Jain, “160 Cybersecurity Statistics 2024 [Updated],” Astra. [Online]. Available: https://www.getastra.com/blog/security-audit/cyber-security-statistics/

M. Vielberth, F. Bohm, I. Fichtinger, and G. Pernul, “Security Operations Center: A Systematic Study and Open Challenges,” IEEE Access, vol. 8, 2020, doi: 10.1109/ACCESS.2020.3045514.

E. Agyepong, Y. Cherdantseva, P. Reinecke, and P. Burnap, “A systematic method for measuring the performance of a cyber security operations centre analyst,” Comput. Secur., vol. 124, 2023, doi: 10.1016/j.cose.2022.102959.

E. Ahlm, “How to build and operate a modern security operations center,” Gart. Inc, 2021.

M. Zolanvari, M. A. Teixeira, L. Gupta, K. M. Khan, and R. Jain, “Machine Learning-Based Network Vulnerability Analysis of Industrial Internet of Things,” IEEE Internet Things J., vol. 6, no. 4, 2019, doi: 10.1109/JIOT.2019.2912022.

T. Jafarian, M. Masdari, A. Ghaffari, and K. Majidzadeh, “A survey and classification of the security anomaly detection mechanisms in software defined networks,” Cluster Comput., vol. 24, no. 2, 2021, doi: 10.1007/s10586-020-03184-1.

CNN Indonesia, “BSSN Deteksi 44 Juta Aktivitas Malware Hingga Mei 2024,” 2024. https://www.cnnindonesia.com/teknologi/20240516184354-185-1098626/bssn-deteksi-44-juta-aktivitas-malware-hingga-mei-2024

A. H. Shah, D. Pasha, E. H. Zadeh, and S. Konur, “Automated Log Analysis and Anomaly Detection Using Machine Learning,” in Frontiers in Artificial Intelligence and Applications, 2022, vol. 358. doi: 10.3233/FAIA220378.

H. Han, Z. Yan, X. Jing, and W. Pedrycz, “Applications of sketches in network traffic measurement: A survey,” Information Fusion, vol. 82. 2022. doi: 10.1016/j.inffus.2021.12.007.

A. Diro, S. Kaisar, A. V. Vasilakos, A. Anwar, A. Nasirian, and G. Olani, “Anomaly detection for space information networks: A survey of challenges, techniques, and future directions,” Comput. Secur., vol. 139, 2024, doi: 10.1016/j.cose.2024.103705.

Z. Zhao, H. Guo, and Y. Wang, “A multi-information fusion anomaly detection model based on convolutional neural networks and AutoEncoder,” Sci. Rep., vol. 14, 2024, doi: 10.1038/s41598-024-66760-0.

A. B. Nassif, M. A. Talib, Q. Nasir, and F. M. Dakalbab, “Machine Learning for Anomaly Detection: A Systematic Review,” IEEE Access, vol. 9, pp. 78658–78700, 2021, doi: 10.1109/ACCESS.2021.3083060.

W. A. Ali, K. N. Manasa, M. Bendechache, M. F. Aljunaid, and P. Sandhya, “A review of current machine learning approaches for anomaly detection in network traffic,” Journal of Telecommunications and the Digital Economy, vol. 8, no. 4. 2020. doi: 10.18080/JTDE.V8N4.307.

N. A. Azeez, T. O. Odeyemi, C. C. Isiekwene, and A. P. Abidoye, “Cyber Attack Detection in A Global Network Using Machine Learning Approach,” FUOYE J. Eng. Technol., vol. 8, no. 4, 2023, doi: 10.46792/fuoyejet.v8i4.1113.

K. Veena, K. Meena, Y. Teekaraman, R. Kuppusamy, and A. Radhakrishnan, “C SVM Classification and KNN Techniques for Cyber Crime Detection,” Wirel. Commun. Mob. Comput., vol. 2022, 2022, doi: 10.1155/2022/3640017.

M. Vishwakarma and N. Kesswani, “A new two-phase intrusion detection system with Naïve Bayes machine learning for data classification and elliptic envelop method for anomaly detection,” Decis. Anal. J., vol. 7, 2023, doi: 10.1016/j.dajour.2023.100233.

V. Nakhipova et al., “Use of the Naive Bayes Classifier Algorithm in Machine Learning for Student Performance Prediction,” Int. J. Inf. Educ. Technol., vol. 14, no. 1, 2024, doi: 10.18178/ijiet.2024.14.1.2028.

B. Mahesh, “Machine Learning Algorithms - A Review,” Int. J. Sci. Res., vol. 9, no. 1, pp. 381–386, 2020, doi: 10.21275/art20203995.

M. Das Nath and T. Bhattasali, “Anomaly Detection Using Machine Learning Approaches,” Azerbaijan J. High Perform. Comput., vol. 3, pp. 196–206, 2020, doi: 10.32010/26166127.2020.3.2.196.206.

A. Al Obaidli, D. Mansour, S. M. Abdulhamid, N. Ben Halima, and A. Al-Ghushami, “Machine Learning Approach to Anomaly Detection Attacks Classification in IoT Devices,” in 1st International Conference in Advanced Innovation on Smart City, ICAISC 2023 - Proceedings, 2023. doi: 10.1109/ICAISC56366.2023.10085349.

P. Chhajer, M. Shah, and A. Kshirsagar, “The applications of artificial neural networks, support vector machines, and long–short term memory for stock market prediction,” Decis. Anal. J., vol. 2, 2022, doi: 10.1016/j.dajour.2021.100015.

J. Zhu, S. He, P. He, J. Liu, and M. R. Lyu, “Loghub: A Large Collection of System Log Datasets for AI-driven Log Analytics,” in Proceedings - International Symposium on Software Reliability Engineering, ISSRE, 2023. doi: 10.1109/ISSRE59848.2023.00071.

M. Akanle et al., “Experimentations with openStack system logs and support vector machine for an anomaly detection model in a private cloud infrastructure,” in 2020 International Conference on Artificial Intelligence, Big Data, Computing and Data Communication Systems, icABCD 2020 - Proceedings, 2020. doi: 10.1109/icABCD49160.2020.9183878.

L. Xiang, “Application of an Improved TF-IDF Method in Literary Text Classification,” Adv. Multimed., vol. 2022, 2022, doi: 10.1155/2022/9285324.

A. Addiga and S. Bagui, “Sentiment Analysis on Twitter Data Using Term Frequency-Inverse Document Frequency,” J. Comput. Commun., vol. 10, no. 08, 2022, doi: 10.4236/jcc.2022.108008.

R. Chavan, G. Patil, V. Madle, and R. Joshi, “Curating Stopwords in Marathi: A TF-IDF Approach for Improved Text Analysis and Information Retrieval,” 2024, doi: 10.1109/I2CT61223.2024.10544359.

M. V. Anand, B. Kiranbala, S. R. Srividhya, K. C., M. Younus, and M. H. Rahman, “Gaussian Naïve Bayes Algorithm: A Reliable Technique Involved in the Assortment of the Segregation in Cancer,” Mob. Inf. Syst., vol. 2022, 2022, doi: 10.1155/2022/2436946.

M. Ismail, N. Hassan, and S. S. Bafjaish, “Comparative Analysis of Naive Bayesian Techniques in Health-Related for Classification Task,” J. Soft Comput. Data Min., vol. 1, no. 2, 2020, doi: 10.30880/jscdm.2020.01.02.001.

W. B. Zulfikar, A. R. Atmadja, and S. F. Pratama, “Sentiment Analysis on Social Media Against Public Policy Using Multinomial Naive Bayes,” Sci. J. Informatics, vol. 10, no. 1, 2023, doi: 10.15294/sji.v10i1.39952.

N. Naicker, T. Adeliyi, and J. Wing, “Linear Support Vector Machines for Prediction of Student Performance in School-Based Education,” Math. Probl. Eng., vol. 2020, 2020, doi: 10.1155/2020/4761468.

M. Alida and M. Mustikasari, “Rupiah Exchange Prediction of US Dollar Using Linear, Polynomial, and Radial Basis Function Kernel in Support Vector Regression,” J. Online Inform., vol. 5, no. 1, 2020.

I. S. Al-Mejibli, J. K. Alwan, and D. H. Abd, “The effect of gamma value on support vector machine performance with different kernels,” Int. J. Electr. Comput. Eng., vol. 10, no. 5, 2020, doi: 10.11591/IJECE.V10I5.PP5497-5506.

S. Uddin, I. Haque, H. Lu, M. A. Moni, and E. Gide, “Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction,” Sci. Rep., vol. 12, no. 1, 2022, doi: 10.1038/s41598-022-10358-x.

C. Kavitha, V. Mani, S. R. Srividhya, O. I. Khalaf, and C. A. Tavera Romero, “Early-Stage Alzheimer’s Disease Prediction Using Machine Learning Models,” Front. Public Heal., vol. 10, 2022, doi: 10.3389/fpubh.2022.853294.

D. M. W. Powers, “Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation.” 2020. [Online]. Available: https://arxiv.org/abs/2010.16061

N. Yuvaraj et al., “Nature-Inspired-Based Approach for Automated Cyberbullying Classification on Multimedia Social Networking,” Math. Probl. Eng., vol. 2021, 2021, doi: 10.1155/2021/6644652.

A. Tharwat, “Classification assessment methods,” Appl. Comput. Informatics, vol. 17, no. 1, 2018, doi: 10.1016/j.aci.2018.08.003.

B. J. Erickson and F. Kitamura, “Magician’s corner: 9. performance metrics for machine learning models,” Radiology: Artificial Intelligence, vol. 3, no. 3. 2021. doi: 10.1148/ryai.2021200126.




DOI: https://doi.org/10.11591/csit.v6i1.p80-90

Refbacks

  • There are currently no refbacks.


Computer Science and Information Technologies
p-ISSN: 2722-323X, e-ISSN: 2722-3221
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Universitas Ahmad Dahlan (UAD).

CSIT Visitor Stats

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.