Deteksi Intrusi ToN-IoT Menggunakan Stacking Ensemble XGBoost–LightGBM
DOI:
https://doi.org/10.33005/santika.v6i1.1249Keywords:
Deteksi Intrusi, Machine Learning, ToN-IoT, Stacking Ensemble, XGboost, LightGBMAbstract
Peningkatan penggunaan Internet of Things (IoT) dan Industrial Internet of Things (IIoT) pada berbagai sektor memberikan kemudahan dalam otomasi, pemantauan, dan pertukaran data, tetapi juga memperluas permukaan serangan jaringan. Kondisi tersebut membuat sistem IoT/IIoT membutuhkan mekanisme deteksi intrusi yang mampu membedakan trafik normal dan trafik serangan secara akurat. Pendekatan berbasis machine learning banyak digunakan karena mampu mempelajari pola dari data dan lebih adaptif dibandingkan pendekatan deteksi berbasis aturan. Penelitian ini mengembangkan model deteksi intrusi berbasis stacking ensemble menggunakan XGBoost dan LightGBM sebagai base learner, serta Regresi Logistik sebagai meta learner pada dataset ToN-IoT subset Network. Tugas klasifikasi difokuskan pada klasifikasi biner, yaitu normal dan attack. Tahapan penelitian meliputi pemahaman data, pra-pemrosesan fitur numerik, boolean, dan kategorikal, pembagian data menggunakan stratified split, optimasi hiperparameter menggunakan Bayesian Optimization, pembentukan fitur meta melalui mekanisme Out-of-Fold Prediction, serta evaluasi model pada data uji. Hasil optimasi menunjukkan bahwa XGBoost, LightGBM, dan Regresi Logistik sebagai meta learner memperoleh nilai PR-AUC validasi di atas 0,999. Pada data uji, model stacking ensemble memperoleh PR-AUC sebesar 0,999984, macro-F1 sebesar 0,995202, recall sebesar 0,997103, dan accuracy sebesar 0,9967. Hasil ini menunjukkan bahwa stacking ensemble XGBoost–LightGBM mampu memberikan performa deteksi intrusi yang sangat baik pada dataset ToN-IoT subset Network dan dapat menjadi alternatif pendekatan machine learning untuk mendukung pengembangan sistem deteksi intrusi pada lingkungan IoT/IIoT.
References
E. Sisinni, A. Saifullah, S. Han, U. Jennehag, and M. Gidlund, “Industrial internet of things: Challenges, opportunities, and directions,” IEEE Trans. Industr. Inform., vol. 14, no. 11, pp. 4724–4734, Nov. 2018, doi: 10.1109/TII.2018.2852491.
L. Diana, P. Dini, and D. Paolini, “Overview on Intrusion Detection Systems for Computers Networking Security,” Computers, vol. 14, no. 3, p. 87, Mar. 2025, doi: 10.3390/computers14030087.
M. M. Rahman, S. Al Shakil, and M. R. Mustakim, “A survey on intrusion detection system in IoT networks,” Cyber Security and Applications, vol. 3, Dec. 2024, doi: 10.1016/j.csa.2024.100082.
A. Thakkar and R. Lohiya, “A Review on Machine Learning and Deep Learning Perspectives of IDS for IoT: Recent Updates, Security Issues, and Challenges,” Archives of Computational Methods in Engineering, vol. 28, no. 4, pp. 3211–3243, Jun. 2021, doi: 10.1007/s11831-020-09496-0.
A. A. Alsulami, Q. Abu Al-Haija, A. Tayeb, and A. Alqahtani, “An Intrusion Detection and Classification System for IoT Traffic with Improved Data Engineering,” Applied Sciences (Switzerland), vol. 12, no. 23, Dec. 2022, doi: 10.3390/app122312336.
A. Alsaedi, N. Moustafa, Z. Tari, A. Mahmood, and Adna N Anwar, “TON-IoT telemetry dataset: A new generation dataset of IoT and IIoT for data-driven intrusion detection systems,” IEEE Access, vol. 8, pp. 165130–165150, 2020, doi: 10.1109/ACCESS.2020.3022862.
C. Bentéjac, A. Csörgő, and G. Martínez-Muñoz, “A comparative analysis of gradient boosting algorithms,” Artif. Intell. Rev., vol. 54, no. 3, pp. 1937–1967, Mar. 2021, doi: 10.1007/s10462-020-09896-5.
T. Chen and C. Guestrin, “XGBoost,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA: ACM, Aug. 2016, pp. 785–794. doi: 10.1145/2939672.2939785.
G. Ke et al., “LightGBM: A Highly Efficient Gradient Boosting Decision Tree,” in Advances in Neural Information Processing Systems 30 (NeurIPS 2017), 2017.
T. G. Dietterich, “Ensemble Methods in Machine Learning,” 2000, pp. 1–15. doi: 10.1007/3-540-45014-9_1.
D. H. Wolpert, “Stacked Generalization,” Neural Networks, vol. 5, no. 2, pp. 241–259, 1992, doi: 10.1016/S0893-6080(05)80023-1.
E. Mushtaq, A. Zameer, and A. Khan, “A two-stage stacked ensemble intrusion detection system using five base classifiers and MLP with optimal feature selection,” Microprocess. Microsyst., vol. 94, p. 104660, Oct. 2022, doi: 10.1016/j.micpro.2022.104660.
J. Snoek, H. Larochelle, and R. P. Adams, “Practical Bayesian Optimization of Machine Learning Algorithms,” in Advances in Neural Information Processing Systems 25 (NIPS 2012), F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, Eds., Red Hook, NY: Curran Associates, Inc., Aug. 2012, pp. 2951–2959. Accessed: Jan. 26, 2026. [Online]. Available: https://papers.nips.cc/paper_files/paper/2012/hash/05311655a15b75fab86956663e1819cd-Abstract.html
Z. Dai et al., “An intrusion detection model to detect zero-day attacks in unseen data using machine learning,” PLoS One, vol. 19, no. 9, Sep. 2024, doi: 10.1371/journal.pone.0308469.
T. Saito and M. Rehmsmeier, “The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets,” PLoS One, vol. 10, no. 3, Mar. 2015, doi: 10.1371/journal.pone.0118432.
L. Breiman, “Bagging predictors,” Mach. Learn., vol. 24, no. 2, pp. 123–140, Aug. 1996, doi: 10.1007/BF00058655.
L. Breiman, “Random Forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, Oct. 2001, doi: 10.1023/A:1010933404324.
M. Ali et al., “Effective network intrusion detection using stacking-based ensemble approach,” Int. J. Inf. Secur., vol. 22, no. 6, pp. 1781–1798, Dec. 2023, doi: 10.1007/s10207-023-00718-7.
D. W. . Hosmer, Stanley. Lemeshow, and R. X. . Sturdivant, Applied logistic regression. Wiley, 2013.
