Detecção de Bots Baseada em Caracterização de Dados

Hélder João Chissingui

doi:10.54580/R0801.14

Autores/as

Hélder João Chissingui Instituto Superior Técnico Militar (ISTM) Autor/a https://orcid.org/0000-0002-7538-3865

DOI:

https://doi.org/10.54580/R0801.14

Palabras clave:

Detección de bots, Metaaprendizaje, Multiclasificadores, Descripción de datos

Resumen

En los últimos años, mitigar las amenazas de bots se ha convertido en un desafío. Más allá del enorme impacto de las actividades maliciosas perpetradas por bots, el crecimiento del uso de internet ha contribuido significativamente a la situación actual. Daños a la infraestructura de TI, pérdidas económicas e insatisfacción entre los usuarios en ciertos entornos de prestación de servicios, entre otros problemas, están directamente asociados con los bots maliciosos. El problema se vuelve aún más complejo porque, en ocasiones, los usuarios utilizan aplicaciones móviles con sus cuentas de usuario para obtener acceso privilegiado a ciertos servicios de comercio electrónico. En otras palabras, el nivel de sofisticación de los bots es cada vez mayor, lo que significa que, en ciertas circunstancias, los patrones de actividad humana exhiben las mismas características que la actividad de los bots. Con este nivel de desarrollo, las tareas de detección se vuelven cada vez más complejas y vitales. Este estudio propone un enfoque de detección basado en metaaprendizaje que apoya la detección mediante la caracterización de datos de usuarios (bots y humanos). El proceso de caracterización se basa en un clasificador múltiple construido a partir de datos de episodios anteriores, en el que se utilizó un clasificador basado en Proactive Forest. Se realizó un análisis estadístico para seleccionar el multiclasificador más adecuado según los tipos Bagging, Boosting, Voting y Stacking. El rendimiento, medido por el porcentaje de instancias correctamente caracterizadas, mostró que el multiclasificador Voting fue el que mejor se ajustó, con un promedio del 99,6 % de instancias correctamente caracterizadas.

Descargas

Los datos de descarga aún no están disponibles.

Referencias

Acien, A., Morales, A., Fierrez, J., Vera-Rodriguez, R., & Delgado-Mohatar, O. (2021). BeCAPTCHA: Behavioral bot detection using touchscreen and mobile sensors benchmarked on HuMIdb. Engineering Applications of Artificial Intelligence, 98, 104058. https://doi.org/10.1016/j.engappai.2020.104058

Ahn, L. Von, Maurer, B., McMillen, C., Abraham, D., & Blum, M. (2008). reCAPTCHA: Human-Based Character Recognition via Web Security Measures. Science, 321(5895), 1465–1468. https://doi.org/10.1126/science.1160379

Albanese, M., Jajodia, S., & Venkatesan, S. (2018). Defending from Stealthy Botnets Using Moving Target Defenses. IEEE Security Privacy, 16(1), 92–97. https://doi.org/10.1109/MSP.2018.1331034

Alkadi, O., Moustafa, N., Turnbull, B., & Choo, K.-K. R. (2021). A Deep Blockchain Framework-Enabled Collaborative Intrusion Detection for Protecting IoT and Cloud Networks. IEEE Internet of Things Journal, 8(12), 9463–9472. https://doi.org/10.1109/JIOT.2020.2996590

Cepero-Pérez, N., Denis-Miranda, L. A., Hernández-Palacio, R., Moreno-Espino, M., & García-Borroto, M. (2018). Proactive Forest for Supervised Classification. In Y. Hernández Heredia, V. Milián Núñez, & J. Ruiz Shulcloper (Eds.), Progress in Artificial Intelligence and Pattern Recognition (pp. 255–262). Springer International Publishing. https://doi.org/10.1007/978-3-030-01132-1_29

Chen, H., He, H., & Starr, A. (2020). An Overview of Web Robots Detection Techniques. IEEE Xplore.

Chissingui, H. J., Pando, H. D., Espino, M. M., & Peréz, N. C. (2022). Bot detection algorithms: A systematic literature review. Revista Cubana de Ciencias Informáticas, 16(4), 1–26. https://rcci.uci.cu/index.php/RCCI/article/view/2548

Chissingui, H. J., Perez, N. C., Pando, H. D., & Espino, M. M. (2023). Multiclasificador homogeneo para detección de bots en el comercio electrónico. Revista Cubana de Transformación Digital, 4(1) e200. https://rctd.uic.cu/rctd/article/view/200/

Cresci, S., Pietro, R. Di, Petrocchi, M., Spognardi, A., & Tesconi, M. (2018). Social Fingerprinting: Detection of Spambot Groups Through DNA-Inspired Behavioral Modeling. IEEE Transactions on Dependable and Secure Computing, 15(4), 561–576. https://doi.org/10.1109/TDSC.2017.2681672

Duin, R. P. W. (2002). The combining classifier: to train or not to train? 2002 International Conference on Pattern Recognition, 2, 765–770 vol.2. https://doi.org/10.1109/ICPR.2002.1048415

Garcia, S., Grill, M., Stiborek, J., & Zunimo, A. (2014). An empirical comparison of botnet detection methods. Computers and Security Journal, Elsevier, 45, 100–123. http://dx.doi.org/10.1016/j.cose.2014.05.011

Gezer, A., Warner, G., Wilson, C., & Shrestha, P. (2019). A flow-based approach for Trickbot banking trojan detection. Computers & Security, 84, 179–192. https://doi.org/10.1016/j.cose.2019.03.013

Han, J., Kamber, M., & Pei, J. (2012). Data mining concepts and techniques, third edition. Morgan Kaufmann Publishers. https://booksite.elsevier.com/9780123814791/

Hayawi, K., Saha, S., Masud, M. M., Mathew, S. S., & Kaosar, M. (2023). Social media bot detection with deep learning methods: a systematic review. Neural Computing and Applications, 35(12), 8903–8918. https://doi.org/10.1007/s00521-023-08352-z

Hitaj, D., Hitaj, B., Jajodia, S., & Mancini, L. V. (2020). Capture the Bot: Using Adversarial Examples to Improve CAPTCHA Robustness to Bot Attacks. IEEE Intelligent Systems. https://doi.org/10.1109/MIS.2020.3036156

Imperva. (2022). 2022 Imperva Bad Bot Report - Evasive Bots Drive Online Fraud. Disponível em : https://www.imperva.com/resources/resource-library/reports/bad-bot-report/

Imperva. (2025). 2025 Bad Bot Report. The Rapid Rise of Bots and the Unseen Risk for Business. https://www.imperva.com/resources/resource-library/reports/2025-bad-bot-report/

Karataş, A., & Şahin, S. (2017). A Review on Social Bot Detection Techniques and Research Directions.

Komorniczak, J., & Ksieniewicz, P. (2022, July 14). problexity -- an open-source Python library for binary classification problem complexity assessment. https://doi.org/10.48550/arXiv.2207.06709

Latah, M. (2020). Detection of malicious social bots: A survey and a refined taxonomy. Expert Systems with Applications, 151. https://doi.org/10.1016/j.eswa.2020.113383

Lorena, A. C., Garcia, L. P. F., Lehmann, J., Souto, M. C. P., & Ho, T. K. (2019). How Complex Is Your Classification Problem? A Survey on Measuring Classification Complexity. ACM Comput. Surv., 52(5) 1 - 34. https://doi.org/10.1145/3347711

Maeda, S., Kanai, A., Tanimoto, S., Hatashima, T., & Ohkubo, K. (2019). A Botnet Detection Method on SDN using Deep Learning. 2019 IEEE International Conference on Consumer Electronics (ICCE), 1–6. https://doi.org/10.1109/ICCE.2019.8662080

Orabi, M., Mouheb, D., Al Aghbari, Z., & Kamel, I. (2020). Detection of Bots in Social Media: A Systematic Review. Information Processing & Management, 57(4), 102250. https://doi.org/10.1016/j.ipm.2020.102250

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). Scikit-learn: Machine Learning in {P}ython. Journal of Machine Learning Research, 12, 2825–2830. https://jmlr.org/papers/volume12/pedregosa11a/pedregosa11a.pdf

Rahman, R. U., & Tomar, D. S. (2020). A new web forensic framework for bot crime investigation. Forensic Science International: Digital Investigation, 33, 300943. https://doi.org/10.1016/j.fsidi.2020.300943

Rheault, L., & Musulan, A. (2021). Efficient detection of online communities and social bot activity during electoral campaigns. Journal of Information Technology and Politics, 18(3), 324–337. https://doi.org/10.1080/19331681.2021.1879705

Rovetta, S., Suchacka, G., & Masulli, F. (2020). Bot recognition in a Web store: An approach based on unsupervised learning. Journal of Network and Computer Applications, 157, 102577. https://doi.org/10.1016/j.jnca.2020.102577

Stassopoulou, A., & Dikaiakos, M. D. (2009). Web robot detection: A probabilistic reasoning approach. Computer Networks, 53(3), 265–278. https://doi.org/10.1016/j.comnet.2008.09.021

Suchacka, G., Cabri, A., Rovetta, S., & Masulli, F. (2021). Efficient on-the-fly Web bot detection. Knowledge-Based Systems, 223, 107074. https://doi.org/10.1016/j.knosys.2021.107074

Suchacka, G., & Iwanski, J. (2020). Identifying legitimate Web users and bots with different traffic profiles — an Information Bottleneck approach. Knowledge-Based Systems, 197, 105875. https://doi.org/10.1016/j.knosys.2020.105875

Turing, A. M. (1950). Computing Machinery and Intelligence. Oxford University Press on Behalf of the Mind Association, 59(236), 433–460. http://www.jstor.org/stable/2251299

Varol, O., Ferrara, E., Davis, C. A., Menczer, F., & Flammini, A. (2017). Online Human-Bot Interactions : Detection , Estimation , and Characterization. Proceedings of the Eleventh International AAAI Conference on Web and Social Media (ICWSM 2017), Icwsm, 280–289. Disponível em: https://doi.org/10.1609/icwsm.v11i1.14871

Venkatesan, S., Albanese, M., Cybenko, G., & Jajodia, S. (2016). A Moving Target Defense Approach to Disrupting Stealthy Botnets. Proceedings of the 2016 ACM Workshop on Moving Target Defense, 37–46.Disponível em: https://doi.org/10.1145/2995272.2995280

Zha, Z., Wang, A., Guo, Y., Montgomery, D., & Chen, S. (2019). BotSifter: An SDN-based Online Bot Detection Framework in Data Centers. 2019 IEEE Conference on Communications and Network Security (CNS), 142–150. Disponível em: https://doi.org/10.1109/CNS.2019.8802854