Malicious URLs detection using data streaming algorithms
Open Access
- 9 July 2021
- journal article
- Published by Institute of Research and Community Services Diponegoro University (LPPM UNDIP) in Jurnal Teknologi dan Sistem Komputer
- Vol. 9 (4), 224-229
- https://doi.org/10.14710/jtsiskom.2021.13965
Abstract
As a result of advancements in technology and technological devices, data is now spawned at an infinite rate, emanating from a vast array of networks, devices, and daily operations like credit card transactions and mobile phones. Datastream entails sequential and real-time continuous data in the inform of evolving stream. However, the traditional machine learning approach is characterized by a batch learning model. Labeled training data are given apriori to train a model based on some machine learning algorithms. This technique necessitates the entire training sample to be readily accessible before the learning process. The training procedure is mainly done offline in this setting due to the high training cost. Consequently, the traditional batch learning technique suffers severe drawbacks, such as poor scalability for real-time phishing websites detection. The model mostly requires re-training from scratch using new training samples. This paper presents the application of streaming algorithms for detecting malicious URLs based on selected online learners: Hoeffding Tree (HT), Naïve Bayes (NB), and Ozabag. Ozabag produced promising results in terms of accuracy, Kappa and Kappa Temp on the dataset with large samples while HT and NB have the least prediction time with comparable accuracy and Kappa with Ozabag algorithm for the real-time detection of phishing websites.Keywords
Funding Information
- University of Ilorin
This publication has 14 references indexed in Scilit:
- Extremely Fast Decision TreePublished by Association for Computing Machinery (ACM) ,2018
- Artificial Neural Network for Websites Classification with Phishing CharacteristicsSocial Networking, 2018
- Using Case-Based Reasoning for Phishing DetectionProcedia Computer Science, 2017
- Two-stage ELM for phishing Web pages detection using hybrid featuresWorld Wide Web, 2016
- Method for Detecting a Malicious Domain by using only Well-known InformationInternational Journal of Cyber-Security and Digital Forensics, 2016
- You Look Suspicious!!: Leveraging Visible Attributes to Classify Malicious Short URLs on TwitterPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2016
- Towards understanding upstream Web trafficPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Predicting phishing websites based on self-structuring neural networkNeural Computing & Applications, 2013
- Identifying suspicious URLsPublished by Association for Computing Machinery (ACM) ,2009
- Mining high-speed data streamsPublished by Association for Computing Machinery (ACM) ,2000