In-Depth Analysis of Phishing Email Detection: Evaluating the Performance of Machine Learning and Deep Learning Models Across Multiple Datasets

Abeer Alhuzali; Ahad Alloqmani; Manar Aljabri; Fatemah Alharbi

In-Depth Analysis of Phishing Email Detection: Evaluating the Performance of Machine Learning and Deep Learning Models Across Multiple Datasets

2025

Abeer Alhuzali | Ahad Alloqmani | Manar Aljabri | Fatemah Alharbi

Phishing emails remain a primary vector for cyberattacks, necessitating advanced detection mechanisms. Existing studies often focus on limited datasets or a small number of models, lacking a comprehensive evaluation approach. This study develops a novel framework for implementing and testing phishing email detection models to address this gap. A total of fourteen machine learning (ML) and deep learning (DL) models are evaluated across ten datasets, including nine publicly available datasets and a merged dataset created for this study. The evaluation is conducted using multiple performance metrics to ensure a comprehensive comparison. Experimental results demonstrate that DL models consistently outperform their ML counterparts in both accuracy and robustness. Notably, transformer-based models BERT and RoBERTa achieve the highest detection accuracies of 98.99% and 99.08%, respectively, on the balanced merged dataset, outperforming traditional ML approaches by an average margin of 4.7%. These findings highlight the superiority of DL in phishing detection and emphasize the potential of AI-driven solutions in strengthening email security systems. This study provides a benchmark for future research and sets the stage for advancements in cybersecurity innovation.

Показать больше [+]