Advancing Image Spam Detection: Evaluating Machine Learning Models Through Comparative Analysis
No Thumbnail Available
Date
2025
Journal Title
Journal ISSN
Volume Title
Publisher
MDPI
Open Access Color
OpenAIRE Downloads
OpenAIRE Views
Abstract
Image-based spam poses a significant challenge for traditional text-based filters, as malicious content is often embedded within images to bypass keyword detection techniques. This study investigates and compares the performance of six machine learning models-ResNet50, XGBoost, Logistic Regression, LightGBM, Support Vector Machine (SVM), and VGG16-using a curated dataset containing 678 legitimate (ham) and 520 spam images. The novelty of this research lies in its comprehensive side-by-side evaluation of diverse models on the same dataset, using standardized dataset preprocessing, balanced data splits, and validation techniques. Model performance was assessed using evaluation metrics such as accuracy, receiver operating characteristic (ROC) curve, precision, recall, and area under the curve (AUC). The results indicate that ResNet50 achieved the highest classification performance, followed closely by XGBoost and Logistic Regression. This work provides practical insights into the strengths and limitations of traditional, ensemble-based, and deep learning models for image-based spam detection. The findings can support the development of more effective and generalizable spam filtering solutions in multimedia-rich communication platforms.
Description
Keywords
Spam Detection, Image Spam, Machine Learning, Support Vector Machine, XGBoost, Logistic Regression, ResNet50, LightGBM, VGG16
Turkish CoHE Thesis Center URL
Fields of Science
Citation
WoS Q
Q2
Scopus Q
Q3
Source
Volume
15
Issue
11