Advancing Image Spam Detection: Evaluating Machine Learning Models Through Comparative Analysis
Loading...

Date
2025
Journal Title
Journal ISSN
Volume Title
Publisher
MDPI
Open Access Color
GOLD
Green Open Access
No
OpenAIRE Downloads
OpenAIRE Views
Publicly Funded
No
Abstract
Image-based spam poses a significant challenge for traditional text-based filters, as malicious content is often embedded within images to bypass keyword detection techniques. This study investigates and compares the performance of six machine learning models-ResNet50, XGBoost, Logistic Regression, LightGBM, Support Vector Machine (SVM), and VGG16-using a curated dataset containing 678 legitimate (ham) and 520 spam images. The novelty of this research lies in its comprehensive side-by-side evaluation of diverse models on the same dataset, using standardized dataset preprocessing, balanced data splits, and validation techniques. Model performance was assessed using evaluation metrics such as accuracy, receiver operating characteristic (ROC) curve, precision, recall, and area under the curve (AUC). The results indicate that ResNet50 achieved the highest classification performance, followed closely by XGBoost and Logistic Regression. This work provides practical insights into the strengths and limitations of traditional, ensemble-based, and deep learning models for image-based spam detection. The findings can support the development of more effective and generalizable spam filtering solutions in multimedia-rich communication platforms.
Description
Keywords
Spam Detection, Image Spam, Machine Learning, Support Vector Machine, XGBoost, Logistic Regression, ResNet50, LightGBM, VGG16, Technology, Support Vector Machine, QH301-705.5, T, Physics, QC1-999, Engineering (General). Civil engineering (General), Chemistry, machine learning, image spam, Logistic Regression, spam detection, TA1-2040, Biology (General), QD1-999, XGBoost
Fields of Science
Citation
WoS Q
Q2
Scopus Q
Q2

OpenCitations Citation Count
N/A
Source
Applied Sciences
Volume
15
Issue
11
Start Page
6158
End Page
PlumX Metrics
Citations
Scopus : 0
Captures
Mendeley Readers : 10
Google Scholar™


