Advancing Image Spam Detection: Evaluating Machine Learning Models Through Comparative Analysis

No Thumbnail Available

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

MDPI

Open Access Color

OpenAIRE Downloads

OpenAIRE Views

Research Projects

Organizational Units

Journal Issue

Events

Abstract

Image-based spam poses a significant challenge for traditional text-based filters, as malicious content is often embedded within images to bypass keyword detection techniques. This study investigates and compares the performance of six machine learning models-ResNet50, XGBoost, Logistic Regression, LightGBM, Support Vector Machine (SVM), and VGG16-using a curated dataset containing 678 legitimate (ham) and 520 spam images. The novelty of this research lies in its comprehensive side-by-side evaluation of diverse models on the same dataset, using standardized dataset preprocessing, balanced data splits, and validation techniques. Model performance was assessed using evaluation metrics such as accuracy, receiver operating characteristic (ROC) curve, precision, recall, and area under the curve (AUC). The results indicate that ResNet50 achieved the highest classification performance, followed closely by XGBoost and Logistic Regression. This work provides practical insights into the strengths and limitations of traditional, ensemble-based, and deep learning models for image-based spam detection. The findings can support the development of more effective and generalizable spam filtering solutions in multimedia-rich communication platforms.

Description

Keywords

Spam Detection, Image Spam, Machine Learning, Support Vector Machine, XGBoost, Logistic Regression, ResNet50, LightGBM, VGG16

Turkish CoHE Thesis Center URL

Fields of Science

Citation

WoS Q

Q2

Scopus Q

Q3

Source

Volume

15

Issue

11

Start Page

End Page