Advancing Image Spam Detection: Evaluating Machine Learning Models Through Comparative Analysis

dc.contributor.author Jamil, Mahnoor
dc.contributor.author Trpcheska, Hristina Mihajloska
dc.contributor.author Popovska-Mitrovikj, Aleksandra
dc.contributor.author Dimitrova, Vesna
dc.contributor.author Creutzburg, Reiner
dc.date.accessioned 2025-07-15T18:46:01Z
dc.date.available 2025-07-15T18:46:01Z
dc.date.issued 2025
dc.description.abstract Image-based spam poses a significant challenge for traditional text-based filters, as malicious content is often embedded within images to bypass keyword detection techniques. This study investigates and compares the performance of six machine learning models-ResNet50, XGBoost, Logistic Regression, LightGBM, Support Vector Machine (SVM), and VGG16-using a curated dataset containing 678 legitimate (ham) and 520 spam images. The novelty of this research lies in its comprehensive side-by-side evaluation of diverse models on the same dataset, using standardized dataset preprocessing, balanced data splits, and validation techniques. Model performance was assessed using evaluation metrics such as accuracy, receiver operating characteristic (ROC) curve, precision, recall, and area under the curve (AUC). The results indicate that ResNet50 achieved the highest classification performance, followed closely by XGBoost and Logistic Regression. This work provides practical insights into the strengths and limitations of traditional, ensemble-based, and deep learning models for image-based spam detection. The findings can support the development of more effective and generalizable spam filtering solutions in multimedia-rich communication platforms. en_US
dc.description.sponsorship European Union [101082683]; Faculty of Computer Science and Engineering at Ss. Cyril; Methodius University in Skopje en_US
dc.description.sponsorship This work was supported partially by the European Union in the framework of ERASMUS MUNDUS, Project CyberMACS #101082683 and Faculty of Computer Science and Engineering at Ss. Cyril and Methodius University in Skopje en_US
dc.identifier.doi 10.3390/app15116158
dc.identifier.issn 2076-3417
dc.identifier.scopus 2-s2.0-105007702913
dc.identifier.uri https://doi.org/10.3390/app15116158
dc.identifier.uri https://hdl.handle.net/20.500.12469/7387
dc.language.iso en en_US
dc.publisher MDPI en_US
dc.relation.ispartof Applied Sciences
dc.rights info:eu-repo/semantics/closedAccess en_US
dc.subject Spam Detection en_US
dc.subject Image Spam en_US
dc.subject Machine Learning en_US
dc.subject Support Vector Machine en_US
dc.subject XGBoost en_US
dc.subject Logistic Regression en_US
dc.subject ResNet50 en_US
dc.subject LightGBM en_US
dc.subject VGG16 en_US
dc.title Advancing Image Spam Detection: Evaluating Machine Learning Models Through Comparative Analysis en_US
dc.type Article en_US
dspace.entity.type Publication
gdc.author.scopusid 59206867000
gdc.author.scopusid 59938845100
gdc.author.scopusid 55225971200
gdc.author.scopusid 37010805100
gdc.author.scopusid 6602924425
gdc.bip.impulseclass C5
gdc.bip.influenceclass C5
gdc.bip.popularityclass C5
gdc.coar.access metadata only access
gdc.coar.type text::journal::journal article
gdc.collaboration.industrial false
gdc.description.department Kadir Has University en_US
gdc.description.departmenttemp [Jamil, Mahnoor; Trpcheska, Hristina Mihajloska; Popovska-Mitrovikj, Aleksandra; Dimitrova, Vesna] Ss Cyril & Methodius Univ, Fac Comp Sci & Engn, Skopje 1000, North Macedonia; [Jamil, Mahnoor] Kadir Has Univ, Sch Grad Studies, TR-34083 Istanbul, Turkiye; [Creutzburg, Reiner] SRH Univ Appl Sci Heidelberg, Sch Technol & Architecture, D-12059 Berlin, Germany; [Creutzburg, Reiner] TH Brandenburg, Fachbereich Informat & Medien, D-14770 Brandenburg, Germany en_US
gdc.description.issue 11 en_US
gdc.description.publicationcategory Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı en_US
gdc.description.scopusquality Q2
gdc.description.startpage 6158
gdc.description.volume 15 en_US
gdc.description.woscitationindex Science Citation Index Expanded
gdc.description.wosquality Q2
gdc.identifier.openalex W4410895597
gdc.identifier.wos WOS:001505753600001
gdc.index.type WoS
gdc.index.type Scopus
gdc.oaire.accesstype GOLD
gdc.oaire.diamondjournal false
gdc.oaire.impulse 0.0
gdc.oaire.influence 2.4895952E-9
gdc.oaire.isgreen false
gdc.oaire.keywords Technology
gdc.oaire.keywords Support Vector Machine
gdc.oaire.keywords QH301-705.5
gdc.oaire.keywords T
gdc.oaire.keywords Physics
gdc.oaire.keywords QC1-999
gdc.oaire.keywords Engineering (General). Civil engineering (General)
gdc.oaire.keywords Chemistry
gdc.oaire.keywords machine learning
gdc.oaire.keywords image spam
gdc.oaire.keywords Logistic Regression
gdc.oaire.keywords spam detection
gdc.oaire.keywords TA1-2040
gdc.oaire.keywords Biology (General)
gdc.oaire.keywords QD1-999
gdc.oaire.keywords XGBoost
gdc.oaire.popularity 2.7494755E-9
gdc.oaire.publicfunded false
gdc.openalex.collaboration International
gdc.openalex.fwci 0.0
gdc.openalex.normalizedpercentile 0.16
gdc.openalex.toppercent TOP 10%
gdc.opencitations.count 0
gdc.plumx.mendeley 10
gdc.plumx.scopuscites 0
gdc.scopus.citedcount 0
gdc.wos.citedcount 0
relation.isOrgUnitOfPublication b20623fc-1264-4244-9847-a4729ca7508c
relation.isOrgUnitOfPublication.latestForDiscovery b20623fc-1264-4244-9847-a4729ca7508c

Files