Predicting User Purchases From Clickstream Data: a Comparative Analysis of Clickstream Data Representations and Machine Learning Models
dc.authorwosid | Tokuç, A. Aylin/Ixn-5337-2023 | |
dc.authorwosid | Dag, Tamer/K-7830-2014 | |
dc.contributor.author | Dağ, Tamer | |
dc.contributor.author | Dag, Tamer | |
dc.contributor.other | Computer Engineering | |
dc.date.accessioned | 2025-04-15T23:42:53Z | |
dc.date.available | 2025-04-15T23:42:53Z | |
dc.date.issued | 2025 | |
dc.department | Kadir Has University | en_US |
dc.department-temp | [Tokuc, A. Aylin] Kadir Has Univ, Dept Comp Engn, TR-34083 Fatih, Istanbul, Turkiye; [Tokuc, A. Aylin] Valinor AI, London SE9 4HA, England; [Dag, Tamer] Amer Univ Middle East, Coll Engn & Technol, Egaila 54200, Kuwait | en_US |
dc.description.abstract | Predicting purchase events from e-commerce clickstream data is a critical challenge with significant implications for optimizing marketing strategies and enhancing customer experience. This study addresses this challenge by systematically evaluating and comparing multiple data representations - aggregated session attributes, recent user actions, and hybrid combinations - which bridges gaps in the existing literature and demonstrates the superiority of hybrid approaches. Unlike prior research, which typically focuses on single representations, our approach combines aggregated session-level summaries with granular, sequential user actions to capture both long-term and short-term behavioral patterns. Through comprehensive experimentation, we compared multiple machine learning models, including LightGBM, decision trees, gradient boosting, SVC, and logistic regression, using real-world e-commerce clickstream data. Notably, the hybrid representation with LightGBM achieved superior predictive performance, significantly outperforming alternative methods. Feature importance analysis revealed key factors influencing purchase likelihood, such as time since the last event, session duration, and product interactions. This study provides actionable insights into real-time marketing interventions by demonstrating the practical utility of hybrid data representations and efficient tree-based models. Our findings offer a scalable and interpretable framework for e-commerce platforms to enhance purchase predictions and optimize marketing strategies. | en_US |
dc.description.woscitationindex | Science Citation Index Expanded | |
dc.identifier.doi | 10.1109/ACCESS.2025.3548267 | |
dc.identifier.endpage | 43817 | en_US |
dc.identifier.issn | 2169-3536 | |
dc.identifier.scopus | 2-s2.0-105001064548 | |
dc.identifier.scopusquality | Q1 | |
dc.identifier.startpage | 43796 | en_US |
dc.identifier.uri | https://doi.org/10.1109/ACCESS.2025.3548267 | |
dc.identifier.volume | 13 | en_US |
dc.identifier.wos | WOS:001445086900045 | |
dc.identifier.wosquality | Q2 | |
dc.language.iso | en | en_US |
dc.publisher | IEEE-Inst Electrical Electronics Engineers inc | en_US |
dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | en_US |
dc.rights | info:eu-repo/semantics/openAccess | en_US |
dc.scopus.citedbyCount | 0 | |
dc.subject | Predictive Models | en_US |
dc.subject | Data Models | en_US |
dc.subject | Hidden Markov Models | en_US |
dc.subject | Electronic Commerce | en_US |
dc.subject | Computational Modeling | en_US |
dc.subject | Data Visualization | en_US |
dc.subject | Analytical Models | en_US |
dc.subject | Real-Time Systems | en_US |
dc.subject | Random Forests | en_US |
dc.subject | Machine Learning Algorithms | en_US |
dc.subject | Clickstream Data | en_US |
dc.subject | Customer Behavior Modeling | en_US |
dc.subject | Data Representations | en_US |
dc.subject | Feature Importance | en_US |
dc.subject | Gradient Boosting | en_US |
dc.subject | E-Commerce | en_US |
dc.subject | Lightgbm | en_US |
dc.subject | Machine Learning | en_US |
dc.subject | Model Selection | en_US |
dc.subject | Purchase Prediction | en_US |
dc.title | Predicting User Purchases From Clickstream Data: a Comparative Analysis of Clickstream Data Representations and Machine Learning Models | en_US |
dc.type | Article | en_US |
dc.wos.citedbyCount | 0 | |
dspace.entity.type | Publication | |
relation.isAuthorOfPublication | 6e6ae480-b76e-48a0-a543-13ef44f9d802 | |
relation.isAuthorOfPublication.latestForDiscovery | 6e6ae480-b76e-48a0-a543-13ef44f9d802 | |
relation.isOrgUnitOfPublication | fd8e65fe-c3b3-4435-9682-6cccb638779c | |
relation.isOrgUnitOfPublication.latestForDiscovery | fd8e65fe-c3b3-4435-9682-6cccb638779c |