Product collaborative filtering based recommendation systems for large-scale E-commerce

•E-commerce demands multi-choice products, challenging businesses.•Recommender systems reshape E-commerce with personalized experiences.•Scalability is a pressing issue for recommendation systems.•Parallel techniques tackle scalability challenges in E-commerce.•Apache Spark accelerates training time...

Full description

Saved in:
Bibliographic Details
Published in:International journal of information management data insights Vol. 5; no. 1; p. 100322
Main Authors: Trinh, Trang, Nguyen, Van-Ho, Nguyen, Nghia, Nguyen, Duy-Nghia
Format: Journal Article
Language:English
Published: Elsevier Ltd 01.06.2025
Elsevier
Subjects:
ISSN:2667-0968, 2667-0968
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•E-commerce demands multi-choice products, challenging businesses.•Recommender systems reshape E-commerce with personalized experiences.•Scalability is a pressing issue for recommendation systems.•Parallel techniques tackle scalability challenges in E-commerce.•Apache Spark accelerates training time for large-scale E-commerce. The rapid growth in e-commerce and the increasing diversity of customer preferences necessitates the development of an effective recommender system for a business offering a wide range of products. This paper introduces a product-based collaborative filtering approach utilizing Apache Spark, a powerful parallel processing framework to address the scalability issues of recommender systems in the cloud computing environment. Using Spark's distributed computing ability, our model attains a surprising 7.6 times speedup on the training time compared to traditional single-machine methods while preserving accuracy with a Root Mean Square Error (RMSE) 0.9. These results demonstrate the effectiveness of parallel and distributed techniques in developing efficient and accurate recommender systems for large-scale e-commerce applications. Future work will focus on applying multi-model to enhance the accuracy of prediction and configuration to optimize the cost of cluster operations.
ISSN:2667-0968
2667-0968
DOI:10.1016/j.jjimei.2025.100322