An efficient parallel algorithm for mining weighted clickstream patterns

•We propose a parallel depth-first search with dynamic load balancing.•We propose a parallel algorithm called PCompact-SPADE for mining weighted frequent clickstream patterns.•We experiment on various datasets to illustrate the algorithm’s performance and scalability. In the Internet age, analyzing...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Information sciences Ročník 582; s. 349 - 368
Hlavní autoři: Huynh, Huy M., Nguyen, Loan T.T., Vo, Bay, Oplatková, Zuzana Komínková, Fournier-Viger, Philippe, Yun, Unil
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier Inc 01.01.2022
Témata:
ISSN:0020-0255, 1872-6291
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:•We propose a parallel depth-first search with dynamic load balancing.•We propose a parallel algorithm called PCompact-SPADE for mining weighted frequent clickstream patterns.•We experiment on various datasets to illustrate the algorithm’s performance and scalability. In the Internet age, analyzing the behavior of online users can help webstore owners understand customers’ interests. Insights from such analysis can be used to improve both user experience and website design. A prominent task for online behavior analysis is clickstream mining, which consists of identifying customer browsing patterns that reveal how users interact with websites. Recently, this task was extended to consider weights to find more impactful patterns. However, most algorithms for mining weighted clickstream patterns are serial algorithms, which are sequentially executed from the start to the end on one running thread. In real life, data is often very large, and serial algorithms can have long runtimes as they do not fully take advantage of the parallelism capabilities of modern multi-core CPUs. To address this limitation, this paper presents two parallel algorithms named DPCompact-SPADE (Depth load balancing Parallel Compact-SPADE) and APCompact-SPADE (Adaptive Parallel Compact-SPADE) for weighted clickstream pattern mining. Experiments on various datasets show that the proposed parallel algorithm is efficient, and outperforms state-of-the-art serial algorithms in terms of runtime, memory consumption, and scalability.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2021.08.070