A parallel hybrid approach integrating clonal selection with artificial bee colony for logistic regression in spam email detection

Spam emails are sent to recipients for advertisement and phishing purposes. In either case, it disturbs recipients and reduces communication quality. Addressing this issue requires classifying emails on servers as either spam or ham. Numerous methods have been proposed for this classification task....

Full description

Saved in:

Bibliographic Details
Published in:	Neural computing & applications Vol. 37; no. 27; pp. 22401 - 22419
Main Authors:	Dedeturk, Bilge Kagan, Akay, Bahriye
Format:	Journal Article
Language:	English
Published:	London Springer London 01.09.2025 Springer Nature B.V
Subjects:	Accuracy Algorithms Artificial Intelligence Bees Blacklisting Classification Computational Biology/Bioinformatics Computational Science and Engineering Computer Science Data Mining and Knowledge Discovery Datasets Deep learning Efficiency Electronic mail systems Feature selection Image Processing and Computer Vision Machine learning Methods Neural networks Optimization Probability and Statistics in Computer Science Regression analysis Robustness S.I.: Hybrid Approaches to Nature-inspired Optimization Algorithms and Their Applications Spamming Special Issue on Hybrid Approaches to Nature-inspired Optimization Algorithms and Their Applications Success Support vector machines Swarm intelligence Logistic regression Spam filtering Artificial bee colony Email spam detection Clonal selection algorithm
ISSN:	0941-0643, 1433-3058
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Spam emails are sent to recipients for advertisement and phishing purposes. In either case, it disturbs recipients and reduces communication quality. Addressing this issue requires classifying emails on servers as either spam or ham. Numerous methods have been proposed for this classification task. Among them, logistic regression (LR) stands out for its simplicity, speed, and ease of implementation. However, LR suffers from low detection rates caused by the gradient descent algorithm used in its training phase. To overcome this limitation, we propose a novel method based on the clonal selection algorithm (CSA), renowned for its success in optimization problems due to its local and global search capabilities. Despite CSA’s effective optimization performance, it suffers from robustness and slow training time. Therefore, the CSA and artificial bee colony (ABC) algorithms are hybridized to improve CSA’s robustness and are parallelized to reduce the training time significantly. This hybrid method is employed to optimize the weights of LR by minimizing the cost at the output of LR. The empirical results denote that the proposed method, named CSA–ABC–LR, yields better classification performance compared to state-of-the-art models reported by previous studies, demonstrating an accuracy rate of 99.13% on the Enron-1 dataset, 99.22% on the CSDMC2010 dataset, and 94.49% on the Spambase dataset.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0941-0643 1433-3058
DOI:	10.1007/s00521-024-10505-7