GSC-DVIT: A vision transformer based deep learning model for lung cancer classification in CT images

•To remove the noise present in the input images, the Gaussian enclosed Bilateral Filtering (GaBF) method is used.•For extracting deep features with low dimensionality parameters, Conditional Variational Autoencoder (CVA) model is used.•To detect lung cancer, a GroupWise Separable Convolutional base...

Full description

Saved in:
Bibliographic Details
Published in:Biomedical signal processing and control Vol. 103; p. 107371
Main Authors: Mannepalli, Durgaprasad, Kuan Tak, Tan, Bala Krishnan, Sivaneasan, Sreenivas, Velagapudi
Format: Journal Article
Language:English
Published: Elsevier Ltd 01.05.2025
Subjects:
ISSN:1746-8094
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•To remove the noise present in the input images, the Gaussian enclosed Bilateral Filtering (GaBF) method is used.•For extracting deep features with low dimensionality parameters, Conditional Variational Autoencoder (CVA) model is used.•To detect lung cancer, a GroupWise Separable Convolutional based dual attention assisted ViT (gSC-DViT) is implemented. Vision transformer (ViT)-based techniques are advancing in the area of medical artificial intelligence (AI) and cancer imaging, comprising lung cancer applications. In recent days, numerous works have used AI techniques using computed tomography (CT) images for lung cancer diagnosis and prognosis based on visual transformers. However, the existing methods often suffer from large parameter counts and high computational complexity, particularly with limited training data. Thereby, this paper proposes an innovative approach based on lightweight vision transformer (LwViT)-based deep learning (DL) for effective classification of lung cancer using CT images. Initially, the proposed model performs pre-processing using Gaussian enclosed Bilateral Filtering (GaBF) to remove the noise. Then, the features are extracted using a conditional variational auto-encoder (CVA). Further, based on the extracted features, a GroupWise Separable Convolutional based dual attention-assisted Vision Transformer (gSC-DViT) is employed for classification. The hyperparameters of the gSC-DViT model are tuned using a puma optimizer to minimize the error and maximize the classification rate. The proposed LwViT-DL model is implemented in the Python platform using the Chest CT Scan image dataset and the Lung Cancer dataset. Moreover, the performance of the LwViT-DL model is compared with existing classifiers in terms of different evaluation measures. The maximum classification accuracy obtained by the LwViT-DL model is 99.52% in the Chest CT Scan image dataset and 99.69% in the Lung Cancer Dataset, superior to the existing classifiers for lung cancer classification.
ISSN:1746-8094
DOI:10.1016/j.bspc.2024.107371