Multiobjective feature selection for microarray data via distributed parallel algorithms

Many real-world problems are large in scale and hence difficult to address. Due to the large number of features in microarray datasets, feature selection and classification are even more challenging for such datasets. Not all of these numerous features contribute to the classification task, and some...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Future generation computer systems Ročník 100; s. 952 - 981
Hlavní autoři:	Cao, Bin, Zhao, Jianwei, Yang, Po, Yang, Peng, Liu, Xin, Qi, Jun, Simpson, Andrew, Elhoseny, Mohamed, Mehmood, Irfan, Muhammad, Khan
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	Elsevier B.V 01.11.2019
Témata:	Distributed parallelism Feature redundancy High dimension Microarray dataset Multiobjective feature selection Microarray dataset High dimension Multiobjective feature selection Distributed parallelism Feature redundancy
ISSN:	0167-739X, 1872-7115
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Many real-world problems are large in scale and hence difficult to address. Due to the large number of features in microarray datasets, feature selection and classification are even more challenging for such datasets. Not all of these numerous features contribute to the classification task, and some even impede performance. Through feature selection, a feature subset that contains only a small quantity of essential features can be generated to increase the classification accuracy and significantly reduce the time consumption. In this paper, we construct a multiobjective feature selection model that simultaneously considers the classification error, the feature number and the feature redundancy. For this model, we propose several distributed parallel algorithms based on different encodings and an adaptive strategy. Additionally, to reduce the time consumption, various tactics are employed, including a feature number constraint, distributed parallelism and sample-wise parallelism. For a batch of microarray datasets, the proposed algorithms are superior to several state-of-the-art multiobjective evolutionary algorithms in terms of both effectiveness and efficiency. •A multi-objective feature selection model is presented and tackled.•Algorithms with two encoding methodologies are proposed.•Adaptive technique is explored.•Explicit feature number threshold and distributed parallelism are employed for efficiency.
ISSN:	0167-739X 1872-7115
DOI:	10.1016/j.future.2019.02.030