Feature Selection for Classification using Principal Component Analysis and Information Gain
•Feature selection improves performance of machine learning algorithms.•Feature selection with more n-tier techniques is simpler and more stable.•A feature selection model that is not specific to any data set is widely applied. Feature Selection and classification have previously been widely applied...
Uloženo v:
| Vydáno v: | Expert systems with applications Ročník 174; s. 114765 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
Elsevier Ltd
15.07.2021
Elsevier BV |
| Témata: | |
| ISSN: | 0957-4174, 1873-6793 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | •Feature selection improves performance of machine learning algorithms.•Feature selection with more n-tier techniques is simpler and more stable.•A feature selection model that is not specific to any data set is widely applied.
Feature Selection and classification have previously been widely applied in various areas like business, medical and media fields. High dimensionality in datasets is one of the main challenges that has been experienced in classifying data, data mining and sentiment analysis. Irrelevant and redundant attributes have also had a negative impact on the complexity and operation of algorithms for classifying data. Consequently, the algorithms record poor results or performance. Some existing work use all attributes for classification, some of which are insignificant for the task, thereby leading to poor performance. This paper therefore develops a hybrid filter model for feature selection based on principal component analysis and information gain. The hybrid model is then applied to support classification using machine learning techniques e.g. the Naïve Bayes technique. Experimental results demonstrate that the hybrid filter model reduces data dimensions, selects appropriate feature sets, and reduces training time, hence providing better classification performance as measured by accuracy, precision and recall.. |
|---|---|
| AbstractList | Feature Selection and classification have previously been widely applied in various areas like business, medical and media fields. High dimensionality in datasets is one of the main challenges that has been experienced in classifying data, data mining and sentiment analysis. Irrelevant and redundant attributes have also had a negative impact on the complexity and operation of algorithms for classifying data. Consequently, the algorithms record poor results or performance. Some existing work use all attributes for classification, some of which are insignificant for the task, thereby leading to poor performance. This paper therefore develops a hybrid filter model for feature selection based on principal component analysis and information gain. The hybrid model is then applied to support classification using machine learning techniques e.g. the Naïve Bayes technique. Experimental results demonstrate that the hybrid filter model reduces data dimensions, selects appropriate feature sets, and reduces training time, hence providing better classification performance as measured by accuracy, precision and recall.. •Feature selection improves performance of machine learning algorithms.•Feature selection with more n-tier techniques is simpler and more stable.•A feature selection model that is not specific to any data set is widely applied. Feature Selection and classification have previously been widely applied in various areas like business, medical and media fields. High dimensionality in datasets is one of the main challenges that has been experienced in classifying data, data mining and sentiment analysis. Irrelevant and redundant attributes have also had a negative impact on the complexity and operation of algorithms for classifying data. Consequently, the algorithms record poor results or performance. Some existing work use all attributes for classification, some of which are insignificant for the task, thereby leading to poor performance. This paper therefore develops a hybrid filter model for feature selection based on principal component analysis and information gain. The hybrid model is then applied to support classification using machine learning techniques e.g. the Naïve Bayes technique. Experimental results demonstrate that the hybrid filter model reduces data dimensions, selects appropriate feature sets, and reduces training time, hence providing better classification performance as measured by accuracy, precision and recall.. |
| ArticleNumber | 114765 |
| Author | Onyango Okeyo, George Odhiambo Omuya, Erick Waema Kimwele, Michael |
| Author_xml | – sequence: 1 givenname: Erick surname: Odhiambo Omuya fullname: Odhiambo Omuya, Erick email: omuya.erick@mksu.ac.ke organization: School of Engineering and Technology, Machakos University, Kenya – sequence: 2 givenname: George surname: Onyango Okeyo fullname: Onyango Okeyo, George email: gokeyo@andrew.cmu.edu organization: Carnegie Mellon University Africa – sequence: 3 givenname: Michael surname: Waema Kimwele fullname: Waema Kimwele, Michael email: mkimwele@jkuat.ac.ke organization: School of Computing and Information Technology, Jomo Kenyatta University of Agriculture and Technology, Kenya |
| BookMark | eNp9kEFLwzAYhoNMcJv-AU8Fz61JmjQJeBnDzcFAQb0JIUtTSemSmrTK_r3d6snDTh98fM_H-z4zMHHeGQBuEcwQRMV9nZn4ozIMMcoQIqygF2CKOMvTgol8AqZQUJYSxMgVmMVYQ4gYhGwKPlZGdX0wyatpjO6sd0nlQ7JsVIy2slqdVn207jN5CdZp26omWfp9OwRwXbJwqjlEGxPlymTjBnY_Imtl3TW4rFQTzc3fnIP31ePb8indPq83y8U21bkgXVpWSBWcU6EMN5opQowuhRhK7DgsScUwzHmJMKQc72gllCYQE6oEFxViTOdzcDf-bYP_6k3sZO37MCSLEtNc0AIjhocrPF7p4GMMppJtsHsVDhJBebQoa3m0KI8W5WhxgPg_SNvu1LALyjbn0YcRNUP1b2uCjNoap01pw6Balt6ew38BsiWP8Q |
| CitedBy_id | crossref_primary_10_1007_s10586_022_03657_5 crossref_primary_10_3390_app14062614 crossref_primary_10_1109_ACCESS_2022_3169765 crossref_primary_10_3390_make5030061 crossref_primary_10_3390_info15070420 crossref_primary_10_3390_technologies13020042 crossref_primary_10_1007_s41870_023_01245_3 crossref_primary_10_1038_s41598_025_07628_9 crossref_primary_10_1111_exsy_13254 crossref_primary_10_1016_j_eswa_2023_122649 crossref_primary_10_1038_s41598_025_16415_5 crossref_primary_10_1016_j_psep_2021_05_026 crossref_primary_10_1007_s13198_025_02842_0 crossref_primary_10_1186_s40537_025_01205_7 crossref_primary_10_3390_a16010012 crossref_primary_10_1002_cpe_7903 crossref_primary_10_1016_j_procs_2023_10_366 crossref_primary_10_7717_peerj_cs_2880 crossref_primary_10_3390_fi16120458 crossref_primary_10_3390_e26090796 crossref_primary_10_1002_cpe_7299 crossref_primary_10_3390_electronics14152959 crossref_primary_10_3390_diagnostics13081440 crossref_primary_10_1080_15376494_2025_2480689 crossref_primary_10_3390_app13052950 crossref_primary_10_3390_math10244772 crossref_primary_10_1109_ACCESS_2022_3198597 crossref_primary_10_1016_j_asoc_2024_111477 crossref_primary_10_1051_bioconf_20248205020 crossref_primary_10_1007_s11042_024_19769_6 crossref_primary_10_3390_app14083337 crossref_primary_10_1016_j_solener_2023_111918 crossref_primary_10_1109_ACCESS_2023_3247866 crossref_primary_10_1038_s41598_025_16311_y crossref_primary_10_1016_j_cose_2021_102352 crossref_primary_10_3390_s24165223 crossref_primary_10_1186_s40537_025_01116_7 crossref_primary_10_32604_jrm_2022_022300 crossref_primary_10_3390_diagnostics12123000 crossref_primary_10_1016_j_comnet_2024_110939 crossref_primary_10_1007_s10115_024_02114_6 crossref_primary_10_1155_2022_5593147 crossref_primary_10_1002_cem_3602 crossref_primary_10_1155_2022_5339926 crossref_primary_10_1016_j_procs_2024_01_172 crossref_primary_10_1109_TITS_2025_3556444 crossref_primary_10_3390_diagnostics15030248 crossref_primary_10_3389_fenrg_2024_1479478 crossref_primary_10_1016_j_bspc_2024_106949 crossref_primary_10_1080_00207543_2024_2423802 crossref_primary_10_3390_rs16173190 crossref_primary_10_3390_s21165302 crossref_primary_10_1007_s13198_024_02294_y crossref_primary_10_1080_19393555_2025_2496327 crossref_primary_10_1007_s42044_024_00174_z crossref_primary_10_3390_s22051836 crossref_primary_10_1007_s00170_024_14868_y crossref_primary_10_20965_jaciii_2022_p0671 crossref_primary_10_1371_journal_pone_0290332 crossref_primary_10_7759_cureus_51036 crossref_primary_10_3390_data9020020 crossref_primary_10_1080_00051144_2023_2218164 crossref_primary_10_1186_s13638_023_02292_x crossref_primary_10_1007_s11356_022_18819_6 crossref_primary_10_1038_s41598_023_34951_w crossref_primary_10_3390_jpm13071071 crossref_primary_10_1007_s00354_023_00222_5 crossref_primary_10_3233_THC_219008 crossref_primary_10_3390_electronics13010205 crossref_primary_10_1080_15389588_2025_2530074 crossref_primary_10_3390_app15094823 crossref_primary_10_3390_math9202622 crossref_primary_10_1109_ACCESS_2025_3566430 crossref_primary_10_1016_j_eswa_2023_121024 crossref_primary_10_1016_j_teler_2025_100199 crossref_primary_10_1109_ACCESS_2025_3600570 crossref_primary_10_1002_qub2_46 crossref_primary_10_1109_ACCESS_2025_3538278 crossref_primary_10_3390_rs15041111 crossref_primary_10_1007_s10115_023_02010_5 crossref_primary_10_1007_s10614_024_10577_6 crossref_primary_10_1007_s41062_025_02203_7 crossref_primary_10_1007_s11042_024_18553_w crossref_primary_10_1007_s41348_025_01100_6 crossref_primary_10_1016_j_jobe_2025_113722 crossref_primary_10_1007_s40747_025_01784_1 crossref_primary_10_3390_info13070314 crossref_primary_10_1038_s41598_025_15155_w crossref_primary_10_1038_s41598_022_22814_9 crossref_primary_10_14500_aro_12034 crossref_primary_10_3390_e27080881 crossref_primary_10_1007_s42979_025_04035_9 crossref_primary_10_1016_j_pmcj_2025_102103 crossref_primary_10_32604_cmc_2022_028055 crossref_primary_10_3390_app14135711 crossref_primary_10_32628_IJSRST218535 crossref_primary_10_32604_cmc_2023_029163 crossref_primary_10_1016_j_eswa_2022_116794 crossref_primary_10_1002_aisy_202401150 crossref_primary_10_1007_s11518_022_5520_1 crossref_primary_10_1016_j_eswa_2023_119607 crossref_primary_10_1142_S0219649225500169 crossref_primary_10_1155_2022_9238968 crossref_primary_10_1016_j_procs_2022_09_384 crossref_primary_10_1038_s41598_023_48230_1 crossref_primary_10_1093_jcde_qwac028 crossref_primary_10_1515_geo_2022_0402 crossref_primary_10_3390_s23104792 crossref_primary_10_1007_s10661_024_13423_2 crossref_primary_10_3390_s24175712 crossref_primary_10_3233_JIFS_221720 crossref_primary_10_1016_j_engappai_2023_107114 crossref_primary_10_3103_S014641162306007X crossref_primary_10_1145_3653025 crossref_primary_10_3390_biomimetics9110662 crossref_primary_10_3390_systems11090483 crossref_primary_10_1002_esp_5737 crossref_primary_10_3390_app15020778 crossref_primary_10_2478_ers_2024_0031 crossref_primary_10_3390_pr11010065 crossref_primary_10_1007_s11071_024_10073_4 crossref_primary_10_3390_w17182688 crossref_primary_10_1007_s42044_025_00229_9 crossref_primary_10_1002_cpe_6756 crossref_primary_10_23919_JSEE_2024_000055 crossref_primary_10_3390_electronics12112427 crossref_primary_10_3390_buildings12111907 crossref_primary_10_1108_IMDS_03_2024_0257 crossref_primary_10_1016_j_procs_2025_04_439 crossref_primary_10_1016_j_eswa_2021_114986 crossref_primary_10_1016_j_measen_2024_101037 crossref_primary_10_1177_14727978251321744 crossref_primary_10_1016_j_rser_2025_116299 crossref_primary_10_1007_s10489_022_04275_9 crossref_primary_10_1007_s00521_023_08941_y crossref_primary_10_1016_j_ijhydene_2022_04_026 crossref_primary_10_1016_j_eswa_2022_117989 crossref_primary_10_1186_s40708_024_00225_y crossref_primary_10_1007_s41060_025_00735_w crossref_primary_10_3390_su15043043 crossref_primary_10_7717_peerj_cs_1041 crossref_primary_10_1088_2632_2153_ad861d crossref_primary_10_3390_math13060996 |
| Cites_doi | 10.1016/j.compeleceng.2013.11.024 10.1186/s13634-016-0355-x 10.1016/j.eswa.2015.07.052 10.1016/j.patcog.2016.11.003 10.1016/j.jtbi.2018.10.047 10.1371/journal.pone.0166017 10.1142/S0219720016500293 10.1145/1835804.1835848 10.1016/j.eswa.2016.03.031 10.1155/2018/1407817 10.1007/s10462-019-09682-y 10.1145/1273496.1273641 10.1016/j.eswa.2019.01.083 10.1504/IJKESDP.2016.084603 10.1016/j.eswa.2016.03.020 |
| ContentType | Journal Article |
| Copyright | 2021 Elsevier Ltd Copyright Elsevier BV Jul 15, 2021 |
| Copyright_xml | – notice: 2021 Elsevier Ltd – notice: Copyright Elsevier BV Jul 15, 2021 |
| DBID | AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D |
| DOI | 10.1016/j.eswa.2021.114765 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Computer and Information Systems Abstracts |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1873-6793 |
| ExternalDocumentID | 10_1016_j_eswa_2021_114765 S0957417421002062 |
| GroupedDBID | --K --M .DC .~1 0R~ 13V 1B1 1RT 1~. 1~5 4.4 457 4G. 5GY 5VS 7-5 71M 8P~ 9JN 9JO AAAKF AABNK AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AARIN AAXUO AAYFN ABBOA ABFNM ABMAC ABMVD ABUCO ABYKQ ACDAQ ACGFS ACHRH ACNTT ACRLP ACZNC ADBBV ADEZE ADTZH AEBSH AECPX AEKER AENEX AFKWA AFTJW AGHFR AGJBL AGUBO AGUMN AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJOXV ALEQD ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD APLSM AXJTR BJAXD BKOJK BLXMC BNSAS CS3 DU5 EBS EFJIC EFLBG EO8 EO9 EP2 EP3 F5P FDB FIRID FNPLU FYGXN G-Q GBLVA GBOLZ HAMUX IHE J1W JJJVA KOM LG9 LY1 LY7 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. PQQKQ Q38 ROL RPZ SDF SDG SDP SDS SES SPC SPCBC SSB SSD SSL SST SSV SSZ T5K TN5 ~G- 29G 9DU AAAKG AAQXK AATTM AAXKI AAYWO AAYXX ABJNI ABKBG ABUFD ABWVN ABXDB ACLOT ACNNM ACRPL ACVFH ADCNI ADJOM ADMUD ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP ASPBG AVWKF AZFZN CITATION EFKBS EJD FEDTE FGOYB G-2 HLZ HVGLF HZ~ R2- SBC SET SEW WUQ XPP ZMT ~HD 7SC 8FD AFXIZ AGCQF AGRNS BNPGV JQ2 L7M L~C L~D SSH |
| ID | FETCH-LOGICAL-c394t-df1a68859ae8ec7a44ecd99765b80d4f72038d120582b5f9ac40245a989f177c3 |
| ISICitedReferencesCount | 224 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000663146900011&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0957-4174 |
| IngestDate | Fri Jul 25 03:43:57 EDT 2025 Tue Nov 18 22:36:20 EST 2025 Sat Nov 29 07:08:26 EST 2025 Fri Feb 23 02:46:14 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Dimensionality reduction Feature selection Filter model Information gain Classification Principal component analysis |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c394t-df1a68859ae8ec7a44ecd99765b80d4f72038d120582b5f9ac40245a989f177c3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| PQID | 2539562172 |
| PQPubID | 2045477 |
| ParticipantIDs | proquest_journals_2539562172 crossref_primary_10_1016_j_eswa_2021_114765 crossref_citationtrail_10_1016_j_eswa_2021_114765 elsevier_sciencedirect_doi_10_1016_j_eswa_2021_114765 |
| PublicationCentury | 2000 |
| PublicationDate | 2021-07-15 |
| PublicationDateYYYYMMDD | 2021-07-15 |
| PublicationDate_xml | – month: 07 year: 2021 text: 2021-07-15 day: 15 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | Expert systems with applications |
| PublicationYear | 2021 |
| Publisher | Elsevier Ltd Elsevier BV |
| Publisher_xml | – name: Elsevier Ltd – name: Elsevier BV |
| References | Xu, Gu, Wang, Wang, Qin (b0145) 2019; 10 Zhao, Z., & Liu, H. (2007). Spectral feature selection for supervised and unsupervised learning. In Proceedings of the 24th international conference on Machine learning, pages 1151– 1157. ACM. Kashef, Nezamabadi-pour, Nikpour (b0060) 2018; 8 Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3(11). 57–82. https://www.jmlr.org/papers/volume3/guyon03a/guyon03a.pdf. Zien, Kraemer, Sören, Gunnar (b0170) 2009 Syed, Hafiz, Saif (b0115) 2020; 11 Kamkar, Gupta, Phung, Venkatesh (b0065) 2015 Tang, Alelyani, Liu (b0125) 2014 Raghavendra, Indiramma (b0100) 2016; 5 461, 92-101. https://doi.org.10.1016/j.jtbi.2018.10.047. Chen, G. Cao, M. Xin, Hu, Wang, Gao (b0140) 2015 Ahmed, Iman, Min (b0005) 2013; 4 W. Zheng T. Eilamstock T. Wu A. Spagna Multi-features based network revealing the structural abnormalities in autism spectrum disorder IEEE Transactions Affective Computing 1 1 2019 https://doi.org.10.1109/TAFFC.2890597. Qiu, Wu, Ding, Xu, Feng (b0095) 2016 Chandrashekar, Sahin (b0020) 2014; 40 Sheikhpour, Sarram, Gharaghani, Chahooki (b0105) 2017; 64 Lavanya, Rani (b0070) 2011; 2 Solorio-Fernández, Carrasco-Ochoa, Martínez-Trinidad (b0110) 2020; 53 Powers (b0090) 2016; 2 Alhaj, T. Siraj, M., Zainal, A., & Elhaj, H. (2016). Feature Selection Using Information Gain for Improved Structural-Based Alert Correlation. PLoS ONE 11(11). https://doi.org/ 10.1371/ journal.pone. 0166017. (2018). GuoPrediction and functional analysis of prokaryote lysine acetylation site by incorporating six types of features into Chou's general PseAAC, Journal of Theoretical Biology Wen, Wong (b0135) 2016; 14 Heydari, Tavakoli, Salim (b0045) 2016; 58 Yu, J. Tan, Steinbach, Kumar (b0120) 2006 Zhao, Wang, Liu (b0160) 2010 Liu (b0075) 2015 D. Cai C. Zhang X. He Unsupervised feature selection for multi-cluster data 2010 ACM 333 342. Nguyen, Shirai, Velcin (b0085) 2015; 42 Indah, A., & Adiwijaya, A. (2018). Applied Computational Intelligence and Soft Computing. 8 (1407817), 5. Hindawi. https://doi.org/10.1155/2018/1407817. Fernández-Gavilanes, Álvarez-López, Juncal-Martínez, Costa-Montenegro, González-Castaño (b0035) 2016; 58 Trstenjak, Donko (b0130) 2016; 10 Chin, Andri, Habibollah, Nuzly (b0030) 2016; 13 Nobre, Neves (b0080) 2019; 125 Zhang (b0150) 2004 Raghavendra (10.1016/j.eswa.2021.114765_b0100) 2016; 5 Ahmed (10.1016/j.eswa.2021.114765_b0005) 2013; 4 10.1016/j.eswa.2021.114765_b0015 Solorio-Fernández (10.1016/j.eswa.2021.114765_b0110) 2020; 53 Kashef (10.1016/j.eswa.2021.114765_b0060) 2018; 8 Kamkar (10.1016/j.eswa.2021.114765_b0065) 2015 10.1016/j.eswa.2021.114765_b0155 Zhang (10.1016/j.eswa.2021.114765_b0150) 2004 10.1016/j.eswa.2021.114765_b0040 Wen (10.1016/j.eswa.2021.114765_b0135) 2016; 14 Zhao (10.1016/j.eswa.2021.114765_b0160) 2010 Powers (10.1016/j.eswa.2021.114765_b0090) 2016; 2 Qiu (10.1016/j.eswa.2021.114765_b0095) 2016 Liu (10.1016/j.eswa.2021.114765_b0075) 2015 Heydari (10.1016/j.eswa.2021.114765_b0045) 2016; 58 Chin (10.1016/j.eswa.2021.114765_b0030) 2016; 13 Trstenjak (10.1016/j.eswa.2021.114765_b0130) 2016; 10 Fernández-Gavilanes (10.1016/j.eswa.2021.114765_b0035) 2016; 58 Sheikhpour (10.1016/j.eswa.2021.114765_b0105) 2017; 64 10.1016/j.eswa.2021.114765_b0025 10.1016/j.eswa.2021.114765_b0165 Nobre (10.1016/j.eswa.2021.114765_b0080) 2019; 125 10.1016/j.eswa.2021.114765_b0010 Tan (10.1016/j.eswa.2021.114765_b0120) 2006 Xu (10.1016/j.eswa.2021.114765_b0145) 2019; 10 10.1016/j.eswa.2021.114765_b0050 Xin (10.1016/j.eswa.2021.114765_b0140) 2015 Syed (10.1016/j.eswa.2021.114765_b0115) 2020; 11 Nguyen (10.1016/j.eswa.2021.114765_b0085) 2015; 42 Tang (10.1016/j.eswa.2021.114765_b0125) 2014 Zien (10.1016/j.eswa.2021.114765_b0170) 2009 Chandrashekar (10.1016/j.eswa.2021.114765_b0020) 2014; 40 Lavanya (10.1016/j.eswa.2021.114765_b0070) 2011; 2 |
| References_xml | – volume: 53 start-page: 907 year: 2020 end-page: 948 ident: b0110 article-title: A review of unsupervised feature selection methods publication-title: Artificial Intelligence Review. – volume: 13 start-page: 971 year: 2016 end-page: 989 ident: b0030 article-title: Supervised, unsupervised, and semi supervised feature selection: a review on gene selection publication-title: IEEE/ACM TCBB. – volume: 40 start-page: 16 year: 2014 end-page: 28 ident: b0020 article-title: A survey on feature selection methods publication-title: Computers and Electrical Engineering. – start-page: 1 year: 2015 end-page: 10 ident: b0065 article-title: Exploiting Feature Relationships Towards Stable Feature Selection publication-title: IEEE International Conference on Data Science and Advanced Analytics (DSAA) – volume: 14 start-page: 1650029 year: 2016 ident: b0135 article-title: Evaluating feature-selection stability in next-generation proteomics publication-title: iology – volume: 11 start-page: 469 year: 2020 ident: b0115 article-title: A Comparative Study of Feature Selection Approaches: 2016–2020” publication-title: Journal of Scientific & Engineering Research – volume: 10 year: 2019 ident: b0145 article-title: Autoencoder Based Feature Selection Method for Classification of Anticancer Drug Response. publication-title: Genomics. – volume: 42 start-page: 9603 year: 2015 end-page: 9611 ident: b0085 article-title: Sentiment analysis on social media for stock movement prediction publication-title: Expert Systems with Applications. – reference: Cao, M., – reference: W. Zheng T. Eilamstock T. Wu A. Spagna Multi-features based network revealing the structural abnormalities in autism spectrum disorder IEEE Transactions Affective Computing 1 1 2019 https://doi.org.10.1109/TAFFC.2890597. – reference: Zhao, Z., & Liu, H. (2007). Spectral feature selection for supervised and unsupervised learning. In Proceedings of the 24th international conference on Machine learning, pages 1151– 1157. ACM. – volume: 4 start-page: 33 year: 2013 end-page: 39 ident: b0005 article-title: Performance Comparison between Naïve Bayes, Decision Tree and K-Nearest Neighbor in Searching Alternative Design in an Energy Simulation Tool publication-title: International Journal of Advanced Computer Science and Applications – reference: & Yu, J. – start-page: 1910 year: 2015 end-page: 1916 ident: b0140 article-title: Feature Selection from Brain sMRI Proc, publication-title: Artificial Intelligence. – volume: 58 start-page: 83 year: 2016 end-page: 92 ident: b0045 article-title: Detection of fake opinions using time series publication-title: Expert Systems with Applications. – volume: 10 start-page: 1184 year: 2016 end-page: 1190 ident: b0130 article-title: Case-Based Reasoning: A Hybrid Classification Model Improved with an Expert's Knowledge for High-Dimensional Problems publication-title: International Journal of Computer, Electrical, Automation, Control and Information Engineering – volume: 125 start-page: 181 year: 2019 end-page: 194 ident: b0080 article-title: Combining Principal Component Analysis, Discrete Wavelet Transform and XGBoost to trade in the financial markets publication-title: Expert Systems with Applications. – year: 2004 ident: b0150 article-title: The Optimality of Naïve Bayes publication-title: Semantic Scholar. – year: 2006 ident: b0120 article-title: Introduction to Data Mining – year: 2010 ident: b0160 article-title: Efficient spectral feature selection with minimum redundancy publication-title: (AAAI) – volume: 5 start-page: 262 year: 2016 end-page: 284 ident: b0100 article-title: Hybrid data mining model for the classification and prediction of medical datasets publication-title: International Journal of Knowledge Engineering and Soft Data Paradigms. – volume: 58 start-page: 57 year: 2016 end-page: 75 ident: b0035 article-title: Unsupervised method for sentiment analysis in online texts publication-title: Expert Systems with Applications. – reference: Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3(11). 57–82. https://www.jmlr.org/papers/volume3/guyon03a/guyon03a.pdf. – volume: 64 start-page: 141 year: 2017 end-page: 158 ident: b0105 article-title: A Survey on semi-supervised feature selection methods publication-title: Pattern Recognitiossn. – volume: 8 year: 2018 ident: b0060 article-title: Multilevel Feature Selection: A comprehensive review and guiding experiments publication-title: Wiley Period. – year: 2015 ident: b0075 article-title: Sentiment analysis: Mining opinions, sentiments and emotions – year: 2009 ident: b0170 publication-title: The Feature Importance Ranking Measure. – volume: 2 start-page: 756 year: 2011 end-page: 763 ident: b0070 article-title: Analysis of feature selection with classification – Breast Cancer Data Sets publication-title: Research gate publication – start-page: 67 year: 2016 ident: b0095 article-title: A survey of machine learning for big data processing publication-title: EURASIP Journal on Advances in Signal Processing – reference: 461, 92-101. https://doi.org.10.1016/j.jtbi.2018.10.047. – reference: D. Cai C. Zhang X. He Unsupervised feature selection for multi-cluster data 2010 ACM 333 342. – reference: Chen, G., – reference: Alhaj, T. Siraj, M., Zainal, A., & Elhaj, H. (2016). Feature Selection Using Information Gain for Improved Structural-Based Alert Correlation. PLoS ONE 11(11). https://doi.org/ 10.1371/ journal.pone. 0166017. – volume: 2 start-page: 37 year: 2016 end-page: 63 ident: b0090 article-title: Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation publication-title: Journal of Machine Learning Technologies – reference: (2018). GuoPrediction and functional analysis of prokaryote lysine acetylation site by incorporating six types of features into Chou's general PseAAC, Journal of Theoretical Biology, – start-page: 37 year: 2014 end-page: 64 ident: b0125 article-title: Feature Selection for Classification: A review publication-title: Data Classification: Algorithms and Applications – reference: Indah, A., & Adiwijaya, A. (2018). Applied Computational Intelligence and Soft Computing. 8 (1407817), 5. Hindawi. https://doi.org/10.1155/2018/1407817. – volume: 40 start-page: 16 issue: 1 year: 2014 ident: 10.1016/j.eswa.2021.114765_b0020 article-title: A survey on feature selection methods publication-title: Computers and Electrical Engineering. doi: 10.1016/j.compeleceng.2013.11.024 – start-page: 67 year: 2016 ident: 10.1016/j.eswa.2021.114765_b0095 article-title: A survey of machine learning for big data processing publication-title: EURASIP Journal on Advances in Signal Processing doi: 10.1186/s13634-016-0355-x – volume: 2 start-page: 37 issue: 1 year: 2016 ident: 10.1016/j.eswa.2021.114765_b0090 article-title: Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation publication-title: Journal of Machine Learning Technologies – volume: 11 start-page: 469 issue: 2 year: 2020 ident: 10.1016/j.eswa.2021.114765_b0115 article-title: A Comparative Study of Feature Selection Approaches: 2016–2020” publication-title: Journal of Scientific & Engineering Research – volume: 8 issue: 2 year: 2018 ident: 10.1016/j.eswa.2021.114765_b0060 article-title: Multilevel Feature Selection: A comprehensive review and guiding experiments publication-title: Wiley Period. – volume: 42 start-page: 9603 issue: 24 year: 2015 ident: 10.1016/j.eswa.2021.114765_b0085 article-title: Sentiment analysis on social media for stock movement prediction publication-title: Expert Systems with Applications. doi: 10.1016/j.eswa.2015.07.052 – start-page: 1910 year: 2015 ident: 10.1016/j.eswa.2021.114765_b0140 article-title: Feature Selection from Brain sMRI Proc, Twenty-Ninth AAAI Conference on publication-title: Artificial Intelligence. – volume: 64 start-page: 141 year: 2017 ident: 10.1016/j.eswa.2021.114765_b0105 article-title: A Survey on semi-supervised feature selection methods publication-title: Pattern Recognitiossn. doi: 10.1016/j.patcog.2016.11.003 – ident: 10.1016/j.eswa.2021.114765_b0025 doi: 10.1016/j.jtbi.2018.10.047 – ident: 10.1016/j.eswa.2021.114765_b0010 doi: 10.1371/journal.pone.0166017 – volume: 14 start-page: 1650029 issue: 5 year: 2016 ident: 10.1016/j.eswa.2021.114765_b0135 article-title: Evaluating feature-selection stability in next-generation proteomics publication-title: Journal of Bioinformatics and Computational Biology doi: 10.1142/S0219720016500293 – ident: 10.1016/j.eswa.2021.114765_b0015 doi: 10.1145/1835804.1835848 – volume: 10 issue: 233 year: 2019 ident: 10.1016/j.eswa.2021.114765_b0145 article-title: Autoencoder Based Feature Selection Method for Classification of Anticancer Drug Response. Frontiers in Genetics: Computational publication-title: Genomics. – volume: 13 start-page: 971 issue: 5 year: 2016 ident: 10.1016/j.eswa.2021.114765_b0030 article-title: Supervised, unsupervised, and semi supervised feature selection: a review on gene selection publication-title: IEEE/ACM TCBB. – start-page: 1 year: 2015 ident: 10.1016/j.eswa.2021.114765_b0065 article-title: Exploiting Feature Relationships Towards Stable Feature Selection – volume: 58 start-page: 57 year: 2016 ident: 10.1016/j.eswa.2021.114765_b0035 article-title: Unsupervised method for sentiment analysis in online texts publication-title: Expert Systems with Applications. doi: 10.1016/j.eswa.2016.03.031 – ident: 10.1016/j.eswa.2021.114765_b0050 doi: 10.1155/2018/1407817 – year: 2009 ident: 10.1016/j.eswa.2021.114765_b0170 publication-title: The Feature Importance Ranking Measure. – volume: 53 start-page: 907 issue: 2 year: 2020 ident: 10.1016/j.eswa.2021.114765_b0110 article-title: A review of unsupervised feature selection methods publication-title: Artificial Intelligence Review. doi: 10.1007/s10462-019-09682-y – year: 2006 ident: 10.1016/j.eswa.2021.114765_b0120 – year: 2015 ident: 10.1016/j.eswa.2021.114765_b0075 – year: 2010 ident: 10.1016/j.eswa.2021.114765_b0160 article-title: Efficient spectral feature selection with minimum redundancy – ident: 10.1016/j.eswa.2021.114765_b0155 doi: 10.1145/1273496.1273641 – volume: 4 start-page: 33 issue: 11 year: 2013 ident: 10.1016/j.eswa.2021.114765_b0005 article-title: Performance Comparison between Naïve Bayes, Decision Tree and K-Nearest Neighbor in Searching Alternative Design in an Energy Simulation Tool publication-title: International Journal of Advanced Computer Science and Applications – volume: 10 start-page: 1184 issue: 6 year: 2016 ident: 10.1016/j.eswa.2021.114765_b0130 article-title: Case-Based Reasoning: A Hybrid Classification Model Improved with an Expert's Knowledge for High-Dimensional Problems publication-title: International Journal of Computer, Electrical, Automation, Control and Information Engineering – volume: 2 start-page: 756 issue: 5 year: 2011 ident: 10.1016/j.eswa.2021.114765_b0070 article-title: Analysis of feature selection with classification – Breast Cancer Data Sets publication-title: Research gate publication – volume: 125 start-page: 181 issue: 1 year: 2019 ident: 10.1016/j.eswa.2021.114765_b0080 article-title: Combining Principal Component Analysis, Discrete Wavelet Transform and XGBoost to trade in the financial markets publication-title: Expert Systems with Applications. doi: 10.1016/j.eswa.2019.01.083 – start-page: 37 year: 2014 ident: 10.1016/j.eswa.2021.114765_b0125 article-title: Feature Selection for Classification: A review – ident: 10.1016/j.eswa.2021.114765_b0165 – volume: 5 start-page: 262 issue: 3/4 year: 2016 ident: 10.1016/j.eswa.2021.114765_b0100 article-title: Hybrid data mining model for the classification and prediction of medical datasets publication-title: International Journal of Knowledge Engineering and Soft Data Paradigms. doi: 10.1504/IJKESDP.2016.084603 – ident: 10.1016/j.eswa.2021.114765_b0040 – year: 2004 ident: 10.1016/j.eswa.2021.114765_b0150 article-title: The Optimality of Naïve Bayes publication-title: Semantic Scholar. – volume: 58 start-page: 83 year: 2016 ident: 10.1016/j.eswa.2021.114765_b0045 article-title: Detection of fake opinions using time series publication-title: Expert Systems with Applications. doi: 10.1016/j.eswa.2016.03.020 |
| SSID | ssj0017007 |
| Score | 2.7047215 |
| Snippet | •Feature selection improves performance of machine learning algorithms.•Feature selection with more n-tier techniques is simpler and more stable.•A feature... Feature Selection and classification have previously been widely applied in various areas like business, medical and media fields. High dimensionality in... |
| SourceID | proquest crossref elsevier |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 114765 |
| SubjectTerms | Algorithms Classification Data mining Dimensionality reduction Feature selection Filter model Information gain Machine learning Principal component analysis Principal components analysis |
| Title | Feature Selection for Classification using Principal Component Analysis and Information Gain |
| URI | https://dx.doi.org/10.1016/j.eswa.2021.114765 https://www.proquest.com/docview/2539562172 |
| Volume | 174 |
| WOSCitedRecordID | wos000663146900011&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1873-6793 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017007 issn: 0957-4174 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1JbxMxFLZCyoELO6K0IB8Ql2iqWTyxfaxQyhYllUghByTL8XggoZmkWUr7N_jFPI-XNhVEcOAyGo1nk9_n957fitBLTXOacZ3B3qQkESGyjEYFk5EC2dtWrCgTXRdx7dJejw2H_LjR-OlzYc5PaVWxiws-_6-khmtAbJM6-w_kDi-FC3AORIcjkB2Of0V4o9QZr8DHusONjySsm1-asCBL8HVtIji2lvbafjCdzyoTFxCqlNhA4ZDc2HojXZHuSYje04uVKwXtk-SuucOD-Xa6vpSe5X5v9YtvYzkdzcIwcJHZlXW-1a8uZfU1DH8YA6N3Ic82wr_1WWonTJy1Ik2MGdTma1oTWkij-bRhiqQRSWy3ngNtGTGjWdSmtnti4NTuFstrk99KAGuMmBzo5Q9TVipNTDVkavtRbJbb7vXF0Um3Kwad4eDV_CwynciMx961ZbmFdlKac9ZEO4fvOsP3wTdFY5uE7__apWLZqMGbn_2TunND8NfazOA-uuu2IfjQwucBaujqIbrnW3xgx_EfoS8OTTigCQMi8CaacI0mHNCEA5qwRxMGNOFraMIGTY_RyVFn8Ppt5BpyRCrjZBXBypVtxnIuNdOKSkK0KjgotPmIxQUpjUufFUka5ywd5SWXihjPvuSMlwmlKnuCmhV8_inCsdYx0SBsFE-JZLHMyYhx0s5oVtCyULso8fMmlKtWb5qmnAofljgRZq6FmWth53oXtcIzc1urZevduSeHcNqm1SIFQGnrc_uedsIt-6VI84zDTgJ2A8-2D--hO1fLYh81V4u1fo5uq_PVeLl44aD2Cy9Fq8Q |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Feature+Selection+for+Classification+using+Principal+Component+Analysis+and+Information+Gain&rft.jtitle=Expert+systems+with+applications&rft.au=Omuya%2C+Erick+Odhiambo&rft.au=Okeyo%2C+George+Onyango&rft.au=Kimwele%2C+Michael+Waema&rft.date=2021-07-15&rft.pub=Elsevier+BV&rft.issn=0957-4174&rft.eissn=1873-6793&rft.volume=174&rft.spage=1&rft_id=info:doi/10.1016%2Fj.eswa.2021.114765&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0957-4174&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0957-4174&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0957-4174&client=summon |