Performance and efficiency of machine learning algorithms for analyzing rectangular biomedical data
Most biomedical datasets, including those of ‘omics, population studies, and surveys, are rectangular in shape and have few missing data. Recently, their sample sizes have grown significantly. Rigorous analyses on these large datasets demand considerably more efficient and more accurate algorithms....
Saved in:
| Published in: | Laboratory investigation Vol. 101; no. 4; pp. 430 - 441 |
|---|---|
| Main Authors: | , , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
New York
Elsevier Inc
01.04.2021
Nature Publishing Group US Nature Publishing Group |
| Subjects: | |
| ISSN: | 0023-6837, 1530-0307, 1530-0307 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Most biomedical datasets, including those of ‘omics, population studies, and surveys, are rectangular in shape and have few missing data. Recently, their sample sizes have grown significantly. Rigorous analyses on these large datasets demand considerably more efficient and more accurate algorithms. Machine learning (ML) algorithms have been used to classify outcomes in biomedical datasets, including random forests (RF), decision tree (DT), artificial neural networks (ANN), and support vector machine (SVM). However, their performance and efficiency in classifying multi-category outcomes of rectangular data are poorly understood. Therefore, we compared these metrics among the 4 ML algorithms. As an example, we created a large rectangular dataset using the female breast cancers in the surveillance, epidemiology, and end results-18 database, which were diagnosed in 2004 and followed up until December 2016. The outcome was the five-category cause of death, namely alive, non-breast cancer, breast cancer, cardiovascular disease, and other cause. We analyzed the 54 dichotomized features from ~45,000 patients using MatLab (version 2018a) and the tenfold cross-validation approach. The accuracy in classifying five-category cause of death with DT, RF, ANN, and SVM was 69.21%, 70.23%, 70.16%, and 69.06%, respectively, which was higher than the accuracy of 68.12% with multinomial logistic regression. Based on the features' information entropy, we optimized dimension reduction (i.e., reduce the number of features in models). We found 32 or more features were required to maintain similar accuracy, while the running time decreased from 55.57 s for 54 features to 25.99 s for 32 features in RF, from 12.92 s to 10.48 s in ANN, and from 175.50 s to 67.81 s in SVM. In summary, we here show that RF, DT, ANN, and SVM had similar accuracy for classifying multi-category outcomes in this large rectangular dataset. Dimension reduction based on information gain will increase the model's efficiency while maintaining classification accuracy. |
|---|---|
| AbstractList | Most biomedical datasets, including those of ‘omics, population studies, and surveys, are rectangular in shape and have few missing data. Recently, their sample sizes have grown significantly. Rigorous analyses on these large datasets demand considerably more efficient and more accurate algorithms. Machine learning (ML) algorithms have been used to classify outcomes in biomedical datasets, including random forests (RF), decision tree (DT), artificial neural networks (ANN), and support vector machine (SVM). However, their performance and efficiency in classifying multi-category outcomes of rectangular data are poorly understood. Therefore, we compared these metrics among the 4 ML algorithms. As an example, we created a large rectangular dataset using the female breast cancers in the surveillance, epidemiology, and end results-18 database, which were diagnosed in 2004 and followed up until December 2016. The outcome was the five-category cause of death, namely alive, non-breast cancer, breast cancer, cardiovascular disease, and other cause. We analyzed the 54 dichotomized features from ~45,000 patients using MatLab (version 2018a) and the tenfold cross-validation approach. The accuracy in classifying five-category cause of death with DT, RF, ANN, and SVM was 69.21%, 70.23%, 70.16%, and 69.06%, respectively, which was higher than the accuracy of 68.12% with multinomial logistic regression. Based on the features' information entropy, we optimized dimension reduction (i.e., reduce the number of features in models). We found 32 or more features were required to maintain similar accuracy, while the running time decreased from 55.57 s for 54 features to 25.99 s for 32 features in RF, from 12.92 s to 10.48 s in ANN, and from 175.50 s to 67.81 s in SVM. In summary, we here show that RF, DT, ANN, and SVM had similar accuracy for classifying multi-category outcomes in this large rectangular dataset. Dimension reduction based on information gain will increase the model’s efficiency while maintaining classification accuracy.Most current biomedical datasets are rectangular in shape and have few missing data, but the sample sizes are very large. Rigorous analyses of these huge datasets demand considerably more efficient and more accurate machine-learning algorithms to classify outcomes. This paper aims to determine the performance and efficiency of classifying multi-category outcomes of such rectangular data. Most biomedical datasets, including those of ‘omics, population studies, and surveys, are rectangular in shape and have few missing data. Recently, their sample sizes have grown significantly. Rigorous analyses on these large datasets demand considerably more efficient and more accurate algorithms. Machine learning (ML) algorithms have been used to classify outcomes in biomedical datasets, including random forests (RF), decision tree (DT), artificial neural networks (ANN), and support vector machine (SVM). However, their performance and efficiency in classifying multi-category outcomes of rectangular data are poorly understood. Therefore, we compared these metrics among the 4 ML algorithms. As an example, we created a large rectangular dataset using the female breast cancers in the surveillance, epidemiology, and end results-18 database, which were diagnosed in 2004 and followed up until December 2016. The outcome was the five-category cause of death, namely alive, non-breast cancer, breast cancer, cardiovascular disease, and other cause. We analyzed the 54 dichotomized features from ~45,000 patients using MatLab (version 2018a) and the tenfold cross-validation approach. The accuracy in classifying five-category cause of death with DT, RF, ANN, and SVM was 69.21%, 70.23%, 70.16%, and 69.06%, respectively, which was higher than the accuracy of 68.12% with multinomial logistic regression. Based on the features' information entropy, we optimized dimension reduction (i.e., reduce the number of features in models). We found 32 or more features were required to maintain similar accuracy, while the running time decreased from 55.57 s for 54 features to 25.99 s for 32 features in RF, from 12.92 s to 10.48 s in ANN, and from 175.50 s to 67.81 s in SVM. In summary, we here show that RF, DT, ANN, and SVM had similar accuracy for classifying multi-category outcomes in this large rectangular dataset. Dimension reduction based on information gain will increase the model's efficiency while maintaining classification accuracy. Most biomedical datasets, including those of ‘omics, population studies, and surveys, are rectangular in shape and have few missing data. Recently, their sample sizes have grown significantly. Rigorous analyses on these large datasets demand considerably more efficient and more accurate algorithms. Machine learning (ML) algorithms have been used to classify outcomes in biomedical datasets, including random forests (RF), decision tree (DT), artificial neural networks (ANN), and support vector machine (SVM). However, their performance and efficiency in classifying multi-category outcomes of rectangular data are poorly understood. Therefore, we compared these metrics among the 4 ML algorithms. As an example, we created a large rectangular dataset using the female breast cancers in the surveillance, epidemiology, and end results-18 database, which were diagnosed in 2004 and followed up until December 2016. The outcome was the five-category cause of death, namely alive, non-breast cancer, breast cancer, cardiovascular disease, and other cause. We analyzed the 54 dichotomized features from ~45,000 patients using MatLab (version 2018a) and the tenfold cross-validation approach. The accuracy in classifying five-category cause of death with DT, RF, ANN, and SVM was 69.21%, 70.23%, 70.16%, and 69.06%, respectively, which was higher than the accuracy of 68.12% with multinomial logistic regression. Based on the features' information entropy, we optimized dimension reduction (i.e., reduce the number of features in models). We found 32 or more features were required to maintain similar accuracy, while the running time decreased from 55.57 s for 54 features to 25.99 s for 32 features in RF, from 12.92 s to 10.48 s in ANN, and from 175.50 s to 67.81 s in SVM. In summary, we here show that RF, DT, ANN, and SVM had similar accuracy for classifying multi-category outcomes in this large rectangular dataset. Dimension reduction based on information gain will increase the model’s efficiency while maintaining classification accuracy. Most current biomedical datasets are rectangular in shape and have few missing data, but the sample sizes are very large. Rigorous analyses of these huge datasets demand considerably more efficient and more accurate machine-learning algorithms to classify outcomes. This paper aims to determine the performance and efficiency of classifying multi-category outcomes of such rectangular data. Most biomedical datasets, including those of 'omics, population studies, and surveys, are rectangular in shape and have few missing data. Recently, their sample sizes have grown significantly. Rigorous analyses on these large datasets demand considerably more efficient and more accurate algorithms. Machine learning (ML) algorithms have been used to classify outcomes in biomedical datasets, including random forests (RF), decision tree (DT), artificial neural networks (ANN), and support vector machine (SVM). However, their performance and efficiency in classifying multi-category outcomes of rectangular data are poorly understood. Therefore, we compared these metrics among the 4 ML algorithms. As an example, we created a large rectangular dataset using the female breast cancers in the surveillance, epidemiology, and end results-18 database, which were diagnosed in 2004 and followed up until December 2016. The outcome was the five-category cause of death, namely alive, non-breast cancer, breast cancer, cardiovascular disease, and other cause. We analyzed the 54 dichotomized features from ~45,000 patients using MatLab (version 2018a) and the tenfold cross-validation approach. The accuracy in classifying five-category cause of death with DT, RF, ANN, and SVM was 69.21%, 70.23%, 70.16%, and 69.06%, respectively, which was higher than the accuracy of 68.12% with multinomial logistic regression. Based on the features' information entropy, we optimized dimension reduction (i.e., reduce the number of features in models). We found 32 or more features were required to maintain similar accuracy, while the running time decreased from 55.57 s for 54 features to 25.99 s for 32 features in RF, from 12.92 s to 10.48 s in ANN, and from 175.50 s to 67.81 s in SVM. In summary, we here show that RF, DT, ANN, and SVM had similar accuracy for classifying multi-category outcomes in this large rectangular dataset. Dimension reduction based on information gain will increase the model's efficiency while maintaining classification accuracy.Most biomedical datasets, including those of 'omics, population studies, and surveys, are rectangular in shape and have few missing data. Recently, their sample sizes have grown significantly. Rigorous analyses on these large datasets demand considerably more efficient and more accurate algorithms. Machine learning (ML) algorithms have been used to classify outcomes in biomedical datasets, including random forests (RF), decision tree (DT), artificial neural networks (ANN), and support vector machine (SVM). However, their performance and efficiency in classifying multi-category outcomes of rectangular data are poorly understood. Therefore, we compared these metrics among the 4 ML algorithms. As an example, we created a large rectangular dataset using the female breast cancers in the surveillance, epidemiology, and end results-18 database, which were diagnosed in 2004 and followed up until December 2016. The outcome was the five-category cause of death, namely alive, non-breast cancer, breast cancer, cardiovascular disease, and other cause. We analyzed the 54 dichotomized features from ~45,000 patients using MatLab (version 2018a) and the tenfold cross-validation approach. The accuracy in classifying five-category cause of death with DT, RF, ANN, and SVM was 69.21%, 70.23%, 70.16%, and 69.06%, respectively, which was higher than the accuracy of 68.12% with multinomial logistic regression. Based on the features' information entropy, we optimized dimension reduction (i.e., reduce the number of features in models). We found 32 or more features were required to maintain similar accuracy, while the running time decreased from 55.57 s for 54 features to 25.99 s for 32 features in RF, from 12.92 s to 10.48 s in ANN, and from 175.50 s to 67.81 s in SVM. In summary, we here show that RF, DT, ANN, and SVM had similar accuracy for classifying multi-category outcomes in this large rectangular dataset. Dimension reduction based on information gain will increase the model's efficiency while maintaining classification accuracy. |
| Author | Huang, Jibing Yuan, Xiaoling Cheng, Chao Deng, Fei Zhang, Lanjing |
| Author_xml | – sequence: 1 givenname: Fei orcidid: 0000-0002-7600-6295 surname: Deng fullname: Deng, Fei organization: School of Electrical and Electronic Engineering, Shanghai Institute of Technology, Shanghai, China – sequence: 2 givenname: Jibing surname: Huang fullname: Huang, Jibing organization: School of Electrical and Electronic Engineering, Shanghai Institute of Technology, Shanghai, China – sequence: 3 givenname: Xiaoling surname: Yuan fullname: Yuan, Xiaoling organization: Department of Infectious Disease, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine Shanghai, Shanghai, China – sequence: 4 givenname: Chao orcidid: 0000-0002-5002-3417 surname: Cheng fullname: Cheng, Chao organization: Department of Medicine, Baylor College of Medicine, 77030, Houston, TX, USA – sequence: 5 givenname: Lanjing orcidid: 0000-0001-5436-887X surname: Zhang fullname: Zhang, Lanjing email: lanjing.zhang@rutgers.edu organization: Department of Pathology, Princeton Medical Center, Plainsboro, NJ, USA |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/33574440$$D View this record in MEDLINE/PubMed |
| BookMark | eNp9kU1vFSEUhompsbfVP-DCTOLGzejhYwpN3JjGr6SJLnRNGDjc0jBQYabp9dfL7VRNurgrAjwPh3PeE3KUckJCXlJ4S4Grd1VQLkUPDHqAgQ393ROyoQNvWw7yiGwAGO_PFJfH5KTWawAqxNnwjBxzPkghBGyI_Y7F5zKZZLEzyXXofbABk9112XeTsVchYRfRlBTStjNxm0uYr6baNa0ZJu5-7y8K2tmk7RJN6caQJ3TBmtg5M5vn5Kk3seKLh_WU_Pz08cfFl_7y2-evFx8ueyskm3sGnDN-Pjgq0SnBmOXopFQMvWgnXjg1ioGCV6OnTnI2KsnH1gpSYzhFfkrerO_elPxrwTrrKVSLMZqEeamaCXXOBqlANPT1I_Q6L6U106gBeBumBGjUqwdqGVtD-qaEyZSd_ju-BrAVsCXXWtD_QyjofUZ6zUi3jPR9RvquSeqRZMNs5pDTXEyIh1W-qrXVSVss_7990Hq_WtiGfxuaVe8TbhHtU9Muh0P6H_w4ud0 |
| CitedBy_id | crossref_primary_10_1093_bib_bbaf398 crossref_primary_10_3390_s24227258 crossref_primary_10_1007_s10586_025_05318_9 crossref_primary_10_53941_tai_2025_100005 crossref_primary_10_1016_j_labinv_2023_100320 crossref_primary_10_1109_ACCESS_2024_3367602 crossref_primary_10_1002_rcm_9991 crossref_primary_10_1016_j_artmed_2025_103074 crossref_primary_10_1016_j_intonc_2025_06_003 crossref_primary_10_1016_j_jad_2023_03_080 crossref_primary_10_1016_j_saa_2021_119680 crossref_primary_10_2196_28036 crossref_primary_10_1038_s41598_025_08915_1 crossref_primary_10_32604_cmc_2022_029823 crossref_primary_10_1007_s10994_024_06556_5 crossref_primary_10_2166_ws_2024_141 crossref_primary_10_1038_s41374_021_00662_x crossref_primary_10_1186_s12911_025_02990_0 crossref_primary_10_1007_s10439_024_03459_3 crossref_primary_10_1007_s11356_021_17956_8 crossref_primary_10_1039_D4SU00705K crossref_primary_10_1007_s12665_025_12420_z crossref_primary_10_1145_3687025 crossref_primary_10_1093_carcin_bgad062 crossref_primary_10_1038_s41598_025_02954_4 crossref_primary_10_3390_buildings14123774 crossref_primary_10_1080_07853890_2024_2357742 crossref_primary_10_1016_j_ins_2024_120270 crossref_primary_10_12998_wjcc_v13_i22_104379 crossref_primary_10_7717_peerj_cs_2418 crossref_primary_10_3390_rs15225338 crossref_primary_10_1016_j_afres_2025_100952 crossref_primary_10_1016_j_pbiomolbio_2023_03_001 crossref_primary_10_1186_s12872_022_02999_7 crossref_primary_10_1007_s10815_024_03372_7 crossref_primary_10_1177_21514593231179316 crossref_primary_10_1016_j_scitotenv_2024_173605 |
| Cites_doi | 10.6004/jnccn.2018.0083 10.1007/s12665-014-3661-3 10.1023/A:1010933404324 10.1016/j.csbj.2014.11.005 10.1002/cncr.32648 10.1038/s41374-018-0125-5 10.5858/arpa.2017-0099-OA 10.6004/jnccn.2019.0009 10.3322/caac.21590 10.1093/bioinformatics/btn374 10.1089/dna.2018.4533 10.1016/j.cmpb.2019.04.008 10.1016/j.catena.2019.104179 10.1111/his.13328 10.1111/his.12936 10.5858/arpa.2019-0435-OA 10.1109/ACCESS.2019.2930235 10.7150/thno.22065 10.1016/j.ejca.2011.06.016 10.4236/jbise.2013.65070 10.1109/2.485891 10.1111/jcmm.14231 10.1016/j.catena.2018.01.005 10.1186/1471-2164-9-S1-S13 10.1016/j.asr.2020.01.036 10.1109/21.97458 10.2202/1544-6115.1492 10.1007/11941439_114 10.5772/50893 10.1023/A:1022627411411 10.1109/EBBT.2018.8391453 |
| ContentType | Journal Article |
| Copyright | 2021 United States & Canadian Academy of Pathology The Author(s), under exclusive licence to United States and Canadian Academy of Pathology 2021 The Author(s), under exclusive licence to United States and Canadian Academy of Pathology 2021. |
| Copyright_xml | – notice: 2021 United States & Canadian Academy of Pathology – notice: The Author(s), under exclusive licence to United States and Canadian Academy of Pathology 2021 – notice: The Author(s), under exclusive licence to United States and Canadian Academy of Pathology 2021. |
| DBID | 6I. AAFTH AAYXX CITATION CGR CUY CVF ECM EIF NPM 3V. 7QL 7QP 7QR 7T5 7T7 7TK 7TM 7U9 7X7 7XB 88E 8AO 8C1 8FD 8FE 8FH 8FI 8FJ 8FK ABUWG AFKRA AZQEC BBNVY BENPR BHPHI C1K CCPQU DWQXO FR3 FYUFA GHDGH GNUQQ H94 HCIFZ K9. LK8 M0S M1P M7N M7P P64 PHGZM PHGZT PJZUB PKEHL PPXIY PQEST PQGLB PQQKQ PQUKI PRINS 7X8 |
| DOI | 10.1038/s41374-020-00525-x |
| DatabaseName | ScienceDirect Open Access Titles Elsevier:ScienceDirect:Open Access CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed ProQuest Central (Corporate) Bacteriology Abstracts (Microbiology B) Calcium & Calcified Tissue Abstracts Chemoreception Abstracts Immunology Abstracts Industrial and Applied Microbiology Abstracts (Microbiology A) Neurosciences Abstracts Nucleic Acids Abstracts Virology and AIDS Abstracts Health & Medical Collection ProQuest Central (purchase pre-March 2016) Medical Database (Alumni Edition) ProQuest Pharma Collection Public Health Database Technology Research Database ProQuest SciTech Collection ProQuest Natural Science Journals Hospital Premium Collection Hospital Premium Collection (Alumni Edition) ProQuest Central (Alumni) (purchase pre-March 2016) ProQuest Central (Alumni) ProQuest Central UK/Ireland ProQuest Central Essentials Biological Science Collection ProQuest Central Database Suite (ProQuest) Natural Science Collection Environmental Sciences and Pollution Management ProQuest One Community College ProQuest Central Engineering Research Database Health Research Premium Collection Health Research Premium Collection (Alumni) ProQuest Central Student AIDS and Cancer Research Abstracts SciTech Premium Collection ProQuest Health & Medical Complete (Alumni) Biological Sciences ProQuest Health & Medical Collection Medical Database ProQuest Algology Mycology and Protozoology Abstracts (Microbiology C) Biological Science Database Biotechnology and BioEngineering Abstracts Proquest Central Premium ProQuest One Academic (New) ProQuest Health & Medical Research Collection ProQuest One Academic Middle East (New) ProQuest One Health & Nursing ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central China MEDLINE - Academic |
| DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) ProQuest Central Student ProQuest Central Essentials Nucleic Acids Abstracts SciTech Premium Collection ProQuest Central China Environmental Sciences and Pollution Management ProQuest One Applied & Life Sciences Health Research Premium Collection Natural Science Collection Health & Medical Research Collection Biological Science Collection Chemoreception Abstracts Industrial and Applied Microbiology Abstracts (Microbiology A) ProQuest Central (New) ProQuest Medical Library (Alumni) Virology and AIDS Abstracts ProQuest Biological Science Collection ProQuest One Academic Eastern Edition ProQuest Hospital Collection Health Research Premium Collection (Alumni) Biological Science Database Neurosciences Abstracts ProQuest Hospital Collection (Alumni) Biotechnology and BioEngineering Abstracts ProQuest Health & Medical Complete ProQuest One Academic UKI Edition Engineering Research Database ProQuest One Academic Calcium & Calcified Tissue Abstracts ProQuest One Academic (New) Technology Research Database ProQuest One Academic Middle East (New) ProQuest Health & Medical Complete (Alumni) ProQuest Central (Alumni Edition) ProQuest One Community College ProQuest One Health & Nursing ProQuest Natural Science Collection ProQuest Pharma Collection ProQuest Central ProQuest Health & Medical Research Collection Health and Medicine Complete (Alumni Edition) ProQuest Central Korea Bacteriology Abstracts (Microbiology B) Algology Mycology and Protozoology Abstracts (Microbiology C) AIDS and Cancer Research Abstracts ProQuest Public Health ProQuest SciTech Collection ProQuest Medical Library Immunology Abstracts ProQuest Central (Alumni) MEDLINE - Academic |
| DatabaseTitleList | ProQuest Central Student MEDLINE MEDLINE - Academic |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: BENPR name: ProQuest Central url: https://www.proquest.com/central sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Medicine |
| EISSN | 1530-0307 |
| EndPage | 441 |
| ExternalDocumentID | 33574440 10_1038_s41374_020_00525_x S0023683722006481 |
| Genre | Journal Article |
| GroupedDBID | --- -Q- -~X .55 .GJ 0R~ 1KJ 29L 2WC 36B 39C 3V. 4.4 53G 5GY 5RE 6I. 70F 7X7 88E 8AO 8C1 8FI 8FJ 8R4 8R5 8WZ A6W AADWK AAFTH AANZL AAWBL AAXUO AAYFA AAYJO AAZLF ABAWZ ABCQX ABGIJ ABJNI ABLJU ABUWG ACBMV ACBRV ACBYP ACGFO ACGFS ACIGE ACIWK ACKTT ACPRK ACRQY ACTTH ACVWB ACZOJ ADBBV ADHDB ADMDM ADQMX ADYYL AEDAW AEFTE AEJRE AENEX AEXYK AFFNX AFKRA AFOSN AFRAH AFSHS AGAYW AGEZK AGGBP AGHAI AHMBA AHPSJ AHSBF AILAN AJDOV AJRNO ALFFA ALMA_UNASSIGNED_HOLDINGS AMRAJ AMRJV AMYLF ASPBG AVWKF AXYYD AZFZN BAWUL BBNVY BENPR BHPHI BKKNO BPHCQ BVXVI CAG CCPQU COF CS3 DIK DNIVK DU5 E3Z EBLON EBS EE. EIOEI EJD EMB F5P FDB FDQFY FEDTE FERAY FIZPM FSGXE FYUFA GX1 HCIFZ HMCUK HVGLF HZ~ IH2 IWAJR JSO JZLTJ KQ8 M1P M7P MVM NAO NQJWS NYICJ O9- OK1 P2P P6G PQQKQ PROAC PSQYO Q2X RNS RNT RNTTT ROL S10 SNX SNYQT SOHCF SRMVM SV3 SWTZT TAOOD TBHMF TDRGL TR2 TSG TWZ UKHRP X7M Y6R YFH YKV YOC YQI YQT ZA5 ZGI ZXP AAHOK AALRI ADVLN AFJKZ AITUG AKRWK ALIPV PKN AAYWO AAYXX ACVFH ADCNI ADXHL AEUPX AFFHD AFPUW AIGII AKBMS AKYEP APXCP CITATION EFKBS PHGZM PHGZT PJZUB PPXIY PQGLB CGR CUY CVF ECM EIF NPM 7QL 7QP 7QR 7T5 7T7 7TK 7TM 7U9 7XB 8FD 8FE 8FH 8FK AZQEC C1K DWQXO FR3 GNUQQ H94 K9. LK8 M7N P64 PKEHL PQEST PQUKI PRINS 7X8 PUEGO |
| ID | FETCH-LOGICAL-c472t-20332395d17ed8422c3ed7782ef47edf4d8b4510f8bf1d732b873b335e1aa31e3 |
| IEDL.DBID | M7P |
| ISICitedReferencesCount | 49 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000617101000002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0023-6837 1530-0307 |
| IngestDate | Sun Sep 28 04:49:35 EDT 2025 Mon Oct 06 16:59:44 EDT 2025 Thu Apr 03 06:52:31 EDT 2025 Sat Nov 29 07:26:03 EST 2025 Tue Nov 18 20:58:53 EST 2025 Fri Feb 21 02:38:02 EST 2025 Fri Feb 23 02:39:23 EST 2024 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 4 |
| Language | English |
| License | This article is made available under the Elsevier license. |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c472t-20332395d17ed8422c3ed7782ef47edf4d8b4510f8bf1d732b873b335e1aa31e3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ORCID | 0000-0002-5002-3417 0000-0002-7600-6295 0000-0001-5436-887X |
| OpenAccessLink | https://dx.doi.org/10.1038/s41374-020-00525-x |
| PMID | 33574440 |
| PQID | 2503525700 |
| PQPubID | 25033 |
| PageCount | 12 |
| ParticipantIDs | proquest_miscellaneous_2489257804 proquest_journals_2503525700 pubmed_primary_33574440 crossref_primary_10_1038_s41374_020_00525_x crossref_citationtrail_10_1038_s41374_020_00525_x springer_journals_10_1038_s41374_020_00525_x elsevier_sciencedirect_doi_10_1038_s41374_020_00525_x |
| PublicationCentury | 2000 |
| PublicationDate | April 2021 20210400 2021-04-00 20210401 |
| PublicationDateYYYYMMDD | 2021-04-01 |
| PublicationDate_xml | – month: 04 year: 2021 text: April 2021 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York – name: United States |
| PublicationSubtitle | Advancing the understanding of human and experimental disease |
| PublicationTitle | Laboratory investigation |
| PublicationTitleAbbrev | Lab Invest |
| PublicationTitleAlternate | Lab Invest |
| PublicationYear | 2021 |
| Publisher | Elsevier Inc Nature Publishing Group US Nature Publishing Group |
| Publisher_xml | – name: Elsevier Inc – name: Nature Publishing Group US – name: Nature Publishing Group |
| References | Goetz, Gradishar, Anderson, Abraham, Aft, Allison (bib12) 2019; 17 Hong, Liu, Bui, Pradhan, Acharya, Pham (bib30) 2018; 163 Pirooznia, Yang, Yang, Deng (bib9) 2008; 9 Anyanwu, Shiva (bib31) 2009; 3 Maniruzzaman, Jahanur Rahman, Ahammed, Abedin, Suri, Biswas (bib8) 2019; 176 Liu, Zhang (bib1) 2019; 99 Jain, Jianchang, Mohiuddin (bib38) 1996; 29 Fradkin D, Schneider D, Muchnik I. Machine learning methods in the analysis of lung cancer survival data. DIMACS technical report 2005–35. 2006. Nguyen, Wang, Nguyen (bib37) 2013; 6 Lan, Hu, Jiang, Yang, Zhao (bib34) 2020; 65 Accessed 22 July 2020. Bishop (bib4) 2006 Mayo, Llanos, Yi, Duan, Zhang (bib15) 2016; 69 Wang XC, Shi F, Yu L, Li Y. Cases analysis of MATLAB neural network. Beijing: Beijing University of Aeronautics and Astronautics. 2009. p. 59–62. Bevers, Helvie, Bonaccio, Calhoun, Daly, Farrar (bib16) 2018; 16 Wang, Deng, Zeng, Shanahan, Li, Zhang (bib7) 2020; 10 Yang, Bao, Zhang, Kang, Haffty, Zhang (bib14) 2017; 71 Mao, Fu, Dong, Zheng, Dong, Li (bib20) 2019; 38 Grzesiak W, Zaborski D. Examples of the use of data mining methods in animal breeding. Adem Karahoca, editor. Data mining applications in engineering and medicine. London, UK: IntechOpen Limited; 2012; 303–24. Costache, Hong, Wang (bib29) 2019; 183 Chavali, Llanos, Yun, Hill, Tan, Zhang (bib13) 2018; 142 Sokolova M, Japkowicz N, Szpakowicz S. Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation. Australasian joint conference on artificial intelligence. 2006; 1015–21. Amrane M, Oukid S, Gagaoua I, Ensari T. Breast cancer classification using machine learning. stanbul: Electric Electronics, Computer Science, Biomedical Engineeringsʼ Meeting (EBBT); 2018. p. 1–4. Youssef, Pradhan, Jebur, El-Harbi (bib28) 2015; 73 Chow, Thike, Li, Nasir, Yeong, Tan (bib5) 2020; 144 Chung D, Keles S. Sparse partial least squares classification for high dimensional data. Stat Appl Genet Mol Biol. 2010;9: Article17. Afifi, Saad, Al-Husseini, Elmehrath, Northfelt, Sonbol (bib17) 2020; 126 Safavian, Landgrebe (bib33) 1991; 21 Cruz, Wishart (bib2) 2007; 2 Aruna, Rajagopalan, Nandakishore (bib27) 2011; 2 Garcia Leiv, Fernandez, Mancus, Casari (bib35) 2019; 7 Siegel, Miller, Jemal (bib11) 2020; 70 Gonçalves, Leles, Oliveira, Guimaraes, Cunha, Fernandes (bib25) 2019; 27 . Dong, Yang, Zhang, Gao, Ke, Sun (bib21) 2019; 23 Jaiantilal A. Classification and regression by randomforest-matlab. (2009, 2012). Cortes, Vapnik (bib39) 1995; 20 Clough-Gorr, Thwin, Stuck, Silliman (bib18) 2012; 48 Koo, Zhang, Chaterji (bib6) 2018; 8 Kourou, Exarchos, Exarchos, Karamouzis, Fotiadis (bib3) 2015; 13 Haibe-Kains, Desmedt, Sotiriou, Bontempi (bib10) 2008; 24 Breiman (bib36) 2001; 45 Costache, Hong, Wang (CR29) 2019; 183 Aruna, Rajagopalan, Nandakishore (CR27) 2011; 2 Safavian, Landgrebe (CR33) 1991; 21 Goetz, Gradishar, Anderson, Abraham, Aft, Allison (CR12) 2019; 17 Pirooznia, Yang, Yang, Deng (CR9) 2008; 9 CR19 Clough-Gorr, Thwin, Stuck, Silliman (CR18) 2012; 48 Nguyen, Wang, Nguyen (CR37) 2013; 6 Dong, Yang, Zhang, Gao, Ke, Sun (CR21) 2019; 23 Breiman (CR36) 2001; 45 Garcia Leiv, Fernandez, Mancus, Casari (CR35) 2019; 7 Cortes, Vapnik (CR39) 1995; 20 CR32 Gonçalves, Leles, Oliveira, Guimaraes, Cunha, Fernandes (CR25) 2019; 27 Chow, Thike, Li, Nasir, Yeong, Tan (CR5) 2020; 144 Kourou, Exarchos, Exarchos, Karamouzis, Fotiadis (CR3) 2015; 13 Jain, Jianchang, Mohiuddin (CR38) 1996; 29 Haibe-Kains, Desmedt, Sotiriou, Bontempi (CR10) 2008; 24 Anyanwu, Shiva (CR31) 2009; 3 Mao, Fu, Dong, Zheng, Dong, Li (CR20) 2019; 38 Afifi, Saad, Al-Husseini, Elmehrath, Northfelt, Sonbol (CR17) 2020; 126 Chavali, Llanos, Yun, Hill, Tan, Zhang (CR13) 2018; 142 Mayo, Llanos, Yi, Duan, Zhang (CR15) 2016; 69 Koo, Zhang, Chaterji (CR6) 2018; 8 CR26 Liu, Zhang (CR1) 2019; 99 Siegel, Miller, Jemal (CR11) 2020; 70 CR24 Wang, Deng, Zeng, Shanahan, Li, Zhang (CR7) 2020; 10 CR23 CR22 Yang, Bao, Zhang, Kang, Haffty, Zhang (CR14) 2017; 71 Cruz, Wishart (CR2) 2007; 2 CR40 Maniruzzaman, Jahanur Rahman, Ahammed, Abedin, Suri, Biswas (CR8) 2019; 176 Lan, Hu, Jiang, Yang, Zhao (CR34) 2020; 65 Hong, Liu, Bui, Pradhan, Acharya, Pham (CR30) 2018; 163 Bevers, Helvie, Bonaccio, Calhoun, Daly, Farrar (CR16) 2018; 16 Bishop (CR4) 2006 Youssef, Pradhan, Jebur, El-Harbi (CR28) 2015; 73 Chavali (10.1038/s41374-020-00525-x_bib13) 2018; 142 Dong (10.1038/s41374-020-00525-x_bib21) 2019; 23 Breiman (10.1038/s41374-020-00525-x_bib36) 2001; 45 Safavian (10.1038/s41374-020-00525-x_bib33) 1991; 21 10.1038/s41374-020-00525-x_bib26 Bevers (10.1038/s41374-020-00525-x_bib16) 2018; 16 10.1038/s41374-020-00525-x_bib24 10.1038/s41374-020-00525-x_bib23 Siegel (10.1038/s41374-020-00525-x_bib11) 2020; 70 Mayo (10.1038/s41374-020-00525-x_bib15) 2016; 69 Gonçalves (10.1038/s41374-020-00525-x_bib25) 2019; 27 Haibe-Kains (10.1038/s41374-020-00525-x_bib10) 2008; 24 Anyanwu (10.1038/s41374-020-00525-x_bib31) 2009; 3 Afifi (10.1038/s41374-020-00525-x_bib17) 2020; 126 Aruna (10.1038/s41374-020-00525-x_bib27) 2011; 2 Wang (10.1038/s41374-020-00525-x_bib7) 2020; 10 Lan (10.1038/s41374-020-00525-x_bib34) 2020; 65 10.1038/s41374-020-00525-x_bib22 Goetz (10.1038/s41374-020-00525-x_bib12) 2019; 17 10.1038/s41374-020-00525-x_bib40 Hong (10.1038/s41374-020-00525-x_bib30) 2018; 163 Koo (10.1038/s41374-020-00525-x_bib6) 2018; 8 10.1038/s41374-020-00525-x_bib19 Costache (10.1038/s41374-020-00525-x_bib29) 2019; 183 Clough-Gorr (10.1038/s41374-020-00525-x_bib18) 2012; 48 Jain (10.1038/s41374-020-00525-x_bib38) 1996; 29 Chow (10.1038/s41374-020-00525-x_bib5) 2020; 144 Nguyen (10.1038/s41374-020-00525-x_bib37) 2013; 6 Kourou (10.1038/s41374-020-00525-x_bib3) 2015; 13 Youssef (10.1038/s41374-020-00525-x_bib28) 2015; 73 Liu (10.1038/s41374-020-00525-x_bib1) 2019; 99 Bishop (10.1038/s41374-020-00525-x_bib4) 2006 Pirooznia (10.1038/s41374-020-00525-x_bib9) 2008; 9 Maniruzzaman (10.1038/s41374-020-00525-x_bib8) 2019; 176 Mao (10.1038/s41374-020-00525-x_bib20) 2019; 38 10.1038/s41374-020-00525-x_bib32 Cruz (10.1038/s41374-020-00525-x_bib2) 2007; 2 Yang (10.1038/s41374-020-00525-x_bib14) 2017; 71 Garcia Leiv (10.1038/s41374-020-00525-x_bib35) 2019; 7 Cortes (10.1038/s41374-020-00525-x_bib39) 1995; 20 |
| References_xml | – volume: 29 start-page: 31 year: 1996 end-page: 44 ident: bib38 article-title: Artificial neural networks: a tutorial publication-title: Computer – reference: Fradkin D, Schneider D, Muchnik I. Machine learning methods in the analysis of lung cancer survival data. DIMACS technical report 2005–35. 2006. – reference: Jaiantilal A. Classification and regression by randomforest-matlab. (2009, 2012). – volume: 24 start-page: 2200 year: 2008 end-page: 2208 ident: bib10 article-title: A comparative study of survival models for breast cancer prognostication based on microarray data: does a single gene beat them all? publication-title: Bioinformatics. – volume: 16 start-page: 1362 year: 2018 end-page: 1389 ident: bib16 article-title: Breast cancer screening and diagnosis, version 3.2018 publication-title: J Natl Compr Cancer Netw – reference: Amrane M, Oukid S, Gagaoua I, Ensari T. Breast cancer classification using machine learning. stanbul: Electric Electronics, Computer Science, Biomedical Engineeringsʼ Meeting (EBBT); 2018. p. 1–4. – volume: 10 start-page: 1344 year: 2020 end-page: 1554 ident: bib7 article-title: Predicting long-term multicategory cause of death in patients with prostate cancer: random forest versus multinomial model publication-title: Am J Cancer Res – volume: 71 start-page: 874 year: 2017 end-page: 886 ident: bib14 article-title: Short-term and long-term clinical outcomes of uncommon types of invasive breast cancer publication-title: Histopathology. – reference: Chung D, Keles S. Sparse partial least squares classification for high dimensional data. Stat Appl Genet Mol Biol. 2010;9: Article17. – volume: 2 start-page: 59 year: 2007 end-page: 77 ident: bib2 article-title: Applications of machine learning in cancer prediction and prognosis publication-title: Cancer Inf – volume: 73 start-page: 3745 year: 2015 end-page: 3761 ident: bib28 article-title: Landslide susceptibility mapping using ensemble bivariate and multivariate statistical models in Fayfa area, Saudi Arabia publication-title: Environ Earth Sci – volume: 6 start-page: 551 year: 2013 end-page: 560 ident: bib37 article-title: Random forest classifier combined with feature selection for breast cancer diagnosis and prognostic publication-title: J Biomed Sci Eng – volume: 144 start-page: 1397 year: 2020 end-page: 1400 ident: bib5 article-title: Counting mitoses with digital pathology in breast phyllodes tumors publication-title: Arch Pathol Lab Med. – reference: Sokolova M, Japkowicz N, Szpakowicz S. Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation. Australasian joint conference on artificial intelligence. 2006; 1015–21. – volume: 142 start-page: 721 year: 2018 end-page: 729 ident: bib13 article-title: Radiotherapy for patients with resected tumor deposit-positive colorectal cancer: a surveillance, epidemiology, and end results-based population study publication-title: Arch Pathol Lab Med – volume: 48 start-page: 805 year: 2012 end-page: 812 ident: bib18 article-title: Examining five- and ten-year survival in older women with breast cancer using cancer-specific geriatric assessment publication-title: Eur J Cancer – reference: Wang XC, Shi F, Yu L, Li Y. Cases analysis of MATLAB neural network. Beijing: Beijing University of Aeronautics and Astronautics. 2009. p. 59–62. – volume: 126 start-page: 1559 year: 2020 end-page: 1567 ident: bib17 article-title: Causes of death after breast cancer diagnosis: a US population-based analysis publication-title: Cancer – volume: 183 start-page: 104179 year: 2019 ident: bib29 article-title: Identification of torrential valleys using GIS and a novel hybrid integration of artificial intelligence, machine learning and bivariate statistics publication-title: Catena – volume: 27 start-page: 45 year: 2019 ident: bib25 article-title: Machine learning and infrared thermography for breast cancer detection publication-title: Multidiscipl Digit Publish Inst Proc – volume: 38 start-page: 322 year: 2019 end-page: 332 ident: bib20 article-title: Identification of a 26-lncRNAs risk model for predicting overall survival of cervical squamous cell carcinoma based on integrated bioinformatics analysis publication-title: DNA Cell Biol – volume: 176 start-page: 173 year: 2019 end-page: 193 ident: bib8 article-title: Statistical characterization and classification of colon microarray gene expression data using multiple machine learning paradigms publication-title: Comput Methods Programs Biomed – volume: 70 start-page: 7 year: 2020 end-page: 30 ident: bib11 article-title: Cancer statistics, 2020 publication-title: CA Cancer J Clin – volume: 2 start-page: 37 year: 2011 end-page: 45 ident: bib27 article-title: Knowledge based analysis of various statistical tools in detecting breast cancer publication-title: Comput Sci Inf Technol – volume: 69 start-page: 230 year: 2016 end-page: 238 ident: bib15 article-title: Prognostic value of tumour deposit and perineural invasion status in colorectal cancer patients: a SEER-based population study publication-title: Histopathology. – volume: 7 start-page: 99978 year: 2019 end-page: 99987 ident: bib35 article-title: A novel hyperparameter-free approach to decision tree construction that avoids overfitting by design publication-title: IEEE Access – volume: 17 start-page: 118 year: 2019 end-page: 126 ident: bib12 article-title: NCCN guidelines insights: breast cancer, version 3.2018 publication-title: J Natl Compr Canc Netw – reference: Accessed 22 July 2020. – volume: 45 start-page: 5 year: 2001 end-page: 32 ident: bib36 article-title: Random forests publication-title: Mach Learn – volume: 23 start-page: 3369 year: 2019 end-page: 3374 ident: bib21 article-title: Predicting overall survival of patients with hepatocellular carcinoma using a three-category method based on DNA methylation and machine learning publication-title: J Cell Mol Med – volume: 9 year: 2008 ident: bib9 article-title: A comparative study of different machine learning methods on microarray gene expression data publication-title: BMC Genom – reference: . – volume: 8 start-page: 277 year: 2018 end-page: 291 ident: bib6 article-title: Tiresias: context-sensitive approach to decipher the presence and strength of MicroRNA regulatory interactions publication-title: Theranostics. – volume: 21 start-page: 660 year: 1991 end-page: 674 ident: bib33 article-title: A survey of decision tree classifier methodology publication-title: IEEE Trans Syst Man Cybern – volume: 13 start-page: 8 year: 2015 end-page: 17 ident: bib3 article-title: Machine learning applications in cancer prognosis and prediction publication-title: Comput Struct Biotechnol J. – volume: 65 start-page: 2052 year: 2020 end-page: 2061 ident: bib34 article-title: A comparative study of decision tree, random forest, and convolutional neural network for spread-F identification publication-title: Adv Space Res – volume: 3 start-page: 230 year: 2009 end-page: 240 ident: bib31 article-title: Comparative analysis of serial decision tree classification algorithms publication-title: Int J Comput Sci Secur – volume: 99 start-page: 118 year: 2019 end-page: 127 ident: bib1 article-title: Trends in the characteristics of human functional genomic data on the gene expression omnibus, 2001–2017 publication-title: Lab Investig – year: 2006 ident: bib4 publication-title: Pattern recognition and machine learning – volume: 20 start-page: 273 year: 1995 end-page: 297 ident: bib39 article-title: Support-vector networks publication-title: Mach Learn – reference: Grzesiak W, Zaborski D. Examples of the use of data mining methods in animal breeding. Adem Karahoca, editor. Data mining applications in engineering and medicine. London, UK: IntechOpen Limited; 2012; 303–24. – volume: 163 start-page: 399 year: 2018 end-page: 413 ident: bib30 article-title: Landslide susceptibility mapping using J48 decision tree with AdaBoost, bagging and rotation forest ensembles in the Guangchang area (China) publication-title: Catena – year: 2006 ident: CR4 publication-title: Pattern recognition and machine learning – volume: 16 start-page: 1362 year: 2018 end-page: 89 ident: CR16 article-title: Breast cancer screening and diagnosis, version 3.2018 publication-title: J Natl Compr Cancer Netw doi: 10.6004/jnccn.2018.0083 – ident: CR22 – volume: 73 start-page: 3745 year: 2015 end-page: 61 ident: CR28 article-title: Landslide susceptibility mapping using ensemble bivariate and multivariate statistical models in Fayfa area, Saudi Arabia publication-title: Environ Earth Sci doi: 10.1007/s12665-014-3661-3 – volume: 45 start-page: 5 year: 2001 end-page: 32 ident: CR36 article-title: Random forests publication-title: Mach Learn doi: 10.1023/A:1010933404324 – volume: 13 start-page: 8 year: 2015 end-page: 17 ident: CR3 article-title: Machine learning applications in cancer prognosis and prediction publication-title: Comput Struct Biotechnol J. doi: 10.1016/j.csbj.2014.11.005 – volume: 126 start-page: 1559 year: 2020 end-page: 67 ident: CR17 article-title: Causes of death after breast cancer diagnosis: a US population-based analysis publication-title: Cancer doi: 10.1002/cncr.32648 – volume: 99 start-page: 118 year: 2019 end-page: 27 ident: CR1 article-title: Trends in the characteristics of human functional genomic data on the gene expression omnibus, 2001–2017 publication-title: Lab Investig doi: 10.1038/s41374-018-0125-5 – volume: 142 start-page: 721 year: 2018 end-page: 9 ident: CR13 article-title: Radiotherapy for patients with resected tumor deposit-positive colorectal cancer: a surveillance, epidemiology, and end results-based population study publication-title: Arch Pathol Lab Med doi: 10.5858/arpa.2017-0099-OA – volume: 17 start-page: 118 year: 2019 end-page: 26 ident: CR12 article-title: NCCN guidelines insights: breast cancer, version 3.2018 publication-title: J Natl Compr Canc Netw doi: 10.6004/jnccn.2019.0009 – volume: 2 start-page: 37 year: 2011 end-page: 45 ident: CR27 article-title: Knowledge based analysis of various statistical tools in detecting breast cancer publication-title: Comput Sci Inf Technol – volume: 70 start-page: 7 year: 2020 end-page: 30 ident: CR11 article-title: Cancer statistics, 2020 publication-title: CA Cancer J Clin doi: 10.3322/caac.21590 – volume: 3 start-page: 230 year: 2009 end-page: 40 ident: CR31 article-title: Comparative analysis of serial decision tree classification algorithms publication-title: Int J Comput Sci Secur – volume: 24 start-page: 2200 year: 2008 end-page: 8 ident: CR10 article-title: A comparative study of survival models for breast cancer prognostication based on microarray data: does a single gene beat them all? publication-title: Bioinformatics. doi: 10.1093/bioinformatics/btn374 – volume: 38 start-page: 322 year: 2019 end-page: 32 ident: CR20 article-title: Identification of a 26-lncRNAs risk model for predicting overall survival of cervical squamous cell carcinoma based on integrated bioinformatics analysis publication-title: DNA Cell Biol doi: 10.1089/dna.2018.4533 – volume: 176 start-page: 173 year: 2019 end-page: 93 ident: CR8 article-title: Statistical characterization and classification of colon microarray gene expression data using multiple machine learning paradigms publication-title: Comput Methods Programs Biomed doi: 10.1016/j.cmpb.2019.04.008 – volume: 183 start-page: 104179 year: 2019 ident: CR29 article-title: Identification of torrential valleys using GIS and a novel hybrid integration of artificial intelligence, machine learning and bivariate statistics publication-title: Catena doi: 10.1016/j.catena.2019.104179 – ident: CR40 – volume: 71 start-page: 874 year: 2017 end-page: 86 ident: CR14 article-title: Short-term and long-term clinical outcomes of uncommon types of invasive breast cancer publication-title: Histopathology. doi: 10.1111/his.13328 – volume: 69 start-page: 230 year: 2016 end-page: 8 ident: CR15 article-title: Prognostic value of tumour deposit and perineural invasion status in colorectal cancer patients: a SEER-based population study publication-title: Histopathology. doi: 10.1111/his.12936 – volume: 20 start-page: 273 year: 1995 end-page: 97 ident: CR39 article-title: Support-vector networks publication-title: Mach Learn – volume: 144 start-page: 1397 year: 2020 end-page: 400 ident: CR5 article-title: Counting mitoses with digital pathology in breast phyllodes tumors publication-title: Arch Pathol Lab Med. doi: 10.5858/arpa.2019-0435-OA – ident: CR23 – ident: CR19 – volume: 7 start-page: 99978 year: 2019 end-page: 87 ident: CR35 article-title: A novel hyperparameter-free approach to decision tree construction that avoids overfitting by design publication-title: IEEE Access doi: 10.1109/ACCESS.2019.2930235 – volume: 8 start-page: 277 year: 2018 end-page: 91 ident: CR6 article-title: Tiresias: context-sensitive approach to decipher the presence and strength of MicroRNA regulatory interactions publication-title: Theranostics. doi: 10.7150/thno.22065 – volume: 10 start-page: 1344 year: 2020 end-page: 55. ident: CR7 article-title: Predicting long-term multicategory cause of death in patients with prostate cancer: random forest versus multinomial model publication-title: Am J Cancer Res – volume: 48 start-page: 805 year: 2012 end-page: 12 ident: CR18 article-title: Examining five- and ten-year survival in older women with breast cancer using cancer-specific geriatric assessment publication-title: Eur J Cancer doi: 10.1016/j.ejca.2011.06.016 – volume: 6 start-page: 551 year: 2013 end-page: 60 ident: CR37 article-title: Random forest classifier combined with feature selection for breast cancer diagnosis and prognostic publication-title: J Biomed Sci Eng doi: 10.4236/jbise.2013.65070 – volume: 29 start-page: 31 year: 1996 end-page: 44 ident: CR38 article-title: Artificial neural networks: a tutorial publication-title: Computer doi: 10.1109/2.485891 – volume: 23 start-page: 3369 year: 2019 end-page: 74 ident: CR21 article-title: Predicting overall survival of patients with hepatocellular carcinoma using a three-category method based on DNA methylation and machine learning publication-title: J Cell Mol Med doi: 10.1111/jcmm.14231 – volume: 2 start-page: 59 year: 2007 end-page: 77 ident: CR2 article-title: Applications of machine learning in cancer prediction and prognosis publication-title: Cancer Inf – ident: CR32 – volume: 163 start-page: 399 year: 2018 end-page: 413 ident: CR30 article-title: Landslide susceptibility mapping using J48 decision tree with AdaBoost, bagging and rotation forest ensembles in the Guangchang area (China) publication-title: Catena doi: 10.1016/j.catena.2018.01.005 – volume: 9 year: 2008 ident: CR9 article-title: A comparative study of different machine learning methods on microarray gene expression data publication-title: BMC Genom doi: 10.1186/1471-2164-9-S1-S13 – volume: 27 start-page: 45 year: 2019 ident: CR25 article-title: Machine learning and infrared thermography for breast cancer detection publication-title: Multidiscipl Digit Publish Inst Proc – volume: 65 start-page: 2052 year: 2020 end-page: 61 ident: CR34 article-title: A comparative study of decision tree, random forest, and convolutional neural network for spread-F identification publication-title: Adv Space Res doi: 10.1016/j.asr.2020.01.036 – ident: CR26 – ident: CR24 – volume: 21 start-page: 660 year: 1991 end-page: 74 ident: CR33 article-title: A survey of decision tree classifier methodology publication-title: IEEE Trans Syst Man Cybern doi: 10.1109/21.97458 – volume: 126 start-page: 1559 year: 2020 ident: 10.1038/s41374-020-00525-x_bib17 article-title: Causes of death after breast cancer diagnosis: a US population-based analysis publication-title: Cancer doi: 10.1002/cncr.32648 – ident: 10.1038/s41374-020-00525-x_bib32 doi: 10.2202/1544-6115.1492 – ident: 10.1038/s41374-020-00525-x_bib40 – volume: 70 start-page: 7 year: 2020 ident: 10.1038/s41374-020-00525-x_bib11 article-title: Cancer statistics, 2020 publication-title: CA Cancer J Clin – year: 2006 ident: 10.1038/s41374-020-00525-x_bib4 – volume: 45 start-page: 5 year: 2001 ident: 10.1038/s41374-020-00525-x_bib36 article-title: Random forests publication-title: Mach Learn doi: 10.1023/A:1010933404324 – ident: 10.1038/s41374-020-00525-x_bib26 doi: 10.1007/11941439_114 – volume: 17 start-page: 118 year: 2019 ident: 10.1038/s41374-020-00525-x_bib12 article-title: NCCN guidelines insights: breast cancer, version 3.2018 publication-title: J Natl Compr Canc Netw doi: 10.6004/jnccn.2019.0009 – volume: 71 start-page: 874 year: 2017 ident: 10.1038/s41374-020-00525-x_bib14 article-title: Short-term and long-term clinical outcomes of uncommon types of invasive breast cancer publication-title: Histopathology. doi: 10.1111/his.13328 – volume: 23 start-page: 3369 year: 2019 ident: 10.1038/s41374-020-00525-x_bib21 article-title: Predicting overall survival of patients with hepatocellular carcinoma using a three-category method based on DNA methylation and machine learning publication-title: J Cell Mol Med doi: 10.1111/jcmm.14231 – ident: 10.1038/s41374-020-00525-x_bib24 – volume: 10 start-page: 1344 year: 2020 ident: 10.1038/s41374-020-00525-x_bib7 article-title: Predicting long-term multicategory cause of death in patients with prostate cancer: random forest versus multinomial model publication-title: Am J Cancer Res – volume: 65 start-page: 2052 year: 2020 ident: 10.1038/s41374-020-00525-x_bib34 article-title: A comparative study of decision tree, random forest, and convolutional neural network for spread-F identification publication-title: Adv Space Res doi: 10.1016/j.asr.2020.01.036 – ident: 10.1038/s41374-020-00525-x_bib22 doi: 10.5772/50893 – volume: 183 start-page: 104179 year: 2019 ident: 10.1038/s41374-020-00525-x_bib29 article-title: Identification of torrential valleys using GIS and a novel hybrid integration of artificial intelligence, machine learning and bivariate statistics publication-title: Catena doi: 10.1016/j.catena.2019.104179 – volume: 20 start-page: 273 year: 1995 ident: 10.1038/s41374-020-00525-x_bib39 article-title: Support-vector networks publication-title: Mach Learn doi: 10.1023/A:1022627411411 – volume: 8 start-page: 277 year: 2018 ident: 10.1038/s41374-020-00525-x_bib6 article-title: Tiresias: context-sensitive approach to decipher the presence and strength of MicroRNA regulatory interactions publication-title: Theranostics. doi: 10.7150/thno.22065 – volume: 3 start-page: 230 year: 2009 ident: 10.1038/s41374-020-00525-x_bib31 article-title: Comparative analysis of serial decision tree classification algorithms publication-title: Int J Comput Sci Secur – volume: 2 start-page: 37 year: 2011 ident: 10.1038/s41374-020-00525-x_bib27 article-title: Knowledge based analysis of various statistical tools in detecting breast cancer publication-title: Comput Sci Inf Technol – volume: 2 start-page: 59 year: 2007 ident: 10.1038/s41374-020-00525-x_bib2 article-title: Applications of machine learning in cancer prediction and prognosis publication-title: Cancer Inf – volume: 27 start-page: 45 year: 2019 ident: 10.1038/s41374-020-00525-x_bib25 article-title: Machine learning and infrared thermography for breast cancer detection publication-title: Multidiscipl Digit Publish Inst Proc – volume: 163 start-page: 399 year: 2018 ident: 10.1038/s41374-020-00525-x_bib30 article-title: Landslide susceptibility mapping using J48 decision tree with AdaBoost, bagging and rotation forest ensembles in the Guangchang area (China) publication-title: Catena doi: 10.1016/j.catena.2018.01.005 – volume: 9 year: 2008 ident: 10.1038/s41374-020-00525-x_bib9 article-title: A comparative study of different machine learning methods on microarray gene expression data publication-title: BMC Genom doi: 10.1186/1471-2164-9-S1-S13 – volume: 48 start-page: 805 year: 2012 ident: 10.1038/s41374-020-00525-x_bib18 article-title: Examining five- and ten-year survival in older women with breast cancer using cancer-specific geriatric assessment publication-title: Eur J Cancer doi: 10.1016/j.ejca.2011.06.016 – volume: 7 start-page: 99978 year: 2019 ident: 10.1038/s41374-020-00525-x_bib35 article-title: A novel hyperparameter-free approach to decision tree construction that avoids overfitting by design publication-title: IEEE Access doi: 10.1109/ACCESS.2019.2930235 – volume: 16 start-page: 1362 year: 2018 ident: 10.1038/s41374-020-00525-x_bib16 article-title: Breast cancer screening and diagnosis, version 3.2018 publication-title: J Natl Compr Cancer Netw doi: 10.6004/jnccn.2018.0083 – volume: 99 start-page: 118 year: 2019 ident: 10.1038/s41374-020-00525-x_bib1 article-title: Trends in the characteristics of human functional genomic data on the gene expression omnibus, 2001–2017 publication-title: Lab Investig doi: 10.1038/s41374-018-0125-5 – volume: 38 start-page: 322 year: 2019 ident: 10.1038/s41374-020-00525-x_bib20 article-title: Identification of a 26-lncRNAs risk model for predicting overall survival of cervical squamous cell carcinoma based on integrated bioinformatics analysis publication-title: DNA Cell Biol doi: 10.1089/dna.2018.4533 – volume: 29 start-page: 31 year: 1996 ident: 10.1038/s41374-020-00525-x_bib38 article-title: Artificial neural networks: a tutorial publication-title: Computer doi: 10.1109/2.485891 – volume: 24 start-page: 2200 year: 2008 ident: 10.1038/s41374-020-00525-x_bib10 article-title: A comparative study of survival models for breast cancer prognostication based on microarray data: does a single gene beat them all? publication-title: Bioinformatics. doi: 10.1093/bioinformatics/btn374 – volume: 142 start-page: 721 year: 2018 ident: 10.1038/s41374-020-00525-x_bib13 article-title: Radiotherapy for patients with resected tumor deposit-positive colorectal cancer: a surveillance, epidemiology, and end results-based population study publication-title: Arch Pathol Lab Med doi: 10.5858/arpa.2017-0099-OA – volume: 6 start-page: 551 year: 2013 ident: 10.1038/s41374-020-00525-x_bib37 article-title: Random forest classifier combined with feature selection for breast cancer diagnosis and prognostic publication-title: J Biomed Sci Eng doi: 10.4236/jbise.2013.65070 – volume: 144 start-page: 1397 year: 2020 ident: 10.1038/s41374-020-00525-x_bib5 article-title: Counting mitoses with digital pathology in breast phyllodes tumors publication-title: Arch Pathol Lab Med. doi: 10.5858/arpa.2019-0435-OA – volume: 73 start-page: 3745 year: 2015 ident: 10.1038/s41374-020-00525-x_bib28 article-title: Landslide susceptibility mapping using ensemble bivariate and multivariate statistical models in Fayfa area, Saudi Arabia publication-title: Environ Earth Sci doi: 10.1007/s12665-014-3661-3 – volume: 21 start-page: 660 year: 1991 ident: 10.1038/s41374-020-00525-x_bib33 article-title: A survey of decision tree classifier methodology publication-title: IEEE Trans Syst Man Cybern doi: 10.1109/21.97458 – volume: 176 start-page: 173 year: 2019 ident: 10.1038/s41374-020-00525-x_bib8 article-title: Statistical characterization and classification of colon microarray gene expression data using multiple machine learning paradigms publication-title: Comput Methods Programs Biomed doi: 10.1016/j.cmpb.2019.04.008 – volume: 69 start-page: 230 year: 2016 ident: 10.1038/s41374-020-00525-x_bib15 article-title: Prognostic value of tumour deposit and perineural invasion status in colorectal cancer patients: a SEER-based population study publication-title: Histopathology. doi: 10.1111/his.12936 – ident: 10.1038/s41374-020-00525-x_bib23 – volume: 13 start-page: 8 year: 2015 ident: 10.1038/s41374-020-00525-x_bib3 article-title: Machine learning applications in cancer prognosis and prediction publication-title: Comput Struct Biotechnol J. doi: 10.1016/j.csbj.2014.11.005 – ident: 10.1038/s41374-020-00525-x_bib19 doi: 10.1109/EBBT.2018.8391453 |
| SSID | ssj0014465 |
| Score | 2.5443966 |
| Snippet | Most biomedical datasets, including those of ‘omics, population studies, and surveys, are rectangular in shape and have few missing data. Recently, their... Most biomedical datasets, including those of 'omics, population studies, and surveys, are rectangular in shape and have few missing data. Recently, their... |
| SourceID | proquest pubmed crossref springer elsevier |
| SourceType | Aggregation Database Index Database Enrichment Source Publisher |
| StartPage | 430 |
| SubjectTerms | 14/105 631/1647/48 692/308/2056 Accuracy Aged Algorithms Artificial neural networks Biomedical data Breast cancer Breast Neoplasms - diagnosis Breast Neoplasms - epidemiology Cardiovascular diseases Classification Databases, Factual Datasets Decision trees Diagnosis, Computer-Assisted - methods Efficiency Entropy (Information theory) Epidemiology Female Humans Laboratory Medicine Learning algorithms Learning theory Machine Learning Medicine Medicine & Public Health Middle Aged Missing data Model accuracy Neural networks Pathology Population studies Reduction Regression analysis Reproducibility of Results Support Vector Machine Support vector machines |
| Title | Performance and efficiency of machine learning algorithms for analyzing rectangular biomedical data |
| URI | https://dx.doi.org/10.1038/s41374-020-00525-x https://link.springer.com/article/10.1038/s41374-020-00525-x https://www.ncbi.nlm.nih.gov/pubmed/33574440 https://www.proquest.com/docview/2503525700 https://www.proquest.com/docview/2489257804 |
| Volume | 101 |
| WOSCitedRecordID | wos000617101000002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVPQU databaseName: Biological Science Database customDbUrl: eissn: 1530-0307 dateEnd: 20221231 omitProxy: false ssIdentifier: ssj0014465 issn: 0023-6837 databaseCode: M7P dateStart: 20000101 isFulltext: true titleUrlDefault: http://search.proquest.com/biologicalscijournals providerName: ProQuest – providerCode: PRVPQU databaseName: Health & Medical Collection customDbUrl: eissn: 1530-0307 dateEnd: 20221231 omitProxy: false ssIdentifier: ssj0014465 issn: 0023-6837 databaseCode: 7X7 dateStart: 20000101 isFulltext: true titleUrlDefault: https://search.proquest.com/healthcomplete providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central customDbUrl: eissn: 1530-0307 dateEnd: 20221231 omitProxy: false ssIdentifier: ssj0014465 issn: 0023-6837 databaseCode: BENPR dateStart: 20000101 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: Public Health Database customDbUrl: eissn: 1530-0307 dateEnd: 20221231 omitProxy: false ssIdentifier: ssj0014465 issn: 0023-6837 databaseCode: 8C1 dateStart: 20000101 isFulltext: true titleUrlDefault: https://search.proquest.com/publichealth providerName: ProQuest |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpR3LbtQwcERbhLjwfiyUlZG4gdU4ttfOCUHVigOsVgjQ3qz4kRZpmy3NFhW-nrHj7B6q7oWLD4kncTJvz3gG4I0t6tA4yajSMlDBLKfWTyRFdyiggcFqmarz__isplM9n1ezvOHW5bTKQSYmQe2XLu6RH6CqTpU7i-L9-S8au0bF6GpuobEDe7FKQplS92brKEIsBtaneHA6QU8sH5opuD7oUHgrQaPzFHdGJb26STFdNzyvBU2TLjq-_79f8QDuZSuUfOjJ5iHcCu0juPMlx9kfg5ttzhOQuvUkpEoT8ZgmWTbkLGVgBpJbTpyQenGCb1mdnnUEwRCiXvz5G29EeRp3RNF_Jv1J_0gUJOalPoHvx0ffDj_R3I6BOqHKFfIT5yWvpGcqeC3K0vHgFVoYoRF4pRFeW4Es3mjbMK94abXilnMZWF1zFvhT2G2XbXgOpPITHbRLxXZE5SbWKi681IXgjQ26GAEbcGFcrlUeW2YsTIqZc216_BnEn0n4M1cjeLuGOe8rdWydLQcUm2xr9DaEQVWyFW5_QKzJ3N6ZDVZH8Hp9G_k0Bl_qNiwvcY7QVRSPhRjBs56O1svEf6SEEAj9biCszcNvXsuL7Wt5CXfLmICT0oz2YXd1cRlewW33e_WzuxjDjpqrNGoc9SEbw97Ho-ns6zjx0D9GJxrH |
| linkProvider | ProQuest |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Nb9QwEB2VgqAXvqELBYwEJ7CaxM7aOVQIFapW3a56KKg3118pSNts2Wyh5UfxGzt2kt1D1b31wDXJRI7zPDO2n98AvDOJ9qXNUypk7ilPDaPG9XOK0yGPCUaq86jO_30ghkN5eFjsL8G_7ixMoFV2PjE6aje2YY18HUN1VO5Mkk-nv2ioGhV2V7sSGg0sdv3FH5yy1Rs7X_D_vs-yra8Hm9u0rSpALRfZFGHBWMaK3KXCO8mzzDLvBAZKX3K8UnInDUekltKUqRMsM1Iww1juU61Z6hm-9xbcDkp2YUTJzRmlJIqPNZQSRvs482sP6SRMrtcYLASnYbIWVmJzen5dILya6F7ZpI2xb-vB_9ZrD-F-m2WTz82weARLvnoMd_daHsETsPvz8xJEV474qKQRjqGScUlOIsPUk7akxjHRo2P8qumPk5qgGVro0cXfcCPEi7DiO9IT0igZBNCTwLt9Ct9u5BOfwXI1rvwqkML1pZc2ignxwvaNEYy7XCaclcbLpAdp9--VbbXYQ0mQkYqcACZVgxeFeFERL-q8Bx9mNqeNEsnCp_MOUqrNpZocSWGoXGi31gFJtd6sVnMU9eDt7Db6obC5pCs_PsNnuCyC-094D543uJ01E_tIcM7R-mMH5PnLr2_Li8VteQP3tg_2BmqwM9x9CStZIBtFStUaLE8nZ_4V3LG_pz_ryes4Sgkc3TTALwF1p3JJ |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V3Pb9MwFH4aHZp24TejY4CR4ARWk9ipnQNCwFYxbVQVArSbiWNnm9SlW9PBxp_GX8ez47SHab3twDX2s2zns9-z3-f3AF7pKLdlkcZUyNRSHmtGtemnFI9DFg2MOE99dP4f-2I4lAcH2WgF_rZvYRytst0T_UZtJoW7I--hqvaRO6OoVwZaxGh78P70jLoMUs7T2qbTaCCyZy9_4_Gtfre7jf_6dZIMdr59-kxDhgFacJHMECKMJSxLTSyskTxJCmaNQKVpS45fSm6k5ojaUuoyNoIlWgqmGUttnOcstgzbvQWrAo0M3oHVjzvD0de5D8OFImsIJoz28RwYnuxETPZqVB2CU3d0c_eyKb24Ti1eNXuvuGy9Jhzc_Z_n8B7cCfY3-dAsmPuwYqsHsPYlMAweQjFavKQgeWWI9TE23ANVMinJieeeWhKSbRySfHyIo5odndQExVAiH1_-cQVOk7i74HE-JU2MA7cciGPkPoLvNzLEx9CpJpV9AiQzfWll4cMM8azoay0YN6mMOCu1lVEX4hYHqghR2l2ykLHybAEmVYMdhdhRHjvqogtv5jKnTYySpbXTFl4qWFmN9aRQiS6V22pBpcI-V6sForrwcl6MO5RzO-WVnZxjHS4zpxgi3oWNBsPzbuIcCc45Sr9tQb1o_Pq-bC7vywtYQ1yr_d3h3lNYTxwLyXOttqAzm57bZ3C7-DU7rqfPw5Il8POmEf4PSgN8pw |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Performance+and+efficiency+of+machine+learning+algorithms+for+analyzing+rectangular+biomedical+data&rft.jtitle=Laboratory+investigation&rft.au=Deng%2C+Fei&rft.au=Huang%2C+Jibing&rft.au=Yuan%2C+Xiaoling&rft.au=Cheng%2C+Chao&rft.date=2021-04-01&rft.eissn=1530-0307&rft.volume=101&rft.issue=4&rft.spage=430&rft_id=info:doi/10.1038%2Fs41374-020-00525-x&rft_id=info%3Apmid%2F33574440&rft.externalDocID=33574440 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0023-6837&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0023-6837&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0023-6837&client=summon |