Sample size and statistical power considerations in high-dimensionality data settings: a comparative study of classification algorithms

Background Data generated using 'omics' technologies are characterized by high dimensionality, where the number of features measured per subject vastly exceeds the number of subjects in the study. In this paper, we consider issues relevant in the design of biomedical studies in which the g...

Full description

Saved in:
Bibliographic Details
Published in:BMC bioinformatics Vol. 11; no. 1; p. 447
Main Authors: Guo, Yu, Graber, Armin, McBurney, Robert N, Balasubramanian, Raji
Format: Journal Article
Language:English
Published: London BioMed Central 03.09.2010
BioMed Central Ltd
Springer Nature B.V
BMC
Subjects:
ISSN:1471-2105, 1471-2105
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Background Data generated using 'omics' technologies are characterized by high dimensionality, where the number of features measured per subject vastly exceeds the number of subjects in the study. In this paper, we consider issues relevant in the design of biomedical studies in which the goal is the discovery of a subset of features and an associated algorithm that can predict a binary outcome, such as disease status. We compare the performance of four commonly used classifiers (K-Nearest Neighbors, Prediction Analysis for Microarrays, Random Forests and Support Vector Machines) in high-dimensionality data settings. We evaluate the effects of varying levels of signal-to-noise ratio in the dataset, imbalance in class distribution and choice of metric for quantifying performance of the classifier. To guide study design, we present a summary of the key characteristics of 'omics' data profiled in several human or animal model experiments utilizing high-content mass spectrometry and multiplexed immunoassay based techniques. Results The analysis of data from seven 'omics' studies revealed that the average magnitude of effect size observed in human studies was markedly lower when compared to that in animal studies. The data measured in human studies were characterized by higher biological variation and the presence of outliers. The results from simulation studies indicated that the classifier Prediction Analysis for Microarrays (PAM) had the highest power when the class conditional feature distributions were Gaussian and outcome distributions were balanced. Random Forests was optimal when feature distributions were skewed and when class distributions were unbalanced. We provide a free open-source R statistical software library ( MVpower ) that implements the simulation strategy proposed in this paper. Conclusion No single classifier had optimal performance under all settings. Simulation studies provide useful guidance for the design of biomedical studies involving high-dimensionality data.
AbstractList Data generated using 'omics' technologies are characterized by high dimensionality, where the number of features measured per subject vastly exceeds the number of subjects in the study. In this paper, we consider issues relevant in the design of biomedical studies in which the goal is the discovery of a subset of features and an associated algorithm that can predict a binary outcome, such as disease status. We compare the performance of four commonly used classifiers (K-Nearest Neighbors, Prediction Analysis for Microarrays, Random Forests and Support Vector Machines) in high-dimensionality data settings. We evaluate the effects of varying levels of signal-to-noise ratio in the dataset, imbalance in class distribution and choice of metric for quantifying performance of the classifier. To guide study design, we present a summary of the key characteristics of 'omics' data profiled in several human or animal model experiments utilizing high-content mass spectrometry and multiplexed immunoassay based techniques. The analysis of data from seven 'omics' studies revealed that the average magnitude of effect size observed in human studies was markedly lower when compared to that in animal studies. The data measured in human studies were characterized by higher biological variation and the presence of outliers. The results from simulation studies indicated that the classifier Prediction Analysis for Microarrays (PAM) had the highest power when the class conditional feature distributions were Gaussian and outcome distributions were balanced. Random Forests was optimal when feature distributions were skewed and when class distributions were unbalanced. We provide a free open-source R statistical software library (MVpower) that implements the simulation strategy proposed in this paper. No single classifier had optimal performance under all settings. Simulation studies provide useful guidance for the design of biomedical studies involving high-dimensionality data.
Background Data generated using 'omics' technologies are characterized by high dimensionality, where the number of features measured per subject vastly exceeds the number of subjects in the study. In this paper, we consider issues relevant in the design of biomedical studies in which the goal is the discovery of a subset of features and an associated algorithm that can predict a binary outcome, such as disease status. We compare the performance of four commonly used classifiers (K-Nearest Neighbors, Prediction Analysis for Microarrays, Random Forests and Support Vector Machines) in high-dimensionality data settings. We evaluate the effects of varying levels of signal-to-noise ratio in the dataset, imbalance in class distribution and choice of metric for quantifying performance of the classifier. To guide study design, we present a summary of the key characteristics of 'omics' data profiled in several human or animal model experiments utilizing high-content mass spectrometry and multiplexed immunoassay based techniques. Results The analysis of data from seven 'omics' studies revealed that the average magnitude of effect size observed in human studies was markedly lower when compared to that in animal studies. The data measured in human studies were characterized by higher biological variation and the presence of outliers. The results from simulation studies indicated that the classifier Prediction Analysis for Microarrays (PAM) had the highest power when the class conditional feature distributions were Gaussian and outcome distributions were balanced. Random Forests was optimal when feature distributions were skewed and when class distributions were unbalanced. We provide a free open-source R statistical software library ( MVpower ) that implements the simulation strategy proposed in this paper. Conclusion No single classifier had optimal performance under all settings. Simulation studies provide useful guidance for the design of biomedical studies involving high-dimensionality data.
Abstract Background: Data generated using 'omics' technologies are characterized by high dimensionality, where the number of features measured per subject vastly exceeds the number of subjects in the study. In this paper, we consider issues relevant in the design of biomedical studies in which the goal is the discovery of a subset of features and an associated algorithm that can predict a binary outcome, such as disease status. We compare the performance of four commonly used classifiers (K-Nearest Neighbors, Prediction Analysis for Microarrays, Random Forests and Support Vector Machines) in high-dimensionality data settings. We evaluate the effects of varying levels of signal-to-noise ratio in the dataset, imbalance in class distribution and choice of metric for quantifying performance of the classifier. To guide study design, we present a summary of the key characteristics of 'omics' data profiled in several human or animal model experiments utilizing high-content mass spectrometry and multiplexed immunoassay based techniques. Results: The analysis of data from seven 'omics' studies revealed that the average magnitude of effect size observed in human studies was markedly lower when compared to that in animal studies. The data measured in human studies were characterized by higher biological variation and the presence of outliers. The results from simulation studies indicated that the classifier Prediction Analysis for Microarrays (PAM) had the highest power when the class conditional feature distributions were Gaussian and outcome distributions were balanced. Random Forests was optimal when feature distributions were skewed and when class distributions were unbalanced. We provide a free open-source R statistical software library (MVpower ) that implements the simulation strategy proposed in this paper. Conclusion: No single classifier had optimal performance under all settings. Simulation studies provide useful guidance for the design of biomedical studies involving high-dimensionality data.
data generated using 'omics' technologies are characterized by high dimensionality, where the number of features measured per subject vastly exceeds the number of subjects in the study. In this paper, we consider issues relevant in the design of biomedical studies in which the goal is the discovery of a subset of features and an associated algorithm that can predict a binary outcome, such as disease status. We compare the performance of four commonly used classifiers (K-Nearest Neighbors, Prediction Analysis for Microarrays, Random Forests and Support Vector Machines) in high-dimensionality data settings. We evaluate the effects of varying levels of signal-to-noise ratio in the dataset, imbalance in class distribution and choice of metric for quantifying performance of the classifier. To guide study design, we present a summary of the key characteristics of 'omics' data profiled in several human or animal model experiments utilizing high-content mass spectrometry and multiplexed immunoassay based techniques. the analysis of data from seven 'omics' studies revealed that the average magnitude of effect size observed in human studies was markedly lower when compared to that in animal studies. The data measured in human studies were characterized by higher biological variation and the presence of outliers. The results from simulation studies indicated that the classifier Prediction Analysis for Microarrays (PAM) had the highest power when the class conditional feature distributions were Gaussian and outcome distributions were balanced. Random Forests was optimal when feature distributions were skewed and when class distributions were unbalanced. We provide a free open-source R statistical software library (MVpower) that implements the simulation strategy proposed in this paper. no single classifier had optimal performance under all settings. Simulation studies provide useful guidance for the design of biomedical studies involving high-dimensionality data.
Background Data generated using 'omics' technologies are characterized by high dimensionality, where the number of features measured per subject vastly exceeds the number of subjects in the study. In this paper, we consider issues relevant in the design of biomedical studies in which the goal is the discovery of a subset of features and an associated algorithm that can predict a binary outcome, such as disease status. We compare the performance of four commonly used classifiers (K-Nearest Neighbors, Prediction Analysis for Microarrays, Random Forests and Support Vector Machines) in high-dimensionality data settings. We evaluate the effects of varying levels of signal-to-noise ratio in the dataset, imbalance in class distribution and choice of metric for quantifying performance of the classifier. To guide study design, we present a summary of the key characteristics of 'omics' data profiled in several human or animal model experiments utilizing high-content mass spectrometry and multiplexed immunoassay based techniques. Results The analysis of data from seven 'omics' studies revealed that the average magnitude of effect size observed in human studies was markedly lower when compared to that in animal studies. The data measured in human studies were characterized by higher biological variation and the presence of outliers. The results from simulation studies indicated that the classifier Prediction Analysis for Microarrays (PAM) had the highest power when the class conditional feature distributions were Gaussian and outcome distributions were balanced. Random Forests was optimal when feature distributions were skewed and when class distributions were unbalanced. We provide a free open-source R statistical software library (MVpower) that implements the simulation strategy proposed in this paper. Conclusion No single classifier had optimal performance under all settings. Simulation studies provide useful guidance for the design of biomedical studies involving high-dimensionality data.
data generated using 'omics' technologies are characterized by high dimensionality, where the number of features measured per subject vastly exceeds the number of subjects in the study. In this paper, we consider issues relevant in the design of biomedical studies in which the goal is the discovery of a subset of features and an associated algorithm that can predict a binary outcome, such as disease status. We compare the performance of four commonly used classifiers (K-Nearest Neighbors, Prediction Analysis for Microarrays, Random Forests and Support Vector Machines) in high-dimensionality data settings. We evaluate the effects of varying levels of signal-to-noise ratio in the dataset, imbalance in class distribution and choice of metric for quantifying performance of the classifier. To guide study design, we present a summary of the key characteristics of 'omics' data profiled in several human or animal model experiments utilizing high-content mass spectrometry and multiplexed immunoassay based techniques.BACKGROUNDdata generated using 'omics' technologies are characterized by high dimensionality, where the number of features measured per subject vastly exceeds the number of subjects in the study. In this paper, we consider issues relevant in the design of biomedical studies in which the goal is the discovery of a subset of features and an associated algorithm that can predict a binary outcome, such as disease status. We compare the performance of four commonly used classifiers (K-Nearest Neighbors, Prediction Analysis for Microarrays, Random Forests and Support Vector Machines) in high-dimensionality data settings. We evaluate the effects of varying levels of signal-to-noise ratio in the dataset, imbalance in class distribution and choice of metric for quantifying performance of the classifier. To guide study design, we present a summary of the key characteristics of 'omics' data profiled in several human or animal model experiments utilizing high-content mass spectrometry and multiplexed immunoassay based techniques.the analysis of data from seven 'omics' studies revealed that the average magnitude of effect size observed in human studies was markedly lower when compared to that in animal studies. The data measured in human studies were characterized by higher biological variation and the presence of outliers. The results from simulation studies indicated that the classifier Prediction Analysis for Microarrays (PAM) had the highest power when the class conditional feature distributions were Gaussian and outcome distributions were balanced. Random Forests was optimal when feature distributions were skewed and when class distributions were unbalanced. We provide a free open-source R statistical software library (MVpower) that implements the simulation strategy proposed in this paper.RESULTSthe analysis of data from seven 'omics' studies revealed that the average magnitude of effect size observed in human studies was markedly lower when compared to that in animal studies. The data measured in human studies were characterized by higher biological variation and the presence of outliers. The results from simulation studies indicated that the classifier Prediction Analysis for Microarrays (PAM) had the highest power when the class conditional feature distributions were Gaussian and outcome distributions were balanced. Random Forests was optimal when feature distributions were skewed and when class distributions were unbalanced. We provide a free open-source R statistical software library (MVpower) that implements the simulation strategy proposed in this paper.no single classifier had optimal performance under all settings. Simulation studies provide useful guidance for the design of biomedical studies involving high-dimensionality data.CONCLUSIONno single classifier had optimal performance under all settings. Simulation studies provide useful guidance for the design of biomedical studies involving high-dimensionality data.
ArticleNumber 447
Audience Academic
Author McBurney, Robert N
Guo, Yu
Balasubramanian, Raji
Graber, Armin
AuthorAffiliation 3 Optimal Medicine Ltd., Warwick Enterprise Park, Wellesbourne, Warwick CV35 9EF, UK
2 Institute for Bioinformatics and Translational Research, UMIT, Eduard Wallnoefer Zentrum 1, 6060 Hall in Tyrol, Austria
4 Division of Biostatistics and Epidemiology, University of Massachusetts - Amherst, 715 North Pleasant Street, Amherst, MA 01003, USA
1 BG Medicine, Inc., 610 Lincoln St., Waltham, MA 02451, USA
AuthorAffiliation_xml – name: 3 Optimal Medicine Ltd., Warwick Enterprise Park, Wellesbourne, Warwick CV35 9EF, UK
– name: 4 Division of Biostatistics and Epidemiology, University of Massachusetts - Amherst, 715 North Pleasant Street, Amherst, MA 01003, USA
– name: 1 BG Medicine, Inc., 610 Lincoln St., Waltham, MA 02451, USA
– name: 2 Institute for Bioinformatics and Translational Research, UMIT, Eduard Wallnoefer Zentrum 1, 6060 Hall in Tyrol, Austria
Author_xml – sequence: 1
  givenname: Yu
  surname: Guo
  fullname: Guo, Yu
  organization: BG Medicine, Inc
– sequence: 2
  givenname: Armin
  surname: Graber
  fullname: Graber, Armin
  organization: Institute for Bioinformatics and Translational Research, UMIT
– sequence: 3
  givenname: Robert N
  surname: McBurney
  fullname: McBurney, Robert N
  organization: Optimal Medicine Ltd
– sequence: 4
  givenname: Raji
  surname: Balasubramanian
  fullname: Balasubramanian, Raji
  email: rbalasub@schoolph.umass.edu
  organization: Division of Biostatistics and Epidemiology, University of Massachusetts - Amherst
BackLink https://www.ncbi.nlm.nih.gov/pubmed/20815881$$D View this record in MEDLINE/PubMed
BookMark eNp9kktv1DAUhSNURB-wZ4UsWCAWKXYcxw4LpKriMVIlJApry2M7GY8Se7CdQvkD_G1uOm3pVICyiH19zpfrm3NY7PngbVE8JfiYENG8JjUnZUUwKwkp65o_KA5uS3t31vvFYUprjAkXmD0q9issCBOCHBS_ztW4GSxK7qdFyhuUssouZafVgDbhu41IB5-csRHqsELOo5XrV6Vxo4WD4NXg8iUyKiuUbM7O9-kNUmAbN2o2XQA9T-YShQ7pQaXkOqDPMKSGPkSXV2N6XDzs1JDsk-v3UfH1_bsvpx_Ls08fFqcnZ6VuKMmlbThtG2IMo7Thluhl3S6rCrcdZZqZVlPOMMetYm3XLauaCGGUwoQ2TVMJw-lRsdhyTVBruYluVPFSBuXkVSHEXqoItx-s7GxrmLGYLRmtqWCi4zUxAvCaCl4RYL3dsjbTcrRGW5-jGnaguyferWQfLmTV1hXwAPDyGhDDt8mmLEeXtB0G5W2YkuSMkQbuMLf9_J5yHaYIo0-yxZAEDDwQvdiKegXtO98F-KqekfKkorzmgooaVMd_UcFj7OjgX9vOQX3H8GrHAJpsf-ReTSnJxfnnXe2zuyO5ncVN4EDQbAU6hpSi7aR2-SoM0IUbJMFyTracoyvn6MJWQrLBiO8Zb9j_sZCtJYHU9zb-Gdo_Pb8BNlUImg
CitedBy_id crossref_primary_10_1100_2012_278352
crossref_primary_10_1016_j_parkreldis_2021_01_006
crossref_primary_10_1016_j_humov_2025_103381
crossref_primary_10_1186_1559_0275_11_22
crossref_primary_10_1089_cmb_2018_0002
crossref_primary_10_1002_pmic_201400255
crossref_primary_10_4251_wjgo_v16_i8_3507
crossref_primary_10_2196_47098
crossref_primary_10_3390_ijms18020448
crossref_primary_10_1007_s40747_017_0037_9
crossref_primary_10_2196_43931
crossref_primary_10_1016_j_cclet_2022_03_020
crossref_primary_10_33549_physiolres_935389
crossref_primary_10_1002_prca_201400137
crossref_primary_10_1371_journal_pone_0107105
crossref_primary_10_1016_j_annepidem_2019_07_014
crossref_primary_10_1016_j_janxdis_2024_102825
crossref_primary_10_1002_ece3_4725
crossref_primary_10_2196_67210
crossref_primary_10_1038_s41598_024_72935_6
crossref_primary_10_1186_s12920_020_00826_6
crossref_primary_10_1016_j_anai_2025_01_018
crossref_primary_10_1002_mds_26642
crossref_primary_10_1016_j_crmeth_2023_100540
crossref_primary_10_3109_1354750X_2010_511265
crossref_primary_10_1016_j_ctrv_2016_12_005
crossref_primary_10_1038_s41398_025_03558_2
crossref_primary_10_3389_fnagi_2020_553635
crossref_primary_10_1016_j_bbi_2023_07_022
crossref_primary_10_1016_j_csbj_2022_01_003
crossref_primary_10_3390_microorganisms10112121
crossref_primary_10_1111_jnp_12244
crossref_primary_10_1109_ACCESS_2019_2929066
crossref_primary_10_1016_j_jmb_2020_01_027
crossref_primary_10_1097_MIB_0000000000000602
crossref_primary_10_1007_s10555_015_9561_5
crossref_primary_10_1038_s41598_020_80570_0
crossref_primary_10_1093_sleep_zsab035
crossref_primary_10_3389_fmolb_2016_00026
crossref_primary_10_1007_s10115_018_1174_1
crossref_primary_10_2196_53597
crossref_primary_10_1080_14789450_2019_1633919
crossref_primary_10_3390_metabo11100660
crossref_primary_10_3390_molecules29143350
crossref_primary_10_1007_s00213_015_3968_0
crossref_primary_10_1186_s12859_018_2134_1
crossref_primary_10_1186_s12918_018_0556_z
crossref_primary_10_1186_s40359_021_00574_x
crossref_primary_10_2217_pgs_11_76
Cites_doi 10.1016/j.patcog.2008.08.001
10.1089/cmb.2004.11.714
10.1093/biostatistics/kxh026
10.1093/bioinformatics/bti699
10.1093/bioinformatics/18.9.1184
10.1182/blood-2008-10-187203
10.1023/A:1010933404324
10.1093/bioinformatics/bti448
10.1002/lary.20279
10.1111/1467-9868.00346
10.1002/sim.1335
10.1109/TIT.1967.1053964
10.1093/biostatistics/kxh015
10.1073/pnas.082099299
10.1021/ac051495j
10.1002/sim.2119
10.1198/016214504000001646
10.1214/aoms/1177729437
10.1073/pnas.0601231103
10.3109/13547500903261354
10.1186/1471-2164-5-87
10.1093/biostatistics/kxj036
10.1371/journal.pone.0004922
10.1093/bioinformatics/bti162
10.1177/0192623308329287
ContentType Journal Article
Copyright Guo et al; licensee BioMed Central Ltd. 2010 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
COPYRIGHT 2010 BioMed Central Ltd.
2010 Guo et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright ©2010 Guo et al; licensee BioMed Central Ltd. 2010 Guo et al; licensee BioMed Central Ltd.
Copyright_xml – notice: Guo et al; licensee BioMed Central Ltd. 2010 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
– notice: COPYRIGHT 2010 BioMed Central Ltd.
– notice: 2010 Guo et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
– notice: Copyright ©2010 Guo et al; licensee BioMed Central Ltd. 2010 Guo et al; licensee BioMed Central Ltd.
DBID C6C
AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
ISR
3V.
7QO
7SC
7X7
7XB
88E
8AL
8AO
8FD
8FE
8FG
8FH
8FI
8FJ
8FK
ABUWG
AEUYN
AFKRA
ARAPS
AZQEC
BBNVY
BENPR
BGLVJ
BHPHI
CCPQU
DWQXO
FR3
FYUFA
GHDGH
GNUQQ
HCIFZ
JQ2
K7-
K9.
L7M
LK8
L~C
L~D
M0N
M0S
M1P
M7P
P5Z
P62
P64
PHGZM
PHGZT
PIMPY
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
Q9U
7X8
5PM
DOA
DOI 10.1186/1471-2105-11-447
DatabaseName Springer Nature OA Free Journals
CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Gale In Context: Science
ProQuest Central (Corporate)
Biotechnology Research Abstracts
Computer and Information Systems Abstracts
Health & Medical Collection
ProQuest Central (purchase pre-March 2016)
Medical Database (Alumni Edition)
Computing Database (Alumni Edition)
ProQuest Pharma Collection
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Natural Science Journals
ProQuest Hospital Collection
Hospital Premium Collection (Alumni Edition)
ProQuest Central (Alumni) (purchase pre-March 2016)
ProQuest Central (Alumni)
ProQuest One Sustainability
ProQuest Central UK/Ireland
Advanced Technologies & Computer Science Collection
ProQuest Central Essentials - QC
Biological Science Collection
ProQuest Central
ProQuest Technology Collection
Natural Science Collection
ProQuest One
ProQuest Central
Engineering Research Database
Health Research Premium Collection
Health Research Premium Collection (Alumni)
ProQuest Central Student
SciTech Premium Collection
ProQuest Computer Science Collection
Computer Science Database
ProQuest Health & Medical Complete (Alumni)
Advanced Technologies Database with Aerospace
Biological Sciences
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Computing Database
Health & Medical Collection (Alumni)
Medical Database
Biological Science Database
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
Biotechnology and BioEngineering Abstracts
Proquest Central Premium
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest Health & Medical Research Collection
ProQuest One Academic Middle East (New)
ProQuest One Health & Nursing
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
ProQuest Central Basic
MEDLINE - Academic
PubMed Central (Full Participant titles)
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Publicly Available Content Database
Computer Science Database
ProQuest Central Student
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
SciTech Premium Collection
ProQuest Central China
ProQuest One Applied & Life Sciences
ProQuest One Sustainability
Health Research Premium Collection
Natural Science Collection
Health & Medical Research Collection
Biological Science Collection
ProQuest Central (New)
ProQuest Medical Library (Alumni)
Advanced Technologies & Aerospace Collection
ProQuest Biological Science Collection
ProQuest One Academic Eastern Edition
ProQuest Hospital Collection
ProQuest Technology Collection
Health Research Premium Collection (Alumni)
Biological Science Database
ProQuest Hospital Collection (Alumni)
Biotechnology and BioEngineering Abstracts
ProQuest Health & Medical Complete
ProQuest One Academic UKI Edition
Engineering Research Database
ProQuest One Academic
ProQuest One Academic (New)
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
ProQuest Health & Medical Complete (Alumni)
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest One Health & Nursing
ProQuest Natural Science Collection
ProQuest Pharma Collection
ProQuest Central
ProQuest Health & Medical Research Collection
Biotechnology Research Abstracts
Health and Medicine Complete (Alumni Edition)
ProQuest Central Korea
Advanced Technologies Database with Aerospace
ProQuest Computing
ProQuest Central Basic
ProQuest Computing (Alumni Edition)
ProQuest SciTech Collection
Computer and Information Systems Abstracts Professional
Advanced Technologies & Aerospace Database
ProQuest Medical Library
ProQuest Central (Alumni)
MEDLINE - Academic
DatabaseTitleList

Publicly Available Content Database

MEDLINE

MEDLINE - Academic

Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: PIMPY
  name: Publicly Available Content Database
  url: http://search.proquest.com/publiccontent
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1471-2105
EndPage 447
ExternalDocumentID oai_doaj_org_article_fe9d5de05b5343858f741d8070c38721
PMC2942858
2501705021
A237478384
20815881
10_1186_1471_2105_11_447
Genre Journal Article
Comparative Study
GeographicLocations United States
United Kingdom
GeographicLocations_xml – name: United Kingdom
– name: United States
GroupedDBID ---
0R~
23N
2VQ
2WC
4.4
53G
5VS
6J9
7X7
88E
8AO
8FE
8FG
8FH
8FI
8FJ
AAFWJ
AAJSJ
AAKPC
AASML
ABDBF
ABUWG
ACGFO
ACGFS
ACIHN
ACIWK
ACPRK
ACUHS
ADBBV
ADMLS
ADRAZ
ADUKV
AEAQA
AENEX
AEUYN
AFKRA
AFPKN
AFRAH
AHBYD
AHMBA
AHSBF
AHYZX
ALMA_UNASSIGNED_HOLDINGS
AMKLP
AMTXH
AOIJS
ARAPS
AZQEC
BAPOH
BAWUL
BBNVY
BCNDV
BENPR
BFQNJ
BGLVJ
BHPHI
BMC
BPHCQ
BVXVI
C1A
C6C
CCPQU
CS3
DIK
DU5
DWQXO
E3Z
EAD
EAP
EAS
EBD
EBLON
EBS
EJD
EMB
EMK
EMOBN
ESX
F5P
FYUFA
GNUQQ
GROUPED_DOAJ
GX1
H13
HCIFZ
HMCUK
HYE
IAO
ICD
IHR
INH
INR
IPNFZ
ISR
ITC
K6V
K7-
KQ8
LK8
M1P
M48
M7P
MK~
ML0
M~E
O5R
O5S
OK1
OVT
P2P
P62
PGMZT
PHGZM
PHGZT
PIMPY
PJZUB
PPXIY
PQGLB
PQQKQ
PROAC
PSQYO
PUEGO
RBZ
RIG
RNS
ROL
RPM
RSV
SBL
SOJ
SV3
TR2
TUS
UKHRP
W2D
WOQ
WOW
XH6
XSB
AAYXX
AFFHD
CITATION
ALIPV
CGR
CUY
CVF
ECM
EIF
NPM
3V.
7QO
7SC
7XB
8AL
8FD
8FK
FR3
JQ2
K9.
L7M
L~C
L~D
M0N
P64
PKEHL
PQEST
PQUKI
PRINS
Q9U
7X8
5PM
ID FETCH-LOGICAL-c631t-e673961dd53367e1cb49b2209f35c5d9c3750709a59ffb24188daa01366628d73
IEDL.DBID RSV
ISICitedReferencesCount 75
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000282655600002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1471-2105
IngestDate Fri Oct 03 12:38:45 EDT 2025
Tue Nov 04 01:54:36 EST 2025
Thu Sep 04 17:42:47 EDT 2025
Mon Oct 06 18:33:06 EDT 2025
Tue Nov 11 10:12:52 EST 2025
Tue Nov 04 17:57:02 EST 2025
Thu Nov 13 14:41:50 EST 2025
Mon Jul 21 05:29:40 EDT 2025
Sat Nov 29 05:39:50 EST 2025
Tue Nov 18 22:33:36 EST 2025
Sat Sep 06 07:27:15 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords Simulated Dataset
Recursive Feature Elimination
Support Vector Machine
Average Classification Accuracy
Random Forest
Language English
License This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c631t-e673961dd53367e1cb49b2209f35c5d9c3750709a59ffb24188daa01366628d73
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ObjectType-Article-2
ObjectType-Feature-1
content type line 23
OpenAccessLink https://link.springer.com/10.1186/1471-2105-11-447
PMID 20815881
PQID 901860285
PQPubID 44065
ParticipantIDs doaj_primary_oai_doaj_org_article_fe9d5de05b5343858f741d8070c38721
pubmedcentral_primary_oai_pubmedcentral_nih_gov_2942858
proquest_miscellaneous_755164187
proquest_journals_901860285
gale_infotracmisc_A237478384
gale_infotracacademiconefile_A237478384
gale_incontextgauss_ISR_A237478384
pubmed_primary_20815881
crossref_citationtrail_10_1186_1471_2105_11_447
crossref_primary_10_1186_1471_2105_11_447
springer_journals_10_1186_1471_2105_11_447
PublicationCentury 2000
PublicationDate 2010-09-03
PublicationDateYYYYMMDD 2010-09-03
PublicationDate_xml – month: 09
  year: 2010
  text: 2010-09-03
  day: 03
PublicationDecade 2010
PublicationPlace London
PublicationPlace_xml – name: London
– name: England
PublicationTitle BMC bioinformatics
PublicationTitleAbbrev BMC Bioinformatics
PublicationTitleAlternate BMC Bioinformatics
PublicationYear 2010
Publisher BioMed Central
BioMed Central Ltd
Springer Nature B.V
BMC
Publisher_xml – name: BioMed Central
– name: BioMed Central Ltd
– name: Springer Nature B.V
– name: BMC
References 3904_CR35
3904_CR33
P Muller (3904_CR16) 2005; 99
CF Aliferis (3904_CR19) 2009; 4
SJ Wang (3904_CR9) 2004; 11
3904_CR32
TW Anderson (3904_CR23) 1952; 23
J Davis (3904_CR26) 2006
ML Lee (3904_CR13) 2002; 21
Y Liu (3904_CR28) 2007
S Pounds (3904_CR11) 2005; 21
TCW Landgrebe (3904_CR27) 2006
U Andersson (3904_CR30) 2009; 14
Richard O Duda PEH (3904_CR34) 2001
SH Jung (3904_CR10) 2005; 6
KI Mills (3904_CR1) 2009; 114
C Cortes (3904_CR6) 1995; 20
SS Li (3904_CR14) 2005; 24
CG Gourin (3904_CR2) 2009; 119
3904_CR22
3904_CR21
J Hua (3904_CR20) 2009; 42
TH Robert Tibshirani (3904_CR4) 2002; 99
JD Storey (3904_CR24) 2002; 64
K Dobbin (3904_CR12) 2005; 6
RN McBurney (3904_CR31) 2009; 37
CA Tsai (3904_CR8) 2005; 21
KK Dobbin (3904_CR18) 2007; 8
D Hwang (3904_CR17) 2002; 18
L Breiman (3904_CR3) 2001; 45
S Bijlsma (3904_CR25) 2006; 78
C Wei (3904_CR7) 2004; 5
TMCaP Hart (3904_CR5) 1967; 13
Y Pawitan (3904_CR15) 2005; 21
L Ein-Dor (3904_CR29) 2006; 103
References_xml – volume: 42
  start-page: 15
  issue: 3
  year: 2009
  ident: 3904_CR20
  publication-title: Pattern Recognition
  doi: 10.1016/j.patcog.2008.08.001
– volume: 11
  start-page: 714
  issue: 4
  year: 2004
  ident: 3904_CR9
  publication-title: J Comput Biol
  doi: 10.1089/cmb.2004.11.714
– volume: 6
  start-page: 157
  issue: 1
  year: 2005
  ident: 3904_CR10
  publication-title: Biostatistics
  doi: 10.1093/biostatistics/kxh026
– volume-title: Pattern Classification
  year: 2001
  ident: 3904_CR34
– ident: 3904_CR22
– ident: 3904_CR35
– volume: 21
  start-page: 4263
  issue: 23
  year: 2005
  ident: 3904_CR11
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bti699
– volume: 18
  start-page: 1184
  issue: 9
  year: 2002
  ident: 3904_CR17
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/18.9.1184
– volume: 114
  start-page: 1063
  issue: 5
  year: 2009
  ident: 3904_CR1
  publication-title: Blood
  doi: 10.1182/blood-2008-10-187203
– volume: 45
  start-page: 5
  issue: 1
  year: 2001
  ident: 3904_CR3
  publication-title: Machine Learning
  doi: 10.1023/A:1010933404324
– volume: 21
  start-page: 3017
  issue: 13
  year: 2005
  ident: 3904_CR15
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bti448
– start-page: 123
  volume-title: Proceedings of the 18th International Conference on Pattern Recognition
  year: 2006
  ident: 3904_CR27
– ident: 3904_CR33
– start-page: 185
  volume-title: IEEE International Conference on Acoustics, Speech and Signal Processing
  year: 2007
  ident: 3904_CR28
– volume: 119
  start-page: 1291
  issue: 7
  year: 2009
  ident: 3904_CR2
  publication-title: Laryngoscope
  doi: 10.1002/lary.20279
– volume: 64
  start-page: 479
  year: 2002
  ident: 3904_CR24
  publication-title: Journal of the Royal Statistical Society Series B (Methodological)
  doi: 10.1111/1467-9868.00346
– volume: 21
  start-page: 3543
  issue: 23
  year: 2002
  ident: 3904_CR13
  publication-title: Stat Med
  doi: 10.1002/sim.1335
– volume: 13
  start-page: 21
  issue: 1
  year: 1967
  ident: 3904_CR5
  publication-title: IEEE Transactions on Information Theory
  doi: 10.1109/TIT.1967.1053964
– volume: 20
  start-page: 273
  issue: 3
  year: 1995
  ident: 3904_CR6
  publication-title: Machine Learning
– volume: 6
  start-page: 27
  issue: 1
  year: 2005
  ident: 3904_CR12
  publication-title: Biostatistics
  doi: 10.1093/biostatistics/kxh015
– start-page: 233
  volume-title: ICML 2006: Proceedings of the 23rd international conference on Machine Learning
  year: 2006
  ident: 3904_CR26
– volume: 99
  start-page: 6567
  issue: 10
  year: 2002
  ident: 3904_CR4
  publication-title: Proceedings of the National Academy of Sciences
  doi: 10.1073/pnas.082099299
– ident: 3904_CR21
– volume: 78
  start-page: 567
  issue: 2
  year: 2006
  ident: 3904_CR25
  publication-title: Anal Chem
  doi: 10.1021/ac051495j
– volume: 24
  start-page: 2267
  issue: 15
  year: 2005
  ident: 3904_CR14
  publication-title: Stat Med
  doi: 10.1002/sim.2119
– volume: 99
  start-page: 990
  year: 2005
  ident: 3904_CR16
  publication-title: Journal of the American Statistical Association
  doi: 10.1198/016214504000001646
– volume: 23
  start-page: 193
  year: 1952
  ident: 3904_CR23
  publication-title: Annals of Mathematical Statistics
  doi: 10.1214/aoms/1177729437
– volume: 103
  start-page: 5923
  issue: 15
  year: 2006
  ident: 3904_CR29
  publication-title: Proc Natl Acad Sci USA
  doi: 10.1073/pnas.0601231103
– volume: 14
  start-page: 572
  year: 2009
  ident: 3904_CR30
  publication-title: Biomarkers
  doi: 10.3109/13547500903261354
– volume: 5
  start-page: 87
  issue: 1
  year: 2004
  ident: 3904_CR7
  publication-title: BMC Genomics
  doi: 10.1186/1471-2164-5-87
– volume: 8
  start-page: 101
  issue: 1
  year: 2007
  ident: 3904_CR18
  publication-title: Biostatistics
  doi: 10.1093/biostatistics/kxj036
– ident: 3904_CR32
– volume: 4
  start-page: e4922
  issue: 3
  year: 2009
  ident: 3904_CR19
  publication-title: PLoS One
  doi: 10.1371/journal.pone.0004922
– volume: 21
  start-page: 1502
  issue: 8
  year: 2005
  ident: 3904_CR8
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bti162
– volume: 37
  start-page: 52
  issue: 1
  year: 2009
  ident: 3904_CR31
  publication-title: Toxicol Pathol
  doi: 10.1177/0192623308329287
SSID ssj0017805
Score 2.2909427
Snippet Background Data generated using 'omics' technologies are characterized by high dimensionality, where the number of features measured per subject vastly exceeds...
data generated using 'omics' technologies are characterized by high dimensionality, where the number of features measured per subject vastly exceeds the number...
Data generated using 'omics' technologies are characterized by high dimensionality, where the number of features measured per subject vastly exceeds the number...
Background Data generated using 'omics' technologies are characterized by high dimensionality, where the number of features measured per subject vastly exceeds...
Abstract Background: Data generated using 'omics' technologies are characterized by high dimensionality, where the number of features measured per subject...
Abstract Background Data generated using 'omics' technologies are characterized by high dimensionality, where the number of features measured per subject...
SourceID doaj
pubmedcentral
proquest
gale
pubmed
crossref
springer
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 447
SubjectTerms Accuracy
Algorithms
Animal models
Animals
Bioinformatics
Biological markers
Biomarkers
Biomedical and Life Sciences
Classification - methods
Comparative studies
Computational biology
Computational Biology/Bioinformatics
Computer Appl. in Life Sciences
Databases, Factual
Design
Discriminant analysis
Gene expression
Gene Expression Profiling - methods
Genetic algorithms
Humans
Life Sciences
Mass spectrometry
Methods
Microarrays
Models, Statistical
Oligonucleotide Array Sequence Analysis - methods
Pattern Recognition, Automated
Proteomics
Research Article
Sample Size
Standard deviation
Studies
SummonAdditionalLinks – databaseName: DOAJ Directory of Open Access Journals
  dbid: DOA
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1ba9VAEF6kKPgi3o2tsoggCuEk2WQvvlWx6EsRq9C3ZbOX9kBNysk5QvsH_NvObC49qagvPia7G7Izs9mZ7Mz3EfKSmYwZ6cvUcCsgQMlMqmzBkMyMWfAIbBlMJJsQh4fy-Fh93qL6wpywHh64F9wieOUq57OqrliJp1gB9kAnwVItkyKWkBfg9YzB1HB-gEj9sa5I5CkENdV4QCn5YrqHFWUl0qpsbUgRt__3r_PW9nQ9dfLa-Wnclg7ukjuDP0n3-3ncIzd8c5_c6hkmLx6Qn0cG4X9pt7z01DSOYgFRxGaGQefIkEbtQNnZ_7qjy4YihHHqEPa_h-wAR51iJintfMyS7t5SQ-0VbDiNGLW0DdSiL47JR_Fh1JydtKvl-vR795B8O_jw9f3HdOBeSC1n-Tr1XDDFc-fAHeTC57YuVV0UmQqsspVTloGrITJlKhVCDW6AlM4YBIDjvJBOsEdkp2kb_4RQKY0NeWa8q_PSO1FzX3pw7YwJEGwFn5DFqABtB2By5Mc40zFAkVyjyjSqDC41qCwhr6cR5z0ox1_6vkOdTv0QTjveACPTg5HpfxlZQl6gRWgEzGgwI-fEbLpOfzr6ovcLhhQETJYJeTV0Ci28vzVDgQNIATG2Zj33Zj1hRdtZ8-5oeHr4onQa_DakC5NVQujUigMxSa7x7abTAg89QRMw58e9lU6zhiWSV1LCRMTMfmdimbc0y9OINl4oiFArmZA3o6VfvdSfhP70fwh9l9zuUzVUmrE9srNebfwzctP-gGWyeh7X-y98ZVW7
  priority: 102
  providerName: Directory of Open Access Journals
– databaseName: Biological Science Database
  dbid: M7P
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Lb9QwELaggMSFNzS0IAshIZCiTeI8bC6oICq4VBUFqTfL8WO7Ukm2m10k-AP8bWYcJ22K6IXjxvYqY4_HM5nx9xHykqmEKW7zWJW6ggAlUbHQGUMyM6bBI9C5U55sojo44MfH4jDU5nShrHKwid5Qm1bjN_IZnFtIl8SLd8uzGEmjMLkaGDSukxsIksB85d7hmERAuP4hM8nLWQp2OIYIp8CrZDnyqVw4iTxg_99m-cK5dLlm8lLi1J9H-3f_U5J75E5wROlerzn3yTXbPCC3emrKnw_J7yOFuMG0W_yyVDWG4s0jD-oMg5ZIrUZ14Prsv_nRRUMR-zg2yBfQY32Ah0-xBJV21pdXd2-povocb5x6cFvaOqrRiceqJf9nVJ3O4ZXXJ9-7R-Tb_sevHz7FgbQh1iVL17EtKybK1BjwI8vKprrORZ1liXCs0IURmoGPUiVCFcK5GvwHzo1SiBxXlhk3FXtMtpq2sduEcq60SxNlTZ3m1lR1aXMLPqFSDqI0ZyMyGxZQ6oBojsQap9JHNryUuOQSlxx-SljyiLweRyx7NI8r-r5HnRj7IQ63f9Cu5jJsa-msMIWxSVEXLMccqwMPzXAQUDMOwXVEXqBGSUTaaLCUZ642XSc_H32RexlD7gLG84i8Cp1cC--vVbgZAbOA4FyTnruTnmAK9KR5Z9A4GUxRJ0d1iwgdW3EgVtc1tt10ssJsKawEyPyk1_JR6gxcxoJzEKSa6P9kWqYtzeLEw5RnAkLbgkfkzbBTzl_qX5P-9EoJdsjtvnhDxAnbJVvr1cY-Izf1D9D_1XO_-f8Ae15iVw
  priority: 102
  providerName: ProQuest
Title Sample size and statistical power considerations in high-dimensionality data settings: a comparative study of classification algorithms
URI https://link.springer.com/article/10.1186/1471-2105-11-447
https://www.ncbi.nlm.nih.gov/pubmed/20815881
https://www.proquest.com/docview/901860285
https://www.proquest.com/docview/755164187
https://pubmed.ncbi.nlm.nih.gov/PMC2942858
https://doaj.org/article/fe9d5de05b5343858f741d8070c38721
Volume 11
WOSCitedRecordID wos000282655600002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVADU
  databaseName: BioMed Central Open Access Free
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: RBZ
  dateStart: 20000101
  isFulltext: true
  titleUrlDefault: https://www.biomedcentral.com/search/
  providerName: BioMedCentral
– providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: DOA
  dateStart: 20000101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: M~E
  dateStart: 20000101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
– providerCode: PRVPQU
  databaseName: Advanced Technologies & Aerospace Database
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: P5Z
  dateStart: 20090101
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/hightechjournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Biological Science Database (ProQuest)
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: M7P
  dateStart: 20090101
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/biologicalscijournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Computer Science Database (ProQuest)
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: K7-
  dateStart: 20090101
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/compscijour
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Health & Medical Collection
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: 7X7
  dateStart: 20090101
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/healthcomplete
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: BENPR
  dateStart: 20090101
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Publicly Available Content Database
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: PIMPY
  dateStart: 20090101
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/publiccontent
  providerName: ProQuest
– providerCode: PRVAVX
  databaseName: SpringerLINK Contemporary 1997-Present
  customDbUrl:
  eissn: 1471-2105
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017805
  issn: 1471-2105
  databaseCode: RSV
  dateStart: 20001201
  isFulltext: true
  titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22
  providerName: Springer Nature
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3di9QwEA_3oeCL36f1ziWIIArl2qZtUt_u5A4PcSm7KqcvIU3SvYWzPba7gv4D_tvOpO2ePT9AXwrdTJZkOpPMZCa_IeQpUwFTwsa-SjUHByVQfqYjhsXMmAaLQMelcsUm-HgsTk-zfINE_V0Yl-3ehyTdSu3UWqT7ISyjPjgoCd4Ei2O-SbZhsxOojJPph3XkADH6-3Dkb3oNth-H0v_rWvzTZnQ1UfJKtNRtQse3_mf4t8nNzuSkB62M3CEbtrpLrrdFKL_eI9-nChGCaTP_ZqmqDMU7Rg6-GTpdYBE1qruqnu3pHp1XFFGOfYOVAVpUD7DlKSab0sa6ROrmJVVUXyKLUwdjS-uSajTXMT_J_RlV57N6MV-efW7uk_fHR-9evfa78gy-Tlm49G3KWZaGxoDFmHIb6iLOiigKspIlOjGZZmCN8CBTSVaWBVgKQhilECMuTSNhONshW1Vd2YeECqF0GQbKmiKMreFFamML1p9SJfhjpfXIfv_VpO6wy7GExrl0PoxIJbJXInvhVQJ7PfJ83eOixe34C-0hCsKaDhG33Q_1YiY7BZalzUxibJAUCYsxmlqCLWYETFAzAW60R56gGEnE1KgwaWemVk0jT6YTeRAxrFLAROyRZx1RWcP4teruQAAXEIZrQLk3oASl14Pm3V5aZbfoNBJMO6woJhKP0HUrdsQ8usrWq0ZyjIvCl4A5P2hFez3rCIzDRAiYCB8I_YAtw5ZqfuYAyaMMnNhEeORFL_qXg_oT0x_9C_EuuQFuDZ57gQrtka3lYmUfk2v6C6jDYkQ2-Sl3TzEi24dH43wycmcr8HzD_RHm8-bwzJNP0J6fvM0_jtyi8QNgjl_S
linkProvider Springer Nature
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Zb9QwELZKAcEL9xFawEIgBFK0SZzDQUKoHFVXW1aItlLfjGM725VKsmx2QeUP8Gv4j8w4R5si-tYHHndjR_FkDk9m_H2EPGXSY5Kb0JWxSiBB8aSbqoAhmRlTsCNQYS4t2UQyHvP9_fTTCvndnoXBtsrWJ1pHrUuF38gHELeQLolHb2bfXCSNwuJqy6BRa8XIHP2AjK16PXwPr_dZEGx-2H235TakAq6Kmb9wTZywNPa1hn1OnBhfZWGaBYGX5ixSkU4VgxiaeKmM0jzPIL5xrqVEZLM4DrhOGNz3ArkYMp4gVP8ocbuiBdIDtJVQHg988PsuZFQRHl0Lkb_lROSzBAF_h4ETcfB0j-apQq2Nf5vX_zPJ3SDXmo023agt4yZZMcUtcrmm3jy6TX7tSMRFptX0p6Gy0BRPVlnQapg0Q-o4qhou0_qbJp0WFLGdXY18CDWWCWQwFFtsaWVs-3j1ikqqjvHUqQXvpWVOFSYp2JVlb0bl4QREtDj4Wt0he-cihbtktSgLc59QzqXKfU8anfmh0UkWm9DAnlfKHLLQ3Dhk0CqMUA1iOxKHHAqbufFYoIoJVDH4KUDFHPKimzGr0UrOGPsWdbAbhzjj9o9yPhGN2xK5SXWkjRdlEQuxhpzDDlRzWKACAwh8hzxBDRaIJFJgq9JELqtKDHc-i42AITcD46FDnjeD8hKeX8nm5AdIAcHHeiPXeyPB1ane5bVWw0XjaivRqbdDaHcVJ2L3YGHKZSUSrAbDm4A136utqlt1AFviiHNYSNKzt55Y-leK6YGFYQ9SSN0j7pCXrWUeP9S_hP7gzBU8Jle2dj9ui-3heLRGrtaNKqnrsXWyupgvzUNySX0HW5g_so6Hki_nba9_APTavBE
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Zb9QwELagHOKFmza0gIWQEJWiTeIcNm_lWFGBVhULqG-W42O7UklWm10k-AP8bWacZEvKISEeNx6v4ontfJMZfx8hT5iKmOI2DVWuCwhQIhUKnTAUM2MaEIFOnfJiE8Vkwo-PxVH3wa3pq937lGR7pgFZmqrVaGFcu8R5PophSw0hWMnwVFiaFhfJpRQlgzBan37aZBGQr79PTf6m1-BV5Bn7f92Xf3oxnS-aPJc59S-k8Y3_HcpNcr2DovSgnTu3yAVb3SZXWnHKr3fI96lC5mDazL9ZqipD8eyRp3WGTgsUV6O6U_tsv_rReUWR_Tg0qBjQsn0AxqdYhEob6wusm-dUUX3GOE49vS2tHdUI47Fuyf8ZVaezejlfnXxu7pKP49cfXr4JO9mGUOcsXoU2L5jIY2MASeaFjXWZijJJIuFYpjMjNAOUUkRCZcK5EhAE50Yp5I7L84Sbgt0jW1Vd2R1COVfaxZGypoxTa4oyt6kFVKiUgzjN2YCM-icodcdpjtIap9LHNjyX6F6J7oWfEtwbkGebHouWz-Mvti9wUmzskInbX6iXM9ktbOmsMJmxUVZmLMUsqwOMZjgMUDMO4XVAHuOUksi1UWExz0ytm0YeTt_Lg4ShegHjaUCedkauhvvXqjsbAV5Aeq6B5d7AEjYDPWje7Weu7DajRgLkQ6UxngWEblqxI9bXVbZeN7LAfCk8CRjzdjvNN6NOADRmnMNAisECGLhl2FLNTzxReSIguM14QPb7ZXB2U39y-v1_MX5Erh69Gst3h5O3u-RaW9ghwojtka3Vcm0fkMv6C6yM5UO_L_wAtiBizQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Sample+size+and+statistical+power+considerations+in+high-dimensionality+data+settings%3A+a+comparative+study+of+classification+algorithms&rft.jtitle=BMC+bioinformatics&rft.au=Guo%2C+Yu&rft.au=Graber%2C+Armin&rft.au=McBurney%2C+Robert+N&rft.au=Balasubramanian%2C+Raji&rft.date=2010-09-03&rft.pub=BioMed+Central&rft.eissn=1471-2105&rft.volume=11&rft.issue=1&rft_id=info:doi/10.1186%2F1471-2105-11-447&rft.externalDocID=10_1186_1471_2105_11_447
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1471-2105&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1471-2105&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1471-2105&client=summon