Scalable nonparametric clustering with unified marker gene selection for single-cell RNA-seq data

Clustering is commonly used in single-cell RNA-sequencing (scRNA-seq) pipelines to characterize cellular heterogeneity. However, current methods face two main limitations. First, they require user-specified heuristics which add time and complexity to bioinformatic workflows; second, they rely on pos...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:bioRxiv
Hlavní autori: Nwizu, Chibuikem, Hughes, Madeline, Ramseier, Michelle L, Navia, Andrew W, Shalek, Alex K, Fusi, Nicolo, Raghavan, Srivatsan, Winter, Peter S, Amini, Ava P, Crawford, Lorin
Médium: Journal Article Paper
Jazyk:English
Vydavateľské údaje: United States Cold Spring Harbor Laboratory Press 12.02.2024
Cold Spring Harbor Laboratory
Vydanie:1.1
Predmet:
ISSN:2692-8205, 2692-8205
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Clustering is commonly used in single-cell RNA-sequencing (scRNA-seq) pipelines to characterize cellular heterogeneity. However, current methods face two main limitations. First, they require user-specified heuristics which add time and complexity to bioinformatic workflows; second, they rely on post-selective differential expression analyses to identify marker genes driving cluster differences, which has been shown to be subject to inflated false discovery rates. We address these challenges by introducing nonparametric clustering of single-cell populations (NCLUSION): an infinite mixture model that leverages Bayesian sparse priors to identify marker genes while simultaneously performing clustering on single-cell expression data. NCLUSION uses a scalable variational inference algorithm to perform these analyses on datasets with up to millions of cells. Through simulations and analyses of publicly available scRNA-seq studies, we demonstrate that NCLUSION (i) matches the performance of other state-of-the-art clustering techniques with significantly reduced runtime and (ii) provides statistically robust and biologically relevant transcriptomic signatures for each of the clusters it identifies. Overall, NCLUSION represents a reliable hypothesis-generating tool for understanding patterns of expression variation present in single-cell populations.
AbstractList Clustering is commonly used in single-cell RNA-sequencing (scRNA-seq) pipelines to characterize cellular heterogeneity. However, current methods face two main limitations. First, they require user-specified heuristics which add time and complexity to bioinformatic workflows; second, they rely on post-selective differential expression analyses to identify marker genes driving cluster differences, which has been shown to be subject to inflated false discovery rates. We address these challenges by introducing nonparametric clustering of single-cell populations (NCLUSION): an infinite mixture model that leverages Bayesian sparse priors to identify marker genes while simultaneously performing clustering on single-cell expression data. NCLUSION uses a scalable variational inference algorithm to perform these analyses on datasets with up to millions of cells. By analyzing publicly available scRNA-seq studies, we demonstrate that NCLUSION (i) matches the performance of other state-of-the-art clustering techniques with significantly reduced runtime and (ii) provides statistically robust and biologically relevant transcriptomic signatures for each of the clusters it identifies. Overall, NCLUSION represents a reliable hypothesis-generating tool for understanding patterns of expression variation present in single-cell populations.
Clustering is commonly used in single-cell RNA-sequencing (scRNA-seq) pipelines to characterize cellular heterogeneity. However, current methods face two main limitations. First, they require user-specified heuristics which add time and complexity to bioinformatic workflows; second, they rely on post-selective differential expression analyses to identify marker genes driving cluster differences, which has been shown to be subject to inflated false discovery rates. We address these challenges by introducing nonparametric clustering of single-cell populations (NCLUSION): an infinite mixture model that leverages Bayesian sparse priors to identify marker genes while simultaneously performing clustering on single-cell expression data. NCLUSION uses a scalable variational inference algorithm to perform these analyses on datasets with up to millions of cells. Through simulations and analyses of publicly available scRNA-seq studies, we demonstrate that NCLUSION (i) matches the performance of other state-of-the-art clustering techniques with significantly reduced runtime and (ii) provides statistically robust and biologically relevant transcriptomic signatures for each of the clusters it identifies. Overall, NCLUSION represents a reliable hypothesis-generating tool for understanding patterns of expression variation present in single-cell populations.
Clustering is commonly used in single-cell RNA-sequencing (scRNA-seq) pipelines to characterize cellular heterogeneity. However, current methods face two main limitations. First, they require user-specified heuristics which add time and complexity to bioinformatic workflows; second, they rely on post-selective differential expression analyses to identify marker genes driving cluster differences, which has been shown to be subject to inflated false discovery rates. We address these challenges by introducing nonparametric clustering of single-cell populations (NCLUSION): an infinite mixture model that leverages Bayesian sparse priors to identify marker genes while simultaneously performing clustering on single-cell expression data. NCLUSION uses a scalable variational inference algorithm to perform these analyses on datasets with up to millions of cells. By analyzing publicly available scRNA-seq studies, we demonstrate that NCLUSION (i) matches the performance of other state-of-the-art clustering techniques with significantly reduced runtime and (ii) provides statistically robust and biologically relevant transcriptomic signatures for each of the clusters it identifies. Overall, NCLUSION represents a reliable hypothesis-generating tool for understanding patterns of expression variation present in single-cell populations.Clustering is commonly used in single-cell RNA-sequencing (scRNA-seq) pipelines to characterize cellular heterogeneity. However, current methods face two main limitations. First, they require user-specified heuristics which add time and complexity to bioinformatic workflows; second, they rely on post-selective differential expression analyses to identify marker genes driving cluster differences, which has been shown to be subject to inflated false discovery rates. We address these challenges by introducing nonparametric clustering of single-cell populations (NCLUSION): an infinite mixture model that leverages Bayesian sparse priors to identify marker genes while simultaneously performing clustering on single-cell expression data. NCLUSION uses a scalable variational inference algorithm to perform these analyses on datasets with up to millions of cells. By analyzing publicly available scRNA-seq studies, we demonstrate that NCLUSION (i) matches the performance of other state-of-the-art clustering techniques with significantly reduced runtime and (ii) provides statistically robust and biologically relevant transcriptomic signatures for each of the clusters it identifies. Overall, NCLUSION represents a reliable hypothesis-generating tool for understanding patterns of expression variation present in single-cell populations.
Clustering is commonly used in single-cell RNA-sequencing (scRNA-seq) pipelines to characterize cellular heterogeneity. However, current methods face two main limitations. First, they require user-specified heuristics which add time and complexity to bioinformatic workflows; second, they rely on post-selective differential expression analyses to identify marker genes driving cluster differences, which has been shown to be subject to inflated false discovery rates. We address these challenges by introducing nonparametric clustering of single-cell populations (NCLUSION): an infinite mixture model that leverages Bayesian sparse priors to identify marker genes while simultaneously performing clustering on single-cell expression data. NCLUSION uses a scalable variational inference algorithm to perform these analyses on datasets with up to millions of cells. By analyzing publicly available scRNA-seq studies, we demonstrate that NCLUSION (i) matches the performance of other state-of-the-art clustering techniques with significantly reduced runtime and (ii) provides statistically robust and biologically relevant transcriptomic signatures for each of the clusters it identifies. Overall, NCLUSION represents a reliable hypothesis-generating tool for understanding patterns of expression variation present in single-cell populations.Competing Interest StatementSR holds equity in Amgen and receives research funding from Microsoft. All other authors have declared that no competing interests exist.Footnotes* https://github.com/microsoft/nclusion* http://microsoft.github.io/nclusion
Author Ramseier, Michelle L
Fusi, Nicolo
Crawford, Lorin
Amini, Ava P
Nwizu, Chibuikem
Shalek, Alex K
Hughes, Madeline
Navia, Andrew W
Raghavan, Srivatsan
Winter, Peter S
Author_xml – sequence: 1
  givenname: Chibuikem
  orcidid: 0000-0002-2075-1747
  surname: Nwizu
  fullname: Nwizu, Chibuikem
  organization: Warren Alpert Medical School of Brown University, Providence, RI, USA
– sequence: 2
  givenname: Madeline
  surname: Hughes
  fullname: Hughes, Madeline
  organization: Microsoft Research, Cambridge, MA, USA
– sequence: 3
  givenname: Michelle L
  orcidid: 0000-0002-9201-7656
  surname: Ramseier
  fullname: Ramseier, Michelle L
  organization: Ragon Institute of MGH, MIT, and Harvard, Cambridge, MA, USA
– sequence: 4
  givenname: Andrew W
  orcidid: 0000-0002-5429-8012
  surname: Navia
  fullname: Navia, Andrew W
  organization: Broad Institute of MIT and Harvard, Cambridge, MA, USA
– sequence: 5
  givenname: Alex K
  orcidid: 0000-0001-5670-8778
  surname: Shalek
  fullname: Shalek, Alex K
  organization: Ragon Institute of MGH, MIT, and Harvard, Cambridge, MA, USA
– sequence: 6
  givenname: Nicolo
  orcidid: 0000-0002-4102-0169
  surname: Fusi
  fullname: Fusi, Nicolo
  organization: Microsoft Research, Cambridge, MA, USA
– sequence: 7
  givenname: Srivatsan
  orcidid: 0000-0002-5374-9918
  surname: Raghavan
  fullname: Raghavan, Srivatsan
  organization: Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
– sequence: 8
  givenname: Peter S
  orcidid: 0000-0002-6557-3219
  surname: Winter
  fullname: Winter, Peter S
  organization: Broad Institute of MIT and Harvard, Cambridge, MA, USA
– sequence: 9
  givenname: Ava P
  orcidid: 0000-0002-8601-6040
  surname: Amini
  fullname: Amini, Ava P
  organization: Microsoft Research, Cambridge, MA, USA
– sequence: 10
  givenname: Lorin
  orcidid: 0000-0003-0178-8242
  surname: Crawford
  fullname: Crawford, Lorin
  organization: Microsoft Research, Cambridge, MA, USA
BackLink https://www.ncbi.nlm.nih.gov/pubmed/38405697$$D View this record in MEDLINE/PubMed
BookMark eNpdkElPxDAMhSMEYp0fwAVF4sKlQ5Y2aY4IsUkjkFjOlZO6EOi0Q9Ky_HuChk2cbEufn5_fFlnt-g4J2eVsyjnjh4KJfMpEGqaFNqU0K2RTKCOyUrBi9U-_QSYxPjLGhFFc6nydbMgyZ4UyepPAjYMWbIs0yS8gwByH4B117RgHDL67p69-eKBj5xuPNZ1DeMJA77FDGrFFN_i-o00faExsi5nDtqXXl0dZxGdawwA7ZK2BNuLkq26Tu9OT2-PzbHZ1dnF8NMssV8xkmkmVc-YkKFsYB67mymmlZYPO8gKwLBTXjWVcWOFqcAxQ19qCtYAMuNwmB0td6_vw5l-qRfDJ7Xv1GVTFRMV5tQzqF12E_nnEOFRzHz-NQ4f9GCthZFqSKbyE7v9DH_sxdOmRRIlClFKZPFF7X9Ro51j_nP7OWX4AEAuAuA
Cites_doi 10.1038/s41587-020-00811-5
10.1186/s12859-016-0984-y
10.1214/12-BA703
10.1093/bioinformatics/btac757
10.1214/009053604000000238
10.1214/11-BA631
10.1038/s41592-019-0654-x
10.1126/science.aaa1934
10.15252/msb.20188746
10.1186/s13059-017-1382-0
10.1038/s41592-018-0229-2
10.1038/s41467-017-00470-2
10.1093/nar/gkz826
10.1038/80859
10.1126/science.abl5197
10.1038/s41467-017-01605-1
10.1371/journal.pcbi.1004575
10.1038/nmeth.4612
10.1214/11-AOAS455
10.1038/s41587-019-0379-5
10.1038/nmeth.4236
10.1038/s41467-021-23196-8
10.1016/j.immuni.2020.12.003
10.1016/S0198-8859(98)00098-6
10.1214/06-BA104
10.1186/s13059-015-0604-6
10.1038/ncomms14049
10.1186/s12865-023-00547-2
10.1073/pnas.1817715116
10.1371/journal.pbio.3000722
10.1038/ni1582
10.1016/j.cell.2018.09.009
10.1038/nrg3833
10.1080/01621459.2017.1285773
10.1214/17-AOAS1046
10.1093/bioinformatics/bti525
10.1038/ni.2927
10.1080/01621459.2022.2116331
10.1038/s41467-020-17900-3
10.1038/s41576-018-0088-9
10.1186/s13059-022-02622-0
10.1186/1471-2105-14-128
10.1016/j.rmed.2013.10.005
10.1093/intimm/8.2.275
10.1371/journal.pgen.1009754
10.1093/database/baaa073
10.1186/s13059-020-1926-6
10.1038/nbt.3192
10.1038/nature14966
10.12688/f1000research.9501.2
10.1093/biostatistics/kxac047
10.1016/0008-8749(90)90260-X
10.1038/nn.4216
10.1186/s13073-017-0467-4
10.1038/s41592-023-01933-9
10.1084/jem.20111908
10.1038/s41581-021-00463-x
10.1038/nmeth.4207
10.1038/75556
ContentType Journal Article
Paper
Copyright 2024. This article is published under http://creativecommons.org/licenses/by-nc/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
2024, Posted by Cold Spring Harbor Laboratory
Copyright_xml – notice: 2024. This article is published under http://creativecommons.org/licenses/by-nc/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
– notice: 2024, Posted by Cold Spring Harbor Laboratory
DBID NPM
8FE
8FH
ABUWG
AFKRA
AZQEC
BBNVY
BENPR
BHPHI
CCPQU
DWQXO
GNUQQ
HCIFZ
LK8
M7P
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
7X8
FX.
DOI 10.1101/2024.02.11.579839
DatabaseName PubMed
ProQuest SciTech Collection
ProQuest Natural Science Collection
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
ProQuest Central Essentials - QC
Biological Science Collection
ProQuest Central
Natural Science Collection
ProQuest One Community College
ProQuest Central Korea
ProQuest Central Student
SciTech Premium Collection
Biological Sciences
Biological Science Database (ProQuest)
ProQuest Central Premium
ProQuest One Academic
ProQuest Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
MEDLINE - Academic
bioRxiv
DatabaseTitle PubMed
Publicly Available Content Database
ProQuest Central Student
ProQuest One Academic Middle East (New)
ProQuest Biological Science Collection
ProQuest Central Essentials
ProQuest One Academic Eastern Edition
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Natural Science Collection
Biological Science Database
ProQuest SciTech Collection
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest One Academic UKI Edition
Natural Science Collection
ProQuest Central Korea
Biological Science Collection
ProQuest Central (New)
ProQuest One Academic
ProQuest One Academic (New)
MEDLINE - Academic
DatabaseTitleList
PubMed
MEDLINE - Academic
Publicly Available Content Database
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: PIMPY
  name: Publicly Available Content Database
  url: http://search.proquest.com/publiccontent
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 2692-8205
Edition 1.1
ExternalDocumentID 2024.02.11.579839v1
38405697
Genre Journal Article
Preprint
Working Paper/Pre-Print
GrantInformation_xml – fundername: NIGMS NIH HHS
  grantid: T32 GM128596
– fundername: NCI NIH HHS
  grantid: K08 CA260442
GroupedDBID 8FE
8FH
AFFHD
AFKRA
ALMA_UNASSIGNED_HOLDINGS
BBNVY
BENPR
BHPHI
CCPQU
HCIFZ
LK8
M7P
NPM
NQS
PHGZM
PHGZT
PIMPY
PQGLB
PROAC
RHI
ABUWG
AZQEC
DWQXO
GNUQQ
PKEHL
PQEST
PQQKQ
PQUKI
PRINS
7X8
PUEGO
FX.
ID FETCH-LOGICAL-b1609-7036410c3a6b59cacd16c7673fecb15ae85617fb012b2cdac0ae7d7babbae0a13
IEDL.DBID M7P
ISSN 2692-8205
IngestDate Tue Jan 07 18:52:28 EST 2025
Fri Sep 05 12:12:55 EDT 2025
Fri Jul 25 09:21:10 EDT 2025
Fri Nov 21 01:40:51 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
License This pre-print is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), CC BY-NC 4.0, as described at http://creativecommons.org/licenses/by-nc/4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-b1609-7036410c3a6b59cacd16c7673fecb15ae85617fb012b2cdac0ae7d7babbae0a13
Notes SourceType-Working Papers-1
ObjectType-Working Paper/Pre-Print-1
content type line 50
ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-3
content type line 23
Competing Interest Statement: SR holds equity in Amgen and receives research funding from Microsoft. All other authors have declared that no competing interests exist.
ORCID 0000-0002-9201-7656
0000-0002-5429-8012
0000-0002-4102-0169
0000-0002-8601-6040
0000-0002-2075-1747
0000-0002-5374-9918
0000-0002-6557-3219
0000-0003-0178-8242
0000-0001-5670-8778
OpenAccessLink https://www.proquest.com/docview/2925283694?pq-origsite=%requestingapplication%
PMID 38405697
PQID 2925283694
PQPubID 2050091
PageCount 37
ParticipantIDs biorxiv_primary_2024_02_11_579839
proquest_miscellaneous_2932023820
proquest_journals_2925283694
pubmed_primary_38405697
PublicationCentury 2000
PublicationDate 20240212
PublicationDateYYYYMMDD 2024-02-12
PublicationDate_xml – month: 02
  year: 2024
  text: 20240212
  day: 12
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
– name: Cold Spring Harbor
PublicationTitle bioRxiv
PublicationTitleAlternate bioRxiv
PublicationYear 2024
Publisher Cold Spring Harbor Laboratory Press
Cold Spring Harbor Laboratory
Publisher_xml – name: Cold Spring Harbor Laboratory Press
– name: Cold Spring Harbor Laboratory
References McInnes, Healy, Melville (2024.02.11.579839v1.13) 2020
Luecken, Theis (2024.02.11.579839v1.20) 2019; 15
Ma, Sun, Shang, Keller, Chen, Zhou (2024.02.11.579839v1.75) 1585; 11
(2024.02.11.579839v1.43) 2023
Lopez, Regier, Cole, Jordan, Yosef (2024.02.11.579839v1.14) 2018; 15
Lähnemann, Köster, Szczurek, McCarthy, Hicks, Robinson, Vallejos, Campbell, Beerenwinkel, Mahfouz, Pinello, Skums, Stamatakis, Attolini, Aparicio, Baaijens, Balvert, de Barbanson, Cappuccio, Corleone, Dutilh, Florescu, Guryev, Holmer, Jahn, Lobo, Keizer, Khatri, Kielbasa, Korbel, Kozlov, Kuo, Lelieveldt, Mandoiu, Marioni, Marschall, Mölder, Niknejad, Raczkowski, Reinders, de Ridder, Saliba, Somarakis, Stegle, Theis, Yang, Zelikovsky, McHardy, Raphael, Shah, Schönhuth (2024.02.11.579839v1.21) 2020; 21
Zhang, Fan, Christina Fan, Rosenfeld, Tse (2024.02.11.579839v1.27) 2018; 19
Blei, Jordan (2024.02.11.579839v1.30) 2006; 1
Choi, Lee, Sohn, Kim (2024.02.11.579839v1.49) 2023; 24
Prabhakaran, Azizi, Carr, Pe’er (2024.02.11.579839v1.32) 2016; 48
Weber, Saha, Datta, Hansen (2024.02.11.579839v1.79) 4059; 14
Wang, Zhu, Pierson, Ramazzotti, Batzoglou (2024.02.11.579839v1.11) 2017; 14
Meng, Lowell (2024.02.11.579839v1.64) 1997; 185
Fang, Liu, Peltz (2024.02.11.579839v1.96) 2023; 39
Stegle, Teichmann, Marioni (2024.02.11.579839v1.6) 2015; 16
Sun, Chen, Xin, Jiang, Huang, Cillo, Tabib, Kolls, Bruno, Lafyatis (2024.02.11.579839v1.33) 1649; 10
Hughes, Kim, Sudderth (2024.02.11.579839v1.84) 2015; 9
Alexander Wolf, Angerer, Theis (2024.02.11.579839v1.89) 2018; 19
Pedregosa, Varoquaux, Gramfort, Michel, Thirion, Grisel, Blondel, Prettenhofer, Weiss, Dubourg, Vanderplas, Passos, Cournapeau, Brucher, Perrot, Duchesnay (2024.02.11.579839v1.41) 2011; 12
Evren, Ringqvist, Tripathi, Sleiers, Rives, Alisjahbana, Gao, Sarhan, Halle, Sorini, Lepzien, Marquardt, Michaëlsson, Smed-Sörensen, Botling, Karlsson, Villablanca, Willinger (2024.02.11.579839v1.54) 2021; 54
Neubert, Homann, Wendelborn, Bär, Krampert, Trum, Schröder, Ebner, Weichselbaum, Schatz, Linz, Veelken, Schulte-Schrepping, Aschenbrenner, Quast, Kurts, Geisberger, Kunzelmann, Hammer, Binger, Titze, Müller, Kolanus, Schultze, Wagner, Jantsch (2024.02.11.579839v1.68) 2020; 18
Mookerjee-Basu, Kappes (2024.02.11.579839v1.71) 2014; 15
Neufeld, Gao, Popp, Battle, Witten (2024.02.11.579839v1.23) 2022
Alexander Wolf, Angerer, Theis (2024.02.11.579839v1.2) 2018; 19
Carbonetto, Stephens (2024.02.11.579839v1.38) 2012; 7
Zhu, Lei, Klei, Devlin, Roeder (2024.02.11.579839v1.42) 2019; 116
(2024.02.11.579839v1.59) 2023
Willis, Tellier, Liao, Trezise, Light, O’Donnell, Garrett-Sinha, Shi, Tarlinton, Nutt (2024.02.11.579839v1.51) 2017; 8
Reich, Bondell (2024.02.11.579839v1.80) 2011; 67
Zheng, Terry, Belgrader, Ryvkin, Bent, Wilson, Ziraldo, Wheeler, McDermott, Zhu, Gregory, Shuga, Montesclaros, Underwood, Masquelier, Nishimura, Schnall-Levin, Wyatt, Hindson, Bharadwaj, Wong, Ness, Beppu, Joachim Deeg, McFarland, Loeb, Valente, Ericson, Stevens, Radich, Mikkelsen, Hindson, Bielas (2024.02.11.579839v1.45) 2017; 8
William Townes, Hicks, Aryee, Irizarry (2024.02.11.579839v1.29) 2019; 20
Grün, Lyubimova, Kester, Wiebrands, Basak, Sasaki, Clevers, van Oudenaarden (2024.02.11.579839v1.16) 2015; 525
Tasic, Menon, Nguyen, Kim, Jarsky, Yao, Levi, Gray, Sorensen, Dolbeare, Bertagnolli, Goldy, Shapovalova, Parry, Lee, Smith, Bernard, Madisen, Sunkin, Hawrylycz, Koch, Zeng (2024.02.11.579839v1.17) 2016; 19
Durinck, Moreau, Kasprzyk, Davis, De Moor, Brazma, Huber (2024.02.11.579839v1.56) 2005; 21
Jiang, Zhong, Gilvary, Corliss, Hong-Geller, Wei, Djeu (2024.02.11.579839v1.63) 2000; 1
Zhu, Stephens (2024.02.11.579839v1.83) 2017; 11
Peet (2024.02.11.579839v1.94) 1974; 5
Gibeon, Zhu, Sogbesan, Banya, Rossios, Saito, Rocha, Hull, Menzies-Gow, Bhavsar, Chung (2024.02.11.579839v1.69) 2014; 108
Zhou, Carbonetto, Stephens (2024.02.11.579839v1.82) 2013; 9
Zuccolo, Deng, Unruh, Sanyal, Bau, Storek, Demetrick, Luider, Auer-Grzesiak, Mansoor, Deans (2024.02.11.579839v1.52) 2013; 4
Cohen, Giladi, Gorki, Solodkin, Zada, Hladik, Miklosi, Salame, Halpern, David, Itzkovitz, Harkany, Knapp, Amit (2024.02.11.579839v1.67) 2018; 175
Vinh, Epps, Bailey (2024.02.11.579839v1.95) 2009
Jaiswal, Dubey, Swain, Croft (2024.02.11.579839v1.70) 1996; 8
Zhang, Shahbaba, Zhao (2024.02.11.579839v1.78) 2018; 13
Simpson (2024.02.11.579839v1.92) 1949; 163
Svensson (2024.02.11.579839v1.87) 2020; 38
Guo, Wang, Steven Potter, Whitsett, Xu (2024.02.11.579839v1.3) 2015; 11
Haque, Engel, Teichmann, Lönnberg (2024.02.11.579839v1.5) 2017; 9
Gao, Bien, Witten (2024.02.11.579839v1.22) 2022
Paul, Lal (2024.02.11.579839v1.61) 2017; 8
Vavoulis, Francescatto, Heutink, Gough (2024.02.11.579839v1.72) 2015; 16
Sun, Wang, Deng, Wang, Lafyatis, Ding, Hu, Chen (2024.02.11.579839v1.35) 2018; 34
Demetci, Cheng, Darnell, Zhou, Ramachandran, Crawford (2024.02.11.579839v1.39) 2021; 17
McIntosh (2024.02.11.579839v1.93) 1967; 48
Norris, Doherty, Collins, McEntee, Traynor, Hegarty, O’Farrelly (2024.02.11.579839v1.50) 1999; 60
Baume, Caligiuri, Manley, Daley, Ritz (2024.02.11.579839v1.48) 1990; 131
Chen, Yu, Yan, Guo, Zhang, Liu, Lei, Zhang, Zhou, Gao, Yang, Li, Zhou, Fan, Ye, Li, Xu, Xiao (2024.02.11.579839v1.53) 2021; 12
Vandenbon, Diez (2024.02.11.579839v1.25) 2020; 11
Jaeger, Donadieu, Cognet, Bernat, Ordoñez-Rueda, Barlogis, Mahlaoui, Fenis, Narni-Mancinelli, Beaupain, Bellanné-Chantelot, Bajénoff, Malissen, Malissen, Vivier, Ugolini (2024.02.11.579839v1.62) 2012; 209
Zeisel, Muñoz-Manchado, Codeluppi, Lönnerberg, La Manno, Juréus, Marques, Munguba, He, Betsholtz, Rolny, Castelo-Branco, Hjerling-Leffler, Linnarsson (2024.02.11.579839v1.19) 2015; 347
van der Maaten, Hinton (2024.02.11.579839v1.12) 2008; 9
Vivier, Tomasello, Baratin, Walzer, Ugolini (2024.02.11.579839v1.60) 2008; 9
Kiselev, Andrews, Hemberg (2024.02.11.579839v1.7) 2019; 20
Guan, Stephens (2024.02.11.579839v1.81) 2011; 5
Cheng, Easton, Rosencrance, Li, Ju, Williams, Mulder, Pang, Chen, Chen (2024.02.11.579839v1.10) 2019; 47
Amezquita, Lun, Becht, Carey, Carpp, Geistlinger, Marini, Rue-Albrecht, Risso, Soneson, Waldron, Pagès, Smith, Huber, Morgan, Gottardo, Hicks (2024.02.11.579839v1.8) 2020; 17
žurauskienė, Yau (2024.02.11.579839v1.18) 2016; 17
Miao, Humphreys, McMahon, Kim (2024.02.11.579839v1.1) 2021; 17
Grabski, Street, Irizarry (2024.02.11.579839v1.24) 2023; 20
Blei, Kucukelbir, McAuliffe (2024.02.11.579839v1.40) 2017; 112
Soneson, Robinson (2024.02.11.579839v1.73) 2018; 15
Yu, Cao, Yang, Yang (2024.02.11.579839v1.90) 2022; 23
Wand, Ormerod, Padoan, Frühwirth (2024.02.11.579839v1.86) 2011; 6
Gelman, Carlin, Stern, Dunson, Vehtari, Rubin, Analysis (2024.02.11.579839v1.31) 2013
Ashburner, Ball, Blake, Botstein, Butler, Michael Cherry, Davis, Dolinski, Dwight, Eppig, Harris, Hill, Issel-Tarver, Kasarskis, Lewis, Matese, Richardson, Ringwald, Rubin, Sherlock (2024.02.11.579839v1.58) 2000; 25
Wagner, Wagner (2024.02.11.579839v1.91) 2007
Duan, Pinto, Xie (2024.02.11.579839v1.34) 2019; 35
Barbieri, Berger (2024.02.11.579839v1.36) 2004; 32
Zeng, Zhou (2024.02.11.579839v1.37) 2017; 8
Michielsen, Reinders, Mahfouz (2024.02.11.579839v1.47) 2021; 12
Svensson (2024.02.11.579839v1.88) 2021; 39
Cohen (2024.02.11.579839v1.85) 2013
Domínguez Conde, Xu, Jarvis, Rainbow, Wells, Gomes, Howlett, Suchanek, Polanski, King, Mamanova, Huang, Szabo, Richardson, Bolt, Fasouli, Mahbubani, Prete, Tuck, Richoz, Tuong, Campos, Mousa, Needham, Pritchard, Li, Elmentaite, Park, Rahmani, Chen, Menon, Bayraktar, James, Meyer, Yosef, Clatworthy, Sims, Farber, Saeb-Parsy, Jones, Teichmann (2024.02.11.579839v1.66) 2022; 376
Blei, Kucukelbir, McAuliffe (2024.02.11.579839v1.76) 2017; 112
Kiselev, Kirschner, Schaub, Andrews, Yiu, Chandra, Natarajan, Reik, Barahona, Green, Hemberg (2024.02.11.579839v1.15) 2017; 14
Chen, Tan, Kou, Duan, Wang, Meirelles, Clark, Ma’ayan (2024.02.11.579839v1.57) 2013; 14
Baer, Dillner, Schwartz, Sedon, Nedospasov, Johnson (2024.02.11.579839v1.65) 1998; 18
Fang, Liu, Peltz (2024.02.11.579839v1.55) 2023; 39
Vargo, Gilbert (2024.02.11.579839v1.46) 2020; 21
Lun, McCarthy, Marioni (2024.02.11.579839v1.4) 2016; 5
Lall, Ray, Bandyopadhyay (2024.02.11.579839v1.28) 2021; 17
Svensson, da Veiga Beltrame, Pachter (2024.02.11.579839v1.44) 2020
Zhu, Stephens (2024.02.11.579839v1.74) 4361; 9
Satija, Farrell, Gennert, Schier, Regev (2024.02.11.579839v1.9) 2015; 33
Zhang, Kamath, Tse David (2024.02.11.579839v1.26) 2019; 9
Giordano, Broderick, Jordan (2024.02.11.579839v1.77) 2018; 19
References_xml – volume: 39
  start-page: 160
  issue: 2
  year: 2021
  end-page: 160
  ident: 2024.02.11.579839v1.88
  article-title: Reply to: Umi or not umi, that is the question for scrna-seq zero-inflation
  publication-title: Nature Biotechnology
  doi: 10.1038/s41587-020-00811-5
– volume: 17
  issue: 1 140
  year: 2016
  ident: 2024.02.11.579839v1.18
  article-title: pcareduce: hierarchical clustering of single cell transcriptional profiles
  publication-title: BMC Bioinformatics
  doi: 10.1186/s12859-016-0984-y
– volume: 17
  start-page: e1009464
  issue: 10
  year: 2021
  ident: 2024.02.11.579839v1.28
  article-title: Rgcop-a regularized copula based method for gene selection in single-cell rna-seq data
  publication-title: PLOS Computational Biology
– volume: 7
  start-page: 73
  year: 2012
  end-page: 108
  ident: 2024.02.11.579839v1.38
  article-title: Scalable variational inference for bayesian variable selection in regression, and its accuracy in genetic association studies
  publication-title: Bayesian Analysis
  doi: 10.1214/12-BA703
– volume: 39
  start-page: btac757
  issue: 1
  year: 2023
  ident: 2024.02.11.579839v1.96
  article-title: Gseapy: a comprehensive package for performing gene set enrichment analysis in python
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btac757
– volume: 9
  start-page: 2579
  issue: 86
  year: 2008
  end-page: 2605
  ident: 2024.02.11.579839v1.12
  article-title: Visualizing data using t-sne
  publication-title: Journal of Machine Learning Research
– volume: 32
  start-page: 870
  issue: 3
  year: 2004
  end-page: 897
  ident: 2024.02.11.579839v1.36
  article-title: Optimal predictive model selection
  publication-title: The Annals of Statistics
  doi: 10.1214/009053604000000238
– volume: 6
  start-page: 847
  issue: 4
  year: 2011
  end-page: 900
  ident: 2024.02.11.579839v1.86
  article-title: Mean field variational bayes for elaborate distributions
  publication-title: Bayesian Analysis
  doi: 10.1214/11-BA631
– volume: 17
  start-page: 137
  issue: 22
  year: 2020
  end-page: 145
  ident: 2024.02.11.579839v1.8
  article-title: Orchestrating single-cell analysis with bioconductor
  publication-title: Nature Methods
  doi: 10.1038/s41592-019-0654-x
– volume: 347
  start-page: 1138
  issue: 6226
  year: 2015
  end-page: 1142
  ident: 2024.02.11.579839v1.19
  article-title: Cell types in the mouse cortex and hippocampus revealed by single-cell rna-seq
  publication-title: Science
  doi: 10.1126/science.aaa1934
– volume: 15
  start-page: e8746
  issue: 6
  year: 2019
  ident: 2024.02.11.579839v1.20
  article-title: Current best practices in single-cell rna-seq analysis: a tutorial
  publication-title: Molecular Systems Biology
  doi: 10.15252/msb.20188746
– year: 2023
  ident: 2024.02.11.579839v1.43
  publication-title: Support: single cell gene expression datasets
– volume: 19
  issue: 1 15
  year: 2018
  ident: 2024.02.11.579839v1.2
  article-title: Scanpy: large-scale single-cell gene expression data analysis
  publication-title: Genome Biology
  doi: 10.1186/s13059-017-1382-0
– volume: 15
  start-page: 1053
  issue: 12
  year: 2018
  end-page: 1058
  ident: 2024.02.11.579839v1.14
  article-title: Deep generative modeling for single-cell transcriptomics
  publication-title: Nature Methods
  doi: 10.1038/s41592-018-0229-2
– volume: 8
  start-page: 456
  issue: 1
  year: 2017
  ident: 2024.02.11.579839v1.37
  article-title: Non-parametric genetic prediction of complex traits with latent dirichlet process regression models
  publication-title: Nature Communications
  doi: 10.1038/s41467-017-00470-2
– volume: 12
  start-page: 2825
  year: 2011
  end-page: 2830
  ident: 2024.02.11.579839v1.41
  article-title: Scikit-learn: Machine learning in Python
  publication-title: Journal of Machine Learning Research
– volume: 47
  start-page: e143
  issue: 22
  year: 2019
  ident: 2024.02.11.579839v1.10
  article-title: Latent cellular analysis robustly reveals subtle diversity in large-scale single-cell rna-seq data
  publication-title: Nucleic Acids Research
  doi: 10.1093/nar/gkz826
– volume: 48
  start-page: 392
  issue: 3
  year: 1967
  end-page: 404
  ident: 2024.02.11.579839v1.93
  article-title: An index of diversity and the relation of certain concepts to diversity
  publication-title: Ecology
– volume: 4
  year: 2013
  ident: 2024.02.11.579839v1.52
  article-title: Expression of ms4a and tmem176 genes in human b lymphocytes
  publication-title: Frontiers in Immunology
– volume: 1
  start-page: 419
  issue: 55
  year: 2000
  end-page: 425
  ident: 2024.02.11.579839v1.63
  article-title: Pivotal role of phosphoinositide-3 kinase in regulation of cytotoxicity in natural killer cells
  publication-title: Nature Immunology
  doi: 10.1038/80859
– volume: 376
  start-page: eabl5197
  issue: 6594
  year: 2022
  ident: 2024.02.11.579839v1.66
  article-title: Cross-tissue immune cell analysis reveals tissue-specific features in humans
  publication-title: Science
  doi: 10.1126/science.abl5197
– volume: 8
  start-page: 1426
  issue: 11
  year: 2017
  ident: 2024.02.11.579839v1.51
  article-title: Environmental sensing by mature b cells is controlled by the transcription factors pu.1 and spib
  publication-title: Nature Communications
  doi: 10.1038/s41467-017-01605-1
– volume: 14
  start-page: 2023
  issue: 1
  year: 4059
  ident: 2024.02.11.579839v1.79
  article-title: and Stephanie C Hicks. nnsvg for the scalable identification of spatially variable genes using nearest-neighbor gaussian processes
  publication-title: Nature Communications
– volume: 11
  start-page: e1004575
  issue: 11
  year: 2015
  ident: 2024.02.11.579839v1.3
  article-title: Sincera: A pipeline for single-cell rna-seq profiling analysis
  publication-title: PLOS Computational Biology
  doi: 10.1371/journal.pcbi.1004575
– year: 2023
  ident: 2024.02.11.579839v1.59
  publication-title: Gene ontology data archive
– volume: 15
  start-page: 255
  issue: 44
  year: 2018
  end-page: 261
  ident: 2024.02.11.579839v1.73
  article-title: Bias, robustness and scalability in single-cell differential expression analysis
  publication-title: Nature Methods
  doi: 10.1038/nmeth.4612
– volume: 5
  start-page: 1780
  issue: 3
  year: 2011
  end-page: 1815
  ident: 2024.02.11.579839v1.81
  article-title: Bayesian variable selection regression for genome-wide association studies and other large-scale problems
  publication-title: The Annals Applied Statistics
  doi: 10.1214/11-AOAS455
– volume: 38
  start-page: 147
  issue: 2
  year: 2020
  end-page: 150
  ident: 2024.02.11.579839v1.87
  article-title: Droplet scrna-seq is not zero-inflated
  publication-title: Nature Biotechnology
  doi: 10.1038/s41587-019-0379-5
– volume: 14
  start-page: 483
  issue: 55
  year: 2017
  end-page: 486
  ident: 2024.02.11.579839v1.15
  article-title: Sc3: consensus clustering of single-cell rna-seq data
  publication-title: Nature Methods
  doi: 10.1038/nmeth.4236
– volume: 12
  start-page: 2799
  issue: 11
  year: 2021
  ident: 2024.02.11.579839v1.47
  article-title: Hierarchical progressive learning of cell identities in single-cell data
  publication-title: Nature Communications
  doi: 10.1038/s41467-021-23196-8
– volume: 34
  start-page: 139
  issue: 1
  year: 2018
  end-page: 146
  ident: 2024.02.11.579839v1.35
  article-title: Dimm-sc: a dirichlet mixture model for clustering droplet-based single cell transcriptomic data
  publication-title: Bioinformatics
– volume: 19
  issue: 1 15
  year: 2018
  ident: 2024.02.11.579839v1.89
  article-title: Scanpy: large-scale single-cell gene expression data analysis
  publication-title: Genome Biology
  doi: 10.1186/s13059-017-1382-0
– volume: 54
  start-page: 259
  issue: 2
  year: 2021
  end-page: 275
  ident: 2024.02.11.579839v1.54
  article-title: Distinct developmental pathways from blood monocytes generate human lung macrophage diversity
  publication-title: Immunity
  doi: 10.1016/j.immuni.2020.12.003
– volume: 60
  start-page: 20
  issue: 1
  year: 1999
  end-page: 31
  ident: 2024.02.11.579839v1.50
  article-title: Natural t cells in the human liver: cytotoxic lymphocytes with dual t cell and natural killer cell phenotype and function are phenotypically heterogenous and include vα24-jαq and γδ t cell receptor bearing cells
  publication-title: Human Immunology
  doi: 10.1016/S0198-8859(98)00098-6
– volume: 1
  start-page: 121
  issue: 1
  year: 2006
  end-page: 143
  ident: 2024.02.11.579839v1.30
  article-title: Variational inference for dirichlet process mixtures
  publication-title: Bayesian Analysis
  doi: 10.1214/06-BA104
– volume: 16
  issue: 1 39
  year: 2015
  ident: 2024.02.11.579839v1.72
  article-title: Dgeclust: differential expression analysis of clustered count data
  publication-title: Genome Biology
  doi: 10.1186/s13059-015-0604-6
– volume: 8
  start-page: 14049
  issue: 11
  year: 2017
  ident: 2024.02.11.579839v1.45
  article-title: Massively parallel digital transcriptional profiling of single cells
  publication-title: Nature Communications
  doi: 10.1038/ncomms14049
– volume: 11
  start-page: 2020
  issue: 1
  year: 1585
  ident: 2024.02.11.579839v1.75
  article-title: Integrative differential expression and gene set enrichment analysis using summary statistics for scrna-seq studies
  publication-title: Nature Communications
– volume: 8
  year: 2017
  ident: 2024.02.11.579839v1.61
  article-title: The molecular mechanism of natural killer cells function and its importance in cancer immunotherapy
  publication-title: Frontiers in Immunology
– volume: 24
  start-page: 15
  issue: 1
  year: 2023
  ident: 2024.02.11.579839v1.49
  article-title: Cd40 ligand stimulation affects the number and memory phenotypes of human peripheral cd8+ t cells
  publication-title: BMC Immunology
  doi: 10.1186/s12865-023-00547-2
– volume: 116
  start-page: 466
  issue: 2
  year: 2019
  end-page: 471
  ident: 2024.02.11.579839v1.42
  article-title: Semisoft clustering of single-cell data
  publication-title: Proceedings of the National Academy of Sciences
  doi: 10.1073/pnas.1817715116
– volume: 18
  start-page: e3000722
  issue: 6
  year: 2020
  ident: 2024.02.11.579839v1.68
  article-title: Ncx1 represents an ionic na+ sensing mechanism in macrophages
  publication-title: PLoS Biology
  doi: 10.1371/journal.pbio.3000722
– volume: 5
  start-page: 285
  issue: 1
  year: 1974
  end-page: 307
  ident: 2024.02.11.579839v1.94
  article-title: The measurement of species diversity
  publication-title: Annual review of ecology and systematics
– volume: 112
  start-page: 859
  issue: 518
  year: 2017
  end-page: 877
  ident: 2024.02.11.579839v1.76
  article-title: Variational inference: A review for statisticians
  publication-title: Journal of the American Statistical Association
– volume: 9
  start-page: 503
  issue: 55
  year: 2008
  end-page: 510
  ident: 2024.02.11.579839v1.60
  article-title: Functions of natural killer cells
  publication-title: Nature Immunology
  doi: 10.1038/ni1582
– volume: 48
  start-page: 1070
  year: 2016
  end-page: 1079
  ident: 2024.02.11.579839v1.32
  article-title: Dirichlet process mixture model for correcting technical variation in single-cell gene expression data
  publication-title: JMLR Workshop and Conference Proceedings
– volume: 175
  start-page: 1031
  issue: 4
  year: 2018
  end-page: 1044
  ident: 2024.02.11.579839v1.67
  article-title: Lung single-cell signaling interaction map reveals basophil role in macrophage imprinting
  publication-title: Cell
  doi: 10.1016/j.cell.2018.09.009
– volume: 16
  start-page: 133
  issue: 33
  year: 2015
  end-page: 145
  ident: 2024.02.11.579839v1.6
  article-title: Computational and analytical challenges in single-cell transcriptomics
  publication-title: Nature Reviews Genetics
  doi: 10.1038/nrg3833
– volume: 112
  start-page: 859
  issue: 518
  year: 2017
  end-page: 877
  ident: 2024.02.11.579839v1.40
  article-title: Variational inference: A review for statisticians
  publication-title: Journal of the American Statistical Association
  doi: 10.1080/01621459.2017.1285773
– volume: 9
  year: 2015
  ident: 2024.02.11.579839v1.84
  article-title: Reliable and scalable variational inference for the hierarchical dirichlet process
  publication-title: Artificial Intelligence and Statistics, page
– volume: 11
  start-page: 1561
  issue: 3
  year: 2017
  end-page: 1592
  ident: 2024.02.11.579839v1.83
  article-title: Bayesian large-scale multiple regression with summary statistics from genome-wide association studies
  publication-title: The Annals Applied Statistics
  doi: 10.1214/17-AOAS1046
– volume: 21
  start-page: 3439
  issue: 16
  year: 2005
  end-page: 3440
  ident: 2024.02.11.579839v1.56
  article-title: Biomart and bioconductor: a powerful link between biological databases and microarray data analysis
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bti525
– volume: 19
  start-page: 1
  issue: 1
  year: 2018
  end-page: 12
  ident: 2024.02.11.579839v1.27
  article-title: An interpretable framework for clustering single-cell rna-seq datasets
  publication-title: BMC bioinformatics
– year: 2013
  ident: 2024.02.11.579839v1.85
  publication-title: Statistical Power Analysis for the Behavioral Sciences
– volume: 15
  start-page: 593
  issue: 77
  year: 2014
  end-page: 594
  ident: 2024.02.11.579839v1.71
  article-title: New ingredients for brewing cd4+t (cells): Tcf-1 and lef-1
  publication-title: Nature Immunology
  doi: 10.1038/ni.2927
– volume: 67
  start-page: 381
  issue: 2
  year: 2011
  end-page: 390
  ident: 2024.02.11.579839v1.80
  article-title: A spatial dirichlet process mixture model for clustering population genetics data
  publication-title: Biometrics
– start-page: 2116331
  year: 2022
  ident: 2024.02.11.579839v1.22
  article-title: Selective inference for hierarchical clustering
  publication-title: Journal of the American Statistical Association
  doi: 10.1080/01621459.2022.2116331
– volume: 163
  start-page: 688
  issue: 4148
  year: 1949
  end-page: 688
  ident: 2024.02.11.579839v1.92
  article-title: Measurement of diversity
  publication-title: nature
– volume: 13
  issue: 2 485
  year: 2018
  ident: 2024.02.11.579839v1.78
  article-title: Variational hamiltonian monte carlo via score matching
  publication-title: Bayesian Analysis
– year: 2007
  ident: 2024.02.11.579839v1.91
  publication-title: Comparing clusterings - an overview
– start-page: 1073
  year: 2009
  end-page: 1080
  ident: 2024.02.11.579839v1.95
  article-title: Information theoretic measures for clusterings comparison: is a correction for chance necessary?
– volume: 11
  start-page: 4318
  issue: 11
  year: 2020
  ident: 2024.02.11.579839v1.25
  article-title: A clustering-independent method for finding differentially expressed genes in single-cell transcriptome data
  publication-title: Nature Communications
  doi: 10.1038/s41467-020-17900-3
– volume: 20
  start-page: 273
  issue: 55
  year: 2019
  end-page: 282
  ident: 2024.02.11.579839v1.7
  article-title: Challenges in unsupervised clustering of single-cell rna-seq data
  publication-title: Nature Reviews Genetics
  doi: 10.1038/s41576-018-0088-9
– volume: 20
  start-page: 1
  year: 2019
  end-page: 16
  ident: 2024.02.11.579839v1.29
  article-title: Feature selection and dimension reduction for single-cell rna-seq based on a multinomial model
  publication-title: Genome Biology
– volume: 18
  start-page: 5678
  issue: 10
  year: 1998
  end-page: 5689
  ident: 2024.02.11.579839v1.65
  article-title: Tumor necrosis factor alpha transcription in macrophages is attenuated by an autocrine factor that preferentially induces nf-κb p50
  publication-title: Molecular and Cellular Biology
– volume: 23
  issue: 1 49
  year: 2022
  ident: 2024.02.11.579839v1.90
  article-title: Benchmarking clustering algorithms on estimating the number of cell types from single-cell rna-sequencing data
  publication-title: Genome Biology
  doi: 10.1186/s13059-022-02622-0
– volume: 35
  start-page: 953
  issue: 6
  year: 2019
  end-page: 961
  ident: 2024.02.11.579839v1.34
  article-title: Parallel clustering of single cell transcriptomic data with split-merge sampling on dirichlet process mixtures
  publication-title: Bioinformatics
– volume: 14
  issue: 128
  year: 2013
  ident: 2024.02.11.579839v1.57
  article-title: Enrichr: interactive and collaborative html5 gene list enrichment analysis tool
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-14-128
– volume: 108
  start-page: 71
  issue: 1
  year: 2014
  end-page: 77
  ident: 2024.02.11.579839v1.69
  article-title: Lipid-laden bronchoalveolar macrophages in asthma and chronic cough
  publication-title: Respiratory Medicine
  doi: 10.1016/j.rmed.2013.10.005
– volume: 9
  start-page: 383
  issue: 4
  year: 2019
  end-page: 392
  ident: 2024.02.11.579839v1.26
  article-title: Valid post-clustering differential analysis for single-cell rna-seq
  publication-title: Cell Systems
– volume: 10
  start-page: 2019
  issue: 1
  year: 1649
  ident: 2024.02.11.579839v1.33
  article-title: A bayesian mixture model for clustering droplet-based single-cell transcriptomic data from population studies
  publication-title: Nature communications
– volume: 8
  start-page: 275
  issue: 2
  year: 1996
  end-page: 285
  ident: 2024.02.11.579839v1.70
  article-title: Regulation of cd40 ligand expression on naive cd4 t cells: a role for tcr but not co-stimulatory signals
  publication-title: International Immunology
  doi: 10.1093/intimm/8.2.275
– volume: 17
  start-page: e1009754
  issue: 8
  year: 2021
  ident: 2024.02.11.579839v1.39
  article-title: Multi-scale inference of genetic trait architecture using biologically annotated neural networks
  publication-title: PLOS Genetics
  doi: 10.1371/journal.pgen.1009754
– start-page: baaa073
  year: 2020
  ident: 2024.02.11.579839v1.44
  article-title: A curated database reveals trends in single-cell transcriptomics
  publication-title: Database: The Journal of Biological Databases and Curation
  doi: 10.1093/database/baaa073
– volume: 19
  issue: 51
  year: 2018
  ident: 2024.02.11.579839v1.77
  article-title: Covariances, robustness and variational bayes
  publication-title: Journal of Machine Learning Research
– volume: 9
  start-page: e1003264
  issue: 2
  year: 2013
  ident: 2024.02.11.579839v1.82
  article-title: Polygenic modeling with Bayesian sparse linear mixed models
  publication-title: PLOS Genetics
– volume: 21
  issue: 1 31
  year: 2020
  ident: 2024.02.11.579839v1.21
  article-title: Eleven grand challenges in single-cell data science
  publication-title: Genome Biology
  doi: 10.1186/s13059-020-1926-6
– volume: 33
  start-page: 495
  year: 2015
  end-page: 502
  ident: 2024.02.11.579839v1.9
  article-title: Spatial reconstruction of single-cell gene expression data
  publication-title: Nature Biotechnology
  doi: 10.1038/nbt.3192
– volume: 185
  start-page: 1661
  issue: 9
  year: 1997
  end-page: 1670
  ident: 2024.02.11.579839v1.64
  article-title: Lipopolysaccharide (lps)-induced macrophage activation and signal transduction in the absence of src-family kinases hck, fgr, and lyn
  publication-title: The Journal of Experimental Medicine
– volume: 525
  start-page: 251
  issue: 75687568
  year: 2015
  end-page: 255
  ident: 2024.02.11.579839v1.16
  article-title: Single-cell messenger rna sequencing reveals rare intestinal cell types
  publication-title: Nature
  doi: 10.1038/nature14966
– volume: 5
  start-page: 2122
  year: 2016
  ident: 2024.02.11.579839v1.4
  article-title: A step-by-step workflow for low-level analysis of single-cell rna-seq data with bioconductor
  publication-title: F1000Research
  doi: 10.12688/f1000research.9501.2
– year: 2013
  ident: 2024.02.11.579839v1.31
  publication-title: Third Edition
– year: 2022
  ident: 2024.02.11.579839v1.23
  article-title: Inference after latent variable estimation for single-cell rna sequencing data
  publication-title: Biostatistics
  doi: 10.1093/biostatistics/kxac047
– volume: 131
  start-page: 352
  issue: 2
  year: 1990
  end-page: 365
  ident: 2024.02.11.579839v1.48
  article-title: Differential expression of cd8α and cd8β associated with mhc-restricted and non-mhc-restricted cytolytic effector cells
  publication-title: Cellular Immunology
  doi: 10.1016/0008-8749(90)90260-X
– volume: 19
  start-page: 335
  issue: 22
  year: 2016
  end-page: 346
  ident: 2024.02.11.579839v1.17
  article-title: Adult mouse cortical cell taxonomy revealed by single cell transcriptomics
  publication-title: Nature Neuroscience
  doi: 10.1038/nn.4216
– volume: 9
  start-page: 75
  issue: 1
  year: 2017
  ident: 2024.02.11.579839v1.5
  article-title: A practical guide to single-cell rna-sequencing for biomedical research and clinical applications
  publication-title: Genome Medicine
  doi: 10.1186/s13073-017-0467-4
– volume: 39
  start-page: btac757
  issue: 1
  year: 2023
  ident: 2024.02.11.579839v1.55
  article-title: Gseapy: a comprehensive package for performing gene set enrichment analysis in python
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btac757
– volume: 20
  start-page: 1196
  issue: 88
  year: 2023
  end-page: 1202
  ident: 2024.02.11.579839v1.24
  article-title: Significance analysis for clustering with single-cell rna-sequencing data
  publication-title: Nature Methods
  doi: 10.1038/s41592-023-01933-9
– volume: 12
  year: 2021
  ident: 2024.02.11.579839v1.53
  article-title: Pnoc expressed by b cells in cholangio-carcinoma was survival related and lair2 could be a t cell exhaustion biomarker in tumor microenvironment: Characterization of immune microenvironment combining single-cell and bulk sequencing technology
  publication-title: Frontiers in Immunology
– year: 2020
  ident: 2024.02.11.579839v1.13
  publication-title: Umap: Uniform manifold approximation and projection for dimension reduction
– volume: 209
  start-page: 565
  issue: 3
  year: 2012
  end-page: 580
  ident: 2024.02.11.579839v1.62
  article-title: Neutrophil depletion impairs natural killer cell maturation, function, and homeostasis
  publication-title: The Journal of Experimental Medicine
  doi: 10.1084/jem.20111908
– volume: 9
  start-page: 2018
  issue: 1
  year: 4361
  ident: 2024.02.11.579839v1.74
  article-title: Large-scale genome-wide enrichment analyses identify new trait-associated genes and pathways across 31 human phenotypes
  publication-title: Nature Communications
– volume: 17
  start-page: 710
  issue: 11
  year: 2021
  end-page: 724
  ident: 2024.02.11.579839v1.1
  article-title: Multi-omics integration in the age of million single-cell data
  publication-title: Nature Reviews Nephrology
  doi: 10.1038/s41581-021-00463-x
– volume: 21
  start-page: 1
  issue: 1
  year: 2020
  end-page: 51
  ident: 2024.02.11.579839v1.46
  article-title: A rank-based marker selection method for high through-put scrna-seq data
  publication-title: BMC bioinformatics
– volume: 14
  start-page: 414
  issue: 44
  year: 2017
  end-page: 416
  ident: 2024.02.11.579839v1.11
  article-title: Visualization and analysis of single-cell rna-seq data by kernel-based similarity learning
  publication-title: Nature Methods
  doi: 10.1038/nmeth.4207
– volume: 25
  start-page: 25
  issue: 11
  year: 2000
  end-page: 29
  ident: 2024.02.11.579839v1.58
  article-title: Gene ontology: tool for the unification of biology
  publication-title: Nature Genetics
  doi: 10.1038/75556
SSID ssj0002961374
Score 1.8600233
SecondaryResourceType preprint
Snippet Clustering is commonly used in single-cell RNA-sequencing (scRNA-seq) pipelines to characterize cellular heterogeneity. However, current methods face two main...
SourceID biorxiv
proquest
pubmed
SourceType Open Access Repository
Aggregation Database
Index Database
SubjectTerms Bayesian analysis
Bioinformatics
Transcriptomics
Title Scalable nonparametric clustering with unified marker gene selection for single-cell RNA-seq data
URI https://www.ncbi.nlm.nih.gov/pubmed/38405697
https://www.proquest.com/docview/2925283694
https://www.proquest.com/docview/2932023820
https://www.biorxiv.org/content/10.1101/2024.02.11.579839
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVPQU
  databaseName: Biological Science Database
  customDbUrl:
  eissn: 2692-8205
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0002961374
  issn: 2692-8205
  databaseCode: M7P
  dateStart: 20131107
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/biologicalscijournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 2692-8205
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0002961374
  issn: 2692-8205
  databaseCode: BENPR
  dateStart: 20131107
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Publicly Available Content Database
  customDbUrl:
  eissn: 2692-8205
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0002961374
  issn: 2692-8205
  databaseCode: PIMPY
  dateStart: 20131107
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/publiccontent
  providerName: ProQuest
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Lb9QwEB7RLpU48S4LZWUkrobE8SM-IUCt4MAqKiAtp8h2bGmldnebdCv498x40-UEF45RIseZ8XgennwfwGuhVUomFtzq6Ln01nO0IiIz64JTMnWpdplswszn9WJhm7HgNoxtlbd7Yt6ou3WgGvlbYQXhkGgr322uOLFG0enqSKFxABNCSahy616zr7EIi84qAzELbdHwRaHGg01ciJT2S8LrLMs3ytia6MKP_HLd_1ze_D3czG7n7P7_TvgBTBq3if1DuBNXj-BoRzz56zG4r6ga-mmKYfZP6N-XRKwVWLjYEnACujNGBVq2XS0Txqjskpp4eoaLLbIhM-egOhnGu4xKDReRU_2fnc_f8yFeMeo6fQLfz06_ffzER7IF7ktdWE5AXLIsQuW0Vza40JU6GG2qFIMvlYs1RlomeXRoXoTOhcJF0xnvvHexcGX1FA5xyvEZMBVE5aVU1mGskqL2wlVRFp1MHsMlVU3h1SjndrOD1GhJF20hMB9pd7qYwsmtVNvRqob2j0hxiP1ttAf6SLeK6y09kxnhUddTON5pbv-WCrNZpa15_u_BX8A9mg_PtC8ncHjdb-NLuBturpdDP4MDs6hnMPlwOm_OZ3nB4VXz-Uvz4zeAHtz5
linkProvider ProQuest
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Nb9QwEB2VLhWc-IaFAkaCoyFx_LE-IEShVauWaFWK1FtqO460Uru7TbqF_il-IzNJdnuCWw-cEzl25s14PLbfA3grtKoqExNudfRceus5ehGJmZXBKVmV1ci1YhMmz0fHx3a8Br-Xd2HoWOUyJraBupwFqpF_EFYQD4m28tP8nJNqFO2uLiU0Oljsx6ufuGRrPu59Rfu-E2Jn--jLLu9VBbhPdWI5MU7JNAmZ017Z4EKZ6mC0yaoYfKpcHGFKYSqPkduLULqQuGhK4533LiYuzbDdWzCQmTToV4Ot7Xx8uKrqCIvTY0v9LLTFUCMS1W-lIvSp0CCJITRN3ytjRyRQvuEns_rX5PLvCW470e3c-99-0X0YjN081g9gLU4fwkYnrXn1CNx3BB9dC2PT2ZT4zc9IOiywcLogagicsBmVoNliOqkwC2dndEypZuhOkTWtNhAClmFGz6iYcho57XCww_wzb-I5o3O1j-HHjYzqCaxjl-MzYCqIzEuprMNsrIraC5dFmZSy8pgQqmwIb3q7FvOONKQg2xeJwBVX0dl-CJtLKxZ93GiKaxNiE6vH6PE0SDeNswW902reI7aG8LRDyuorGa7Xlbbm-b8bfw13do--HRQHe_n-C7hLfeOtyM0mrF_Ui_gSbofLi0lTv-oBzuDkpgHzB8RbOWA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Scalable+nonparametric+clustering+with+unified+marker+gene+selection+for+single-cell+RNA-seq+data&rft.jtitle=bioRxiv&rft.au=Nwizu%2C+Chibuikem&rft.au=Hughes%2C+Madeline&rft.au=Ramseier%2C+Michelle+L&rft.au=Navia%2C+Andrew+W&rft.date=2024-02-12&rft.issn=2692-8205&rft.eissn=2692-8205&rft_id=info:doi/10.1101%2F2024.02.11.579839&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2692-8205&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2692-8205&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2692-8205&client=summon