LFSC: A linear fast semi-supervised clustering algorithm that integrates reference-bulk and single-cell transcriptomes

The identification of cell types in complex tissues is an important step in research into cellular heterogeneity in disease. We present a linear fast semi-supervised clustering (LFSC) algorithm that utilizes reference samples generated from bulk RNA sequencing data to identify cell types from single...

Full description

Saved in:
Bibliographic Details
Published in:Frontiers in genetics Vol. 13; p. 1068075
Main Authors: Liu, Qiaoming, Liang, Yingjian, Wang, Dong, Li, Jie
Format: Journal Article
Language:English
Published: Switzerland Frontiers Media S.A 01.12.2022
Subjects:
ISSN:1664-8021, 1664-8021
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The identification of cell types in complex tissues is an important step in research into cellular heterogeneity in disease. We present a linear fast semi-supervised clustering (LFSC) algorithm that utilizes reference samples generated from bulk RNA sequencing data to identify cell types from single-cell transcriptomes. An anchor graph is constructed to depict the relationship between reference samples and cells. By applying a connectivity constraint to the learned graph, LFSC enables the preservation of the underlying cluster structure. Moreover, the overall complexity of LFSC is linear to the size of the data, which greatly improves effectiveness and efficiency. By applying LFSC to real single-cell RNA sequencing datasets, we discovered that it has superior performance over existing baseline methods in clustering accuracy and robustness. An application using infiltrating T cells in liver cancer demonstrates that LFSC can successfully find new cell types, discover differently expressed genes, and explore new cancer-associated biomarkers.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Reviewed by: Ran Su, Tianjin University, China
This article was submitted to Computational Genomics, a section of the journal Frontiers in Genetics
Edited by: Quan Zou, University of Electronic Science and Technology of China, China
These authors have contributed equally to this work
Yijie Ding, University of Electronic Science and Technology of China, China
ISSN:1664-8021
1664-8021
DOI:10.3389/fgene.2022.1068075