LncDC: a machine learning-based tool for long non-coding RNA detection from RNA-Seq data
Long non-coding RNAs (lncRNAs) play an essential role in diverse biological processes and disease development. Accurate classification of lncRNAs and mRNAs is important for the identification of tissue- or disease-specific lncRNAs. Here, we present our tool LncDC (Long non-coding RNA detection) that...
Saved in:
| Published in: | Scientific reports Vol. 12; no. 1; pp. 19083 - 18 |
|---|---|
| Main Authors: | , |
| Format: | Journal Article |
| Language: | English |
| Published: |
London
Nature Publishing Group UK
09.11.2022
Nature Portfolio |
| Subjects: | |
| ISSN: | 2045-2322, 2045-2322 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Long non-coding RNAs (lncRNAs) play an essential role in diverse biological processes and disease development. Accurate classification of lncRNAs and mRNAs is important for the identification of tissue- or disease-specific lncRNAs. Here, we present our tool LncDC (Long non-coding RNA detection) that is able to accurately predict lncRNAs with an XGBoost model using features extracted from RNA sequences, secondary structures, and translated proteins. Benchmarking experiments showed that LncDC consistently outperformed six state-of-the-art tools in distinguishing lncRNAs from mRNAs. Notably, the use of sequence and secondary structure (SASS) k-mer score features and flexible ORF features improved the classification capability of LncDC. We anticipate that LncDC will definitely promote the discovery of more and novel disease-specific lncRNAs. LncDC is implemented in Python and freely available at
https://github.com/lim74/LncDC
. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| ISSN: | 2045-2322 2045-2322 |
| DOI: | 10.1038/s41598-022-22082-7 |