BioSeq-BLM: a platform for analyzing DNA, RNA and protein sequences based on biological language models

Abstract In order to uncover the meanings of ‘book of life’, 155 different biological language models (BLMs) for DNA, RNA and protein sequence analysis are discussed in this study, which are able to extract the linguistic properties of ‘book of life’. We also extend the BLMs into a system called Bio...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Nucleic acids research Ročník 49; číslo 22; s. e129
Hlavní autoři: Li, Hong-Liang, Pang, Yi-He, Liu, Bin
Médium: Journal Article
Jazyk:angličtina
Vydáno: England Oxford University Press 16.12.2021
Témata:
ISSN:0305-1048, 1362-4962, 1362-4962
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Abstract In order to uncover the meanings of ‘book of life’, 155 different biological language models (BLMs) for DNA, RNA and protein sequence analysis are discussed in this study, which are able to extract the linguistic properties of ‘book of life’. We also extend the BLMs into a system called BioSeq-BLM for automatically representing and analyzing the sequence data. Experimental results show that the predictors generated by BioSeq-BLM achieve comparable or even obviously better performance than the exiting state-of-the-art predictors published in literatures, indicating that BioSeq-BLM will provide new approaches for biological sequence analysis based on natural language processing technologies, and contribute to the development of this very important field. In order to help the readers to use BioSeq-BLM for their own experiments, the corresponding web server and stand-alone package are established and released, which can be freely accessed at http://bliulab.net/BioSeq-BLM/.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0305-1048
1362-4962
1362-4962
DOI:10.1093/nar/gkab829