SSL-VQ: vector-quantized variational autoencoders for semi-supervised prediction of therapeutic targets across diverse diseases

Motivation Identifying effective therapeutic targets poses a challenge in drug discovery, especially for uncharacterized diseases without known therapeutic targets (e.g. rare diseases, intractable diseases). Results This study presents a novel machine learning approach using multimodal vector-quanti...

Full description

Saved in:

Bibliographic Details
Published in:	Bioinformatics (Oxford, England) Vol. 41; no. 2
Main Authors:	Namba, Satoko, Li, Chen, Yuyama Otani, Noriko, Yamanishi, Yoshihiro
Format:	Journal Article
Language:	English
Published:	England Oxford University Press 04.02.2025
Subjects:	Autoencoder Computational Biology - methods Disease - genetics Drug Discovery - methods Humans Original Paper Supervised Machine Learning
ISSN:	1367-4811, 1367-4803, 1367-4811
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Motivation Identifying effective therapeutic targets poses a challenge in drug discovery, especially for uncharacterized diseases without known therapeutic targets (e.g. rare diseases, intractable diseases). Results This study presents a novel machine learning approach using multimodal vector-quantized variational autoencoders (VQ-VAEs) for predicting therapeutic target molecules across diseases. To address the lack of known therapeutic target–disease associations, we incorporate the information on uncharacterized diseases without known targets or uncharacterized proteins without known indications (applicable diseases) in the semi-supervised learning (SSL) framework. The method integrates disease-specific and protein perturbation profiles with genetic perturbations (e.g. gene knockdowns and gene overexpressions) at the transcriptome level. Cross-cell representation learning, facilitated by VQ-VAEs, was performed to extract informative features from protein perturbation profiles across diverse human cell types. Concurrently, cross-disease representation learning was performed, leveraging VQ-VAE, to extract informative features reflecting disease states from disease-specific profiles. The model’s applicability to uncharacterized diseases or proteins is enhanced by considering the consistency between disease-specific and patient-specific signatures. The efficacy of the method is demonstrated across three practical scenarios for 79 diseases: target repositioning for target–disease pairs, new target prediction for uncharacterized diseases, and new indication prediction for uncharacterized proteins. This method is expected to be valuable for identifying therapeutic targets across various diseases. Availability and implementation Code: github.com/YamanishiLab/SSL-VQ and Data: 10.5281/zenodo.14644837.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1367-4811 1367-4803 1367-4811
DOI:	10.1093/bioinformatics/btaf039