Grade: Generative graph contrastive learning for multimodal recommendation

Multimodal recommender systems based on graph convolutional networks have made significant progress by integrating multiple modal data for item recommendation. While most existing approaches learn user and item representations through modality-related interaction graphs, these approaches still encou...

Full description

Saved in:

Bibliographic Details
Published in:	Neurocomputing (Amsterdam) Vol. 657; p. 131630
Main Authors:	Ping, Yu-Chao, Wang, Shu-Qin, Yang, Zi-Yi, Dong, Yong-Quan, Hu, Meng-Xiang, Zhang, Pei-Lin
Format:	Journal Article
Language:	English
Published:	Elsevier B.V 07.12.2025
Subjects:	Contrastive learning Graph convolutional network Multimodal recommendation systems Variational graph autoencoder Multimodal recommendation systems Graph convolutional network Variational graph autoencoder Contrastive learning
ISSN:	0925-2312
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Multimodal recommender systems based on graph convolutional networks have made significant progress by integrating multiple modal data for item recommendation. While most existing approaches learn user and item representations through modality-related interaction graphs, these approaches still encounter challenges inherent to graph convolutional networks: over-smoothing. To address this challenge, we propose a model named Grade, Generative Graph Contrastive Learning for Multimodal Recommendations. It combines generative models and contrastive learning and design four task losses. In particular, the generative graph contrastive task generates contrastive views inter-modal through variational graph reconstruction, effectively aligning modal features to improve user and item representations. In addition, the feature perturbation contrastive task generates multimodal noisy views with interference for intra-modal contrast through noise-based self-supervised learning, effectively enhancing the robustness of modality-specific representations. Finally, we incorporate the Variational Graph Autoencoders (VGAE) task and the Bayesian Personalized Ranking (BPR) task. The combination of these four task losses effectively mitigates the issues of over-smoothing. Extensive experiments conducted on three publicly available datasets confirm the superiority of our model. The related code is available on https://github.com/Ricardo-Ping/Grade.
ISSN:	0925-2312
DOI:	10.1016/j.neucom.2025.131630