Dual Contrastive Prediction for Incomplete Multi-View Representation Learning

In this article, we propose a unified framework to solve the following two challenging problems in incomplete multi-view representation learning: i) how to learn a consistent representation unifying different views, and ii) how to recover the missing views. To address the challenges, we provide an i...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on pattern analysis and machine intelligence Vol. 45; no. 4; pp. 4447 - 4461
Main Authors: Lin, Yijie, Gou, Yuanbiao, Liu, Xiaotian, Bai, Jinfeng, Lv, Jiancheng, Peng, Xi
Format: Journal Article
Language:English
Published: United States IEEE 01.04.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:0162-8828, 1939-3539, 2160-9292, 1939-3539
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this article, we propose a unified framework to solve the following two challenging problems in incomplete multi-view representation learning: i) how to learn a consistent representation unifying different views, and ii) how to recover the missing views. To address the challenges, we provide an information theoretical framework under which the consistency learning and data recovery are treated as a whole. With the theoretical framework, we propose a novel objective function which jointly solves the aforementioned two problems and achieves a provable sufficient and minimal representation. In detail, the consistency learning is performed by maximizing the mutual information of different views through contrastive learning, and the missing views are recovered by minimizing the conditional entropy through dual prediction. To the best of our knowledge, this is one of the first works to theoretically unify the cross-view consistency learning and data recovery for representation learning. Extensive experimental results show that the proposed method remarkably outperforms 20 competitive multi-view learning methods on six datasets in terms of clustering, classification, and human action recognition. The code could be accessed from https://pengxi.me .
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0162-8828
1939-3539
2160-9292
1939-3539
DOI:10.1109/TPAMI.2022.3197238