On the Fundamental Limits of Matrix Completion: Leveraging Hierarchical Similarity Graphs

We study a matrix completion problem which leverages a hierarchical structure of social similarity graphs as side information in the context of recommender systems. We assume that users are categorized into clusters, each of which comprises sub-clusters (or what we call "groups"). We consi...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on information theory Vol. 70; no. 3; pp. 2039 - 2075
Main Authors:	Ahn, Junhyung, Elmahdy, Adel, Mohajer, Soheil, Suh, Changho
Format:	Journal Article
Language:	English
Published:	New York IEEE 01.03.2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Algorithms Clustering algorithms Clusters Collaborative filtering Complexity Complexity theory Filtering graph side information Graphs Information theory Lower bounds matrix completion problem Recommender systems Similarity Sparse matrices
ISSN:	0018-9448, 1557-9654
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	We study a matrix completion problem which leverages a hierarchical structure of social similarity graphs as side information in the context of recommender systems. We assume that users are categorized into clusters, each of which comprises sub-clusters (or what we call "groups"). We consider a hierarchical stochastic block model that well respects practically-relevant social graphs and follows a low-rank rating matrix model. Under this setting, we characterize the information-theoretic limit on the number of observed matrix entries (i.e., optimal sample complexity) as a function of the quality of graph side information (to be detailed) by proving sharp upper and lower bounds on the sample complexity. One important consequence of this result is that leveraging the hierarchical structure of similarity graphs yields a substantial gain in sample complexity relative to the one that simply identifies different groups without resorting to the relational structure across them. Another implication of the result is when the graph information is rich, the optimal sample complexity is proportional to the number of clusters, while it nearly stays constant as the number of groups in a cluster increases. We empirically demonstrate through extensive experiments that the proposed algorithm achieves the optimal sample complexity.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0018-9448 1557-9654
DOI:	10.1109/TIT.2023.3345902