Provable Active Multi-Task Representation Learning

Multi-task representation learning is an emerging machine learning paradigm that integrates data from multiple sources, harnessing task similarities to enhance overall model performance. The application of multi-task learning to real-world settings is hindered due to data scarcity, along with challe...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on signal processing pp. 1 - 15
Main Authors:	Lin, Jiabin, Moothedath, Shana, Le, Tuan Anh
Format:	Journal Article
Language:	English
Published:	IEEE 2025
Subjects:	active learning alternating gradient descent and minimization algorithm Complexity theory Machine learning algorithms Minimization Multi-task representation learning Multitasking Noise Representation learning Signal processing algorithms Training transfer learning Vectors
ISSN:	1053-587X, 1941-0476
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Multi-task representation learning is an emerging machine learning paradigm that integrates data from multiple sources, harnessing task similarities to enhance overall model performance. The application of multi-task learning to real-world settings is hindered due to data scarcity, along with challenges related to scalability and computational resources. To address these challenges, we develop a fast and sample-efficient approach for multi-task active learning with linear representation when the amount of data from source tasks and target tasks is limited. By leveraging the techniques from active learning, we propose an adaptive sampling-based alternating projected gradient descent (GD) and minimization algorithm that iteratively estimates the relevance of each source task to the target task and samples from each source task based on the estimated relevance. We present the convergence guarantees and the sample and time complexities of our algorithm. We evaluated the effectiveness of our algorithm using experiments and compared it with four benchmark algorithms using synthetic and real-world MNIST-C and MovieLens-100K datasets.
ISSN:	1053-587X 1941-0476
DOI:	10.1109/TSP.2025.3623098