LLMAir: Adaptive Reprogramming Large Language Model for Air Quality Prediction

Accurate and timely air quality prediction is crucial for cities and individuals to effectively take necessary precautions against potential air pollution. Existing studies typically rely on building prediction models based on large-scale monitoring data, often designed for specific tasks. Recently,...

Full description

Saved in:

Bibliographic Details
Published in:	Proceedings - International Conference on Parallel and Distributed Systems pp. 423 - 430
Main Authors:	Fan, Jinxiao, Chu, Haolin, Liu, Liang, Ma, Huadong
Format:	Conference Proceeding
Language:	English
Published:	IEEE 10.10.2024
Subjects:	Air quality Air quality prediction Analytical models Atmospheric modeling Data models Large language models Monitoring Predictive models Regulators Semantics Spatiotemporal phenomena Urban data analysis
ISSN:	2690-5965
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Accurate and timely air quality prediction is crucial for cities and individuals to effectively take necessary precautions against potential air pollution. Existing studies typically rely on building prediction models based on large-scale monitoring data, often designed for specific tasks. Recently, pre-trained large language models (LLMs) have achieved significant progress in various time series analysis tasks due to their powerful representation and inference capabilities. However, their application to air quality data with spatio-temporal features remains largely unexplored. In this work, we propose LLMAir, an adaptive reprogramming approach that adapts pre-trained LLMs for air quality prediction. We first construct spatiotemporal tokens based on monitoring stations by integrating value, node, and time embeddings. Next, we design an adaptive semantic-enhanced reprogramming module to compute similarity matching scores between our spatiotemporal tokens and pre-trained word embeddings for alignment. We employ a semantic regulator to generate the optimal length of word prototypes, which serve as prompt prefixes for adaptive reprogramming and guiding the spatiotemporal token embeddings into the frozen LLM. Additionally, we jointly optimize predictive error and alignment loss to train our model. Experimental results demonstrate that LLMAir achieves state-of-the-art performance in air quality prediction and few-shot forecasting across two real-world datasets.
ISSN:	2690-5965
DOI:	10.1109/ICPADS63350.2024.00062