Stochastic human motion prediction using a quantized conditional diffusion model
Human motion prediction is a fundamental task in computer vision, aiming to forecast future human poses based on observed motion sequences. Existing deterministic methods generate a single future motion sequence, neglecting the inherent stochasticity and diversity of human behaviors. To address this...
Gespeichert in:
| Veröffentlicht in: | Knowledge-based systems Jg. 309; S. 112823 |
|---|---|
| Hauptverfasser: | , , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Elsevier B.V
30.01.2025
|
| Schlagworte: | |
| ISSN: | 0950-7051 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Zusammenfassung: | Human motion prediction is a fundamental task in computer vision, aiming to forecast future human poses based on observed motion sequences. Existing deterministic methods generate a single future motion sequence, neglecting the inherent stochasticity and diversity of human behaviors. To address this limitation, we propose a novel two-stage stochastic human motion prediction framework, termed the Quantized Conditional Diffusion Model (QCDM), which combines a Discrete Motion Quantization Module and a Conditional Motion Generation Module. Specifically, we first design a discrete motion quantization module that leverages Graph Convolutional Networks (GCNs) and one-dimensional temporal convolutions to encode motion sequences into continuous latent representations. These representations are then quantized into discrete latent variables using a learnable codebook. A decoder reconstructs the motion sequence from these discrete variables, preserving key motion patterns while eliminating redundancies. Next, we develop a conditional motion generation module that integrates GCNs and Transformers for denoising spatio-temporal features. The diffusion process iteratively refines noisy motion data by reversing a gradual noising procedure, modeling the distribution of plausible future motions. Action category information and observed historical motion segments are incorporated as conditions into the denoising process, enabling controllable generation of specific motions. Additionally, we introduce a diversity enhancement strategy by penalizing overly similar samples. This encourages the model to explore a wider range of plausible motions and thereby improving the diversity and richness of the prediction results. Extensive experiments demonstrate that the QCDM framework outperforms state-of-the-art methods in stochastic human motion prediction tasks, offering both accuracy and diversity in generated motion sequences.
•Combines motion quantization with conditional diffusion model for motion prediction.•Utilizes GCNs and temporal convolutions for efficient motion feature extraction.•Integrates action category info for controllable, diverse motion generation.•Implements diversity enhancement to reduce similarity in prediction samples. |
|---|---|
| ISSN: | 0950-7051 |
| DOI: | 10.1016/j.knosys.2024.112823 |