On-Line Learning and Optimization for Wireless Video Transmission

In this paper, we address the problem of how to optimize the cross-layer transmission policy for delay-sensitive video streaming over slow-varying flat-fading wireless channels on-line, at transmission time, when the environment dynamics are unknown. We first formulate the cross-layer optimization u...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on signal processing Jg. 58; H. 6; S. 3108 - 3124
Hauptverfasser: Yu Zhang, Fangwen Fu, van der Schaar, Mihaela
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York, NY IEEE 01.06.2010
Institute of Electrical and Electronics Engineers
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:1053-587X, 1941-0476
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this paper, we address the problem of how to optimize the cross-layer transmission policy for delay-sensitive video streaming over slow-varying flat-fading wireless channels on-line, at transmission time, when the environment dynamics are unknown. We first formulate the cross-layer optimization using a systematic layered Markov decision process (MDP) framework, which complies with the layered architecture of the OSI stack. Subsequently, considering the unknown dynamics of the video sources and underlying wireless channels, we propose a layered real-time dynamic programming (LRTDP) algorithm, which requires no a priori knowledge about the source and network dynamics. LRTDP allows each layer to learn the dynamics on-the-fly, and adjusts its policy autonomously, based on their experienced dynamics as well as limited message exchanges with other layers. Unlike existing cross-layer methods, LRTDP optimizes the cross-layer policy in a layered and on-line fashion, exhibits a low computational complexity, requires limited message exchanges among layers, and is capable to adapt on-the-fly to the experienced environment dynamics. Finally, we prove that LRTDP converges to the optimal cross-layer policy asymptotically. Our numerical experiments show that LRTDP provides comparable performance to the idealized optimal cross-layer solutions based on complete knowledge.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ObjectType-Article-2
ObjectType-Feature-1
content type line 23
ISSN:1053-587X
1941-0476
DOI:10.1109/TSP.2010.2046040