Online learning for wireless video transmission with limited information

In this paper, we address the problem of joint packet scheduling at the application layer as well as power and rate allocation at the physical layer for delay-sensitive video streaming over slow-varying flat-fading wireless channels. Our goal is to find the optimal cross-layer policy that maximizes...

Full description

Saved in:

Bibliographic Details
Published in:	2009 17th International Packet Video Workshop pp. 1 - 10
Main Authors:	Yu Zhang, Fangwen Fu, van der Schaar, M.
Format:	Conference Proceeding
Language:	English
Published:	IEEE 01.05.2009
Subjects:	Decision making Delay Dynamic programming Heuristic algorithms layered markov decision process online learning Open systems Physical layer Real time systems real-time learning Scheduling algorithm State estimation Streaming media wireless video transmission
ISBN:	9781424446513, 1424446511
ISSN:	2167-969X
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	In this paper, we address the problem of joint packet scheduling at the application layer as well as power and rate allocation at the physical layer for delay-sensitive video streaming over slow-varying flat-fading wireless channels. Our goal is to find the optimal cross-layer policy that maximizes the cumulative received video quality, while minimizing the total transmission energy. We first formulate the cross-layer optimization using a systematic layered Markov Decision Process (MDP) framework and then propose a layered real-time dynamic programming (RTDP) algorithm for solving this cross-layer optimization problem by combining together the policy update and real-time decision making. This approach reduces the high complexity of the conventionally used offline dynamic programming methods. Moreover, to accommodate the cases when the network environment dynamics (e.g. state transition probabilities) are unknown or non-stationary (e.g. state transition probabilities are changed over time), we further improve our RTDP method by collecting the required network information and estimating the dynamics online, using a model-free approach. Based on this information, a user (a transmitter-receiver pair) can adaptively change its policy to cope in real-time with the experienced environment dynamics. We also prove the convergence of this RTDP method (which complies with the layered architecture of the OSI stack). Finally, our numerical experiments show that the proposed RTDP solutions outperform the conventional offline DP methods for real-time video streaming.
ISBN:	9781424446513 1424446511
ISSN:	2167-969X
DOI:	10.1109/PACKET.2009.5152166