Counterfactual Reward Estimation for Credit Assignment in Multi-agent Deep Reinforcement Learning over Wireless Video Transmission

This study investigates frame-wise optimization in Mobile Edge Computing (MEC) for video transmission, emphasizing dynamic adaptation to diverse frame complexities and efficient resource utilization. The comprehensive system model captures the complexities of joint optimizations in MEC for real-time...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Proceedings of the International Conference on Distributed Computing Systems s. 1177 - 1189
Hlavní autori: Wenhan, Y., Qian, Liangxin, Chua, Terence Jie, Zhao, Jun
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 23.07.2024
Predmet:
ISSN:2575-8411
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract This study investigates frame-wise optimization in Mobile Edge Computing (MEC) for video transmission, emphasizing dynamic adaptation to diverse frame complexities and efficient resource utilization. The comprehensive system model captures the complexities of joint optimizations in MEC for real-time video transmission, addressing challenges associated with error concealment techniques, and enhancing the user experience by addressing successive frame losses. To handle credit assignment in multi-agent scenarios, we integrate counterfactual reward shaping, introducing a counterfactual reward multi-agent proximal policy optimization (CRMAPPO). Results reveal the impact of the credit assignment parameter (β) on algorithm performance, demonstrating a trade-off between accurate credit assignment and policy bias. The study emphasizes CRMAPPO's performance, surpassing traditional MAPPO under optimal β choices, marking a substantial 109.18% improvement in total rewards. This research significantly contributes to optimizing resource allocation in video transmission within MEC frameworks, addressing challenges associated with frame-wise optimization and providing a nuanced understanding of credit assignment dynamics in multi-agent environments.
AbstractList This study investigates frame-wise optimization in Mobile Edge Computing (MEC) for video transmission, emphasizing dynamic adaptation to diverse frame complexities and efficient resource utilization. The comprehensive system model captures the complexities of joint optimizations in MEC for real-time video transmission, addressing challenges associated with error concealment techniques, and enhancing the user experience by addressing successive frame losses. To handle credit assignment in multi-agent scenarios, we integrate counterfactual reward shaping, introducing a counterfactual reward multi-agent proximal policy optimization (CRMAPPO). Results reveal the impact of the credit assignment parameter (β) on algorithm performance, demonstrating a trade-off between accurate credit assignment and policy bias. The study emphasizes CRMAPPO's performance, surpassing traditional MAPPO under optimal β choices, marking a substantial 109.18% improvement in total rewards. This research significantly contributes to optimizing resource allocation in video transmission within MEC frameworks, addressing challenges associated with frame-wise optimization and providing a nuanced understanding of credit assignment dynamics in multi-agent environments.
Author Zhao, Jun
Qian, Liangxin
Wenhan, Y.
Chua, Terence Jie
Author_xml – sequence: 1
  givenname: Y.
  surname: Wenhan
  fullname: Wenhan, Y.
  email: wenhan002@e.ntu.edu.sg
  organization: Graduate College, Nanyang Technological University
– sequence: 2
  givenname: Liangxin
  surname: Qian
  fullname: Qian, Liangxin
  email: qian0080@e.ntu.edu.sg
  organization: College of Computing and Data Science, Nanyang Technological University
– sequence: 3
  givenname: Terence Jie
  surname: Chua
  fullname: Chua, Terence Jie
  email: terencej001@e.ntu.edu.sg
  organization: Graduate College, Nanyang Technological University
– sequence: 4
  givenname: Jun
  orcidid: 0000-0002-3004-7091
  surname: Zhao
  fullname: Zhao, Jun
  email: junzhao@ntu.edu.sg
  organization: College of Computing and Data Science, Nanyang Technological University
BookMark eNotj91KAzEUhKMo2Na-gUJeYOtJstlsLsu2aqEiaNXLEjcnJdJmS5Iq3vrkrj8XwzDwzcAMyUnoAhJyyWDCGOirRTNrHivQfebAywkAY_yIjLXStZAg6gqkPiYDLpUs6pKxMzJM6Q0AZF2JAflqukPIGJ1p88Fs6QN-mGjpPGW_M9l3gbou0iai9ZlOU_KbsMOQqQ_07rDNvjCbnzhD3PddH3q6xV9iiSYGHza0e8dIX3zELaZEn73Fjq6iCWnn-70unJNTZ7YJx_8-Ik_X81VzWyzvbxbNdFl4pqpcVMgBhVNaO81LQGkZCm4Fx1cunTU1tmh7EmsrHFQ1SgGstUopVNY4I0bk4m_XI-J6H_uD8XPNoBK9SvEN5wJmIg
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ICDCS60910.2024.00112
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9798350386059
EISSN 2575-8411
EndPage 1189
ExternalDocumentID 10631064
Genre orig-research
GrantInformation_xml – fundername: Nanyang Technological University
  funderid: 10.13039/501100001475
GroupedDBID 29G
6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
AAWTH
ABLEC
ACGFS
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
IPLJI
M43
OCL
RIE
RIL
RIO
RNS
ID FETCH-LOGICAL-i176t-6e20e3f799f9240e5d1e32d32eb25fda8eced176e8d3f068e5301cd777e7dafa3
IEDL.DBID RIE
ISICitedReferencesCount 1
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001304430200103&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:32:38 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i176t-6e20e3f799f9240e5d1e32d32eb25fda8eced176e8d3f068e5301cd777e7dafa3
ORCID 0000-0002-3004-7091
PageCount 13
ParticipantIDs ieee_primary_10631064
PublicationCentury 2000
PublicationDate 2024-July-23
PublicationDateYYYYMMDD 2024-07-23
PublicationDate_xml – month: 07
  year: 2024
  text: 2024-July-23
  day: 23
PublicationDecade 2020
PublicationTitle Proceedings of the International Conference on Distributed Computing Systems
PublicationTitleAbbrev ICDCS
PublicationYear 2024
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0005863
Score 2.2700708
Snippet This study investigates frame-wise optimization in Mobile Edge Computing (MEC) for video transmission, emphasizing dynamic adaptation to diverse frame...
SourceID ieee
SourceType Publisher
StartPage 1177
SubjectTerms Complexity theory
credit assignment
Dynamic scheduling
multi-agent deep reinforcement learning
Resource management
Streaming media
Training
User experience
Video transmission
Wireless communication
Title Counterfactual Reward Estimation for Credit Assignment in Multi-agent Deep Reinforcement Learning over Wireless Video Transmission
URI https://ieeexplore.ieee.org/document/10631064
WOSCitedRecordID wos001304430200103&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELZoxcBUHkW8dQNrILETO5nTVrBUFS91qxz7UkVCaVVSfgC_nLObUjEwsEVRnEi-nM93_r77GLsNEdHlCUFso4QSFCGDTGMS6IJbXSQYlyb0YhNqPE6n02zSktU9F4bGevAZ3rlLf5ZvF2btSmXk4ZJ2IzLusI5SckPW2uE5Uilaik4UZveP-SB_li4aUhLIY3_iwH9JqPgIMur989uHrL_j4sHkJ8ocsT2sj1lvK8YArW-esC_HLnea09pTQuAJHR4WhuTCG3Yi0PYU8hW9pgEySjX3OACoavAk3EA7khUMEJc01vdTNb50CG0L1jk4tCc4uOw7LY_wVllcgI919K-4olufvY6GL_lD0AosBFWkZBNI5CGKUmVZSWlYiImNUHArOKXbSWl1igYtPYmpFWUoU0xoOTBWKYXK6lKLU9atFzWeMZCRDEsjIolZGqexKLiQ1pqU68LoIjLnrO_mdLbc9NCYbafz4o_7l-zAmc1VUbm4Yt1mtcZrtm8-m-pjdeMt_w2n5rNx
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELWgIMFUPor4xgNrILETx5nTVq0oVQUFdasc-4IiobQqKT-AX87ZTakYGNiiKHYk2-fznd-7R8itDwA2TvBCE0QYoHDhJQoiT2XMqCyCMNe-E5uIh0M5mSSjmqzuuDDY1oHP4M4-urt8M9NLmypDCxd4GhHhNtmx0lk1XWuD6JCC1ySdwE_u-2k7fRbWH2IYyEJ358B-iag4H9Jt_vPvB6S1YePR0Y-fOSRbUB6R5lqOgdbWeUy-LL_cqk4rRwqhT2ARsbSDRrziJ1I8oNJ0gd1UFKeleHNIAFqU1NFwPWVpVrQNMMe2rqKqdslDWhdhfaMW70ktYPYdN0j6WhiYUeftcLXYtFuLvHQ747Tn1RILXhHEovIEMB94HidJjoGYD5EJgDPDGQbcUW6UBA0GvwRpeO4LCRFuCNrEcQyxUbniJ6RRzko4JVQEws81DwQkMpQhzxgXxmjJVKZVFugz0rJjOp2vqmhM18N5_sf7G7LXGz8OpoP-8OGC7NsptDlVxi9Jo1os4Yrs6s-q-Fhcu1XwDSiDtro
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+the+International+Conference+on+Distributed+Computing+Systems&rft.atitle=Counterfactual+Reward+Estimation+for+Credit+Assignment+in+Multi-agent+Deep+Reinforcement+Learning+over+Wireless+Video+Transmission&rft.au=Wenhan%2C+Y.&rft.au=Qian%2C+Liangxin&rft.au=Chua%2C+Terence+Jie&rft.au=Zhao%2C+Jun&rft.date=2024-07-23&rft.pub=IEEE&rft.eissn=2575-8411&rft.spage=1177&rft.epage=1189&rft_id=info:doi/10.1109%2FICDCS60910.2024.00112&rft.externalDocID=10631064