Distributionally Robust Policy Learning via Adversarial Environment Generation

Our goal is to train control policies that generalize well to unseen environments. Inspired by the Distributionally Robust Optimization (DRO) framework, we propose DRAGEN - Distributionally Robust policy learning via Adversarial Generation of ENvironments - for iteratively improving robustness of po...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE robotics and automation letters Ročník 7; číslo 2; s. 1379 - 1386
Hlavní autori: Ren, Allen Z., Majumdar, Anirudha
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Piscataway IEEE 01.04.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:
ISSN:2377-3766, 2377-3766
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Our goal is to train control policies that generalize well to unseen environments. Inspired by the Distributionally Robust Optimization (DRO) framework, we propose DRAGEN - Distributionally Robust policy learning via Adversarial Generation of ENvironments - for iteratively improving robustness of policies to realistic distribution shifts by generating adversarial environments. The key idea is to learn a generative model for environments whose latent variables capture cost-predictive and realistic variations in environments. We perform DRO with respect to a Wasserstein ball around the empirical distribution of environments by generating realistic adversarial environments via gradient ascent on the latent space. We demonstrate strong Out-of-Distribution (OoD) generalization in simulation for (i) swinging up a pendulum with onboard vision and (ii) grasping realistic 3D objects. Grasping experiments on hardware demonstrate better sim2real performance compared to domain randomization.
AbstractList Our goal is to train control policies that generalize well to unseen environments. Inspired by the Distributionally Robust Optimization (DRO) framework, we propose DRAGEN — Distributionally Robust policy learning via Adversarial Generation of ENvironments — for iteratively improving robustness of policies to realistic distribution shifts by generating adversarial environments. The key idea is to learn a generative model for environments whose latent variables capture cost-predictive and realistic variations in environments. We perform DRO with respect to a Wasserstein ball around the empirical distribution of environments by generating realistic adversarial environments via gradient ascent on the latent space. We demonstrate strong Out-of-Distribution (OoD) generalization in simulation for (i) swinging up a pendulum with onboard vision and (ii) grasping realistic 3D objects. Grasping experiments on hardware demonstrate better sim2real performance compared to domain randomization.
Author Majumdar, Anirudha
Ren, Allen Z.
Author_xml – sequence: 1
  givenname: Allen Z.
  orcidid: 0000-0001-5306-2844
  surname: Ren
  fullname: Ren, Allen Z.
  email: allen.ren@princeton.edu
  organization: Mechanical and Aerospace Engineering Department, Princeton University, Princeton, NJ, USA
– sequence: 2
  givenname: Anirudha
  surname: Majumdar
  fullname: Majumdar, Anirudha
  email: ani.majumdar@princeton.edu
  organization: Mechanical and Aerospace Engineering Department, Princeton University, Princeton, NJ, USA
BookMark eNp9kM1LAzEQxYNUsNbeBS8LnlvzsU02x1JrFRaVoueQ3Z2VlG1Sk2yh_727toh48DTD8N4b3u8SDayzgNA1wVNCsLzL1_MpxZRMGWFSpvIMDSkTYsIE54Nf-wUah7DBGJMZFUzOhuj53oToTdFG46xumkOydkUbYvLqGlMekhy0t8Z-JHujk3m1Bx-0N7pJlnZvvLNbsDFZgQWv-4QrdF7rJsD4NEfo_WH5tnic5C-rp8U8n5RUkjhJZ6XOKBBgrJIso5RDRYmuNS2B1HXBIRMaC5HyDAPgGmpIy1LwVBT9qWIjdHvM3Xn32UKIauNa3xUIinIiMREio50KH1WldyF4qNXOm632B0Ww6sGpDpzqwakTuM7C_1hKE7-rRa9N85_x5mg0APDzR3IuMZXsC0wOfWs
CODEN IRALC6
CitedBy_id crossref_primary_10_1109_TASE_2025_3535945
crossref_primary_10_1177_02783649251352000
crossref_primary_10_1109_LRA_2021_3139949
crossref_primary_10_3390_drones8080368
crossref_primary_10_1109_TSG_2025_3571349
Cites_doi 10.1109/IROS.2018.8593933
10.1109/ICCV.2019.00062
10.1007/978-3-540-71050-9
10.1109/IROS.2018.8593986
10.1109/IROS.2017.8202133
10.1109/ICCV.2019.00487
10.1126/science.1127647
10.1109/LRA.2021.3139949
10.1145/325165.325247
10.1007/s10107-017-1172-1
10.1109/5.726791
10.1109/34.291440
10.1109/CVPR.2019.00025
10.1109/ICRA.2012.6225116
10.1177/0278364918770733
10.1287/moor.2018.0936
10.1109/COASE.2019.8843059
10.1109/LRA.2020.2992195
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/LRA.2021.3139949
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2377-3766
EndPage 1386
ExternalDocumentID 10_1109_LRA_2021_3139949
9669029
Genre orig-research
GrantInformation_xml – fundername: Toyota Research Institute
  funderid: 10.13039/100015599
– fundername: Office of Naval Research
  grantid: N00014-18-1-2873
  funderid: 10.13039/100000006
GroupedDBID 0R~
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFS
AGQYO
AGSQL
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
IFIPE
IPLJI
JAVBF
KQ8
M43
M~E
O9-
OCL
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c291t-45ca82e1e33d938226ed21afa2ce1ffb6e87a0774680ee0fefe4cc7647b4680d3
IEDL.DBID RIE
ISICitedReferencesCount 8
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000742180000013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2377-3766
IngestDate Mon Jun 30 07:40:05 EDT 2025
Tue Nov 18 22:23:53 EST 2025
Sat Nov 29 06:03:14 EST 2025
Wed Aug 27 02:24:01 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 2
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c291t-45ca82e1e33d938226ed21afa2ce1ffb6e87a0774680ee0fefe4cc7647b4680d3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-5306-2844
PQID 2619017782
PQPubID 4437225
PageCount 8
ParticipantIDs proquest_journals_2619017782
ieee_primary_9669029
crossref_primary_10_1109_LRA_2021_3139949
crossref_citationtrail_10_1109_LRA_2021_3139949
PublicationCentury 2000
PublicationDate 2022-04-01
PublicationDateYYYYMMDD 2022-04-01
PublicationDate_xml – month: 04
  year: 2022
  text: 2022-04-01
  day: 01
PublicationDecade 2020
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE robotics and automation letters
PublicationTitleAbbrev LRA
PublicationYear 2022
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References Laskin (ref11) 2020; 33
ref35
ref34
ref30
ref32
Mehta (ref8) 2020
ref18
Haarnoja (ref31) 2018
Coumans (ref33)
Volpi (ref6) 2018
Jalal (ref15) 2017
Robey (ref16) 2020
Lee (ref19) 2020
Namkoong (ref5) 2016; 29
Yarats (ref10) 2021
ref23
Snderhauf (ref1) 2018; 37
Wong (ref17) 2021
ref25
Wang (ref13) 2019
ref20
ref22
Arjovsky (ref27) 2017
Sinha (ref2) 2018
ref21
Qi (ref36) 2017
ref28
Goodfellow (ref14) 2015
ref29
ref7
ref9
ref4
ref3
Achlioptas (ref24) 2018
Dennis (ref12) 2020; 33
Villani (ref26) 2009; 338
Kleineberg (ref37) 2020
References_xml – ident: ref9
  doi: 10.1109/IROS.2018.8593933
– ident: ref18
  doi: 10.1109/ICCV.2019.00062
– year: 2020
  ident: ref37
  article-title: Adversarial generation of continuous implicit shape representations
– start-page: 40
  volume-title: Proc. Int. Conf. Mach. Learn.
  year: 2018
  ident: ref24
  article-title: Learning representations and generative models for 3D point clouds
– start-page: 1861
  volume-title: Proc. Int. Conf. Mach. Learn.
  year: 2018
  ident: ref31
  article-title: Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor
– volume: 338
  volume-title: Optimal Transport: Old and New
  year: 2009
  ident: ref26
  doi: 10.1007/978-3-540-71050-9
– ident: ref35
  doi: 10.1109/IROS.2018.8593986
– start-page: 652
  volume-title: Proc. IEEE Conf. Comput. Vis. Pattern Recognit.
  year: 2017
  ident: ref36
  article-title: PointNet: Deep learning on point sets for 3D classification and segmentation
– ident: ref7
  doi: 10.1109/IROS.2017.8202133
– volume: 33
  start-page: 13049
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  year: 2020
  ident: ref12
  article-title: Emergent complexity and zero-shot transfer via unsupervised environment design
– volume-title: Proc. Int. Conf. Learn. Representations
  year: 2021
  ident: ref17
  article-title: Learning perturbation sets for robust machine learning
– volume-title: Proc. Int. Conf. Learn. Representations
  year: 2018
  ident: ref2
  article-title: Certifiable distributional robustness with principled adversarial training
– volume-title: Proc. Int. Conf. Learn. Representations
  year: 2015
  ident: ref14
  article-title: Explaining and harnessing adversarial examples
– ident: ref22
  doi: 10.1109/ICCV.2019.00487
– year: 2020
  ident: ref16
  article-title: Model-based robust deep learning
– ident: ref25
  doi: 10.1126/science.1127647
– year: 2019
  ident: ref13
  article-title: Paired open-ended trailblazer (POET): Endlessly generating increasingly complex and diverse learning environments and their solutions
– ident: ref28
  doi: 10.1109/LRA.2021.3139949
– year: 2020
  ident: ref19
  article-title: ShapeAdv: Generating shape-aware adversarial 3D point clouds
– ident: ref32
  doi: 10.1145/325165.325247
– volume-title: Proc. Int. Conf. Learn. Representations
  year: 2021
  ident: ref10
  article-title: Image augmentation is all you need: Regularizing deep reinforcement learning from pixels
– ident: ref4
  doi: 10.1007/s10107-017-1172-1
– start-page: 5334
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  year: 2018
  ident: ref6
  article-title: Generalizing to unseen domains via adversarial data augmentation
– start-page: 1162
  volume-title: Proc. Conf. Robot Learn.
  year: 2020
  ident: ref8
  article-title: Active domain randomization
– ident: ref29
  doi: 10.1109/5.726791
– volume: 33
  start-page: 19884
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  year: 2020
  ident: ref11
  article-title: Reinforcement learning with augmented data
– year: 2017
  ident: ref15
  article-title: The robust manifold defense: Adversarial training using generative models
– ident: ref30
  doi: 10.1109/34.291440
– ident: ref23
  doi: 10.1109/CVPR.2019.00025
– start-page: 2016
  ident: ref33
  article-title: PyBullet, a Python module for physics simulation for games, robotics and machine learning
– ident: ref34
  doi: 10.1109/ICRA.2012.6225116
– volume: 37
  start-page: 405
  issue: 4/5
  year: 2018
  ident: ref1
  article-title: The limits and potentials of deep learning for robotics
  publication-title: Int. J. Robot. Res.
  doi: 10.1177/0278364918770733
– volume: 29
  start-page: 2208
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  year: 2016
  ident: ref5
  article-title: Stochastic gradient methods for distributionally robust optimization with $f$-divergences
– ident: ref3
  doi: 10.1287/moor.2018.0936
– ident: ref20
  doi: 10.1109/COASE.2019.8843059
– ident: ref21
  doi: 10.1109/LRA.2020.2992195
– start-page: 214
  volume-title: Proc. Int. Conf. Mach. Learn.
  year: 2017
  ident: ref27
  article-title: Wasserstein generative adversarial networks
SSID ssj0001527395
Score 2.2895334
Snippet Our goal is to train control policies that generalize well to unseen environments. Inspired by the Distributionally Robust Optimization (DRO) framework, we...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1379
SubjectTerms continual learning
Costs
data sets for robot learning
generalization
Grasping
Optimization
Policies
Reinforcement learning
Robots
Robustness
Task analysis
Training
Title Distributionally Robust Policy Learning via Adversarial Environment Generation
URI https://ieeexplore.ieee.org/document/9669029
https://www.proquest.com/docview/2619017782
Volume 7
WOSCitedRecordID wos000742180000013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 2377-3766
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001527395
  issn: 2377-3766
  databaseCode: RIE
  dateStart: 20160101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2377-3766
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001527395
  issn: 2377-3766
  databaseCode: M~E
  dateStart: 20160101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFH9sw4Me_JridI4cvAjWtelHmuPQDQ9zyFDZraTJqwhjk3UbePFvN0m7D1AEb6UkobyX5v1ekvf7AVwFyqWKKe5o7M2dwE-VkzKXOVxGlKVMB22rdfjaZ4NBPBrxpwrcrGthENFePsNb82jP8tVULsxWWVtDc-5SXoUqY6yo1drspxgmMR6uTiJd3u4POzr_o55OS3UUNmSZW5HHSqn8WH9tUOkd_O9zDmG_BI-kU3j7CCo4OYa9LUrBOgzuDRNuKWIlxuNPMpymi3xOCgJgUvKpvpHluyBWjTkXZg6S7qbijRRc1GaEE3jpdZ_vHpxSM8GRlHtzJwiliCl66PuK-zr6R6ioJzJBJXpZlkYYM-FqzBfFLqKbYYaBlCwKWGpeKf8UapPpBM-AhDrT8FGGvvQyPaxGDmEmwzgVIlAYi7QB7ZU9E1kSihtdi3FiEwuXJ9oDifFAUnqgAdfrHh8FmcYfbevG4ut2pbEb0Fy5LCn_tjwxWaBeWTTYOf-91wXsUlO2YG_cNKE2ny3wEnbkcv6ez1pQffzqtux0-gaQw8lI
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3fS8MwED7mFNQHf4vTqXnwRbCuTX-keRy6oTiHjCl7K2lylcHYxHWC_71J2s2BIvhWShLKXZr7Lsl9H8BFoFyqmOKOxt7cCfxUOSlzmcNlRFnKdNC2WocvHdbtxoMBf6rA1aIWBhHt5TO8No_2LF9N5MxslTU0NOcu5SuwGgYB9Ypqre8dFcMlxsP5WaTLG51eU2eA1NOJqY7Dhi5zKfZYMZUfK7ANK-3t_33QDmyV8JE0C3_vQgXHe7C5RCq4D91bw4VbyliJ0eiT9CbpbJqTggKYlIyqr-RjKIjVY54KMwtJ67vmjRRs1GaEA3hut_o3d06pmuBIyr3cCUIpYooe-r7ivo7_ESrqiUxQiV6WpRHGTLga9UWxi-hmmGEgJYsClppXyj-E6ngyxiMgoc41fJShL71MD6uxQ5jJME6FCBTGIq1BY27PRJaU4kbZYpTY1MLlifZAYjyQlB6oweWix1tBp_FH231j8UW70tg1qM9dlpT_2zQxeaBeWzTcOf691zms3_UfO0nnvvtwAhvUFDHY-zd1qObvMzyFNfmRD6fvZ3ZSfQFrLMte
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Distributionally+Robust+Policy+Learning+via+Adversarial+Environment+Generation&rft.jtitle=IEEE+robotics+and+automation+letters&rft.au=Ren%2C+Allen+Z&rft.au=Majumdar%2C+Anirudha&rft.date=2022-04-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.eissn=2377-3766&rft.volume=7&rft.issue=2&rft.spage=1379&rft_id=info:doi/10.1109%2FLRA.2021.3139949&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2377-3766&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2377-3766&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2377-3766&client=summon