Distributionally Robust Policy Learning via Adversarial Environment Generation
Our goal is to train control policies that generalize well to unseen environments. Inspired by the Distributionally Robust Optimization (DRO) framework, we propose DRAGEN - Distributionally Robust policy learning via Adversarial Generation of ENvironments - for iteratively improving robustness of po...
Uložené v:
| Vydané v: | IEEE robotics and automation letters Ročník 7; číslo 2; s. 1379 - 1386 |
|---|---|
| Hlavní autori: | , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Piscataway
IEEE
01.04.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Predmet: | |
| ISSN: | 2377-3766, 2377-3766 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Our goal is to train control policies that generalize well to unseen environments. Inspired by the Distributionally Robust Optimization (DRO) framework, we propose DRAGEN - Distributionally Robust policy learning via Adversarial Generation of ENvironments - for iteratively improving robustness of policies to realistic distribution shifts by generating adversarial environments. The key idea is to learn a generative model for environments whose latent variables capture cost-predictive and realistic variations in environments. We perform DRO with respect to a Wasserstein ball around the empirical distribution of environments by generating realistic adversarial environments via gradient ascent on the latent space. We demonstrate strong Out-of-Distribution (OoD) generalization in simulation for (i) swinging up a pendulum with onboard vision and (ii) grasping realistic 3D objects. Grasping experiments on hardware demonstrate better sim2real performance compared to domain randomization. |
|---|---|
| AbstractList | Our goal is to train control policies that generalize well to unseen environments. Inspired by the Distributionally Robust Optimization (DRO) framework, we propose DRAGEN — Distributionally Robust policy learning via Adversarial Generation of ENvironments — for iteratively improving robustness of policies to realistic distribution shifts by generating adversarial environments. The key idea is to learn a generative model for environments whose latent variables capture cost-predictive and realistic variations in environments. We perform DRO with respect to a Wasserstein ball around the empirical distribution of environments by generating realistic adversarial environments via gradient ascent on the latent space. We demonstrate strong Out-of-Distribution (OoD) generalization in simulation for (i) swinging up a pendulum with onboard vision and (ii) grasping realistic 3D objects. Grasping experiments on hardware demonstrate better sim2real performance compared to domain randomization. |
| Author | Majumdar, Anirudha Ren, Allen Z. |
| Author_xml | – sequence: 1 givenname: Allen Z. orcidid: 0000-0001-5306-2844 surname: Ren fullname: Ren, Allen Z. email: allen.ren@princeton.edu organization: Mechanical and Aerospace Engineering Department, Princeton University, Princeton, NJ, USA – sequence: 2 givenname: Anirudha surname: Majumdar fullname: Majumdar, Anirudha email: ani.majumdar@princeton.edu organization: Mechanical and Aerospace Engineering Department, Princeton University, Princeton, NJ, USA |
| BookMark | eNp9kM1LAzEQxYNUsNbeBS8LnlvzsU02x1JrFRaVoueQ3Z2VlG1Sk2yh_727toh48DTD8N4b3u8SDayzgNA1wVNCsLzL1_MpxZRMGWFSpvIMDSkTYsIE54Nf-wUah7DBGJMZFUzOhuj53oToTdFG46xumkOydkUbYvLqGlMekhy0t8Z-JHujk3m1Bx-0N7pJlnZvvLNbsDFZgQWv-4QrdF7rJsD4NEfo_WH5tnic5C-rp8U8n5RUkjhJZ6XOKBBgrJIso5RDRYmuNS2B1HXBIRMaC5HyDAPgGmpIy1LwVBT9qWIjdHvM3Xn32UKIauNa3xUIinIiMREio50KH1WldyF4qNXOm632B0Ww6sGpDpzqwakTuM7C_1hKE7-rRa9N85_x5mg0APDzR3IuMZXsC0wOfWs |
| CODEN | IRALC6 |
| CitedBy_id | crossref_primary_10_1109_TASE_2025_3535945 crossref_primary_10_1177_02783649251352000 crossref_primary_10_1109_LRA_2021_3139949 crossref_primary_10_3390_drones8080368 crossref_primary_10_1109_TSG_2025_3571349 |
| Cites_doi | 10.1109/IROS.2018.8593933 10.1109/ICCV.2019.00062 10.1007/978-3-540-71050-9 10.1109/IROS.2018.8593986 10.1109/IROS.2017.8202133 10.1109/ICCV.2019.00487 10.1126/science.1127647 10.1109/LRA.2021.3139949 10.1145/325165.325247 10.1007/s10107-017-1172-1 10.1109/5.726791 10.1109/34.291440 10.1109/CVPR.2019.00025 10.1109/ICRA.2012.6225116 10.1177/0278364918770733 10.1287/moor.2018.0936 10.1109/COASE.2019.8843059 10.1109/LRA.2020.2992195 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/LRA.2021.3139949 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 2377-3766 |
| EndPage | 1386 |
| ExternalDocumentID | 10_1109_LRA_2021_3139949 9669029 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: Toyota Research Institute funderid: 10.13039/100015599 – fundername: Office of Naval Research grantid: N00014-18-1-2873 funderid: 10.13039/100000006 |
| GroupedDBID | 0R~ 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFS AGQYO AGSQL AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD IFIPE IPLJI JAVBF KQ8 M43 M~E O9- OCL RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c291t-45ca82e1e33d938226ed21afa2ce1ffb6e87a0774680ee0fefe4cc7647b4680d3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 8 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000742180000013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2377-3766 |
| IngestDate | Mon Jun 30 07:40:05 EDT 2025 Tue Nov 18 22:23:53 EST 2025 Sat Nov 29 06:03:14 EST 2025 Wed Aug 27 02:24:01 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 2 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c291t-45ca82e1e33d938226ed21afa2ce1ffb6e87a0774680ee0fefe4cc7647b4680d3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0001-5306-2844 |
| PQID | 2619017782 |
| PQPubID | 4437225 |
| PageCount | 8 |
| ParticipantIDs | proquest_journals_2619017782 ieee_primary_9669029 crossref_primary_10_1109_LRA_2021_3139949 crossref_citationtrail_10_1109_LRA_2021_3139949 |
| PublicationCentury | 2000 |
| PublicationDate | 2022-04-01 |
| PublicationDateYYYYMMDD | 2022-04-01 |
| PublicationDate_xml | – month: 04 year: 2022 text: 2022-04-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | Piscataway |
| PublicationPlace_xml | – name: Piscataway |
| PublicationTitle | IEEE robotics and automation letters |
| PublicationTitleAbbrev | LRA |
| PublicationYear | 2022 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | Laskin (ref11) 2020; 33 ref35 ref34 ref30 ref32 Mehta (ref8) 2020 ref18 Haarnoja (ref31) 2018 Coumans (ref33) Volpi (ref6) 2018 Jalal (ref15) 2017 Robey (ref16) 2020 Lee (ref19) 2020 Namkoong (ref5) 2016; 29 Yarats (ref10) 2021 ref23 Snderhauf (ref1) 2018; 37 Wong (ref17) 2021 ref25 Wang (ref13) 2019 ref20 ref22 Arjovsky (ref27) 2017 Sinha (ref2) 2018 ref21 Qi (ref36) 2017 ref28 Goodfellow (ref14) 2015 ref29 ref7 ref9 ref4 ref3 Achlioptas (ref24) 2018 Dennis (ref12) 2020; 33 Villani (ref26) 2009; 338 Kleineberg (ref37) 2020 |
| References_xml | – ident: ref9 doi: 10.1109/IROS.2018.8593933 – ident: ref18 doi: 10.1109/ICCV.2019.00062 – year: 2020 ident: ref37 article-title: Adversarial generation of continuous implicit shape representations – start-page: 40 volume-title: Proc. Int. Conf. Mach. Learn. year: 2018 ident: ref24 article-title: Learning representations and generative models for 3D point clouds – start-page: 1861 volume-title: Proc. Int. Conf. Mach. Learn. year: 2018 ident: ref31 article-title: Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor – volume: 338 volume-title: Optimal Transport: Old and New year: 2009 ident: ref26 doi: 10.1007/978-3-540-71050-9 – ident: ref35 doi: 10.1109/IROS.2018.8593986 – start-page: 652 volume-title: Proc. IEEE Conf. Comput. Vis. Pattern Recognit. year: 2017 ident: ref36 article-title: PointNet: Deep learning on point sets for 3D classification and segmentation – ident: ref7 doi: 10.1109/IROS.2017.8202133 – volume: 33 start-page: 13049 volume-title: Proc. Adv. Neural Inf. Process. Syst. year: 2020 ident: ref12 article-title: Emergent complexity and zero-shot transfer via unsupervised environment design – volume-title: Proc. Int. Conf. Learn. Representations year: 2021 ident: ref17 article-title: Learning perturbation sets for robust machine learning – volume-title: Proc. Int. Conf. Learn. Representations year: 2018 ident: ref2 article-title: Certifiable distributional robustness with principled adversarial training – volume-title: Proc. Int. Conf. Learn. Representations year: 2015 ident: ref14 article-title: Explaining and harnessing adversarial examples – ident: ref22 doi: 10.1109/ICCV.2019.00487 – year: 2020 ident: ref16 article-title: Model-based robust deep learning – ident: ref25 doi: 10.1126/science.1127647 – year: 2019 ident: ref13 article-title: Paired open-ended trailblazer (POET): Endlessly generating increasingly complex and diverse learning environments and their solutions – ident: ref28 doi: 10.1109/LRA.2021.3139949 – year: 2020 ident: ref19 article-title: ShapeAdv: Generating shape-aware adversarial 3D point clouds – ident: ref32 doi: 10.1145/325165.325247 – volume-title: Proc. Int. Conf. Learn. Representations year: 2021 ident: ref10 article-title: Image augmentation is all you need: Regularizing deep reinforcement learning from pixels – ident: ref4 doi: 10.1007/s10107-017-1172-1 – start-page: 5334 volume-title: Proc. Adv. Neural Inf. Process. Syst. year: 2018 ident: ref6 article-title: Generalizing to unseen domains via adversarial data augmentation – start-page: 1162 volume-title: Proc. Conf. Robot Learn. year: 2020 ident: ref8 article-title: Active domain randomization – ident: ref29 doi: 10.1109/5.726791 – volume: 33 start-page: 19884 volume-title: Proc. Adv. Neural Inf. Process. Syst. year: 2020 ident: ref11 article-title: Reinforcement learning with augmented data – year: 2017 ident: ref15 article-title: The robust manifold defense: Adversarial training using generative models – ident: ref30 doi: 10.1109/34.291440 – ident: ref23 doi: 10.1109/CVPR.2019.00025 – start-page: 2016 ident: ref33 article-title: PyBullet, a Python module for physics simulation for games, robotics and machine learning – ident: ref34 doi: 10.1109/ICRA.2012.6225116 – volume: 37 start-page: 405 issue: 4/5 year: 2018 ident: ref1 article-title: The limits and potentials of deep learning for robotics publication-title: Int. J. Robot. Res. doi: 10.1177/0278364918770733 – volume: 29 start-page: 2208 volume-title: Proc. Adv. Neural Inf. Process. Syst. year: 2016 ident: ref5 article-title: Stochastic gradient methods for distributionally robust optimization with $f$-divergences – ident: ref3 doi: 10.1287/moor.2018.0936 – ident: ref20 doi: 10.1109/COASE.2019.8843059 – ident: ref21 doi: 10.1109/LRA.2020.2992195 – start-page: 214 volume-title: Proc. Int. Conf. Mach. Learn. year: 2017 ident: ref27 article-title: Wasserstein generative adversarial networks |
| SSID | ssj0001527395 |
| Score | 2.2895334 |
| Snippet | Our goal is to train control policies that generalize well to unseen environments. Inspired by the Distributionally Robust Optimization (DRO) framework, we... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 1379 |
| SubjectTerms | continual learning Costs data sets for robot learning generalization Grasping Optimization Policies Reinforcement learning Robots Robustness Task analysis Training |
| Title | Distributionally Robust Policy Learning via Adversarial Environment Generation |
| URI | https://ieeexplore.ieee.org/document/9669029 https://www.proquest.com/docview/2619017782 |
| Volume | 7 |
| WOSCitedRecordID | wos000742180000013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 2377-3766 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0001527395 issn: 2377-3766 databaseCode: RIE dateStart: 20160101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 2377-3766 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0001527395 issn: 2377-3766 databaseCode: M~E dateStart: 20160101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFH9sw4Me_JridI4cvAjWtelHmuPQDQ9zyFDZraTJqwhjk3UbePFvN0m7D1AEb6UkobyX5v1ekvf7AVwFyqWKKe5o7M2dwE-VkzKXOVxGlKVMB22rdfjaZ4NBPBrxpwrcrGthENFePsNb82jP8tVULsxWWVtDc-5SXoUqY6yo1drspxgmMR6uTiJd3u4POzr_o55OS3UUNmSZW5HHSqn8WH9tUOkd_O9zDmG_BI-kU3j7CCo4OYa9LUrBOgzuDRNuKWIlxuNPMpymi3xOCgJgUvKpvpHluyBWjTkXZg6S7qbijRRc1GaEE3jpdZ_vHpxSM8GRlHtzJwiliCl66PuK-zr6R6ioJzJBJXpZlkYYM-FqzBfFLqKbYYaBlCwKWGpeKf8UapPpBM-AhDrT8FGGvvQyPaxGDmEmwzgVIlAYi7QB7ZU9E1kSihtdi3FiEwuXJ9oDifFAUnqgAdfrHh8FmcYfbevG4ut2pbEb0Fy5LCn_tjwxWaBeWTTYOf-91wXsUlO2YG_cNKE2ny3wEnbkcv6ez1pQffzqtux0-gaQw8lI |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3fS8MwED7mFNQHf4vTqXnwRbCuTX-keRy6oTiHjCl7K2lylcHYxHWC_71J2s2BIvhWShLKXZr7Lsl9H8BFoFyqmOKOxt7cCfxUOSlzmcNlRFnKdNC2WocvHdbtxoMBf6rA1aIWBhHt5TO8No_2LF9N5MxslTU0NOcu5SuwGgYB9Ypqre8dFcMlxsP5WaTLG51eU2eA1NOJqY7Dhi5zKfZYMZUfK7ANK-3t_33QDmyV8JE0C3_vQgXHe7C5RCq4D91bw4VbyliJ0eiT9CbpbJqTggKYlIyqr-RjKIjVY54KMwtJ67vmjRRs1GaEA3hut_o3d06pmuBIyr3cCUIpYooe-r7ivo7_ESrqiUxQiV6WpRHGTLga9UWxi-hmmGEgJYsClppXyj-E6ngyxiMgoc41fJShL71MD6uxQ5jJME6FCBTGIq1BY27PRJaU4kbZYpTY1MLlifZAYjyQlB6oweWix1tBp_FH231j8UW70tg1qM9dlpT_2zQxeaBeWzTcOf691zms3_UfO0nnvvtwAhvUFDHY-zd1qObvMzyFNfmRD6fvZ3ZSfQFrLMte |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Distributionally+Robust+Policy+Learning+via+Adversarial+Environment+Generation&rft.jtitle=IEEE+robotics+and+automation+letters&rft.au=Ren%2C+Allen+Z&rft.au=Majumdar%2C+Anirudha&rft.date=2022-04-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.eissn=2377-3766&rft.volume=7&rft.issue=2&rft.spage=1379&rft_id=info:doi/10.1109%2FLRA.2021.3139949&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2377-3766&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2377-3766&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2377-3766&client=summon |