LEMMA: Learning Language-Conditioned Multi-Robot Manipulation

Complex manipulation tasks often require robots with complementary capabilities to collaborate. We introduce a benchmark for L anguag E -Conditioned M ulti-robot MA nipulation ( LEMMA ) focused on task allocation and long-horizon object manipulation based on human language instructions in a tabletop...

Full description

Saved in:
Bibliographic Details
Published in:IEEE robotics and automation letters Vol. 8; no. 10; pp. 6835 - 6842
Main Authors: Gong, Ran, Gao, Xiaofeng, Gao, Qiaozi, Shakiah, Suhaila, Thattai, Govind, Sukhatme, Gaurav S.
Format: Journal Article
Language:English
Published: Piscataway IEEE 01.10.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:2377-3766, 2377-3766
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Complex manipulation tasks often require robots with complementary capabilities to collaborate. We introduce a benchmark for L anguag E -Conditioned M ulti-robot MA nipulation ( LEMMA ) focused on task allocation and long-horizon object manipulation based on human language instructions in a tabletop setting. LEMMA features 8 types of procedurally generated tasks with varying degree of complexity, some of which require the robots to use tools and pass tools to each other. For each task, we provide 800 expert demonstrations and human instructions for training and evaluations. LEMMA poses greater challenges compared to existing benchmarks, as it requires the system to identify each manipulator's limitations and assign sub-tasks accordingly while also handling strong temporal dependencies in each task. To address these challenges, we propose a modular hierarchical planning approach as a baseline. Our results highlight the potential of LEMMA for developing future language-conditioned multi-robot systems.
AbstractList Complex manipulation tasks often require robots with complementary capabilities to collaborate. We introduce a benchmark for L anguag E -Conditioned M ulti-robot MA nipulation ( LEMMA ) focused on task allocation and long-horizon object manipulation based on human language instructions in a tabletop setting. LEMMA features 8 types of procedurally generated tasks with varying degree of complexity, some of which require the robots to use tools and pass tools to each other. For each task, we provide 800 expert demonstrations and human instructions for training and evaluations. LEMMA poses greater challenges compared to existing benchmarks, as it requires the system to identify each manipulator's limitations and assign sub-tasks accordingly while also handling strong temporal dependencies in each task. To address these challenges, we propose a modular hierarchical planning approach as a baseline. Our results highlight the potential of LEMMA for developing future language-conditioned multi-robot systems.
Author Thattai, Govind
Gao, Qiaozi
Gong, Ran
Gao, Xiaofeng
Sukhatme, Gaurav S.
Shakiah, Suhaila
Author_xml – sequence: 1
  givenname: Ran
  orcidid: 0009-0000-9365-9143
  surname: Gong
  fullname: Gong, Ran
  email: nikepupu@ucla.edu
  organization: Center for Vision, Cognition, Learning, and Autonomy, UCLA, Los Angeles, CA, USA
– sequence: 2
  givenname: Xiaofeng
  orcidid: 0000-0003-3331-9846
  surname: Gao
  fullname: Gao, Xiaofeng
  email: xfgao@g.ucla.edu
  organization: Amazon Alexa AI, San Jose, CA, USA
– sequence: 3
  givenname: Qiaozi
  orcidid: 0000-0002-5403-0796
  surname: Gao
  fullname: Gao, Qiaozi
  email: qiaozikl@gmail.com
  organization: Amazon Alexa AI, San Jose, CA, USA
– sequence: 4
  givenname: Suhaila
  orcidid: 0000-0002-1891-7058
  surname: Shakiah
  fullname: Shakiah, Suhaila
  email: ssshakia@amazon.com
  organization: Amazon Alexa AI, San Jose, CA, USA
– sequence: 5
  givenname: Govind
  orcidid: 0009-0005-1010-8896
  surname: Thattai
  fullname: Thattai, Govind
  email: gowin.thattai@gmail.com
  organization: Amazon Alexa AI, San Jose, CA, USA
– sequence: 6
  givenname: Gaurav S.
  orcidid: 0000-0003-2408-474X
  surname: Sukhatme
  fullname: Sukhatme, Gaurav S.
  email: gaurav@usc.edu
  organization: Amazon Alexa AI, San Jose, CA, USA
BookMark eNp9kD1rwzAQhkVJoWmavUMHQ2enkk5fLnQIIf0Am0JoZ6FYclBw5VS2h_77OE2G0KHTHdw993LPNRqFJjiEbgmeEYKzh3w1n1FMYQZAAHN1gcYUpExBCjE666_QtG23GGPCqYSMj9FTviyK-WOSOxODD5skN2HTm41LF02wvvNDkE2Kvu58umrWTZcUJvhdX5vD6AZdVqZu3fRUJ-jzefmxeE3z95e3xTxPS4pll8qMWbF2jFNrJGM8U1QAlZIzIJWS1ihGGTWVLbPKECstN8IS7jB1meWwhgm6P97dxea7d22nt00fwxCpqRIcZ0SAGrbwcauMTdtGV-ld9F8m_miC9cGTHjzpgyd98jQg4g9S-u73tS4aX_8H3h1B75w7y6EMsALYA0zidLM
CODEN IRALC6
CitedBy_id crossref_primary_10_3390_app14114696
crossref_primary_10_1016_j_rcim_2025_103113
crossref_primary_10_1109_JAS_2025_125552
Cites_doi 10.1007/978-3-030-95459-8_13
10.1007/978-3-030-58601-0_39
10.1177/02783649211056967
10.1109/LRA.2022.3180108
10.1177/0278364913496484
10.1162/isal_a_00269
10.15607/RSS.2022.XVIII.032
10.15607/RSS.2021.XVII.044
10.1109/LRA.2022.3145964
10.1109/IROS47612.2022.9981802
10.1109/CVPR.2018.00387
10.1109/ICCV48922.2021.01564
10.1109/TASE.2018.2791478
10.1016/j.robot.2012.07.005
10.15607/RSS.2021.XVII.047
10.1007/978-3-030-58558-7_28
10.1109/HUMANOIDS47582.2021.9555672
10.1109/IROS47612.2022.9981280
10.15607/RSS.2020.XVI.003
10.1109/CVPR.2019.00685
10.1109/CVPR42600.2020.01075
10.1109/ICRA48891.2023.10161317
10.1109/LRA.2022.3193254
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/LRA.2023.3313058
DatabaseName IEEE Xplore (IEEE)
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2377-3766
EndPage 6842
ExternalDocumentID 10_1109_LRA_2023_3313058
10243083
Genre orig-research
GrantInformation_xml – fundername: Amazon Alexa AI
GroupedDBID 0R~
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFS
AGQYO
AGSQL
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
IFIPE
IPLJI
JAVBF
KQ8
M43
M~E
O9-
OCL
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c207t-794d6be452da7445982632775431f87da84242afdc9fa1d7d5a6d15e02e9d53b3
IEDL.DBID RIE
ISSN 2377-3766
IngestDate Mon Jun 30 06:24:41 EDT 2025
Tue Nov 18 21:00:09 EST 2025
Sat Nov 29 06:03:27 EST 2025
Wed Aug 27 02:24:55 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 10
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c207t-794d6be452da7445982632775431f87da84242afdc9fa1d7d5a6d15e02e9d53b3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0009-0005-1010-8896
0000-0003-3331-9846
0000-0003-2408-474X
0000-0002-5403-0796
0009-0000-9365-9143
0000-0002-1891-7058
PQID 2865091638
PQPubID 4437225
PageCount 8
ParticipantIDs crossref_primary_10_1109_LRA_2023_3313058
ieee_primary_10243083
proquest_journals_2865091638
crossref_citationtrail_10_1109_LRA_2023_3313058
PublicationCentury 2000
PublicationDate 2023-10-01
PublicationDateYYYYMMDD 2023-10-01
PublicationDate_xml – month: 10
  year: 2023
  text: 2023-10-01
  day: 01
PublicationDecade 2020
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE robotics and automation letters
PublicationTitleAbbrev LRA
PublicationYear 2023
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
stepputtis (ref11) 0
ref34
ref15
ref14
ref36
ref31
ref30
ref33
szot (ref20) 0
ref32
shridhar (ref5) 0
ref2
ref16
ref19
huang (ref8) 0
ref18
nair (ref12) 0
shridhar (ref1) 0
ref26
ref25
min (ref35) 0
chen (ref24) 0
ref21
zheng (ref3) 0
ref28
ref27
zeng (ref22) 0
ref29
sharma (ref23) 2022
ref7
ref9
ref4
ref6
wang (ref17) 2021
zeng (ref10) 0
References_xml – ident: ref28
  doi: 10.1007/978-3-030-95459-8_13
– start-page: 1769
  year: 0
  ident: ref8
  article-title: Inner monologue: Embodied reasoning through planning with language models
  publication-title: Proc Conf Robot Learn
– start-page: 894
  year: 0
  ident: ref1
  article-title: CLIPort: What and where pathways for robotic manipulation
  publication-title: Proc 5th Conf Robot Learn
– ident: ref13
  doi: 10.1007/978-3-030-58601-0_39
– ident: ref32
  doi: 10.1177/02783649211056967
– ident: ref2
  doi: 10.1109/LRA.2022.3180108
– ident: ref34
  doi: 10.1177/0278364913496484
– ident: ref18
  doi: 10.1162/isal_a_00269
– ident: ref14
  doi: 10.15607/RSS.2022.XVIII.032
– start-page: 726
  year: 0
  ident: ref22
  article-title: Transporter networks: Rearranging the visual world for robotic manipulation
  publication-title: Proc Conf Robot Learn
– ident: ref21
  doi: 10.15607/RSS.2021.XVII.044
– ident: ref19
  doi: 10.1109/LRA.2022.3145964
– year: 2022
  ident: ref23
  article-title: CH-MARL: A multimodal benchmark for cooperative, heterogeneous multi-agent reinforcement learning
– ident: ref29
  doi: 10.1109/IROS47612.2022.9981802
– year: 2021
  ident: ref17
  article-title: Collaborative visual navigation
– start-page: 1
  year: 0
  ident: ref10
  article-title: Socratic models: Composing zero-shot multimodal reasoning with language
  publication-title: Proc 11th Int Conf Learn Representations
– start-page: 5150
  year: 0
  ident: ref24
  article-title: Towards human-level bimanual dexterous manipulation with reinforcement learning
  publication-title: Proc Adv Neural Inf Process Syst
– start-page: 785
  year: 0
  ident: ref5
  article-title: Perceiver-actor: A multi-task transformer for robotic manipulation
  publication-title: Proc Conf Robot Learn
– ident: ref4
  doi: 10.1109/CVPR.2018.00387
– start-page: 1
  year: 0
  ident: ref35
  article-title: FILM: Following instructions in language with modular methods
  publication-title: Proc Int Conf Learn Representations
– ident: ref36
  doi: 10.1109/ICCV48922.2021.01564
– ident: ref30
  doi: 10.1109/TASE.2018.2791478
– ident: ref27
  doi: 10.1016/j.robot.2012.07.005
– ident: ref9
  doi: 10.15607/RSS.2021.XVII.047
– ident: ref15
  doi: 10.1007/978-3-030-58558-7_28
– ident: ref26
  doi: 10.1109/HUMANOIDS47582.2021.9555672
– start-page: 13139
  year: 0
  ident: ref11
  article-title: Language-conditioned imitation learning for robot manipulation tasks
  publication-title: Proc Adv Neural Inf Process Syst
– ident: ref25
  doi: 10.1109/IROS47612.2022.9981280
– ident: ref31
  doi: 10.15607/RSS.2020.XVI.003
– ident: ref16
  doi: 10.1109/CVPR.2019.00685
– start-page: 665
  year: 0
  ident: ref3
  article-title: VLMBENCH: A compositional benchmark for vision-and-language manipulation
  publication-title: Proc Neural Inf Process Syst Track Datasets Benchmarks
– start-page: 1303
  year: 0
  ident: ref12
  article-title: Learning language-conditioned robot behavior from offline data and crowd-sourced annotation
  publication-title: Proc Conf Robot Learn
– start-page: 251
  year: 0
  ident: ref20
  article-title: Habitat 2.0: Training home assistants to rearrange their habitat
  publication-title: Proc Adv Neural Inf Process Syst
– ident: ref6
  doi: 10.1109/CVPR42600.2020.01075
– ident: ref33
  doi: 10.1109/ICRA48891.2023.10161317
– ident: ref7
  doi: 10.1109/LRA.2022.3193254
SSID ssj0001527395
Score 2.298679
Snippet Complex manipulation tasks often require robots with complementary capabilities to collaborate. We introduce a benchmark for L anguag E -Conditioned M...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 6835
SubjectTerms Benchmark testing
Benchmarks
Collaboration
Data Sets for Robot Learning
Multi-robot systems
Multiple robots
Multitasking
Natural Dialog for HRI
Planning
Robot kinematics
Robots
Task analysis
Task complexity
Title LEMMA: Learning Language-Conditioned Multi-Robot Manipulation
URI https://ieeexplore.ieee.org/document/10243083
https://www.proquest.com/docview/2865091638
Volume 8
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 2377-3766
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001527395
  issn: 2377-3766
  databaseCode: RIE
  dateStart: 20160101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2377-3766
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001527395
  issn: 2377-3766
  databaseCode: M~E
  dateStart: 20160101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFH-44UEPfk6cztGDFw_d2iZtGsHDGBse2iFDYbeSr4ogq2ydR_92k7TVgSh46yFpy0veR97L-_0ArlkQCSIQcmPTIYM591xKQ-7qyDQkNJYyYhYyPyGzWbxY0Ie6Wd32wiil7OUzNTCPtpYvC7ExqTKt4QFGOmZoQYsQUjVrfSdUDJQYDZtSpEeHyXw0MOzgA4S0pTak7luux3Kp_DDA1qtMD__5P0dwUIePzqha72PYUcsT2N8CFTyFu2SSpqNbp0ZOfXaSOiXpjgtTnzbQRNKxjbfuvOBF6aRs-dLQeHXgaTp5HN-7NUmCKwKPlK7WJxlxhcNAMoKxweOLUGBw7ZCfx0SyGGsvzHIpaM58SWTIIumHygsUlSHi6AzaS_3hc3CkKclQFssY6yhKBBTn2CecC5ZzfTDKuzBs5JeJGkHcEFm8ZvYk4dFMSzwzEs9qiXfh5mvGW4We8cfYjpHw1rhKuF3oNWuU1fq1zkw_rY50tPG4-GXaJeyZt1f37nrQLlcbdQW74r18Wa_60Eo_Jn27gT4BXlXAzA
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFH_oFNSDnxOnU3vw4qGzbdKPCB7G2JjYDhkTdiv5qgxkla3z7zdJWx2IgrceElJe8j7yXt7vB3BDvYCHHCE70h0ymDHHJsRntopM_ZBEQgTUQObH4WgUTafkuWpWN70wUkrz-Ex29Kep5Yucr3SqTGm4h5GKGTZhy8fYc8t2re-UigYTI35djHTIXTzudjQ_eAchZas1rfua8zFsKj9MsPErg4N__tEh7FcBpNUtd_wINuT8GPbWYAVP4CHuJ0n33qqwU1-tuEpK2r1cV6g1OJGwTOutPc5ZXlgJnc9qIq8mvAz6k97QrmgSbO45YWErjRIBk9j3BA0x1oh8AfI0sh1ysygUNMLKD9NMcJJRV4TCp4Fwfel4kggfMXQKjbla-AwsoYsyhEYiwiqO4h7BGXZDxjjNmLoaZS24q-WX8gpDXFNZvKXmLuGQVEk81RJPK4m34PZrxnuJn_HH2KaW8Nq4UrgtaNd7lFYatkx1R62KdZT5OP9l2jXsDCdJnMaPo6cL2NUrla_w2tAoFit5Cdv8o5gtF1fmGH0CAXjC4g
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=LEMMA%3A+Learning+Language-Conditioned+Multi-Robot+Manipulation&rft.jtitle=IEEE+robotics+and+automation+letters&rft.au=Gong%2C+Ran&rft.au=Gao%2C+Xiaofeng&rft.au=Gao%2C+Qiaozi&rft.au=Shakiah%2C+Suhaila&rft.date=2023-10-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.eissn=2377-3766&rft.volume=8&rft.issue=10&rft.spage=6835&rft_id=info:doi/10.1109%2FLRA.2023.3313058&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2377-3766&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2377-3766&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2377-3766&client=summon