Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN
Manually labeling datasets with object masks is extremely time consuming. In this work, we follow the idea of Polygon-RNN [4] to produce polygonal annotations of objects interactively using humans-in-the-loop. We introduce several important improvements to the model: 1) we design a new CNN encoder a...
Uložené v:
| Vydané v: | 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition s. 859 - 868 |
|---|---|
| Hlavní autori: | , , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
01.06.2018
|
| Predmet: | |
| ISSN: | 1063-6919 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Manually labeling datasets with object masks is extremely time consuming. In this work, we follow the idea of Polygon-RNN [4] to produce polygonal annotations of objects interactively using humans-in-the-loop. We introduce several important improvements to the model: 1) we design a new CNN encoder architecture, 2) show how to effectively train the model with Reinforcement Learning, and 3) significantly increase the output resolution using a Graph Neural Network, allowing the model to accurately annotate high-resolution objects in images. Extensive evaluation on the Cityscapes dataset [8] shows that our model, which we refer to as Polygon-RNN++, significantly outperforms the original model in both automatic (10% absolute and 16% relative improvement in mean IoU) and interactive modes (requiring 50% fewer clicks by annotators). We further analyze the cross-domain scenario in which our model is trained on one dataset, and used out of the box on datasets from varying domains. The results show that Polygon-RNN++ exhibits powerful generalization capabilities, achieving significant improvements over existing pixel-wise methods. Using simple online fine-tuning we further achieve a high reduction in annotation time for new datasets, moving a step closer towards an interactive annotation tool to be used in practice. |
|---|---|
| AbstractList | Manually labeling datasets with object masks is extremely time consuming. In this work, we follow the idea of Polygon-RNN [4] to produce polygonal annotations of objects interactively using humans-in-the-loop. We introduce several important improvements to the model: 1) we design a new CNN encoder architecture, 2) show how to effectively train the model with Reinforcement Learning, and 3) significantly increase the output resolution using a Graph Neural Network, allowing the model to accurately annotate high-resolution objects in images. Extensive evaluation on the Cityscapes dataset [8] shows that our model, which we refer to as Polygon-RNN++, significantly outperforms the original model in both automatic (10% absolute and 16% relative improvement in mean IoU) and interactive modes (requiring 50% fewer clicks by annotators). We further analyze the cross-domain scenario in which our model is trained on one dataset, and used out of the box on datasets from varying domains. The results show that Polygon-RNN++ exhibits powerful generalization capabilities, achieving significant improvements over existing pixel-wise methods. Using simple online fine-tuning we further achieve a high reduction in annotation time for new datasets, moving a step closer towards an interactive annotation tool to be used in practice. |
| Author | Ling, Huan Kar, Amlan Acuna, David Fidler, Sanja |
| Author_xml | – sequence: 1 givenname: David surname: Acuna fullname: Acuna, David email: davidj@cs.toronto.edu organization: NVIDIA – sequence: 2 givenname: Huan surname: Ling fullname: Ling, Huan email: linghuan@cs.toronto.edu organization: Vector Institute – sequence: 3 givenname: Amlan surname: Kar fullname: Kar, Amlan email: amlan@cs.toronto.edu organization: Vector Institute – sequence: 4 givenname: Sanja surname: Fidler fullname: Fidler, Sanja email: fidler@cs.toronto.edu organization: Vector Institute |
| BookMark | eNotj81Kw0AURkdRsNasXbjJC6TeyfwvS2y1UmqpxW2ZZO7UkXYiyaD07Q3U1eGDwwfnllzFNiIh9xQmlIJ5rD7Wm0kJVE8AwMgLkhmlqWBaSl6CuSQjCpIV0lBzQ7K-_xq0UmqmuRiR15n3oQkYU76ICTvbpPCD-TTGNtkU2pi3Pn_H_XEwzvvJJttj6vPfkD7zdXs47dtYbFarO3Lt7aHH7J9jsp3PttVLsXx7XlTTZREMpIKB8dopZiV1JRcKuXWN8mhYo4Rnvqyhpt4BM0qCqLmray2clAZpiSA5G5OH821AxN13F462O-20GJoNZ39mkU8o |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/CVPR.2018.00096 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Applied Sciences |
| EISBN | 9781538664209 1538664208 |
| EISSN | 1063-6919 |
| EndPage | 868 |
| ExternalDocumentID | 8578194 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IH 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP OCL RIE RIL RIO |
| ID | FETCH-LOGICAL-i90t-309f8d73a61d2457e4adc7fe93c75f3f2b0b1fd0397605b4dbb85d669e12e0643 |
| IEDL.DBID | RIE |
| IngestDate | Wed Aug 27 02:43:29 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i90t-309f8d73a61d2457e4adc7fe93c75f3f2b0b1fd0397605b4dbb85d669e12e0643 |
| PageCount | 10 |
| ParticipantIDs | ieee_primary_8578194 |
| PublicationCentury | 2000 |
| PublicationDate | 2018-Jun |
| PublicationDateYYYYMMDD | 2018-06-01 |
| PublicationDate_xml | – month: 06 year: 2018 text: 2018-Jun |
| PublicationDecade | 2010 |
| PublicationTitle | 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition |
| PublicationTitleAbbrev | CVPR |
| PublicationYear | 2018 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0002683845 ssj0003211698 |
| Score | 2.5885525 |
| Snippet | Manually labeling datasets with object masks is extremely time consuming. In this work, we follow the idea of Polygon-RNN [4] to produce polygonal annotations... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 859 |
| SubjectTerms | Computer architecture Decoding Labeling Neural networks Predictive models Training |
| Title | Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN |
| URI | https://ieeexplore.ieee.org/document/8578194 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEA5t8eDJRyu-ycGja7OPbJKj1BYRWZZapLeSZynorrRbwX9vkg3txYu3JIcQJo-Zb2a-CQB3BGPNhUUnFnipKFMyjgRSKuI4zwzniabIE4VfSVHQ-ZyVHXC_48JorX3ymX5wTR_LV7XcOlfZkNrjZUF3F3QJIS1Xa-dPSXKa0hAhc_3UIpuc0VDNJ0ZsOHovpy6XyyVPIl-kf_-ditcmk6P_reMYDPa0PFjuFM4J6OjqFBwFOxKGW7rpg5exrwth54De4cf9mwYfq6puA--wNvBNLz8D76iCT7yx2qzZQOeWhWX98bOsq2haFAMwm4xno-co_JkQrRhqohQxQxVJeR6rJMNEZ1xJYjRLJcEmNYlAIjYKOSsEYZEpIShWec50nGhnnZyBXlVX-hxAF55hTMpEZsyCTClsK85kaiQyHBt8AfpOMouvtirGIgjl8u_hK3DoRN8mWV2DXrPe6htwIL-b1WZ967fyF5oPn7Y |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELZKQYKpQIt444GRUCexE3tEpVWBEkWlQt0qx4-qEiSoTZH499hO1C4sbLYHy_Lr7ru77w6A25gQxTODTgzwkh6WwvcyJKXHSYQ154GiyBGFR3GS0OmUpQ1wt-HCKKVc8Jm6t03ny5eFWFtTWZea62VA9w7YJRgHfsXW2lhUgoiGtPaR2X5osE3EaJ3Px0es23tPxzaay4ZPIpemf1tQxcmTQet_KzkEnS0xD6YbkXMEGio_Bq1ak4T1O121wXPfZYYwc0Bn8uPuV4MPeV5UrndYaPim5p818yiHj7w08qxcQWuYhWnx8TMvcm-cJB0wGfQnvaFXV03wFgyVXoiYpjIOeeTLAJNYYS5FrBULRUx0qIMMZb6WyOohiGRYZhklMoqY8gNl9ZMT0MyLXJ0CaB00jAkRCMwMzBSZaflYhFogzYkmZ6Btd2b2VeXFmNWbcv738A3YH05eR7PRU_JyAQ7sMVQhV5egWS7X6grsie9ysVpeu2P9BRkPov0 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2018+IEEE%2FCVF+Conference+on+Computer+Vision+and+Pattern+Recognition&rft.atitle=Efficient+Interactive+Annotation+of+Segmentation+Datasets+with+Polygon-RNN&rft.au=Acuna%2C+David&rft.au=Ling%2C+Huan&rft.au=Kar%2C+Amlan&rft.au=Fidler%2C+Sanja&rft.date=2018-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=859&rft.epage=868&rft_id=info:doi/10.1109%2FCVPR.2018.00096&rft.externalDocID=8578194 |