Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN
Manually labeling datasets with object masks is extremely time consuming. In this work, we follow the idea of Polygon-RNN [4] to produce polygonal annotations of objects interactively using humans-in-the-loop. We introduce several important improvements to the model: 1) we design a new CNN encoder a...
Saved in:
| Published in: | 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 859 - 868 |
|---|---|
| Main Authors: | , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
01.06.2018
|
| Subjects: | |
| ISSN: | 1063-6919 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Manually labeling datasets with object masks is extremely time consuming. In this work, we follow the idea of Polygon-RNN [4] to produce polygonal annotations of objects interactively using humans-in-the-loop. We introduce several important improvements to the model: 1) we design a new CNN encoder architecture, 2) show how to effectively train the model with Reinforcement Learning, and 3) significantly increase the output resolution using a Graph Neural Network, allowing the model to accurately annotate high-resolution objects in images. Extensive evaluation on the Cityscapes dataset [8] shows that our model, which we refer to as Polygon-RNN++, significantly outperforms the original model in both automatic (10% absolute and 16% relative improvement in mean IoU) and interactive modes (requiring 50% fewer clicks by annotators). We further analyze the cross-domain scenario in which our model is trained on one dataset, and used out of the box on datasets from varying domains. The results show that Polygon-RNN++ exhibits powerful generalization capabilities, achieving significant improvements over existing pixel-wise methods. Using simple online fine-tuning we further achieve a high reduction in annotation time for new datasets, moving a step closer towards an interactive annotation tool to be used in practice. |
|---|---|
| AbstractList | Manually labeling datasets with object masks is extremely time consuming. In this work, we follow the idea of Polygon-RNN [4] to produce polygonal annotations of objects interactively using humans-in-the-loop. We introduce several important improvements to the model: 1) we design a new CNN encoder architecture, 2) show how to effectively train the model with Reinforcement Learning, and 3) significantly increase the output resolution using a Graph Neural Network, allowing the model to accurately annotate high-resolution objects in images. Extensive evaluation on the Cityscapes dataset [8] shows that our model, which we refer to as Polygon-RNN++, significantly outperforms the original model in both automatic (10% absolute and 16% relative improvement in mean IoU) and interactive modes (requiring 50% fewer clicks by annotators). We further analyze the cross-domain scenario in which our model is trained on one dataset, and used out of the box on datasets from varying domains. The results show that Polygon-RNN++ exhibits powerful generalization capabilities, achieving significant improvements over existing pixel-wise methods. Using simple online fine-tuning we further achieve a high reduction in annotation time for new datasets, moving a step closer towards an interactive annotation tool to be used in practice. |
| Author | Ling, Huan Kar, Amlan Acuna, David Fidler, Sanja |
| Author_xml | – sequence: 1 givenname: David surname: Acuna fullname: Acuna, David email: davidj@cs.toronto.edu organization: NVIDIA – sequence: 2 givenname: Huan surname: Ling fullname: Ling, Huan email: linghuan@cs.toronto.edu organization: Vector Institute – sequence: 3 givenname: Amlan surname: Kar fullname: Kar, Amlan email: amlan@cs.toronto.edu organization: Vector Institute – sequence: 4 givenname: Sanja surname: Fidler fullname: Fidler, Sanja email: fidler@cs.toronto.edu organization: Vector Institute |
| BookMark | eNotj81Kw0AURkdRsNasXbjJC6TeyfwvS2y1UmqpxW2ZZO7UkXYiyaD07Q3U1eGDwwfnllzFNiIh9xQmlIJ5rD7Wm0kJVE8AwMgLkhmlqWBaSl6CuSQjCpIV0lBzQ7K-_xq0UmqmuRiR15n3oQkYU76ICTvbpPCD-TTGNtkU2pi3Pn_H_XEwzvvJJttj6vPfkD7zdXs47dtYbFarO3Lt7aHH7J9jsp3PttVLsXx7XlTTZREMpIKB8dopZiV1JRcKuXWN8mhYo4Rnvqyhpt4BM0qCqLmray2clAZpiSA5G5OH821AxN13F462O-20GJoNZ39mkU8o |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/CVPR.2018.00096 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Applied Sciences |
| EISBN | 9781538664209 1538664208 |
| EISSN | 1063-6919 |
| EndPage | 868 |
| ExternalDocumentID | 8578194 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IH 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IJVOP OCL RIE RIL RIO |
| ID | FETCH-LOGICAL-i90t-309f8d73a61d2457e4adc7fe93c75f3f2b0b1fd0397605b4dbb85d669e12e0643 |
| IEDL.DBID | RIE |
| IngestDate | Wed Aug 27 02:43:29 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i90t-309f8d73a61d2457e4adc7fe93c75f3f2b0b1fd0397605b4dbb85d669e12e0643 |
| PageCount | 10 |
| ParticipantIDs | ieee_primary_8578194 |
| PublicationCentury | 2000 |
| PublicationDate | 2018-Jun |
| PublicationDateYYYYMMDD | 2018-06-01 |
| PublicationDate_xml | – month: 06 year: 2018 text: 2018-Jun |
| PublicationDecade | 2010 |
| PublicationTitle | 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition |
| PublicationTitleAbbrev | CVPR |
| PublicationYear | 2018 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0002683845 ssj0003211698 |
| Score | 2.5885525 |
| Snippet | Manually labeling datasets with object masks is extremely time consuming. In this work, we follow the idea of Polygon-RNN [4] to produce polygonal annotations... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 859 |
| SubjectTerms | Computer architecture Decoding Labeling Neural networks Predictive models Training |
| Title | Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN |
| URI | https://ieeexplore.ieee.org/document/8578194 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV05a8MwFH4koUOnHknpjYaOdSNfOsaSJpRSjElDyRZkSwqBVi6JU-i_rySbZOnSzdJgxLOs973j-wRwp1kkiBQk0JENVxMbAASCqyTgVOBSM2ExvCcKv9IsY_M5zztwv-PCKKV885l6cI--li-rcutSZUNmt5cNurvQpZQ2XK1dPiUiLGZthcyNYxvZEM5aNZ8Q8-HoPZ-6Xi7XPIm9SP_-OhXvTSZH_1vHMQz2tDyU7xzOCXSUOYWjFkei9i_d9OFl7HUh7DuQT_gJf6ahR2OqpvCOKo3e1PKz5R0Z9CRq683qDXJpWZRXHz_LygTTLBvAbDKejZ6D9s6EYMVxHcSYayZpLEgooySlKhGypFrxuKSpjnVU4CLUEjsUgtMikUXBUkkIV2GkHDo5g56pjDoHFFMnLkWFNaMN0YQFesKeBbpgpMB2Bl9A31lm8dWoYixao1z-PX0Fh870TZPVNfTq9VbdwEH5Xa8261v_KX8B-pWdyA |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT8IwGH6DaKInVDB-24NHJ91XP44GIai4LEgMN9JtLSHRzcAw8d_bdgtcvHhre1iWt1vf5_14ngLcKuYJkgniKE-Hq4EOABzBZeBwKnCqmNAY3hKFRzSK2HTK4wbcbbgwUkrbfCbvzdDW8rMiXZtUWZfpz0sH3TuwGwaB51ZsrU1GxSPMZ3WNzMx9HdsQzmo9Hxfzbu89HptuLtM-ia1M__ZCFetPBq3_vckhdLbEPBRvXM4RNGR-DK0aSaL6P1214blvlSH0M5BN-Ql7qqGHPC-q0jsqFHqT88-aeZSjR1Fqf1aukEnMorj4-JkXuTOOog5MBv1Jb-jUtyY4C45Lx8dcsYz6griZF4RUBiJLqZLcT2mofOUlOHFVhg0OwWESZEnCwowQLl1PGnxyAs28yOUpIJ8aeSkqtBl1kCY01BP6NFAJIwnWK_gM2sYys69KF2NWG-X87-Ub2B9OXkez0VP0cgEHZhuqlqtLaJbLtbyCvfS7XKyW13ZbfwFhYaEP |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2018+IEEE%2FCVF+Conference+on+Computer+Vision+and+Pattern+Recognition&rft.atitle=Efficient+Interactive+Annotation+of+Segmentation+Datasets+with+Polygon-RNN&rft.au=Acuna%2C+David&rft.au=Ling%2C+Huan&rft.au=Kar%2C+Amlan&rft.au=Fidler%2C+Sanja&rft.date=2018-06-01&rft.pub=IEEE&rft.eissn=1063-6919&rft.spage=859&rft.epage=868&rft_id=info:doi/10.1109%2FCVPR.2018.00096&rft.externalDocID=8578194 |