CIDEr: Consensus-based image description evaluation
Automatically describing an image with a sentence is a long-standing challenge in computer vision and natural language processing. Due to recent progress in object detection, attribute classification, action recognition, etc., there is renewed interest in this area. However, evaluating the quality o...
Uloženo v:
| Vydáno v: | 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) s. 4566 - 4575 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Konferenční příspěvek Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
01.06.2015
|
| Témata: | |
| ISSN: | 1063-6919, 1063-6919 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Automatically describing an image with a sentence is a long-standing challenge in computer vision and natural language processing. Due to recent progress in object detection, attribute classification, action recognition, etc., there is renewed interest in this area. However, evaluating the quality of descriptions has proven to be challenging. We propose a novel paradigm for evaluating image descriptions that uses human consensus. This paradigm consists of three main parts: a new triplet-based method of collecting human annotations to measure consensus, a new automated metric that captures consensus, and two new datasets: PASCAL-50S and ABSTRACT-50S that contain 50 sentences describing each image. Our simple metric captures human judgment of consensus better than existing metrics across sentences generated by various sources. We also evaluate five state-of-the-art image description approaches using this new protocol and provide a benchmark for future comparisons. A version of CIDEr named CIDEr-D is available as a part of MS COCO evaluation server to enable systematic evaluation and benchmarking. |
|---|---|
| AbstractList | Automatically describing an image with a sentence is a long-standing challenge in computer vision and natural language processing. Due to recent progress in object detection, attribute classification, action recognition, etc., there is renewed interest in this area. However, evaluating the quality of descriptions has proven to be challenging. We propose a novel paradigm for evaluating image descriptions that uses human consensus. This paradigm consists of three main parts: a new triplet-based method of collecting human annotations to measure consensus, a new automated metric that captures consensus, and two new datasets: PASCAL-50S and ABSTRACT-50S that contain 50 sentences describing each image. Our simple metric captures human judgment of consensus better than existing metrics across sentences generated by various sources. We also evaluate five state-of-the-art image description approaches using this new protocol and provide a benchmark for future comparisons. A version of CIDEr named CIDEr-D is available as a part of MS COCO evaluation server to enable systematic evaluation and benchmarking. |
| Author | Vedantam, Ramakrishna Zitnick, C. Lawrence Parikh, Devi |
| Author_xml | – sequence: 1 givenname: Ramakrishna surname: Vedantam fullname: Vedantam, Ramakrishna email: vrama91@vt.edu organization: Virginia Tech, Blacksburg, VA, USA – sequence: 2 givenname: C. Lawrence surname: Zitnick fullname: Zitnick, C. Lawrence email: larryz@microsoft.com organization: Microsoft Res., Redmond, WA, USA – sequence: 3 givenname: Devi surname: Parikh fullname: Parikh, Devi email: parikh@vt.edu organization: Virgnia Tech, Blacksburg, VA, USA |
| BookMark | eNpNkMFKw0AURUepYFv7AeImSzeJ72UmmRl3klYtFBRRt2GSvJGBNImZRPDvTWkXru5ZXA6Xu2Czpm2IsWuECBH0Xfb5-hbFgEkkY61ByTO2QJFKnupUwDmbI6Q8TDXq2T--ZCvvXQEcQGkdw5zxbLve9PdB1jaeGj_6sDCeqsDtzRcFFfmyd93g2iagH1OP5oBX7MKa2tPqlEv28bh5z57D3cvTNnvYhS4GNYSGKLUchEGpgBfKaFPEUimBaE0lKwuckwBlY1vaBDFJqmknGiRAPOCS3R69Xd9-j-SHfO98SXVtGmpHn6OUwMVk1FP15lh1RJR3_TS__81P1_A_1XlV2A |
| ContentType | Conference Proceeding Journal Article |
| DBID | 6IE 6IH CBEJK RIE RIO 7SC 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/CVPR.2015.7299087 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Computer and Information Systems Abstracts |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Applied Sciences Computer Science |
| EISBN | 1467369640 9781467369640 |
| EISSN | 1063-6919 |
| EndPage | 4575 |
| ExternalDocumentID | 7299087 |
| Genre | orig-research |
| GroupedDBID | 23M 29F 29O 6IE 6IH 6IK ABDPE ACGFS ALMA_UNASSIGNED_HOLDINGS CBEJK IPLJI M43 RIE RIO RNS 7SC 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-i208t-aee6f304a17803b8a9ab2788411fad7df033e408f2fcf51155d0631a1e011d063 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 2825 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000387959204066&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1063-6919 |
| IngestDate | Fri Sep 05 10:08:47 EDT 2025 Wed Aug 27 02:49:18 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i208t-aee6f304a17803b8a9ab2788411fad7df033e408f2fcf51155d0631a1e011d063 |
| Notes | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Conference-1 ObjectType-Feature-3 content type line 23 SourceType-Conference Papers & Proceedings-2 |
| PQID | 1770347889 |
| PQPubID | 23500 |
| PageCount | 10 |
| ParticipantIDs | ieee_primary_7299087 proquest_miscellaneous_1770347889 |
| PublicationCentury | 2000 |
| PublicationDate | 20150601 |
| PublicationDateYYYYMMDD | 2015-06-01 |
| PublicationDate_xml | – month: 06 year: 2015 text: 20150601 day: 01 |
| PublicationDecade | 2010 |
| PublicationTitle | 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) |
| PublicationTitleAbbrev | CVPR |
| PublicationYear | 2015 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssib030089920 ssj0023720 ssj0003211698 |
| Score | 2.5746875 |
| Snippet | Automatically describing an image with a sentence is a long-standing challenge in computer vision and natural language processing. Due to recent progress in... |
| SourceID | proquest ieee |
| SourceType | Aggregation Database Publisher |
| StartPage | 4566 |
| SubjectTerms | Accuracy Benchmarking Cider Computer vision Conferences Correlation Human Measurement Pattern recognition Protocols Sentences Silicon Testing Training |
| Title | CIDEr: Consensus-based image description evaluation |
| URI | https://ieeexplore.ieee.org/document/7299087 https://www.proquest.com/docview/1770347889 |
| WOSCitedRecordID | wos000387959204066&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PT8IwFG6AePCECkb8lZl4tNCu3dp6RYheCDFquC3d2iYcHIaBf799o4ODXrw1TZYub6_fe933-j6E7h2ThXOCYp_KC8xTrrHKmcFaECPznOeqcLXYhJjN5GKh5i30sL8LY62ti8_sEIY1l29WxRZ-lY0EYKcUbdQWQuzuajW-wwjwVyH1ARRm_mSTqj2jEIMaS818pgyniqrAcFKiRuOP-SsUeSXDsEBQWvkFz3XMmXb_97YnqH-4vBfN92HpFLVseYa6IduMwl6u_FQj6NDM9RAbvzxN1o8RqHiCBEaFIciZaPnpUScydg8x0aFJeB-9Tydv42ccVBXwMiZyg7W1qWOEayokYbnUSuexPwhzSp02wjjCmOVEutgVzqdjSWK8zaim1kMBDM9Rp1yV9gJFihcJ44lURZpy53LNLSsSv6NdAn0O2QD1wCzZ165xRhYsMkB3jV0z78zAUOjSrrZVRoUHIGjory7_fvQKHcOH2tVqXaPOZr21N-io-N4sq_Vt7RE_8_qxJw |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PT8IwFH5BNNETKhjx50w8WlzXdm29IgQiEmLQcFu6rU04CIaBf7_t2MZBL96aJkuXt9fvve57fR_AvSEiMYZjZFN5jmhIFZIxSZHifirimMYyMbnYBB-PxWwmJzV4qO7CaK3z4jPdccOcy0-Xycb9KnvkDjsF34N9RmmAt7e1Su8hvmOwiuTH4TCxZ5tQVpxC4PRYcu4zJCiUWBYcJ_blY_dj8ubKvFinWKLQWvkF0HnU6Tf-977H0Npd3_MmVWA6gZpenEKjyDe9YjdndqqUdCjnmkC6w-fe6slzOp5OBCNDLsyl3vzT4o6X6gpkvF2b8Ba893vT7gAVugpoHvhijZTWoSE-VZgLn8RCSRUH9ihMMTYq5anxCdHUFyYwibEJGWOptRlWWFswcMMzqC-WC30OnqQJI5QJmYQhNSZWVJOE2T1tmOt0SNrQdGaJvratM6LCIm24K-0aWXd2HIVa6OUmizC3EORa-suLvx-9hcPB9HUUjYbjl0s4ch9tW7l1BfX1aqOv4SD5Xs-z1U3uHT9eHLRu |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=2015+IEEE+Conference+on+Computer+Vision+and+Pattern+Recognition+%28CVPR%29&rft.atitle=CIDEr%3A+Consensus-based+image+description+evaluation&rft.au=Vedantam%2C+Ramakrishna&rft.au=Zitnick%2C+C.+Lawrence&rft.au=Parikh%2C+Devi&rft.date=2015-06-01&rft.pub=IEEE&rft.issn=1063-6919&rft.eissn=1063-6919&rft.spage=4566&rft.epage=4575&rft_id=info:doi/10.1109%2FCVPR.2015.7299087&rft.externalDocID=7299087 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-6919&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-6919&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-6919&client=summon |