Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks
Convolutional neural networks (CNN) have recently shown outstanding image classification performance in the large- scale visual recognition challenge (ILSVRC2012). The success of CNNs is attributed to their ability to learn rich mid-level image representations as opposed to hand-designed low-level f...
Saved in:
| Published in: | 2014 IEEE Conference on Computer Vision and Pattern Recognition pp. 1717 - 1724 |
|---|---|
| Main Authors: | , , , |
| Format: | Conference Proceeding Journal Article |
| Language: | English |
| Published: |
IEEE
01.06.2014
|
| Subjects: | |
| ISSN: | 1063-6919, 1063-6919 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Convolutional neural networks (CNN) have recently shown outstanding image classification performance in the large- scale visual recognition challenge (ILSVRC2012). The success of CNNs is attributed to their ability to learn rich mid-level image representations as opposed to hand-designed low-level features used in other image classification methods. Learning CNNs, however, amounts to estimating millions of parameters and requires a very large number of annotated image samples. This property currently prevents application of CNNs to problems with limited training data. In this work we show how image representations learned with CNNs on large-scale annotated datasets can be efficiently transferred to other visual recognition tasks with limited amount of training data. We design a method to reuse layers trained on the ImageNet dataset to compute mid-level image representation for images in the PASCAL VOC dataset. We show that despite differences in image statistics and tasks in the two datasets, the transferred representation leads to significantly improved results for object and action classification, outperforming the current state of the art on Pascal VOC 2007 and 2012 datasets. We also show promising results for object and action localization. |
|---|---|
| AbstractList | Convolutional neural networks (CNN) have recently shown outstanding image classification performance in the large- scale visual recognition challenge (ILSVRC2012). The success of CNNs is attributed to their ability to learn rich mid-level image representations as opposed to hand-designed low-level features used in other image classification methods. Learning CNNs, however, amounts to estimating millions of parameters and requires a very large number of annotated image samples. This property currently prevents application of CNNs to problems with limited training data. In this work we show how image representations learned with CNNs on large-scale annotated datasets can be efficiently transferred to other visual recognition tasks with limited amount of training data. We design a method to reuse layers trained on the ImageNet dataset to compute mid-level image representation for images in the PASCAL VOC dataset. We show that despite differences in image statistics and tasks in the two datasets, the transferred representation leads to significantly improved results for object and action classification, outperforming the current state of the art on Pascal VOC 2007 and 2012 datasets. We also show promising results for object and action localization. Convolutional neural networks (CNN) have recently shown outstanding image classification performance in the large- scale visual recognition challenge (ILSVRC2012). The suc- cess of CNNs is attributed to their ability to learn rich mid- level image representations as opposed to hand-designed low-level features used in other image classification meth- ods. Learning CNNs, however, amounts to estimating mil- lions of parameters and requires a very large number of annotated image samples. This property currently prevents application of CNNs to problems with limited training data. In this work we show how image representations learned with CNNs on large-scale annotated datasets can be effi- ciently transferred to other visual recognition tasks with limited amount of training data. We design a method to reuse layers trained on the ImageNet dataset to compute mid-level image representation for images in the PASCAL VOC dataset. We show that despite differences in image statistics and tasks in the two datasets, the transferred rep- resentation leads to significantly improved results for object and action classification, outperforming the current state of the art on Pascal VOC 2007 and 2012 datasets. We also show promising results for object and action localization. |
| Author | Bottou, Leon Oquab, Maxime Laptev, Ivan Sivic, Josef |
| Author_xml | – sequence: 1 givenname: Maxime surname: Oquab fullname: Oquab, Maxime organization: INRIA, Paris, France – sequence: 2 givenname: Leon surname: Bottou fullname: Bottou, Leon organization: MSR, New York, NY, USA – sequence: 3 givenname: Ivan surname: Laptev fullname: Laptev, Ivan organization: INRIA, Paris, France – sequence: 4 givenname: Josef surname: Sivic fullname: Sivic, Josef organization: INRIA, Paris, France |
| BookMark | eNpNjDtPwzAURg0qEqV0ZGLJyJLia8d2PKKIR6XyUFXYUHWb3FQWqVPspIh_DwUGpvPp6Og7YQPfemLsDPgEgNvL4uVpPhEcsokQ4oCNrckhM9YqgFwdsiFwLVNtwQ7-7WM2jtGtuNBGZ0rqIXudEQbv_DpBXyWLgD7WFMJe3LsqbWhHTTLd4JqSOW0DRfIddq71MXmO-6po_a5t-r3CJnmgPvyg-2jDWzxlRzU2kcZ_HLHFzfWiuEtnj7fT4mqWOmGyLrUlKrQqy0tbQUUiK3kOSlZIJjcCpNLK1CRrTjVWK1ErIWutkAiqEqmUI3bxe7sN7XtPsVtuXCypadBT28claGMsV5rn3-n5b-qIaLkNboPhc6kttxpy-QU2UWcD |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding Journal Article |
| DBID | 6IE 6IH CBEJK RIE RIO 7SC 7SP 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/CVPR.2014.222 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Applied Sciences Computer Science |
| EISBN | 9781479951185 1479951188 |
| EISSN | 1063-6919 |
| EndPage | 1724 |
| ExternalDocumentID | 6909618 |
| Genre | orig-research |
| GroupedDBID | 23M 29F 29O 6IE 6IH 6IK ABDPE ACGFS ALMA_UNASSIGNED_HOLDINGS CBEJK IPLJI M43 RIE RIO RNS 7SC 7SP 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-i274t-9ca5a9548c9d1de24c08153dae7872135657fe3f0efadb2f523f65aee1dcaec3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 2088 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000361555601097&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1063-6919 |
| IngestDate | Thu Oct 02 07:08:15 EDT 2025 Wed Aug 27 04:30:17 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i274t-9ca5a9548c9d1de24c08153dae7872135657fe3f0efadb2f523f65aee1dcaec3 |
| Notes | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Conference-1 ObjectType-Feature-3 content type line 23 SourceType-Conference Papers & Proceedings-2 |
| PQID | 1677905608 |
| PQPubID | 23500 |
| PageCount | 8 |
| ParticipantIDs | ieee_primary_6909618 proquest_miscellaneous_1677905608 |
| PublicationCentury | 2000 |
| PublicationDate | 20140601 |
| PublicationDateYYYYMMDD | 2014-06-01 |
| PublicationDate_xml | – month: 06 year: 2014 text: 20140601 day: 01 |
| PublicationDecade | 2010 |
| PublicationTitle | 2014 IEEE Conference on Computer Vision and Pattern Recognition |
| PublicationTitleAbbrev | CVPR |
| PublicationYear | 2014 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssib026764536 ssj0023720 ssj0003211698 |
| Score | 2.5456948 |
| Snippet | Convolutional neural networks (CNN) have recently shown outstanding image classification performance in the large- scale visual recognition challenge... |
| SourceID | proquest ieee |
| SourceType | Aggregation Database Publisher |
| StartPage | 1717 |
| SubjectTerms | Computer vision Image recognition Image representation Neural networks Object recognition Pascal (programming language) Pattern recognition Representations Training Training data Visual Visualization Volatile organic compounds |
| Title | Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks |
| URI | https://ieeexplore.ieee.org/document/6909618 https://www.proquest.com/docview/1677905608 |
| WOSCitedRecordID | wos000361555601097&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LawIxEB5UeujJtlpqX6TQY1fdvHZzlkp7qIhI8VIk5lEEuxZX_f1Nsrt6aC897RJYCMnMJLPzzfcBPDKFqVwsdCRjGhIU53PCZSmWMMu1JoQnNohNJKNROpuJcQ2eDr0wxpgAPjNd_xpq-Xqtdv5XWc9lcl6gpA71JEmKXq3KdjBPOGWFdneIwsRlNlwcKgrYq7GEyicnERexOPJt9gbv44kHedEu9hq6QWXlV2gO582w-b-ZnkH72LiHxocj6RxqJruAZnnTRKUf526oEnOoxlrwUTKtfiKZaRSOMOtZG93A21JHKw8uQq9fLvygSUDPlk1LWY4C7AC5KexLO5Yr5Ek_wiOgzPM2TIfP08FLVGovREuXp24joSSTngxOCR1rg6lydwdGtDTOw3FMfLXUGmL7xkq9wNbls5YzaUyslTSKXEIjW2fmCpBwEUFipllfcaoMTbnUinphTpf6JTTtQMuv3_y7YNeYl0vXgYdqA-bO4n0ZQ2ZmvcvnMfccie6mll7__ekNnPrdLABdt9DYbnbmDk7UfrvMN_fBbH4A8yrAoA |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LawIxEB6sLbQn22qpfabQY1fdvNY9S0WpiogUL2WJeRTBrsVVf3-TuOqhvfS0S2AhJDOTzM433wfwzCSmYjpVgQipT1Csz8U2SzGEGa4UITwyXmwiGgyak0k8LMDLvhdGa-3BZ7rmXn0tXy3k2v0qq9tMzgmUHMExoxSH226tnfVgHnHKturdPg4Tm9vweF9TwE6Pxdc-OQl4HMYHxs166304cjAvWsNORdfrrPwKzv7EaZf-N9dzqBxa99BwfyhdQEGnl1DK75oo9-TMDu3kHHZjZfjIuVY_kUgV8oeYcbyNdqA_U8HcwYtQ98sGIDTy-Nm8bSnNkAceIDuFTW7JYo4c7Yd_eJx5VoFx-3Xc6gS5-kIws5nqKoilYMLRwclYhUpjKu3tgREltPVxHBJXLzWamIY2Qk2xsRmt4UxoHSoptCRXUEwXqb4GFNuYIDBTrCE5lZo2uVCSOmlOm_xFtFmFslu_5HvLr5HkS1eFp90GJNbmXSFDpHqxzpKQO5ZEe1dr3vz96SOcdsb9XtLrDt5u4czt7BbedQfF1XKt7-FEblazbPngTegHIxLD5w |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2014+IEEE+Conference+on+Computer+Vision+and+Pattern+Recognition&rft.atitle=Learning+and+Transferring+Mid-level+Image+Representations+Using+Convolutional+Neural+Networks&rft.au=Oquab%2C+Maxime&rft.au=Bottou%2C+Leon&rft.au=Laptev%2C+Ivan&rft.au=Sivic%2C+Josef&rft.date=2014-06-01&rft.pub=IEEE&rft.issn=1063-6919&rft.eissn=1063-6919&rft.spage=1717&rft.epage=1724&rft_id=info:doi/10.1109%2FCVPR.2014.222&rft.externalDocID=6909618 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-6919&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-6919&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-6919&client=summon |