Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks

Convolutional neural networks (CNN) have recently shown outstanding image classification performance in the large- scale visual recognition challenge (ILSVRC2012). The success of CNNs is attributed to their ability to learn rich mid-level image representations as opposed to hand-designed low-level f...

Full description

Saved in:
Bibliographic Details
Published in:2014 IEEE Conference on Computer Vision and Pattern Recognition pp. 1717 - 1724
Main Authors: Oquab, Maxime, Bottou, Leon, Laptev, Ivan, Sivic, Josef
Format: Conference Proceeding Journal Article
Language:English
Published: IEEE 01.06.2014
Subjects:
ISSN:1063-6919, 1063-6919
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Convolutional neural networks (CNN) have recently shown outstanding image classification performance in the large- scale visual recognition challenge (ILSVRC2012). The success of CNNs is attributed to their ability to learn rich mid-level image representations as opposed to hand-designed low-level features used in other image classification methods. Learning CNNs, however, amounts to estimating millions of parameters and requires a very large number of annotated image samples. This property currently prevents application of CNNs to problems with limited training data. In this work we show how image representations learned with CNNs on large-scale annotated datasets can be efficiently transferred to other visual recognition tasks with limited amount of training data. We design a method to reuse layers trained on the ImageNet dataset to compute mid-level image representation for images in the PASCAL VOC dataset. We show that despite differences in image statistics and tasks in the two datasets, the transferred representation leads to significantly improved results for object and action classification, outperforming the current state of the art on Pascal VOC 2007 and 2012 datasets. We also show promising results for object and action localization.
AbstractList Convolutional neural networks (CNN) have recently shown outstanding image classification performance in the large- scale visual recognition challenge (ILSVRC2012). The success of CNNs is attributed to their ability to learn rich mid-level image representations as opposed to hand-designed low-level features used in other image classification methods. Learning CNNs, however, amounts to estimating millions of parameters and requires a very large number of annotated image samples. This property currently prevents application of CNNs to problems with limited training data. In this work we show how image representations learned with CNNs on large-scale annotated datasets can be efficiently transferred to other visual recognition tasks with limited amount of training data. We design a method to reuse layers trained on the ImageNet dataset to compute mid-level image representation for images in the PASCAL VOC dataset. We show that despite differences in image statistics and tasks in the two datasets, the transferred representation leads to significantly improved results for object and action classification, outperforming the current state of the art on Pascal VOC 2007 and 2012 datasets. We also show promising results for object and action localization.
Convolutional neural networks (CNN) have recently shown outstanding image classification performance in the large- scale visual recognition challenge (ILSVRC2012). The suc- cess of CNNs is attributed to their ability to learn rich mid- level image representations as opposed to hand-designed low-level features used in other image classification meth- ods. Learning CNNs, however, amounts to estimating mil- lions of parameters and requires a very large number of annotated image samples. This property currently prevents application of CNNs to problems with limited training data. In this work we show how image representations learned with CNNs on large-scale annotated datasets can be effi- ciently transferred to other visual recognition tasks with limited amount of training data. We design a method to reuse layers trained on the ImageNet dataset to compute mid-level image representation for images in the PASCAL VOC dataset. We show that despite differences in image statistics and tasks in the two datasets, the transferred rep- resentation leads to significantly improved results for object and action classification, outperforming the current state of the art on Pascal VOC 2007 and 2012 datasets. We also show promising results for object and action localization.
Author Bottou, Leon
Oquab, Maxime
Laptev, Ivan
Sivic, Josef
Author_xml – sequence: 1
  givenname: Maxime
  surname: Oquab
  fullname: Oquab, Maxime
  organization: INRIA, Paris, France
– sequence: 2
  givenname: Leon
  surname: Bottou
  fullname: Bottou, Leon
  organization: MSR, New York, NY, USA
– sequence: 3
  givenname: Ivan
  surname: Laptev
  fullname: Laptev, Ivan
  organization: INRIA, Paris, France
– sequence: 4
  givenname: Josef
  surname: Sivic
  fullname: Sivic, Josef
  organization: INRIA, Paris, France
BookMark eNpNjDtPwzAURg0qEqV0ZGLJyJLia8d2PKKIR6XyUFXYUHWb3FQWqVPspIh_DwUGpvPp6Og7YQPfemLsDPgEgNvL4uVpPhEcsokQ4oCNrckhM9YqgFwdsiFwLVNtwQ7-7WM2jtGtuNBGZ0rqIXudEQbv_DpBXyWLgD7WFMJe3LsqbWhHTTLd4JqSOW0DRfIddq71MXmO-6po_a5t-r3CJnmgPvyg-2jDWzxlRzU2kcZ_HLHFzfWiuEtnj7fT4mqWOmGyLrUlKrQqy0tbQUUiK3kOSlZIJjcCpNLK1CRrTjVWK1ErIWutkAiqEqmUI3bxe7sN7XtPsVtuXCypadBT28claGMsV5rn3-n5b-qIaLkNboPhc6kttxpy-QU2UWcD
CODEN IEEPAD
ContentType Conference Proceeding
Journal Article
DBID 6IE
6IH
CBEJK
RIE
RIO
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/CVPR.2014.222
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
Computer Science
EISBN 9781479951185
1479951188
EISSN 1063-6919
EndPage 1724
ExternalDocumentID 6909618
Genre orig-research
GroupedDBID 23M
29F
29O
6IE
6IH
6IK
ABDPE
ACGFS
ALMA_UNASSIGNED_HOLDINGS
CBEJK
IPLJI
M43
RIE
RIO
RNS
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-i274t-9ca5a9548c9d1de24c08153dae7872135657fe3f0efadb2f523f65aee1dcaec3
IEDL.DBID RIE
ISICitedReferencesCount 2088
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000361555601097&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1063-6919
IngestDate Thu Oct 02 07:08:15 EDT 2025
Wed Aug 27 04:30:17 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i274t-9ca5a9548c9d1de24c08153dae7872135657fe3f0efadb2f523f65aee1dcaec3
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Conference-1
ObjectType-Feature-3
content type line 23
SourceType-Conference Papers & Proceedings-2
PQID 1677905608
PQPubID 23500
PageCount 8
ParticipantIDs ieee_primary_6909618
proquest_miscellaneous_1677905608
PublicationCentury 2000
PublicationDate 20140601
PublicationDateYYYYMMDD 2014-06-01
PublicationDate_xml – month: 06
  year: 2014
  text: 20140601
  day: 01
PublicationDecade 2010
PublicationTitle 2014 IEEE Conference on Computer Vision and Pattern Recognition
PublicationTitleAbbrev CVPR
PublicationYear 2014
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib026764536
ssj0023720
ssj0003211698
Score 2.5456948
Snippet Convolutional neural networks (CNN) have recently shown outstanding image classification performance in the large- scale visual recognition challenge...
SourceID proquest
ieee
SourceType Aggregation Database
Publisher
StartPage 1717
SubjectTerms Computer vision
Image recognition
Image representation
Neural networks
Object recognition
Pascal (programming language)
Pattern recognition
Representations
Training
Training data
Visual
Visualization
Volatile organic compounds
Title Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks
URI https://ieeexplore.ieee.org/document/6909618
https://www.proquest.com/docview/1677905608
WOSCitedRecordID wos000361555601097&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LawIxEB5UeujJtlpqX6TQY1fdvHZzlkp7qIhI8VIk5lEEuxZX_f1Nsrt6aC897RJYCMnMJLPzzfcBPDKFqVwsdCRjGhIU53PCZSmWMMu1JoQnNohNJKNROpuJcQ2eDr0wxpgAPjNd_xpq-Xqtdv5XWc9lcl6gpA71JEmKXq3KdjBPOGWFdneIwsRlNlwcKgrYq7GEyicnERexOPJt9gbv44kHedEu9hq6QWXlV2gO582w-b-ZnkH72LiHxocj6RxqJruAZnnTRKUf526oEnOoxlrwUTKtfiKZaRSOMOtZG93A21JHKw8uQq9fLvygSUDPlk1LWY4C7AC5KexLO5Yr5Ek_wiOgzPM2TIfP08FLVGovREuXp24joSSTngxOCR1rg6lydwdGtDTOw3FMfLXUGmL7xkq9wNbls5YzaUyslTSKXEIjW2fmCpBwEUFipllfcaoMTbnUinphTpf6JTTtQMuv3_y7YNeYl0vXgYdqA-bO4n0ZQ2ZmvcvnMfccie6mll7__ekNnPrdLABdt9DYbnbmDk7UfrvMN_fBbH4A8yrAoA
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LawIxEB6sLbQn22qpfabQY1fdvNY9S0WpiogUL2WJeRTBrsVVf3-TuOqhvfS0S2AhJDOTzM433wfwzCSmYjpVgQipT1Csz8U2SzGEGa4UITwyXmwiGgyak0k8LMDLvhdGa-3BZ7rmXn0tXy3k2v0qq9tMzgmUHMExoxSH226tnfVgHnHKturdPg4Tm9vweF9TwE6Pxdc-OQl4HMYHxs166304cjAvWsNORdfrrPwKzv7EaZf-N9dzqBxa99BwfyhdQEGnl1DK75oo9-TMDu3kHHZjZfjIuVY_kUgV8oeYcbyNdqA_U8HcwYtQ98sGIDTy-Nm8bSnNkAceIDuFTW7JYo4c7Yd_eJx5VoFx-3Xc6gS5-kIws5nqKoilYMLRwclYhUpjKu3tgREltPVxHBJXLzWamIY2Qk2xsRmt4UxoHSoptCRXUEwXqb4GFNuYIDBTrCE5lZo2uVCSOmlOm_xFtFmFslu_5HvLr5HkS1eFp90GJNbmXSFDpHqxzpKQO5ZEe1dr3vz96SOcdsb9XtLrDt5u4czt7BbedQfF1XKt7-FEblazbPngTegHIxLD5w
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2014+IEEE+Conference+on+Computer+Vision+and+Pattern+Recognition&rft.atitle=Learning+and+Transferring+Mid-level+Image+Representations+Using+Convolutional+Neural+Networks&rft.au=Oquab%2C+Maxime&rft.au=Bottou%2C+Leon&rft.au=Laptev%2C+Ivan&rft.au=Sivic%2C+Josef&rft.date=2014-06-01&rft.pub=IEEE&rft.issn=1063-6919&rft.eissn=1063-6919&rft.spage=1717&rft.epage=1724&rft_id=info:doi/10.1109%2FCVPR.2014.222&rft.externalDocID=6909618
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-6919&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-6919&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-6919&client=summon