Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks

Convolutional neural networks (CNN) have recently shown outstanding image classification performance in the large- scale visual recognition challenge (ILSVRC2012). The success of CNNs is attributed to their ability to learn rich mid-level image representations as opposed to hand-designed low-level f...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2014 IEEE Conference on Computer Vision and Pattern Recognition s. 1717 - 1724
Hlavní autoři: Oquab, Maxime, Bottou, Leon, Laptev, Ivan, Sivic, Josef
Médium: Konferenční příspěvek Journal Article
Jazyk:angličtina
Vydáno: IEEE 01.06.2014
Témata:
ISSN:1063-6919, 1063-6919
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Convolutional neural networks (CNN) have recently shown outstanding image classification performance in the large- scale visual recognition challenge (ILSVRC2012). The success of CNNs is attributed to their ability to learn rich mid-level image representations as opposed to hand-designed low-level features used in other image classification methods. Learning CNNs, however, amounts to estimating millions of parameters and requires a very large number of annotated image samples. This property currently prevents application of CNNs to problems with limited training data. In this work we show how image representations learned with CNNs on large-scale annotated datasets can be efficiently transferred to other visual recognition tasks with limited amount of training data. We design a method to reuse layers trained on the ImageNet dataset to compute mid-level image representation for images in the PASCAL VOC dataset. We show that despite differences in image statistics and tasks in the two datasets, the transferred representation leads to significantly improved results for object and action classification, outperforming the current state of the art on Pascal VOC 2007 and 2012 datasets. We also show promising results for object and action localization.
AbstractList Convolutional neural networks (CNN) have recently shown outstanding image classification performance in the large- scale visual recognition challenge (ILSVRC2012). The success of CNNs is attributed to their ability to learn rich mid-level image representations as opposed to hand-designed low-level features used in other image classification methods. Learning CNNs, however, amounts to estimating millions of parameters and requires a very large number of annotated image samples. This property currently prevents application of CNNs to problems with limited training data. In this work we show how image representations learned with CNNs on large-scale annotated datasets can be efficiently transferred to other visual recognition tasks with limited amount of training data. We design a method to reuse layers trained on the ImageNet dataset to compute mid-level image representation for images in the PASCAL VOC dataset. We show that despite differences in image statistics and tasks in the two datasets, the transferred representation leads to significantly improved results for object and action classification, outperforming the current state of the art on Pascal VOC 2007 and 2012 datasets. We also show promising results for object and action localization.
Convolutional neural networks (CNN) have recently shown outstanding image classification performance in the large- scale visual recognition challenge (ILSVRC2012). The suc- cess of CNNs is attributed to their ability to learn rich mid- level image representations as opposed to hand-designed low-level features used in other image classification meth- ods. Learning CNNs, however, amounts to estimating mil- lions of parameters and requires a very large number of annotated image samples. This property currently prevents application of CNNs to problems with limited training data. In this work we show how image representations learned with CNNs on large-scale annotated datasets can be effi- ciently transferred to other visual recognition tasks with limited amount of training data. We design a method to reuse layers trained on the ImageNet dataset to compute mid-level image representation for images in the PASCAL VOC dataset. We show that despite differences in image statistics and tasks in the two datasets, the transferred rep- resentation leads to significantly improved results for object and action classification, outperforming the current state of the art on Pascal VOC 2007 and 2012 datasets. We also show promising results for object and action localization.
Author Bottou, Leon
Oquab, Maxime
Laptev, Ivan
Sivic, Josef
Author_xml – sequence: 1
  givenname: Maxime
  surname: Oquab
  fullname: Oquab, Maxime
  organization: INRIA, Paris, France
– sequence: 2
  givenname: Leon
  surname: Bottou
  fullname: Bottou, Leon
  organization: MSR, New York, NY, USA
– sequence: 3
  givenname: Ivan
  surname: Laptev
  fullname: Laptev, Ivan
  organization: INRIA, Paris, France
– sequence: 4
  givenname: Josef
  surname: Sivic
  fullname: Sivic, Josef
  organization: INRIA, Paris, France
BookMark eNpNjDtPwzAURg0qEqV0ZGLJyJLia8d2PKKIR6XyUFXYUHWb3FQWqVPspIh_DwUGpvPp6Og7YQPfemLsDPgEgNvL4uVpPhEcsokQ4oCNrckhM9YqgFwdsiFwLVNtwQ7-7WM2jtGtuNBGZ0rqIXudEQbv_DpBXyWLgD7WFMJe3LsqbWhHTTLd4JqSOW0DRfIddq71MXmO-6po_a5t-r3CJnmgPvyg-2jDWzxlRzU2kcZ_HLHFzfWiuEtnj7fT4mqWOmGyLrUlKrQqy0tbQUUiK3kOSlZIJjcCpNLK1CRrTjVWK1ErIWutkAiqEqmUI3bxe7sN7XtPsVtuXCypadBT28claGMsV5rn3-n5b-qIaLkNboPhc6kttxpy-QU2UWcD
CODEN IEEPAD
ContentType Conference Proceeding
Journal Article
DBID 6IE
6IH
CBEJK
RIE
RIO
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/CVPR.2014.222
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
Computer Science
EISBN 9781479951185
1479951188
EISSN 1063-6919
EndPage 1724
ExternalDocumentID 6909618
Genre orig-research
GroupedDBID 23M
29F
29O
6IE
6IH
6IK
ABDPE
ACGFS
ALMA_UNASSIGNED_HOLDINGS
CBEJK
IPLJI
M43
RIE
RIO
RNS
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-i274t-9ca5a9548c9d1de24c08153dae7872135657fe3f0efadb2f523f65aee1dcaec3
IEDL.DBID RIE
ISICitedReferencesCount 2088
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000361555601097&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1063-6919
IngestDate Thu Oct 02 07:08:15 EDT 2025
Wed Aug 27 04:30:17 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i274t-9ca5a9548c9d1de24c08153dae7872135657fe3f0efadb2f523f65aee1dcaec3
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Conference-1
ObjectType-Feature-3
content type line 23
SourceType-Conference Papers & Proceedings-2
PQID 1677905608
PQPubID 23500
PageCount 8
ParticipantIDs ieee_primary_6909618
proquest_miscellaneous_1677905608
PublicationCentury 2000
PublicationDate 20140601
PublicationDateYYYYMMDD 2014-06-01
PublicationDate_xml – month: 06
  year: 2014
  text: 20140601
  day: 01
PublicationDecade 2010
PublicationTitle 2014 IEEE Conference on Computer Vision and Pattern Recognition
PublicationTitleAbbrev CVPR
PublicationYear 2014
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib026764536
ssj0023720
ssj0003211698
Score 2.5456948
Snippet Convolutional neural networks (CNN) have recently shown outstanding image classification performance in the large- scale visual recognition challenge...
SourceID proquest
ieee
SourceType Aggregation Database
Publisher
StartPage 1717
SubjectTerms Computer vision
Image recognition
Image representation
Neural networks
Object recognition
Pascal (programming language)
Pattern recognition
Representations
Training
Training data
Visual
Visualization
Volatile organic compounds
Title Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks
URI https://ieeexplore.ieee.org/document/6909618
https://www.proquest.com/docview/1677905608
WOSCitedRecordID wos000361555601097&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3Pa8IwFA4qO-zkNh1zv8hgx1Vtk6btWSbbYSIiw8soafIyBFeHVf_-5aWtHrbLTi2BQkhe3svr-973EfIYiUxynTBs-8g8HieRJzOmPWVELIW23kFrJzYRTSbxYpFMG-Tp0AsDAA58Bn18dbV8vVY7_FU2sJkcCpQ0STOKorJXq7adQESCh6V2t_PCzGY2IjlUFAJUY3GVT8E8kfjJkW9zMHqfzhDkxfsBaug6lZVfrtnFm3H7fzM9I91j4x6dHkLSOWlAfkHa1U2TVue4sEO1mEM91iEfFdPqJ5W5pi6EGWRttANvS-2tEFxEX7-s-6Ezh56tmpbygjrYAbVT2Fd2LFcUST_cw6HMiy6Zj5_noxev0l7wljZP3XqJkqFEMjiVaF9DwJW9O4RMS7AnPPAZVksNMDMEI3UWGJvPGhFKAF8rCYpdkla-zuGK0DBWwobADLgxPBpmGZdM6xip5IbaAO-RDq5f-l2ya6TV0vXIQ70BqbV4LGPIHNa7IvUFciTam1p8_fenN-QUd7MEdN2S1nazgztyovbbZbG5d2bzA4LswmM
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8QwEB58gZ58rbg-I3i067ZJ0_YsiqIuiyziRUqaTGRh7Yrd9febSdv1oBdPLYFCSCYzmc433wdwnshCCZNxavsoApFmSaAKbgJtZaqkcd7BGC82kQwG6ctLNlyCi0UvDCJ68Bn26NXX8s1Uz-lX2aXL5EigZBlWYyGisO7Waq0nkokUca3e7f0wd7mNzBY1hYj0WHztU_JAZmH2w7h5efU8fCKYl-hFpKLrdVZ-OWcfcW42_zfXLej8tO6x4SIobcMSljuw2dw1WXOSKzfUyjm0Y7vw2nCtvjFVGuaDmCXeRjfwODbBhOBF7O7dOSD25PGzTdtSWTEPPGBuCl-NJasJI9oP__A486oDo5vr0dVt0KgvBGOXqc6CTKtYER2czkxoMBLa3R5ibhS6Mx6FnOqlFrnto1WmiKzLaK2MFWJotELN92ClnJa4DyxOtXRBsEBhrUj6RSEUNyYlMrm-sSi6sEvrl3_U_Bp5s3RdOGs3IHc2T4UMVeJ0XuWhJJZEd1dLD_7-9BTWb0ePD_nD3eD-EDZoZ2t41xGszD7neAxr-ms2rj5PvAl9A8mVxao
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2014+IEEE+Conference+on+Computer+Vision+and+Pattern+Recognition&rft.atitle=Learning+and+Transferring+Mid-level+Image+Representations+Using+Convolutional+Neural+Networks&rft.au=Oquab%2C+Maxime&rft.au=Bottou%2C+Leon&rft.au=Laptev%2C+Ivan&rft.au=Sivic%2C+Josef&rft.date=2014-06-01&rft.pub=IEEE&rft.issn=1063-6919&rft.eissn=1063-6919&rft.spage=1717&rft.epage=1724&rft_id=info:doi/10.1109%2FCVPR.2014.222&rft.externalDocID=6909618
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-6919&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-6919&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-6919&client=summon