UntrimmedNets for Weakly Supervised Action Recognition and Detection

Current action recognition methods heavily rely on trimmed videos for model training. However, it is expensive and time-consuming to acquire a large-scale trimmed video dataset. This paper presents a new weakly supervised architecture, called UntrimmedNet, which is able to directly learn action reco...

Full description

Saved in:
Bibliographic Details
Published in:2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 6402 - 6411
Main Authors: Wang, Limin, Xiong, Yuanjun, Lin, Dahua, Van Gool, Luc
Format: Conference Proceeding
Language:English
Published: IEEE 01.07.2017
Subjects:
ISSN:1063-6919, 1063-6919
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Current action recognition methods heavily rely on trimmed videos for model training. However, it is expensive and time-consuming to acquire a large-scale trimmed video dataset. This paper presents a new weakly supervised architecture, called UntrimmedNet, which is able to directly learn action recognition models from untrimmed videos without the requirement of temporal annotations of action instances. Our UntrimmedNet couples two important components, the classification module and the selection module, to learn the action models and reason about the temporal duration of action instances, respectively. These two components are implemented with feed-forward networks, and UntrimmedNet is therefore an end-to-end trainable architecture. We exploit the learned models for action recognition (WSR) and detection (WSD) on the untrimmed video datasets of THUMOS14 and ActivityNet. Although our UntrimmedNet only employs weak supervision, our method achieves performance superior or comparable to that of those strongly supervised approaches on these two datasets.
AbstractList Current action recognition methods heavily rely on trimmed videos for model training. However, it is expensive and time-consuming to acquire a large-scale trimmed video dataset. This paper presents a new weakly supervised architecture, called UntrimmedNet, which is able to directly learn action recognition models from untrimmed videos without the requirement of temporal annotations of action instances. Our UntrimmedNet couples two important components, the classification module and the selection module, to learn the action models and reason about the temporal duration of action instances, respectively. These two components are implemented with feed-forward networks, and UntrimmedNet is therefore an end-to-end trainable architecture. We exploit the learned models for action recognition (WSR) and detection (WSD) on the untrimmed video datasets of THUMOS14 and ActivityNet. Although our UntrimmedNet only employs weak supervision, our method achieves performance superior or comparable to that of those strongly supervised approaches on these two datasets.
Author Xiong, Yuanjun
Van Gool, Luc
Lin, Dahua
Wang, Limin
Author_xml – sequence: 1
  givenname: Limin
  surname: Wang
  fullname: Wang, Limin
  organization: Comput. Vision Lab., ETH Zurich, Zurich, Switzerland
– sequence: 2
  givenname: Yuanjun
  surname: Xiong
  fullname: Xiong, Yuanjun
  organization: Dept. of Inf. Eng., Chinese Univ. of Hong Kong, Hong Kong, China
– sequence: 3
  givenname: Dahua
  surname: Lin
  fullname: Lin, Dahua
  organization: Dept. of Inf. Eng., Chinese Univ. of Hong Kong, Hong Kong, China
– sequence: 4
  givenname: Luc
  surname: Van Gool
  fullname: Van Gool, Luc
  organization: Computer Vision Laboratory, ETH Zurich, Switzerland
BookMark eNpNjk9LwzAYh6NMcJ0ePXnJF2h936RJmuPo_AdDZTo9jqZ5I9GtHW0V9u0d04On3wM_eHgSNmrahhi7QMgQwV6Vr0-LTACaTJviiCWoZKEhVyY_ZmMELVNt0Y7-8SlL-v4DQEgjYMxmy2bo4mZD_oGGnoe2429Ufa53_PlrS9137MnzaT3EtuELqtv3Jh64ajyf0UCH54ydhGrd0_nfTtjy5vqlvEvnj7f35XSeRjRqSG3htUYjcyyspNqpICw4S6BqEs6boHxuHFK-LzN55ayRwQtn6hCcBS3khF3-eiMRrbb77qrbrQoEQI3yB95UTXE
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/CVPR.2017.678
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
Computer Science
EISBN 1538604574
9781538604571
EISSN 1063-6919
EndPage 6411
ExternalDocumentID 8100161
Genre orig-research
GroupedDBID 23M
29F
29O
6IE
6IH
6IK
ABDPE
ACGFS
ALMA_UNASSIGNED_HOLDINGS
CBEJK
IPLJI
M43
RIE
RIO
RNS
ID FETCH-LOGICAL-i175t-98d6617341893ecb5f290b9e05ce2bd7f5d47b1e472074ab973fd2b7cffb90623
IEDL.DBID RIE
ISICitedReferencesCount 423
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000418371406053&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1063-6919
IngestDate Wed Aug 27 06:13:56 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i175t-98d6617341893ecb5f290b9e05ce2bd7f5d47b1e472074ab973fd2b7cffb90623
PageCount 10
ParticipantIDs ieee_primary_8100161
PublicationCentury 2000
PublicationDate 2017-July
PublicationDateYYYYMMDD 2017-07-01
PublicationDate_xml – month: 07
  year: 2017
  text: 2017-July
PublicationDecade 2010
PublicationTitle 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
PublicationTitleAbbrev CVPR
PublicationYear 2017
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0023720
ssj0003211698
Score 2.5893545
Snippet Current action recognition methods heavily rely on trimmed videos for model training. However, it is expensive and time-consuming to acquire a large-scale...
SourceID ieee
SourceType Publisher
StartPage 6402
SubjectTerms Adaptation models
Feature extraction
Motion pictures
Proposals
Training
Videos
Visualization
Title UntrimmedNets for Weakly Supervised Action Recognition and Detection
URI https://ieeexplore.ieee.org/document/8100161
WOSCitedRecordID wos000418371406053&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEG6AePCECsZ3evBoYZ_t9mhA44kQFOVG-kyIuhB2MfHfOy3L4sGLt3YOm2aa7jftfDMfQrcJXP0jxgPiem-RRGSc8CgUhFtAG66CVAvlxSbYaJTNZnzcQHd1LYwxxpPPTM8NfS5fL9XGPZX1s9BHKE3UZIxta7Xq95QYbjKU1xmEyKmv-EwnjQnlId_31-wPXscTR-piPerU1X6pqnhQeWz_bzlHqLuvzsPjGneOUcPkJ6hdhZO4OqwFmHaKDTtbBw2neblefAICjkxZYAhY8ZsR7x_f-Hmzcn-NAr5w70sd8GRHLYKxyDUemtLTtvIumj4-vAyeSKWjQBYQHJSEZxpQmAFeQXBilExtxAPJTZAqE0nNbKoTJkOTgMtYIiRnsdWRZMpa6doYx6eolS9zc4ZwpLKYhlZkNBEJtTDTKlNUMwA5ltrgHHWcn-arbauMeeWii7_Nl-jQbcOW_XqFWuV6Y67RgfoqF8X6xu_vD2wUots
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PT8IwFG4QTfSECsbf9uDRwdZ17Xo0IMGIC0FQbmTrj4Sog7Bh4n9vW8bw4MVb-w5L85rue-373vsAuMX66o8ocx3Te8vBccgchrzYYUqjDeNuIGJuxSZoFIWTCRtUwF1ZCyOltOQz2TRDm8sXc74yT2Wt0LMRyg7YDTBG3rpaq3xR8fVdhrAyh4CM_orNdRLfIcxj2w6brfbrYGhoXbRJjL7aL10VCyvd2v8WdAga2_o8OCiR5whUZHoMakVACYvjmmnTRrNhY6uDzjjNl7NPjYGRzDOoQ1b4JuP3j2_4slqY_0amv3Bvix3gcEMu0uM4FbAjc0vcShtg3H0YtXtOoaTgzHR4kDssFBqHqUYsHZ5IngQKMTdh0g24RImgKhCYJp7E2mUUxwmjvhIooVypxDQy9k9ANZ2n8hRAxEOfeCoOCY4xUXomeMiJoBrmaKDcM1A3fpou1s0ypoWLzv8234D93ui5P-0_Rk8X4MBsyZoLewmq-XIlr8Ae_8pn2fLa7vUPFTemIg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=2017+IEEE+Conference+on+Computer+Vision+and+Pattern+Recognition+%28CVPR%29&rft.atitle=UntrimmedNets+for+Weakly+Supervised+Action+Recognition+and+Detection&rft.au=Wang%2C+Limin&rft.au=Xiong%2C+Yuanjun&rft.au=Lin%2C+Dahua&rft.au=Van+Gool%2C+Luc&rft.date=2017-07-01&rft.pub=IEEE&rft.issn=1063-6919&rft.eissn=1063-6919&rft.spage=6402&rft.epage=6411&rft_id=info:doi/10.1109%2FCVPR.2017.678&rft.externalDocID=8100161
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-6919&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-6919&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-6919&client=summon