PixelSieve: Towards Efficient Activity Analysis From Compressed Video Streams

Pixel-level data redundancy in video induces additional memory and computing overhead when neural networks are employed to mine spatiotemporal patterns, e.g. activity and event labels from video streams. This work proposes PixelSieve, to enable highly efficient CNN-based activity analysis directly f...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	2021 58th ACM/IEEE Design Automation Conference (DAC) S. 811 - 816
Hauptverfasser:	Wang, Yongchen, Wang, Ying, Li, Huawei, Li, Xiaowei
Format:	Tagungsbericht
Sprache:	Englisch
Veröffentlicht:	IEEE 05.12.2021
Schlagworte:	Decoding Design automation Metadata Neural networks Redundancy Spatiotemporal phenomena Streaming media
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Abstract	Pixel-level data redundancy in video induces additional memory and computing overhead when neural networks are employed to mine spatiotemporal patterns, e.g. activity and event labels from video streams. This work proposes PixelSieve, to enable highly efficient CNN-based activity analysis directly from video data in compressed formats. Instead of recovering original RGB frames from compressed video, PixelSieve utilizes the built-in metadata in compressed video streams to distill only the critical pixels that render relevant spatiotemporal features, and then conducts efficient CNN inference with the condensed inputs. PixelSieve removes the overhead of video decoding and significantly improves the performance of CNN-based video analysis by 4.5x on average.
AbstractList	Pixel-level data redundancy in video induces additional memory and computing overhead when neural networks are employed to mine spatiotemporal patterns, e.g. activity and event labels from video streams. This work proposes PixelSieve, to enable highly efficient CNN-based activity analysis directly from video data in compressed formats. Instead of recovering original RGB frames from compressed video, PixelSieve utilizes the built-in metadata in compressed video streams to distill only the critical pixels that render relevant spatiotemporal features, and then conducts efficient CNN inference with the condensed inputs. PixelSieve removes the overhead of video decoding and significantly improves the performance of CNN-based video analysis by 4.5x on average.
Author	Wang, Ying Wang, Yongchen Li, Xiaowei Li, Huawei
Author_xml	– sequence: 1 givenname: Yongchen surname: Wang fullname: Wang, Yongchen email: wangyongchen@ict.ac.cn organization: SKLCA, Institute of Computing Technology,Chinese Academy of Sciences – sequence: 2 givenname: Ying surname: Wang fullname: Wang, Ying email: wangying2009@ict.ac.cn organization: SKLCA, Institute of Computing Technology,Chinese Academy of Sciences – sequence: 3 givenname: Huawei surname: Li fullname: Li, Huawei email: lihuawei@ict.ac.cn organization: SKLCA, Institute of Computing Technology,Chinese Academy of Sciences – sequence: 4 givenname: Xiaowei surname: Li fullname: Li, Xiaowei email: lxw@ict.ac.cn organization: SKLCA, Institute of Computing Technology,Chinese Academy of Sciences
BookMark	eNotj91KwzAYQCMoqLNPIEJeYDVfm1_vSt1UmCiseDvS5AsE-jOaMu3bK7ibc-4OnFtyOYwDEvIALAdg5vG5qkEzxfOCFZAboWUJ7IJkRmmQUvCyUJxdkyyl2DLJhOZ_vCHvn_EHu33EEz7RZvy2k090E0J0EYeZVm6OpzgvtBpst6SY6HYae1qP_XHClNDTr-hxpPt5QtunO3IVbJcwO3tFmu2mqV_Xu4-Xt7rarW0Bal4LFFwHFMxbF1RQMggnXYFeOzA8lEY4j1wzz7kIDFsJEhzIlhsoQNtyRe7_sxERD8cp9nZaDufn8heK6E-P
ContentType	Conference Proceeding
DBID	6IE 6IH CBEJK RIE RIO
DOI	10.1109/DAC18074.2021.9586310
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EISBN	9781665432740 1665432748
EndPage	816
ExternalDocumentID	9586310
Genre	orig-research
GrantInformation_xml	– fundername: National Key Research and Development Program of China funderid: 10.13039/501100012166
GroupedDBID	6IE 6IH ACM ALMA_UNASSIGNED_HOLDINGS CBEJK RIE RIO
ID	FETCH-LOGICAL-a217t-5e548fe50dacf7f76f5c6c2ed8c194f395cde480d445f0eb6161c16b491218a3
IEDL.DBID	RIE
ISICitedReferencesCount	0
ISICitedReferencesURI	http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000766079700136&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate	Wed Aug 27 02:28:29 EDT 2025
IsPeerReviewed	false
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-a217t-5e548fe50dacf7f76f5c6c2ed8c194f395cde480d445f0eb6161c16b491218a3
PageCount	6
ParticipantIDs	ieee_primary_9586310
PublicationCentury	2000
PublicationDate	2021-Dec.-5
PublicationDateYYYYMMDD	2021-12-05
PublicationDate_xml	– month: 12 year: 2021 text: 2021-Dec.-5 day: 05
PublicationDecade	2020
PublicationTitle	2021 58th ACM/IEEE Design Automation Conference (DAC)
PublicationTitleAbbrev	DAC
PublicationYear	2021
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssib060584060
Score	2.1648688
Snippet	Pixel-level data redundancy in video induces additional memory and computing overhead when neural networks are employed to mine spatiotemporal patterns, e.g....
SourceID	ieee
SourceType	Publisher
StartPage	811
SubjectTerms	Decoding Design automation Metadata Neural networks Redundancy Spatiotemporal phenomena Streaming media
Title	PixelSieve: Towards Efficient Activity Analysis From Compressed Video Streams
URI	https://ieeexplore.ieee.org/document/9586310
WOSCitedRecordID	wos000766079700136&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3Pa8MgFJau7LDTNtqx33jYcWk10Rh3K13LLiuFldFbMfqEwNqMNi3786cm7RjsspsIKryony_vfd9D6MFhMDUqZZEGCV5U292DMbdRzpKMawfpOamLTYjJJJvP5bSFHg9cGAAIyWfQ880Qyzel3vpfZX3JszTxfKojIUTN1drvHR_dc9hEGpIOJbL_PBhSL_XinMCY9pqxv4qoBAwZn_5v9TPU_SHj4ekBZs5RC1Yd9DotvuDjrYAdPOFZSH3d4FGQg3CT4IGui0LgveYIHq_LJfaHP4iFG_xeGCixj0mr5aaLZuPRbPgSNZURIuVciCri4BwNC5wYpa2wIrVcpzoGk2kqmU0k1wZYRgxj3BLIU_eu0zTNmaQO0lVygdqrcgWXCAd9daa5irViXCRKCS0ouLljd_1wcoU63hKLz1r7YtEY4frv7ht04o0d0j34LWpX6y3coWO9q4rN-j58sG_b8paf
linkProvider	IEEE
linkToHtml	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFA5jCnpS2cTf5uDRbk2btI23MTcmbmNgkd1GmrxAwa2yX_jnm6TdRPDiLQSSwGuSL6_vfd9D6MFgMFEiop4EDlZU29yDAdNeRsOESQPpmV8Wm4jH42Q65ZMaetxzYQDAJZ9ByzZdLF8VcmN_lbU5S6LQ8qkOGKUBKdlau91j43sGnfyKpkN83n7udIkVezFuYEBa1ehfZVQcivRP_rf-KWr-0PHwZA80Z6gGiwYaTfIv-HjLYQtPOHXJryvcc4IQZhLckWVZCLxTHcH9ZTHH9vg7uXCF33MFBbZRaTFfNVHa76XdgVfVRvCEcSLWHgPjamhgvhJSxzqONJORDEAlknCqQ86kApr4ilKmfcgi87KTJMooJwbURXiO6otiARcIO4V1KpkIpKAsDoWIZUzAzB2YC4j5l6hhLTH7LNUvZpURrv7uvkdHg3Q0nA1fxq_X6Nga3iV_sBtUXy83cIsO5Xadr5Z37uN9A-JRmeY
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2021+58th+ACM%2FIEEE+Design+Automation+Conference+%28DAC%29&rft.atitle=PixelSieve%3A+Towards+Efficient+Activity+Analysis+From+Compressed+Video+Streams&rft.au=Wang%2C+Yongchen&rft.au=Wang%2C+Ying&rft.au=Li%2C+Huawei&rft.au=Li%2C+Xiaowei&rft.date=2021-12-05&rft.pub=IEEE&rft.spage=811&rft.epage=816&rft_id=info:doi/10.1109%2FDAC18074.2021.9586310&rft.externalDocID=9586310