High-performance video content recognition with long-term recurrent convolutional network for FPGA

FPGA is a promising candidate for the acceleration of Deep Neural Networks (DNN) with improved latency and energy consumption compared to CPU and GPU-based implementations. DNNs use sequences of layers of regular computation that are well suited for HLS-based design for FPGA. However, optimizing lar...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:International Conference on Field-programmable Logic and Applications s. 1 - 4
Hlavní autori: Xiaofan Zhang, Xinheng Liu, Ramachandran, Anand, Chuanhao Zhuge, Shibin Tang, Peng Ouyang, Zuofu Cheng, Rupnow, Kyle, Deming Chen
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: Ghent University 01.09.2017
Predmet:
ISSN:1946-1488
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract FPGA is a promising candidate for the acceleration of Deep Neural Networks (DNN) with improved latency and energy consumption compared to CPU and GPU-based implementations. DNNs use sequences of layers of regular computation that are well suited for HLS-based design for FPGA. However, optimizing large neural networks under resource constraints is still a key challenge. HLS must manage on-chip computation, buffering resources, and off-chip memory accesses to minimize the total latency. In this paper, we present a design framework for DNNs that uses highly configurable IPs for neural network layers together with a new design space exploration engine for Resource Allocation Management (REALM). We also carry out efficient memory subsystem design and fixed-point weight re-training to further improve our FPGA solution. We demonstrate our design framework on the Long-term Recurrent Convolution Network for video inputs. Our implementation on a Xilinx VC709 board achieves 3.1X speedup compared to an NVIDIA K80 and 4.75X speedup compared to an Intel Xeon with 17.5X lower energy per image.
AbstractList FPGA is a promising candidate for the acceleration of Deep Neural Networks (DNN) with improved latency and energy consumption compared to CPU and GPU-based implementations. DNNs use sequences of layers of regular computation that are well suited for HLS-based design for FPGA. However, optimizing large neural networks under resource constraints is still a key challenge. HLS must manage on-chip computation, buffering resources, and off-chip memory accesses to minimize the total latency. In this paper, we present a design framework for DNNs that uses highly configurable IPs for neural network layers together with a new design space exploration engine for Resource Allocation Management (REALM). We also carry out efficient memory subsystem design and fixed-point weight re-training to further improve our FPGA solution. We demonstrate our design framework on the Long-term Recurrent Convolution Network for video inputs. Our implementation on a Xilinx VC709 board achieves 3.1X speedup compared to an NVIDIA K80 and 4.75X speedup compared to an Intel Xeon with 17.5X lower energy per image.
Author Deming Chen
Peng Ouyang
Ramachandran, Anand
Xiaofan Zhang
Zuofu Cheng
Shibin Tang
Rupnow, Kyle
Chuanhao Zhuge
Xinheng Liu
Author_xml – sequence: 1
  surname: Xiaofan Zhang
  fullname: Xiaofan Zhang
– sequence: 2
  surname: Xinheng Liu
  fullname: Xinheng Liu
– sequence: 3
  givenname: Anand
  surname: Ramachandran
  fullname: Ramachandran, Anand
– sequence: 4
  surname: Chuanhao Zhuge
  fullname: Chuanhao Zhuge
– sequence: 5
  surname: Shibin Tang
  fullname: Shibin Tang
– sequence: 6
  surname: Peng Ouyang
  fullname: Peng Ouyang
– sequence: 7
  surname: Zuofu Cheng
  fullname: Zuofu Cheng
– sequence: 8
  givenname: Kyle
  surname: Rupnow
  fullname: Rupnow, Kyle
– sequence: 9
  surname: Deming Chen
  fullname: Deming Chen
BookMark eNotkM1OAjEcxKvRREAewHjpCyz2Y7vbHgkRMNlEDnom3e6_S3VpSSkQ357dyFwmmfllDjNGDz54QOiFkhnjiqq35aaaMULLmSSikJzfobEiinCSM8nu0YiqvMhoLuUTmh6PP6SXyEspihGq167dZQeINsS99gbw2TUQsAk-gU84ggmtd8kFjy8u7XAXfJsliPuhOsU4QD18Dt1pgHSHPaRLiL-4X8TLzWr-jB6t7o4wvfkEfS_fvxbrrPpcfSzmVeZoKVKmC1HnpqaCWs0NIRJqU9dGGaXA5lZS0E2fGkqMYrQQmlhmWSlL1jQGCOUT9Pq_6wBge4hur-Pf9nYJvwI7Vll7
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.23919/FPL.2017.8056833
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9090304282
9789090304281
EISSN 1946-1488
EndPage 4
ExternalDocumentID 8056833
Genre orig-research
GroupedDBID 6IE
6IF
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
OCL
RIE
RIL
ID FETCH-LOGICAL-i175t-a65b4cb151fa3c008ebcbbc9c99ef4f81ead008c10c92165a0f2f27872ddce013
IEDL.DBID RIE
ISICitedReferencesCount 26
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000426989400077&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:28:39 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i175t-a65b4cb151fa3c008ebcbbc9c99ef4f81ead008c10c92165a0f2f27872ddce013
PageCount 4
ParticipantIDs ieee_primary_8056833
PublicationCentury 2000
PublicationDate 2017-Sept.
PublicationDateYYYYMMDD 2017-09-01
PublicationDate_xml – month: 09
  year: 2017
  text: 2017-Sept.
PublicationDecade 2010
PublicationTitle International Conference on Field-programmable Logic and Applications
PublicationTitleAbbrev FPL
PublicationYear 2017
Publisher Ghent University
Publisher_xml – name: Ghent University
SSID ssj0000547856
Score 1.9950753
Snippet FPGA is a promising candidate for the acceleration of Deep Neural Networks (DNN) with improved latency and energy consumption compared to CPU and GPU-based...
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms Field programmable gate arrays
IP networks
Mathematical model
Neural networks
Optimization
Quantization (signal)
Resource management
Title High-performance video content recognition with long-term recurrent convolutional network for FPGA
URI https://ieeexplore.ieee.org/document/8056833
WOSCitedRecordID wos000426989400077&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07a8MwED6S0KFTW5LSNxo6VoltOZY0llK3QwgeWsgW9CyBYIc0ye-vzjYOhS7dxOkF0knfcfruBPDImWdWx5J6lzKaJiqmgumIou2BDizBXJ0yf8bnc7FYyKIHT10sjHOuJp-5MRbrt3xbmT26yiYioLVgrA99znkTq9X5UyJMTDXNmofLhMlYTvJihtwtPm77_fpApcaP_Ox_M5_D6BiIR4oOYi6g58ohaORm0M2R8U8wlq4iyDoPo5COE1SVBN2sZF2VXxSvYKxq8jFh40OrdWpNyoYMTsKIJC_enkfwmb9-vLzT9qsEugr4v6Mqm-rU6ADfXjETcN1po7WRRkrnUy_ioDBBauLIyCTOpiryiU_CYU2sNS6YgZcwKKvSXQEJ9qBKM6eVTVyaWauk4poxnalICsP4NQxxfZabJhvGsl2am7_Ft3CKW9Cwsu5gsNvu3T2cmMNu9b19qLfwBz1on-M
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFH_MKehJZRO_zcGj2doka5ujiHViHT1M2G0kaSoDacfc9veb15YOwYu38PIFyUt-j5ffewG4D3nOM-1LmlvBqWDKpxHXHkXbAx1YEbdVyvwknEyi2UymHXhoY2GstRX5zA6wWL3lZ6XZoKtsGDm0jjjfg_2REMyvo7Vaj4qHqalGQf10ybj05TBOE2RvhYOm568vVCoEiY__N_cJ9HeheCRtQeYUOrbogUZ2Bl3uOP8Eo-lKgrxzNwppWUFlQdDRSr7K4pPiJYxVdUYmbLxt9E59kaKmgxM3IonTl8c-fMTP06cxbT5LoAtnAaypCkZaGO0APFfcOGS32mhtpJHS5iKPfKcyTmp8z0jmByPl5Sxn7riyLDPWGYJn0C3Kwp4DcRahEoHVKmNWBFmmpAo15zpQnowMDy-gh-szX9b5MObN0lz-Lb6Dw_H0PZknr5O3KzjC7ag5WtfQXa829gYOzHa9-F7dVtv5A7i1oyo
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=International+Conference+on+Field-programmable+Logic+and+Applications&rft.atitle=High-performance+video+content+recognition+with+long-term+recurrent+convolutional+network+for+FPGA&rft.au=Xiaofan+Zhang&rft.au=Xinheng+Liu&rft.au=Ramachandran%2C+Anand&rft.au=Chuanhao+Zhuge&rft.date=2017-09-01&rft.pub=Ghent+University&rft.eissn=1946-1488&rft.spage=1&rft.epage=4&rft_id=info:doi/10.23919%2FFPL.2017.8056833&rft.externalDocID=8056833