Efficient parallel implementation of three-point viterbi decoding algorithm on CPU, GPU, and FPGA

SUMMARYIn wireless communication, Viterbi decoding algorithm (VDA) is the one of most popular channel decoding algorithms, which is widely used in WLAN, WiMAX, or 3G communications. However, the throughput of Viterbi decoder is constrained by the convolutional characteristic. Recently, the three‐poi...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Concurrency and computation Ročník 26; číslo 3; s. 821 - 840
Hlavní autori:	Li, Rongchun, Dou, Yong, Zou, Dan
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	Blackwell Publishing Ltd 10.03.2014
Predmet:	Algorithms Central processing units CUDA Decoding Field programmable gate arrays FPGA GPU OpenMP Optimization Platforms SDR SSE viterbi Viterbi decoding Wireless communication
ISSN:	1532-0626, 1532-0634
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Abstract	SUMMARYIn wireless communication, Viterbi decoding algorithm (VDA) is the one of most popular channel decoding algorithms, which is widely used in WLAN, WiMAX, or 3G communications. However, the throughput of Viterbi decoder is constrained by the convolutional characteristic. Recently, the three‐point VDA (TVDA) was proposed to solve this problem. In TVDA, the whole procedure can be divided into three phases, the forward, trace‐back, and decoding phases. In this paper, we analyze the parallelism of TVDA and propose parallel TVDA on the multi‐core CPU, graphics processing unit (GPU), and field programmable gate array (FPGA). We demonstrate approaches that fully exploit its performance potential on CPU, GPU, and FPGA computing platforms. For CPU platforms, we perform two optimization methods, single instruction multiple data and multithreading to gain over 145 × speedup over the naive CPU version on a quad‐core CPU platform. For GPU platforms, we propose the combination of cached memory optimization, coalesced global memory accesses, codeword packing scheme, and asynchronous data transition, achieving the throughput of 404.65 Mbps and 12 × speedup over initial GPU versions on an NVIDIA GeForce GTX580 card and 7 × speedup over Intel quad‐core CPU i5‐2300, under the same manufacturing year and both with fully optimized schemes. In addition, for FPGA platforms, we customize a radix‐4 pipelined architecture for the TVDA in a 45‐nm FPGA chip from Xilinx (XC6VLX760). Under 209.15‐MHz clock rate, it achieves a throughput of 418.30 Mbps. Finally, we also discuss the performance evaluation and efficiency comparison of different flexible architectures for real‐time Viterbi decoding in terms of the decoding throughput, power consumption, optimization schemes, programming costs, and price costs.Copyright © 2013 John Wiley & Sons, Ltd.
AbstractList	In wireless communication, Viterbi decoding algorithm (VDA) is the one of most popular channel decoding algorithms, which is widely used in WLAN, WiMAX, or 3G communications. However, the throughput of Viterbi decoder is constrained by the convolutional characteristic. Recently, the three‐point VDA (TVDA) was proposed to solve this problem. In TVDA, the whole procedure can be divided into three phases, the forward, trace‐back, and decoding phases. In this paper, we analyze the parallelism of TVDA and propose parallel TVDA on the multi‐core CPU, graphics processing unit (GPU), and field programmable gate array (FPGA). We demonstrate approaches that fully exploit its performance potential on CPU, GPU, and FPGA computing platforms. For CPU platforms, we perform two optimization methods, single instruction multiple data and multithreading to gain over 145 × speedup over the naive CPU version on a quad‐core CPU platform. For GPU platforms, we propose the combination of cached memory optimization, coalesced global memory accesses, codeword packing scheme, and asynchronous data transition, achieving the throughput of 404.65 Mbps and 12 × speedup over initial GPU versions on an NVIDIA GeForce GTX580 card and 7 × speedup over Intel quad‐core CPU i5‐2300, under the same manufacturing year and both with fully optimized schemes. In addition, for FPGA platforms, we customize a radix‐4 pipelined architecture for the TVDA in a 45‐nm FPGA chip from Xilinx (XC6VLX760). Under 209.15‐MHz clock rate, it achieves a throughput of 418.30 Mbps. Finally, we also discuss the performance evaluation and efficiency comparison of different flexible architectures for real‐time Viterbi decoding in terms of the decoding throughput, power consumption, optimization schemes, programming costs, and price costs.Copyright © 2013 John Wiley & Sons, Ltd. SUMMARYIn wireless communication, Viterbi decoding algorithm (VDA) is the one of most popular channel decoding algorithms, which is widely used in WLAN, WiMAX, or 3G communications. However, the throughput of Viterbi decoder is constrained by the convolutional characteristic. Recently, the three‐point VDA (TVDA) was proposed to solve this problem. In TVDA, the whole procedure can be divided into three phases, the forward, trace‐back, and decoding phases. In this paper, we analyze the parallelism of TVDA and propose parallel TVDA on the multi‐core CPU, graphics processing unit (GPU), and field programmable gate array (FPGA). We demonstrate approaches that fully exploit its performance potential on CPU, GPU, and FPGA computing platforms. For CPU platforms, we perform two optimization methods, single instruction multiple data and multithreading to gain over 145 × speedup over the naive CPU version on a quad‐core CPU platform. For GPU platforms, we propose the combination of cached memory optimization, coalesced global memory accesses, codeword packing scheme, and asynchronous data transition, achieving the throughput of 404.65 Mbps and 12 × speedup over initial GPU versions on an NVIDIA GeForce GTX580 card and 7 × speedup over Intel quad‐core CPU i5‐2300, under the same manufacturing year and both with fully optimized schemes. In addition, for FPGA platforms, we customize a radix‐4 pipelined architecture for the TVDA in a 45‐nm FPGA chip from Xilinx (XC6VLX760). Under 209.15‐MHz clock rate, it achieves a throughput of 418.30 Mbps. Finally, we also discuss the performance evaluation and efficiency comparison of different flexible architectures for real‐time Viterbi decoding in terms of the decoding throughput, power consumption, optimization schemes, programming costs, and price costs.Copyright © 2013 John Wiley & Sons, Ltd. In wireless communication, Viterbi decoding algorithm (VDA) is the one of most popular channel decoding algorithms, which is widely used in WLAN, WiMAX, or 3G communications. However, the throughput of Viterbi decoder is constrained by the convolutional characteristic. Recently, the three-point VDA (TVDA) was proposed to solve this problem. In TVDA, the whole procedure can be divided into three phases, the forward, trace-back, and decoding phases. In this paper, we analyze the parallelism of TVDA and propose parallel TVDA on the multi-core CPU, graphics processing unit (GPU), and field programmable gate array (FPGA). We demonstrate approaches that fully exploit its performance potential on CPU, GPU, and FPGA computing platforms. For CPU platforms, we perform two optimization methods, single instruction multiple data and multithreading to gain over 145speedup over the naive CPU version on a quad-core CPU platform. For GPU platforms, we propose the combination of cached memory optimization, coalesced global memory accesses, codeword packing scheme, and asynchronous data transition, achieving the throughput of 404.65Mbps and 12speedup over initial GPU versions on an NVIDIA GeForce GTX580 card and 7speedup over Intel quad-core CPU i5-2300, under the same manufacturing year and both with fully optimized schemes. In addition, for FPGA platforms, we customize a radix-4 pipelined architecture for the TVDA in a 45-nm FPGA chip from Xilinx (XC6VLX760). Under 209.15-MHz clock rate, it achieves a throughput of 418.30Mbps. Finally, we also discuss the performance evaluation and efficiency comparison of different flexible architectures for real-time Viterbi decoding in terms of the decoding throughput, power consumption, optimization schemes, programming costs, and price costs.Copyright copyright 2013 John Wiley & Sons, Ltd.
Author	Zou, Dan Dou, Yong Li, Rongchun
Author_xml	– sequence: 1 givenname: Rongchun surname: Li fullname: Li, Rongchun email: Correspondence to: Rongchun Li, National Laboratory for Parallel and Distribution Processing, National University of Defense Technology, Changsha, China., rongchunli@nudt.edu.cn organization: National Laboratory for Parallel and Distribution Processing, National University of Defense Technology, Changsha, China – sequence: 2 givenname: Yong surname: Dou fullname: Dou, Yong organization: National Laboratory for Parallel and Distribution Processing, National University of Defense Technology, Changsha, China – sequence: 3 givenname: Dan surname: Zou fullname: Zou, Dan organization: National Laboratory for Parallel and Distribution Processing, National University of Defense Technology, Changsha, China
BookMark	eNp1kF1P2zAUhi0EEp_SfoIvd0GKHSexc4mqtkxUWyWK4M46dY7BmxMHO7Dx75eqE9PQuDlfes578RyT_S50SMgnziacsfzC9DgRrBZ75IiXIs9YJYr9tzmvDslxSt8Z45wJfkRgZq0zDruB9hDBe_TUtb3HdjzB4EJHg6XDY0TM-uBG7MUNGDeONmhC47oHCv4hRDc8tnSEp6vbc7rYFugaOl8tLk_JgQWf8OxPPyHr-Ww9vcqW3xZfppfLzBQsF1mjals2RuUFM0zW0gpVS8WbSglQAJvSVqZCZgqJIBU0cgO8HFe-4ZZJJU7I511sH8PTM6ZBty4Z9B46DM9J8zJndaFEzf6iJoaUIlrdR9dCfNWc6a1EPUrUW4kjOnmHGrfTMkRw_n8P2e7hp_P4-mGwnq5m__IuDfjrjYf4Q1dSyFLffV1oeV_ez8vrG63Eb-OzksI
CitedBy_id	crossref_primary_10_1016_j_jnca_2016_08_020 crossref_primary_10_1145_3470642 crossref_primary_10_1007_s00607_017_0557_6 crossref_primary_10_1109_TCSI_2018_2825362 crossref_primary_10_1002_cpe_3488 crossref_primary_10_1002_cpe_3833 crossref_primary_10_1109_TCSS_2021_3059318 crossref_primary_10_1002_cpe_5437 crossref_primary_10_1109_ACCESS_2018_2882455
Cites_doi	10.4218/etrij.08.0208.0196 10.1109/MCOM.2010.5434388 10.1109/TIT.1967.1054010 10.1109/VTCF.2006.176 10.1109/ICCT.2006.341948 10.1109/PROC.1973.9030 10.1109/ICISE.2009.265 10.1007/s10470-011-9764-9 10.1109/26.221067 10.1007/978-3-642-11515-8_26 10.1109/wicom.2011.6036680 10.1109/WCSP.2011.6096781 10.1109/TVLSI.2004.842930 10.1109/SOCDC.2009.5423923 10.1109/SIPS.2009.5336249 10.1002/cpe.1913 10.1109/ICEEE.2006.251908
ContentType	Journal Article
Copyright	Copyright © 2013 John Wiley & Sons, Ltd.
Copyright_xml	– notice: Copyright © 2013 John Wiley & Sons, Ltd.
DBID	BSCLL AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D
DOI	10.1002/cpe.3093
DatabaseName	Istex CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional
DatabaseTitle	CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional
DatabaseTitleList	CrossRef Technology Research Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISSN	1532-0634
EndPage	840
ExternalDocumentID	10_1002_cpe_3093 CPE3093 ark_67375_WNG_7X5XF5KS_8
Genre	article
GrantInformation_xml	– fundername: National Science Foundation of China funderid: 61125201
GroupedDBID	.3N .DC .GA .Y3 05W 0R~ 10A 1L6 1OC 33P 3SF 3WU 4.4 50Y 50Z 51W 51X 52M 52N 52O 52P 52S 52T 52U 52W 52X 5GY 5VS 66C 702 7PT 8-0 8-1 8-3 8-4 8-5 8UM 930 A03 AAESR AAEVG AAHQN AAMNL AANHP AANLZ AAONW AASGY AAXRX AAYCA AAZKR ABCQN ABCUV ABEML ABIJN ACAHQ ACBWZ ACCZN ACPOU ACRPL ACSCC ACXBN ACXQS ACYXJ ADBBV ADEOM ADIZJ ADKYN ADMGS ADMLS ADNMO ADOZA ADXAS ADZMN AEIGN AEIMD AEUYR AEYWJ AFBPY AFFPM AFGKR AFWVQ AGQPQ AGYGG AHBTC AITYG AIURR AJXKR ALMA_UNASSIGNED_HOLDINGS ALVPJ AMBMR AMYDB ATUGU AUFTA AZBYB BAFTC BDRZF BFHJK BHBCM BMNLL BROTX BRXPI BSCLL BY8 CS3 D-E D-F DCZOG DPXWK DR2 DRFUL DRSTM EBS EJD F00 F01 F04 F5P G-S G.N GNP GODZA HGLYW HHY HZ~ IX1 JPC KQQ LATKE LAW LC2 LC3 LEEKS LH4 LITHE LOXES LP6 LP7 LUTES LW6 LYRES MEWTI MK4 MRFUL MRSTM MSFUL MSSTM MXFUL MXSTM N04 N05 N9A O66 O9- OIG P2W P2X P4D PQQKQ Q.N Q11 QB0 QRW R.K ROL RX1 SUPJJ TN5 UB1 V2E W8V W99 WBKPD WIH WIK WOHZO WQJ WXSBR WYISQ WZISG XG1 XV2 ~IA ~WT AAYXX CITATION O8X 7SC 7SP 8FD JQ2 L7M L~C L~D
ID	FETCH-LOGICAL-c4023-d89f5dc8240c0797f389781d683a8aab5f6c6e0c47ea78ad7ba150c41b1f0783
IEDL.DBID	DRFUL
ISICitedReferencesCount	17
ISICitedReferencesURI	http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000331020300012&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN	1532-0626
IngestDate	Sun Nov 09 08:58:16 EST 2025 Sat Nov 29 01:41:13 EST 2025 Tue Nov 18 21:16:21 EST 2025 Tue Nov 11 03:12:19 EST 2025 Tue Nov 11 03:33:39 EST 2025
IsPeerReviewed	true
IsScholarly	true
Issue	3
Language	English
License	http://onlinelibrary.wiley.com/termsAndConditions#vor
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c4023-d89f5dc8240c0797f389781d683a8aab5f6c6e0c47ea78ad7ba150c41b1f0783
Notes	National Science Foundation of China - No. 61125201 istex:74029AA1651A9C4BEA3F73C2E1521182656D7303 ArticleID:CPE3093 ark:/67375/WNG-7X5XF5KS-8 ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23
PQID	1520948390
PQPubID	23500
PageCount	20
ParticipantIDs	proquest_miscellaneous_1520948390 crossref_primary_10_1002_cpe_3093 crossref_citationtrail_10_1002_cpe_3093 wiley_primary_10_1002_cpe_3093_CPE3093 istex_primary_ark_67375_WNG_7X5XF5KS_8
PublicationCentury	2000
PublicationDate	10 March 2014
PublicationDateYYYYMMDD	2014-03-10
PublicationDate_xml	– month: 03 year: 2014 text: 10 March 2014 day: 10
PublicationDecade	2010
PublicationTitle	Concurrency and computation
PublicationTitleAlternate	Concurrency Computat.: Pract. Exper
PublicationYear	2014
Publisher	Blackwell Publishing Ltd
Publisher_xml	– name: Blackwell Publishing Ltd
References	Kim J, Seungheon H, Seungwon C. Implementation of an SDR system using graphics processing unit. IEEE Communications Magazine 2010; 48(3):156-162. Choi SW, Kang KM, Choi SS. A Two-stage radix-4 Viterbi decoder for multiband OFDM UWB system. ETRI Journal 2008; 30(6):850-852. Zou D, Dou Y, Xia F. Optimization schemes and performance evaluation of Smith-Waterman algorithm on CPU, GPU and FPGA. Concurrency and Computation-Practice & Experience 2012; 24(14):1625-1644. Forney GD. The Viterbi algorithm. Proceedings of the IEEE 1973; 61(3):268-278. Ahn C, Kim J, Ju J, Choi J, Choi B, Choi S. Implementation of an SDR platform using GPU and its application to a 2 × 2 MIMO WiMAX system. Analog Integrated Circuits and Signal Processing 2011; 69(2):107-117. Feygin G, Gulak PG. Architectural tradeoffs for survivor sequence memory management in Viterbi decoders. IEEE Transactions on Communications 1993; 41(3):425-429. Mesmay FD, Chellappa S, Franchetti F, Markus P. Computer generation of efficient software Viterbi decoders. Lecture Notes in Computer Science 2010; 5952:353-368. Tessier R, Swaminathan S, Ramaswamy R, Goeckel D, Burleson W. A reconfigurable, power-efficient adaptive Viterbi decoder. IEEE Transactions on Very Large Scale Integration VLSI Systems 2005; 13(4):484-488. Viterbi AJ. Error bounds for convolutional codes and asymptoticaIly optimum decoding algorithm. IEEE Transactions on Information Theory 1967; IT-13(4):260-269. 1973; 61 2010; 48 1967; IT‐13 2011 1993; 41 2009 2007 2006 2005 2008; 30 2011; 69 2010; 5952 2012; 24 2005; 13 e_1_2_8_17_1 e_1_2_8_18_1 e_1_2_8_19_1 e_1_2_8_13_1 e_1_2_8_24_1 e_1_2_8_14_1 e_1_2_8_15_1 e_1_2_8_16_1 e_1_2_8_3_1 e_1_2_8_2_1 e_1_2_8_5_1 e_1_2_8_4_1 e_1_2_8_7_1 e_1_2_8_6_1 e_1_2_8_9_1 e_1_2_8_8_1 e_1_2_8_20_1 e_1_2_8_10_1 e_1_2_8_21_1 e_1_2_8_11_1 e_1_2_8_22_1 e_1_2_8_12_1 e_1_2_8_23_1
References_xml	– reference: Mesmay FD, Chellappa S, Franchetti F, Markus P. Computer generation of efficient software Viterbi decoders. Lecture Notes in Computer Science 2010; 5952:353-368. – reference: Viterbi AJ. Error bounds for convolutional codes and asymptoticaIly optimum decoding algorithm. IEEE Transactions on Information Theory 1967; IT-13(4):260-269. – reference: Ahn C, Kim J, Ju J, Choi J, Choi B, Choi S. Implementation of an SDR platform using GPU and its application to a 2 × 2 MIMO WiMAX system. Analog Integrated Circuits and Signal Processing 2011; 69(2):107-117. – reference: Choi SW, Kang KM, Choi SS. A Two-stage radix-4 Viterbi decoder for multiband OFDM UWB system. ETRI Journal 2008; 30(6):850-852. – reference: Forney GD. The Viterbi algorithm. Proceedings of the IEEE 1973; 61(3):268-278. – reference: Tessier R, Swaminathan S, Ramaswamy R, Goeckel D, Burleson W. A reconfigurable, power-efficient adaptive Viterbi decoder. IEEE Transactions on Very Large Scale Integration VLSI Systems 2005; 13(4):484-488. – reference: Feygin G, Gulak PG. Architectural tradeoffs for survivor sequence memory management in Viterbi decoders. IEEE Transactions on Communications 1993; 41(3):425-429. – reference: Kim J, Seungheon H, Seungwon C. Implementation of an SDR system using graphics processing unit. IEEE Communications Magazine 2010; 48(3):156-162. – reference: Zou D, Dou Y, Xia F. Optimization schemes and performance evaluation of Smith-Waterman algorithm on CPU, GPU and FPGA. Concurrency and Computation-Practice & Experience 2012; 24(14):1625-1644. – start-page: 1 year: 2007 end-page: 4 – start-page: 185 year: 2009 end-page: 190 – volume: 5952 start-page: 353 year: 2010 end-page: 368 article-title: Computer generation of efficient software Viterbi decoders publication-title: Lecture Notes in Computer Science – volume: IT‐13 start-page: 260 issue: 4 year: 1967 end-page: 269 article-title: Error bounds for convolutional codes and asymptoticaIly optimum decoding algorithm publication-title: IEEE Transactions on Information Theory – year: 2005 – volume: 24 start-page: 1625 issue: 14 year: 2012 end-page: 1644 article-title: Optimization schemes and performance evaluation of Smith–Waterman algorithm on CPU, GPU and FPGA publication-title: Concurrency and Computation‐Practice & Experience – volume: 69 start-page: 107 issue: 2 year: 2011 end-page: 117 article-title: Implementation of an SDR platform using GPU and its application to a 2 × 2 MIMO WiMAX system publication-title: Analog Integrated Circuits and Signal Processing – volume: 41 start-page: 425 issue: 3 year: 1993 end-page: 429 article-title: Architectural tradeoffs for survivor sequence memory management in Viterbi decoders publication-title: IEEE Transactions on Communications – start-page: 121 year: 2009 end-page: 124 – start-page: 51 year: 2006 end-page: 55 – start-page: 1 year: 2011 end-page: 4 – start-page: 1 year: 2006 end-page: 4 – start-page: 1 year: 2006 end-page: 5 – volume: 30 start-page: 850 issue: 6 year: 2008 end-page: 852 article-title: A Two‐stage radix‐4 Viterbi decoder for multiband OFDM UWB system publication-title: ETRI Journal – volume: 13 start-page: 484 issue: 4 year: 2005 end-page: 488 article-title: A reconfigurable, power‐efficient adaptive Viterbi decoder publication-title: IEEE Transactions on Very Large Scale Integration VLSI Systems – start-page: 468 year: 2009 end-page: 471 – start-page: 1 year: 2011 end-page: 6 – start-page: 237 year: 2007 end-page: 241 – volume: 61 start-page: 268 issue: 3 year: 1973 end-page: 278 article-title: The Viterbi algorithm publication-title: Proceedings of the IEEE – volume: 48 start-page: 156 issue: 3 year: 2010 end-page: 162 article-title: Implementation of an SDR system using graphics processing unit publication-title: IEEE Communications Magazine – ident: e_1_2_8_5_1 – ident: e_1_2_8_23_1 doi: 10.4218/etrij.08.0208.0196 – ident: e_1_2_8_8_1 doi: 10.1109/MCOM.2010.5434388 – ident: e_1_2_8_2_1 doi: 10.1109/TIT.1967.1054010 – ident: e_1_2_8_6_1 – ident: e_1_2_8_11_1 – ident: e_1_2_8_16_1 doi: 10.1109/VTCF.2006.176 – ident: e_1_2_8_17_1 – ident: e_1_2_8_19_1 – ident: e_1_2_8_18_1 doi: 10.1109/ICCT.2006.341948 – ident: e_1_2_8_3_1 doi: 10.1109/PROC.1973.9030 – ident: e_1_2_8_13_1 doi: 10.1109/ICISE.2009.265 – ident: e_1_2_8_9_1 doi: 10.1007/s10470-011-9764-9 – ident: e_1_2_8_10_1 doi: 10.1109/26.221067 – ident: e_1_2_8_12_1 doi: 10.1007/978-3-642-11515-8_26 – ident: e_1_2_8_14_1 doi: 10.1109/wicom.2011.6036680 – ident: e_1_2_8_22_1 doi: 10.1109/WCSP.2011.6096781 – ident: e_1_2_8_4_1 doi: 10.1109/TVLSI.2004.842930 – ident: e_1_2_8_21_1 doi: 10.1109/SOCDC.2009.5423923 – ident: e_1_2_8_20_1 doi: 10.1109/SIPS.2009.5336249 – ident: e_1_2_8_24_1 doi: 10.1002/cpe.1913 – ident: e_1_2_8_7_1 – ident: e_1_2_8_15_1 doi: 10.1109/ICEEE.2006.251908
SSID	ssj0011031
Score	2.1109731
Snippet	SUMMARYIn wireless communication, Viterbi decoding algorithm (VDA) is the one of most popular channel decoding algorithms, which is widely used in WLAN, WiMAX,... In wireless communication, Viterbi decoding algorithm (VDA) is the one of most popular channel decoding algorithms, which is widely used in WLAN, WiMAX, or 3G...
SourceID	proquest crossref wiley istex
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	821
SubjectTerms	Algorithms Central processing units CUDA Decoding Field programmable gate arrays FPGA GPU OpenMP Optimization Platforms SDR SSE viterbi Viterbi decoding Wireless communication
Title	Efficient parallel implementation of three-point viterbi decoding algorithm on CPU, GPU, and FPGA
URI	https://api.istex.fr/ark:/67375/WNG-7X5XF5KS-8/fulltext.pdf https://onlinelibrary.wiley.com/doi/abs/10.1002%2Fcpe.3093 https://www.proquest.com/docview/1520948390
Volume	26
WOSCitedRecordID	wos000331020300012&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVWIB databaseName: Wiley Online Library Full Collection 2020 customDbUrl: eissn: 1532-0634 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0011031 issn: 1532-0626 databaseCode: DRFUL dateStart: 20010101 isFulltext: true titleUrlDefault: https://onlinelibrary.wiley.com providerName: Wiley-Blackwell
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3NbtNAEF5BwoELKQXUAkWLVNELps5mnV0fozQOEiiKaCtyW43Xa7Dq2lGSVhz7CH1GnqQz_glUAgmJi30Z_2hmx_ONZ_Ybxg4xiIahFNbTsp96UmCCAs6lHgjfCqET39qkGjahZjO9WITzpquS9sLU_BDbH27kGdX3mhwc4vXxL9JQu3QfqIz3kHUFLlvZYd2TL9H5520NgQYY1GypwvMRt7fUs744bq-9F4y6pNcf95Dm73i1CjhR739edYc9aWAmH9Xr4il74Ipd1mtHOPDGo58xO6koJDDycCIBz3OX8-yy7Skno_Ey5Rs0uPt5c7ssMxS8pl3LccYTzFwp8nHIv5WrbPP9kqP4eH7-nk_pAEXCo_l09JydRZOz8UevmbvgWcwmB16iwzRIrMZgb30VqhRBjUJcO9QD0ABxkA7t0PlWKgdKQ6JiQFhpZT_up1QVfME6RVm4PcZDF4SxUxrlQglOgoJADsAmFnPjYZzus6NW_8Y2nOQ0GiM3NZuyMKg6Q6rbZ2-3ksuah-MPMu8qE24FYHVBfWsqMF9nU6MWwSIKPp0ajTdrbWzQm6hEAoUrr9amT11BEkGjjzerTPrXp5nxfELnl_8q-Io9RrQlvaoZ8DXrbFZX7oA9stebbL1606zeO1aq84c
linkProvider	Wiley-Blackwell
linkToHtml	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3NbtNAEB6VBAkutAUqCqUsEoILbh1nnV2rpyrEadUQRZCK3Fbr9bpYuHaUphVHHoFn7JN0xj9pK4GExMW-jNfWzo7nm53ZbwDeoRMNAu4ZR_JO4nAPAxRtbeJozzWeJ2PXmLhsNiHGYzmbBZM1OGjOwlT8EKsNN7KM8n9NBk4b0vu3rKFmbvcoj_cA2hxXkd-C9qcv4elolUSgDgYVXarnuAjcG-5Z19tvnr3njdo0sT_vQc27gLX0OOH6f33rBjypgSY7rFbGJqzZ_CmsN00cWG3Tz8AMShIJ9D2MaMCzzGYsPW-qykltrEjYElVur3_9nhcpCl7RueUoZTHGruT7mM7OikW6_H7OULw_Of3IhnTReczCyfDwOUzDwbR_5NSdFxyD8WTXiWWQ-LGR6O6NKwKRIKwRiGx7squl1pGf9EzPuoYLq4XUsYg0AkvDO1EnobzgFrTyIrcvgAXWDyIrJMoFXFuuhfZ5V5vYYHTci5Jt-NAoQJmalZyaY2Sq4lP2FE6doqnbhrcryXnFxPEHmfelDlcCevGDKteEr76Nh0rM_Fnon3xVEgdrlKzQnihJonNbXF6oDtUFcYSNLg5W6vSvb1P9yYDuL_9V8A08Opp-HqnR8fjkFTxG7MWdsjRwB1rLxaV9DQ_N1TK9WOzWS_kG52P3dw
linkToPdf	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Lb9QwEB6VLkJcKE9RnkZCcCE0m3ViW5yq7WZBrVYRtGJvluNHG5Emq-226rE_gd_IL2Gcx0IlkJC4JJeJE3k8mW88428AXqMTFYJGOuB06AIaYYCirHWBikIdRdyEWpum2QSbzfh8LrIN-NCfhWn5IdYbbt4ymv-1N3C7MG7nF2uoXtj3Po93AwY0Fgla5WDvc3p0sE4i-A4GLV1qFIQI3Hvu2TDa6Z-95o0GfmIvr0HN3wFr43HSrf_61rtwpwOaZLddGfdgw1b3Yatv4kA6m34AetKQSKDvIZ4GvCxtSYrTvqrcq43UjqxQ5fbH1fdFXaDghT-3nBfEYOzqfR9R5XG9LFYnpwTFx9nROzL1F1UZkmbT3YdwmE4Oxx-DrvNCoDGeHAWGCxcbzdHd65AJ5hDWMES2CR8prlQeu0QnNtSUWcW4MixXCCw1HeZD5_OCj2Czqiv7GIiwscgt4ygnqLJUMRXTkdJGY3Sc5G4b3vYKkLpjJffNMUrZ8ilHEqdO-qnbhldryUXLxPEHmTeNDtcCavnNV66xWH6dTSWbx_M03v8iOQ7WK1miPfkkiapsfX4mh74uiCJsDHGwRqd_fZscZxN_f_Kvgi_hVraXyoNPs_2ncBuhFw2aysBnsLlantvncFNfrIqz5YtuJf8Ef9H28g
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Efficient+parallel+implementation+of+three-point+viterbi+decoding+algorithm+on+CPU%2C+GPU%2C+and+FPGA&rft.jtitle=Concurrency+and+computation&rft.au=Li%2C+Rongchun&rft.au=Dou%2C+Yong&rft.au=Zou%2C+Dan&rft.date=2014-03-10&rft.issn=1532-0626&rft.eissn=1532-0634&rft.volume=26&rft.issue=3&rft.spage=821&rft.epage=840&rft_id=info:doi/10.1002%2Fcpe.3093&rft.externalDBID=NO_FULL_TEXT
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1532-0626&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1532-0626&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1532-0626&client=summon