Acceleration of computer based simulation, image processing, and data analysis using computer clusters with heterogeneous accelerators

Saved in:
Bibliographic Details
Title: Acceleration of computer based simulation, image processing, and data analysis using computer clusters with heterogeneous accelerators
Authors: Chen, Chong
Source: Graduate Theses and Dissertations
Publisher Information: eCommons
Publication Year: 2016
Collection: University of Dayton: eCommons
Subject Terms: Heterogeneous distributed computing systems, Parallel computers, Multiprocessors, Computer Engineering, parallel computing, distributed computing, GPGPU, Xeon Phi, Preconditioned Iterative Solver, ALS, bilateral filtering
Description: With the limits to frequency scaling in microprocessors due to power constraints, many-core and multi-core architectures have become the norm over the past decade. The goal of this work is the acceleration of key computer simulation tools, data processing, and data analysis algorithms in multi-core and many-core computer clusters and the analysis of their accelerated performances. The main contributions of this dissertation are: 1. Acceleration of vector bilateral filtering for hyperspectral imaging with GPGPU: a GPGPU based acceleration for vector bilateral filtering called vBF_GPU was implemented in this dissertation. vBF_GPU use multiple threads to processing one pixel of a hyperspectral image to improve the efficiency of the cache memory. The memory access operation of vBF_GPU was fully optimized to reduce the data transfer cost of the GPGPU program. The experiment results indicate that vBF_GPU can provide up to 19x speedup when compared with a multi-core CPU implementation and up to 3x speedup when compared with a naive GPGPU implementation of vector bilateral filtering. vBF_GPU can process hyperspectral imaging with up to 266 spectrums, and the window size of the bilateral filter is unlimited.;"2. Optimization of acceleration of alternative least square algorithm using GPGPU cluster: this study presented an optimized implementation for Alternative Least Square Algorithm (ALS) to realize large-scale matrix factorization based recommendation system. In this study, a GPGPU optimized implementation is developed to conduct the batch solver in ALS algorithm. An equivalent mathematical form of equations was used to simplify the computation complexity of ALS algorithm. A distributed version of this implementation was also developed and tested using a cluster of GPGPUs. The experiment results in this study indicates that our application running at a GPGPU can achieve up to 3.8x speedup when compared with an 8-core CPU. And the distributed implementation made excellent scalability at a computer cluster with multiple ...
Document Type: text
Language: unknown
Relation: https://ecommons.udayton.edu/graduate_theses/1207; http://rave.ohiolink.edu/etdc/view?acc_num=dayton148036732102682
Availability: https://ecommons.udayton.edu/graduate_theses/1207
http://rave.ohiolink.edu/etdc/view?acc_num=dayton148036732102682
Rights: Copyright © 2016, author
Accession Number: edsbas.F44DCDCD
Database: BASE
FullText Text:
  Availability: 0
CustomLinks:
  – Url: https://ecommons.udayton.edu/graduate_theses/1207#
    Name: EDS - BASE (s4221598)
    Category: fullText
    Text: View record from BASE
  – Url: https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=EBSCO&SrcAuth=EBSCO&DestApp=WOS&ServiceName=TransferToWoS&DestLinkType=GeneralSearchSummary&Func=Links&author=Chen%20C
    Name: ISI
    Category: fullText
    Text: Nájsť tento článok vo Web of Science
    Icon: https://imagesrvr.epnet.com/ls/20docs.gif
    MouseOverText: Nájsť tento článok vo Web of Science
Header DbId: edsbas
DbLabel: BASE
An: edsbas.F44DCDCD
RelevancyScore: 793
AccessLevel: 3
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 793.477478027344
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Acceleration of computer based simulation, image processing, and data analysis using computer clusters with heterogeneous accelerators
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Chen%2C+Chong%22">Chen, Chong</searchLink>
– Name: TitleSource
  Label: Source
  Group: Src
  Data: Graduate Theses and Dissertations
– Name: Publisher
  Label: Publisher Information
  Group: PubInfo
  Data: eCommons
– Name: DatePubCY
  Label: Publication Year
  Group: Date
  Data: 2016
– Name: Subset
  Label: Collection
  Group: HoldingsInfo
  Data: University of Dayton: eCommons
– Name: Subject
  Label: Subject Terms
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Heterogeneous+distributed+computing+systems%22">Heterogeneous distributed computing systems</searchLink><br /><searchLink fieldCode="DE" term="%22Parallel+computers%22">Parallel computers</searchLink><br /><searchLink fieldCode="DE" term="%22Multiprocessors%22">Multiprocessors</searchLink><br /><searchLink fieldCode="DE" term="%22Computer+Engineering%22">Computer Engineering</searchLink><br /><searchLink fieldCode="DE" term="%22parallel+computing%22">parallel computing</searchLink><br /><searchLink fieldCode="DE" term="%22distributed+computing%22">distributed computing</searchLink><br /><searchLink fieldCode="DE" term="%22GPGPU%22">GPGPU</searchLink><br /><searchLink fieldCode="DE" term="%22Xeon+Phi%22">Xeon Phi</searchLink><br /><searchLink fieldCode="DE" term="%22Preconditioned+Iterative+Solver%22">Preconditioned Iterative Solver</searchLink><br /><searchLink fieldCode="DE" term="%22ALS%22">ALS</searchLink><br /><searchLink fieldCode="DE" term="%22bilateral+filtering%22">bilateral filtering</searchLink>
– Name: Abstract
  Label: Description
  Group: Ab
  Data: With the limits to frequency scaling in microprocessors due to power constraints, many-core and multi-core architectures have become the norm over the past decade. The goal of this work is the acceleration of key computer simulation tools, data processing, and data analysis algorithms in multi-core and many-core computer clusters and the analysis of their accelerated performances. The main contributions of this dissertation are: 1. Acceleration of vector bilateral filtering for hyperspectral imaging with GPGPU: a GPGPU based acceleration for vector bilateral filtering called vBF_GPU was implemented in this dissertation. vBF_GPU use multiple threads to processing one pixel of a hyperspectral image to improve the efficiency of the cache memory. The memory access operation of vBF_GPU was fully optimized to reduce the data transfer cost of the GPGPU program. The experiment results indicate that vBF_GPU can provide up to 19x speedup when compared with a multi-core CPU implementation and up to 3x speedup when compared with a naive GPGPU implementation of vector bilateral filtering. vBF_GPU can process hyperspectral imaging with up to 266 spectrums, and the window size of the bilateral filter is unlimited.;"2. Optimization of acceleration of alternative least square algorithm using GPGPU cluster: this study presented an optimized implementation for Alternative Least Square Algorithm (ALS) to realize large-scale matrix factorization based recommendation system. In this study, a GPGPU optimized implementation is developed to conduct the batch solver in ALS algorithm. An equivalent mathematical form of equations was used to simplify the computation complexity of ALS algorithm. A distributed version of this implementation was also developed and tested using a cluster of GPGPUs. The experiment results in this study indicates that our application running at a GPGPU can achieve up to 3.8x speedup when compared with an 8-core CPU. And the distributed implementation made excellent scalability at a computer cluster with multiple ...
– Name: TypeDocument
  Label: Document Type
  Group: TypDoc
  Data: text
– Name: Language
  Label: Language
  Group: Lang
  Data: unknown
– Name: NoteTitleSource
  Label: Relation
  Group: SrcInfo
  Data: https://ecommons.udayton.edu/graduate_theses/1207; http://rave.ohiolink.edu/etdc/view?acc_num=dayton148036732102682
– Name: URL
  Label: Availability
  Group: URL
  Data: https://ecommons.udayton.edu/graduate_theses/1207<br />http://rave.ohiolink.edu/etdc/view?acc_num=dayton148036732102682
– Name: Copyright
  Label: Rights
  Group: Cpyrght
  Data: Copyright © 2016, author
– Name: AN
  Label: Accession Number
  Group: ID
  Data: edsbas.F44DCDCD
PLink https://erproxy.cvtisr.sk/sfx/access?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.F44DCDCD
RecordInfo BibRecord:
  BibEntity:
    Languages:
      – Text: unknown
    Subjects:
      – SubjectFull: Heterogeneous distributed computing systems
        Type: general
      – SubjectFull: Parallel computers
        Type: general
      – SubjectFull: Multiprocessors
        Type: general
      – SubjectFull: Computer Engineering
        Type: general
      – SubjectFull: parallel computing
        Type: general
      – SubjectFull: distributed computing
        Type: general
      – SubjectFull: GPGPU
        Type: general
      – SubjectFull: Xeon Phi
        Type: general
      – SubjectFull: Preconditioned Iterative Solver
        Type: general
      – SubjectFull: ALS
        Type: general
      – SubjectFull: bilateral filtering
        Type: general
    Titles:
      – TitleFull: Acceleration of computer based simulation, image processing, and data analysis using computer clusters with heterogeneous accelerators
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Chen, Chong
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 01
              Type: published
              Y: 2016
          Identifiers:
            – Type: issn-locals
              Value: edsbas
          Titles:
            – TitleFull: Graduate Theses and Dissertations
              Type: main
ResultId 1