DALiuGE: A graph execution framework for harnessing the astronomical data deluge

Gespeichert in:
Bibliographische Detailangaben
Titel: DALiuGE: A graph execution framework for harnessing the astronomical data deluge
Autoren: Wu, C., Tobar, R., Vinsen, K., Wicenec, A., Pallot, D., Lao, B., Wang, R., An, T., Boulton, M., Cooper, I., Dodson, R., Dolensky, M., Mei Y(梅盈), Wang F(王锋)
Verlagsinformationen: ELSEVIER SCIENCE BV
Publikationsjahr: 2017
Bestand: Yunnan Observatories: YNAO OpenIR (Chinese Academy of Sciences, CAS) / 中国科学院云南天文台机构知识库
Schlagwörter: Dataflow, Graph Execution Engine, Data Driven, Square Kilometre Array, Many-task Computing, 天文学, 天文学::天文学其他学科, 计算机科学技术, 计算机科学技术::计算机应用, Astronomy & Astrophysics, Computer Science, Interdisciplinary Applications, Pipeline processing systems, 619.1Pipe, Piping and Pipelines - 722.4Digital Computers and Systems - 723.2Data Processing and Image Processing, Science & Technology, Physical Sciences, Technology, 理学, 理学::天文学, 工学, 工学::计算机科学与技术(可授工学、理学学位)
Beschreibung: The Data Activated Liu(1) Graph Engine - DALiuGE(2) - is an execution framework for processing large astronomical datasets at a scale required by the Square Kilometre Array Phase 1 (SKA1). It includes an interface for expressing complex data reduction pipelines consisting of both datasets and algorithmic components and an implementation run-time to execute such pipelines on distributed resources. By mapping the logical view of a pipeline to its physical realisation, DALiuGE separates the concerns of multiple stakeholders, allowing them to collectively optimise large-scale data processing solutions in a coherent manner. The execution in DALiuGE is data-activated, where each individual data item autonomously triggers the processing on itself. Such decentralisation also makes the execution framework very scalable and flexible, supporting pipeline sizes ranging from less than ten tasks running on a laptop to tens of millions of concurrent tasks on the second fastest supercomputer in the world. DALiuGE has been used in production for reducing interferometry datasets from the Karl E. Jansky Very Large Array and the Mingantu Ultrawide Spectral Radioheliograph; and is being developed as the execution framework prototype for the Science Data Processor (SDP) consortium of the Square Kilometre Array (SKA) telescope. This paper presents a technical overview of DALiuGE and discusses case studies from the CHILES and MUSER projects that use DALiuGE to execute production pipelines. In a companion paper, we provide in-depth analysis of DALiuGE's scalability to very large numbers of tasks on two supercomputing facilities. (C) 2017 Elsevier B.V. All rights reserved.
Publikationsart: article in journal/newspaper
report
Sprache: English
Relation: ASTRONOMY AND COMPUTING; http://ir.ynao.ac.cn/handle/114a53/10042; http://www.sciencedirect.com/science/article/pii/S2213133716301214
DOI: 10.1016/j.ascom.2017.03.007
Verfügbarkeit: http://ir.ynao.ac.cn/handle/114a53/10042
http://www.sciencedirect.com/science/article/pii/S2213133716301214
https://doi.org/10.1016/j.ascom.2017.03.007
Dokumentencode: edsbas.91FC2F3
Datenbank: BASE
FullText Text:
  Availability: 0
CustomLinks:
  – Url: http://ir.ynao.ac.cn/handle/114a53/10042#
    Name: EDS - BASE (s4221598)
    Category: fullText
    Text: View record from BASE
  – Url: https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=EBSCO&SrcAuth=EBSCO&DestApp=WOS&ServiceName=TransferToWoS&DestLinkType=GeneralSearchSummary&Func=Links&author=Wu%20C
    Name: ISI
    Category: fullText
    Text: Nájsť tento článok vo Web of Science
    Icon: https://imagesrvr.epnet.com/ls/20docs.gif
    MouseOverText: Nájsť tento článok vo Web of Science
Header DbId: edsbas
DbLabel: BASE
An: edsbas.91FC2F3
RelevancyScore: 786
AccessLevel: 3
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 786.106628417969
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: DALiuGE: A graph execution framework for harnessing the astronomical data deluge
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Wu%2C+C%2E%22">Wu, C.</searchLink><br /><searchLink fieldCode="AR" term="%22Tobar%2C+R%2E%22">Tobar, R.</searchLink><br /><searchLink fieldCode="AR" term="%22Vinsen%2C+K%2E%22">Vinsen, K.</searchLink><br /><searchLink fieldCode="AR" term="%22Wicenec%2C+A%2E%22">Wicenec, A.</searchLink><br /><searchLink fieldCode="AR" term="%22Pallot%2C+D%2E%22">Pallot, D.</searchLink><br /><searchLink fieldCode="AR" term="%22Lao%2C+B%2E%22">Lao, B.</searchLink><br /><searchLink fieldCode="AR" term="%22Wang%2C+R%2E%22">Wang, R.</searchLink><br /><searchLink fieldCode="AR" term="%22An%2C+T%2E%22">An, T.</searchLink><br /><searchLink fieldCode="AR" term="%22Boulton%2C+M%2E%22">Boulton, M.</searchLink><br /><searchLink fieldCode="AR" term="%22Cooper%2C+I%2E%22">Cooper, I.</searchLink><br /><searchLink fieldCode="AR" term="%22Dodson%2C+R%2E%22">Dodson, R.</searchLink><br /><searchLink fieldCode="AR" term="%22Dolensky%2C+M%2E%22">Dolensky, M.</searchLink><br /><searchLink fieldCode="AR" term="%22Mei+Y%28梅盈%29%22">Mei Y(梅盈)</searchLink><br /><searchLink fieldCode="AR" term="%22Wang+F%28王锋%29%22">Wang F(王锋)</searchLink>
– Name: Publisher
  Label: Publisher Information
  Group: PubInfo
  Data: ELSEVIER SCIENCE BV
– Name: DatePubCY
  Label: Publication Year
  Group: Date
  Data: 2017
– Name: Subset
  Label: Collection
  Group: HoldingsInfo
  Data: Yunnan Observatories: YNAO OpenIR (Chinese Academy of Sciences, CAS) / 中国科学院云南天文台机构知识库
– Name: Subject
  Label: Subject Terms
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22Dataflow%22">Dataflow</searchLink><br /><searchLink fieldCode="DE" term="%22Graph+Execution+Engine%22">Graph Execution Engine</searchLink><br /><searchLink fieldCode="DE" term="%22Data+Driven%22">Data Driven</searchLink><br /><searchLink fieldCode="DE" term="%22Square+Kilometre+Array%22">Square Kilometre Array</searchLink><br /><searchLink fieldCode="DE" term="%22Many-task+Computing%22">Many-task Computing</searchLink><br /><searchLink fieldCode="DE" term="%22天文学%22">天文学</searchLink><br /><searchLink fieldCode="DE" term="%22天文学%3A%3A天文学其他学科%22">天文学::天文学其他学科</searchLink><br /><searchLink fieldCode="DE" term="%22计算机科学技术%22">计算机科学技术</searchLink><br /><searchLink fieldCode="DE" term="%22计算机科学技术%3A%3A计算机应用%22">计算机科学技术::计算机应用</searchLink><br /><searchLink fieldCode="DE" term="%22Astronomy+%26+Astrophysics%22">Astronomy & Astrophysics</searchLink><br /><searchLink fieldCode="DE" term="%22Computer+Science%22">Computer Science</searchLink><br /><searchLink fieldCode="DE" term="%22Interdisciplinary+Applications%22">Interdisciplinary Applications</searchLink><br /><searchLink fieldCode="DE" term="%22Pipeline+processing+systems%22">Pipeline processing systems</searchLink><br /><searchLink fieldCode="DE" term="%22619%2E1Pipe%22">619.1Pipe</searchLink><br /><searchLink fieldCode="DE" term="%22Piping+and+Pipelines+-+722%2E4Digital+Computers+and+Systems+-+723%2E2Data+Processing+and+Image+Processing%22">Piping and Pipelines - 722.4Digital Computers and Systems - 723.2Data Processing and Image Processing</searchLink><br /><searchLink fieldCode="DE" term="%22Science+%26+Technology%22">Science & Technology</searchLink><br /><searchLink fieldCode="DE" term="%22Physical+Sciences%22">Physical Sciences</searchLink><br /><searchLink fieldCode="DE" term="%22Technology%22">Technology</searchLink><br /><searchLink fieldCode="DE" term="%22理学%22">理学</searchLink><br /><searchLink fieldCode="DE" term="%22理学%3A%3A天文学%22">理学::天文学</searchLink><br /><searchLink fieldCode="DE" term="%22工学%22">工学</searchLink><br /><searchLink fieldCode="DE" term="%22工学%3A%3A计算机科学与技术(可授工学、理学学位)%22">工学::计算机科学与技术(可授工学、理学学位)</searchLink>
– Name: Abstract
  Label: Description
  Group: Ab
  Data: The Data Activated Liu(1) Graph Engine - DALiuGE(2) - is an execution framework for processing large astronomical datasets at a scale required by the Square Kilometre Array Phase 1 (SKA1). It includes an interface for expressing complex data reduction pipelines consisting of both datasets and algorithmic components and an implementation run-time to execute such pipelines on distributed resources. By mapping the logical view of a pipeline to its physical realisation, DALiuGE separates the concerns of multiple stakeholders, allowing them to collectively optimise large-scale data processing solutions in a coherent manner. The execution in DALiuGE is data-activated, where each individual data item autonomously triggers the processing on itself. Such decentralisation also makes the execution framework very scalable and flexible, supporting pipeline sizes ranging from less than ten tasks running on a laptop to tens of millions of concurrent tasks on the second fastest supercomputer in the world. DALiuGE has been used in production for reducing interferometry datasets from the Karl E. Jansky Very Large Array and the Mingantu Ultrawide Spectral Radioheliograph; and is being developed as the execution framework prototype for the Science Data Processor (SDP) consortium of the Square Kilometre Array (SKA) telescope. This paper presents a technical overview of DALiuGE and discusses case studies from the CHILES and MUSER projects that use DALiuGE to execute production pipelines. In a companion paper, we provide in-depth analysis of DALiuGE's scalability to very large numbers of tasks on two supercomputing facilities. (C) 2017 Elsevier B.V. All rights reserved.
– Name: TypeDocument
  Label: Document Type
  Group: TypDoc
  Data: article in journal/newspaper<br />report
– Name: Language
  Label: Language
  Group: Lang
  Data: English
– Name: NoteTitleSource
  Label: Relation
  Group: SrcInfo
  Data: ASTRONOMY AND COMPUTING; http://ir.ynao.ac.cn/handle/114a53/10042; http://www.sciencedirect.com/science/article/pii/S2213133716301214
– Name: DOI
  Label: DOI
  Group: ID
  Data: 10.1016/j.ascom.2017.03.007
– Name: URL
  Label: Availability
  Group: URL
  Data: http://ir.ynao.ac.cn/handle/114a53/10042<br />http://www.sciencedirect.com/science/article/pii/S2213133716301214<br />https://doi.org/10.1016/j.ascom.2017.03.007
– Name: AN
  Label: Accession Number
  Group: ID
  Data: edsbas.91FC2F3
PLink https://erproxy.cvtisr.sk/sfx/access?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.91FC2F3
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.1016/j.ascom.2017.03.007
    Languages:
      – Text: English
    Subjects:
      – SubjectFull: Dataflow
        Type: general
      – SubjectFull: Graph Execution Engine
        Type: general
      – SubjectFull: Data Driven
        Type: general
      – SubjectFull: Square Kilometre Array
        Type: general
      – SubjectFull: Many-task Computing
        Type: general
      – SubjectFull: 天文学
        Type: general
      – SubjectFull: 天文学::天文学其他学科
        Type: general
      – SubjectFull: 计算机科学技术
        Type: general
      – SubjectFull: 计算机科学技术::计算机应用
        Type: general
      – SubjectFull: Astronomy & Astrophysics
        Type: general
      – SubjectFull: Computer Science
        Type: general
      – SubjectFull: Interdisciplinary Applications
        Type: general
      – SubjectFull: Pipeline processing systems
        Type: general
      – SubjectFull: 619.1Pipe
        Type: general
      – SubjectFull: Piping and Pipelines - 722.4Digital Computers and Systems - 723.2Data Processing and Image Processing
        Type: general
      – SubjectFull: Science & Technology
        Type: general
      – SubjectFull: Physical Sciences
        Type: general
      – SubjectFull: Technology
        Type: general
      – SubjectFull: 理学
        Type: general
      – SubjectFull: 理学::天文学
        Type: general
      – SubjectFull: 工学
        Type: general
      – SubjectFull: 工学::计算机科学与技术(可授工学、理学学位)
        Type: general
    Titles:
      – TitleFull: DALiuGE: A graph execution framework for harnessing the astronomical data deluge
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Wu, C.
      – PersonEntity:
          Name:
            NameFull: Tobar, R.
      – PersonEntity:
          Name:
            NameFull: Vinsen, K.
      – PersonEntity:
          Name:
            NameFull: Wicenec, A.
      – PersonEntity:
          Name:
            NameFull: Pallot, D.
      – PersonEntity:
          Name:
            NameFull: Lao, B.
      – PersonEntity:
          Name:
            NameFull: Wang, R.
      – PersonEntity:
          Name:
            NameFull: An, T.
      – PersonEntity:
          Name:
            NameFull: Boulton, M.
      – PersonEntity:
          Name:
            NameFull: Cooper, I.
      – PersonEntity:
          Name:
            NameFull: Dodson, R.
      – PersonEntity:
          Name:
            NameFull: Dolensky, M.
      – PersonEntity:
          Name:
            NameFull: Mei Y(梅盈)
      – PersonEntity:
          Name:
            NameFull: Wang F(王锋)
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 01
              Type: published
              Y: 2017
          Identifiers:
            – Type: issn-locals
              Value: edsbas
ResultId 1