An Efficient Cross-Platform Workflow Optimization Method

To manage complex data analysis tasks, cross-platform data processing systems combining multiple platforms are being developed.The platform selection of operators in the cross-platform workflow of the system is critical to the system performance, because the implementation of operators on different...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Ji suan ji gong cheng Jg. 48; H. 7; S. 13 - 21,28
1. Verfasser: DU Qinghua, ZHANG Kai
Format: Journal Article
Sprache:Chinesisch
Englisch
Veröffentlicht: Editorial Office of Computer Engineering 01.07.2022
Schlagworte:
ISSN:1000-3428
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:To manage complex data analysis tasks, cross-platform data processing systems combining multiple platforms are being developed.The platform selection of operators in the cross-platform workflow of the system is critical to the system performance, because the implementation of operators on different platforms will result in significantly different performances.Currently, cost-based optimization methods are primarily applied in cross-platform workflow optimization to achieve platform selection;however, the existing cost models cannot mine the potential information of cross-platform workflows, thus resulting in inaccurate cost estimation.Hence, a more efficient cross-platform workflow optimization method is proposed herein.This method uses the GAT-BiGRU-FC Network(GGFN) model as the cost model, which uses both operator and workflow features as model inputs.The model uses a graph attention mechanism to capture the structure information of the Directed Acyclic Graph(DAG)-type cross-platform workflow and the information of the neighbor nodes of the operator.The gated recurrent unit is used to memorize the operation timing information of operators to achieve accurate cost estimations.Subsequently, the enumeration algorithm of the operator implementation platform is designed and implemented based on the characteristics of the cross-platform workflow.The algorithm utilizes the GGFN-based cost model and delay-greedy pruning method to perform enumeration and selects the appropriate implementation platform for each operator.Experiments show that this method can improve the execution performance of cross-platform workflows by 3x and reduce the runtime by more than 60%.
ISSN:1000-3428
DOI:10.19678/j.issn.1000-3428.0064163