CODC-pyParaQC: A design and implementation of parallel quality control for ocean observation big data
High-quality ocean observation is essential for research and applications in ocean exploration and climate change. With moving into the era of big data in recent years, it becomes crucial to process these massive raw observations accurately and efficiently. This paper addressed issues encountered in...
Gespeichert in:
| Veröffentlicht in: | Proceedings of the ... International Symposium on Parallel and Distributed Processing with Applications (Print) S. 1863 - 1870 |
|---|---|
| Hauptverfasser: | , , , , , , |
| Format: | Tagungsbericht |
| Sprache: | Englisch |
| Veröffentlicht: |
IEEE
30.10.2024
|
| Schlagworte: | |
| ISSN: | 2158-9208 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | High-quality ocean observation is essential for research and applications in ocean exploration and climate change. With moving into the era of big data in recent years, it becomes crucial to process these massive raw observations accurately and efficiently. This paper addressed issues encountered in processing ocean big data within traditional delayed-mode quality control systems, including substantial serial I/O workloads and frequent context switching. A parallel quality control scheme named CODC-pyParaQC was proposed by constructing computing process groups. It retains the advantages of the existed delayed-mode quality control system (e.g. CODC-QC) while improving the efficiency of the quality control procedure, solving the feasibility of a large-scale parallel computation of the quality control scheme and realizing the (near) real-time quality control of massive ocean observation profiles. The results showed that the efficiency of single-node quality control has been improved by about 10 times. Leveraging the computing power of supercomputers and employing multi process groups for cross-node parallel computation, we have developed a fast and efficient (near) real-time quality control procedure. This system processed approximately 22,548,733 temperature profiles from the world ocean database (1940-2023) in about 6.5 hours. Our new quality control scheme can ensure the computing capability necessary for establishing a high-quality ocean observation profile database. |
|---|---|
| AbstractList | High-quality ocean observation is essential for research and applications in ocean exploration and climate change. With moving into the era of big data in recent years, it becomes crucial to process these massive raw observations accurately and efficiently. This paper addressed issues encountered in processing ocean big data within traditional delayed-mode quality control systems, including substantial serial I/O workloads and frequent context switching. A parallel quality control scheme named CODC-pyParaQC was proposed by constructing computing process groups. It retains the advantages of the existed delayed-mode quality control system (e.g. CODC-QC) while improving the efficiency of the quality control procedure, solving the feasibility of a large-scale parallel computation of the quality control scheme and realizing the (near) real-time quality control of massive ocean observation profiles. The results showed that the efficiency of single-node quality control has been improved by about 10 times. Leveraging the computing power of supercomputers and employing multi process groups for cross-node parallel computation, we have developed a fast and efficient (near) real-time quality control procedure. This system processed approximately 22,548,733 temperature profiles from the world ocean database (1940-2023) in about 6.5 hours. Our new quality control scheme can ensure the computing capability necessary for establishing a high-quality ocean observation profile database. |
| Author | Li, Tianyan Wang, Yanjun Jin, Zhong Zhang, Bin Yuan, Huifeng Cheng, Lijing Tan, Zhetao |
| Author_xml | – sequence: 1 givenname: Huifeng surname: Yuan fullname: Yuan, Huifeng email: hfyuan@cnic.cn organization: University of Chinese Academy of Sciences,Computer Internet Information Center, Chinese Academy of Sciences,Beijing,China – sequence: 2 givenname: Tianyan surname: Li fullname: Li, Tianyan email: tyli@cnic.cn organization: Chinese Academy of Sciences,Computer Internet Information Center,Beijing,China – sequence: 3 givenname: Zhong surname: Jin fullname: Jin, Zhong email: zjin@sccas.cn organization: Chinese Academy of Sciences,Computer Internet Information Center,Beijing,China – sequence: 4 givenname: Lijing surname: Cheng fullname: Cheng, Lijing email: chenglij@mail.iap.ac.cn organization: Chinese Academy of Sciences,Institute of Atmospheric Physics,Beijing,China – sequence: 5 givenname: Zhetao surname: Tan fullname: Tan, Zhetao email: tanzhetao19@mails.ucas.ac.cn organization: Chinese Academy of Sciences,Institute of Atmospheric Physics,Beijing,China – sequence: 6 givenname: Bin surname: Zhang fullname: Zhang, Bin email: zhangbin@qdio.ac.cn organization: Chinese Academy of Sciences,Institute of Oceanography,Qingdao,China – sequence: 7 givenname: Yanjun surname: Wang fullname: Wang, Yanjun email: yjwang@qdio.ac.cn organization: Chinese Academy of Sciences,Institute of Oceanography,Qingdao,China |
| BookMark | eNotjctuwjAQRd2qlUopf8DCPxA6Y8eJ3R1KX0hIUJU9GpMBuTJOmqSV-Psi0dXZnHPvvbhJTWIhpggzRHCPi8_1vNBY2JkClc8AlMmvxMSVzmqNBlyJ6lqMFBqbOQX2Tkz6_gsANFrrHIwEV6vnKmtPa-roo3qSc1lzHw5JUqplOLaRj5wGGkKTZLOX7VmLkaP8_qEYhpPcNWnomij3TSebHdPZ8j13v5fCh4OsaaAHcbun2PPkn2OxeX3ZVO_ZcvW2qObLLDg9ZFgrXXsN3lHOnn1OllRZW4bClqU3hUH0zuSOnFbs3flOKWRNRlsoNeqxmF5mAzNv2y4cqTttEaw1GlH_AW6BWT0 |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/ISPA63168.2024.00254 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science Oceanography |
| EISBN | 9798331509712 |
| EISSN | 2158-9208 |
| EndPage | 1870 |
| ExternalDocumentID | 10885311 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IF 6IH 6IK 6IL 6IN AAJGR AAWTH ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL |
| ID | FETCH-LOGICAL-i93t-1d23db30b9a4ebeb4a8a27d8e06877b56511b9549a932eb9cea221e3a53807313 |
| IEDL.DBID | RIE |
| IngestDate | Wed Aug 27 01:52:33 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i93t-1d23db30b9a4ebeb4a8a27d8e06877b56511b9549a932eb9cea221e3a53807313 |
| PageCount | 8 |
| ParticipantIDs | ieee_primary_10885311 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-Oct.-30 |
| PublicationDateYYYYMMDD | 2024-10-30 |
| PublicationDate_xml | – month: 10 year: 2024 text: 2024-Oct.-30 day: 30 |
| PublicationDecade | 2020 |
| PublicationTitle | Proceedings of the ... International Symposium on Parallel and Distributed Processing with Applications (Print) |
| PublicationTitleAbbrev | ISPA |
| PublicationYear | 2024 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0003188990 |
| Score | 1.8876755 |
| Snippet | High-quality ocean observation is essential for research and applications in ocean exploration and climate change. With moving into the era of big data in... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 1863 |
| SubjectTerms | Big Data Climate change Computational efficiency Control systems Database systems Observers ocean observation Ocean temperature Oceanography Oceans parallel computation Process control Quality control Real-time systems Supercomputers |
| Title | CODC-pyParaQC: A design and implementation of parallel quality control for ocean observation big data |
| URI | https://ieeexplore.ieee.org/document/10885311 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELWgYkAMhVLEt25gDSR2GsdsVaCCpQ2iQ7fKji8oUkmqfiD132M7aTsxsERWBlvy-XTn8733CHnQWW6CFoZeiD6zH-UJ3ZNeLjLBZRwKzWqxCT4cxpOJSBuwusPCIKJrPsNHO3Rv-brK1rZUZjw8NtHFInkPOY9qsNauoGIOp7k7-A08LvDF0_tn2o-sMJO5BlJLkk0d6f9eRMXFkEH7n6ufku4ejQfpLs6ckQMsO6S9lWOAxjs75GSUoSwbCupzgsnoJfHmm1Qu5EfyDH3QrlsDZKmh-N72jVvDQJWD5QCfzXAGNcxyA00TO5isFio7M1RqV8IFVXyB7S7tkvHgdZy8eY2oglcItvICTZlWzFdChsZ-KpSxpFzH6Ecx58qkd0Gg7NOfNIkdKmGmpzRAJnuWmZ4F7IK0yqrESwI5zUMpRG7cPgupFqLHVJRRyeJIBSwXV6RrN3E6r2kzptv9u_7j_w05tnZygcG_Ja3VYo135Cj7WRXLxb0z9i9kOqs_ |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELUQIIEYCqWIb25gDSS282G2KlC1orRBdOhW2bGDIpWkKi1S_z22k7YTA0tkZbAln093Pt97D6F7mWY6aCnqUOUS8xEOkz53MpaykEeUSVKJTYSDQTQes6QGq1ssjFLKNp-pBzO0b_myTJemVKY9PNLRxSB593xKsVvBtTYlFX089e3BrQFynsseex9JOzDSTPoiiA1NNra0_1sZFRtFOo1_rn-MWls8HiSbSHOCdlTRRI21IAPU_tlER8NU8aImoT5FKh4-x85slfA5f4-foA3S9msALyTkX-vOcWMaKDMwLODTqZpCBbRcQd3GDjqvhdLMDKXYFHFB5J9g-ktbaNR5GcVdp5ZVcHJGFo4nMZGCuIJxqi0oKI84DmWk3CAKQ6ETPM8T5vGP69ROCaanx9hThPuGm5545AztFmWhzhFkOKOcsUw7fkqxZMwnIkgxJ1EgPJKxC9QymziZVcQZk_X-Xf7x_w4ddEdv_Um_N3i9QofGZjZMuNdodzFfqhu0n_4s8u_5rTX8L3nqroY |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+the+...+International+Symposium+on+Parallel+and+Distributed+Processing+with+Applications+%28Print%29&rft.atitle=CODC-pyParaQC%3A+A+design+and+implementation+of+parallel+quality+control+for+ocean+observation+big+data&rft.au=Yuan%2C+Huifeng&rft.au=Li%2C+Tianyan&rft.au=Jin%2C+Zhong&rft.au=Cheng%2C+Lijing&rft.date=2024-10-30&rft.pub=IEEE&rft.eissn=2158-9208&rft.spage=1863&rft.epage=1870&rft_id=info:doi/10.1109%2FISPA63168.2024.00254&rft.externalDocID=10885311 |