Parallel clustering for visualizing large scientific line data
Scientists often need to extract, visualize and analyze lines from vast amounts of data to understand dynamic structures and interactions. The effectiveness of such a visual validation and analysis process mainly relies on a good strategy to categorize and visualize the lines. However, the sheer siz...
Uloženo v:
| Vydáno v: | 2011 IEEE Symposium on Large Data Analysis and Visualization s. 47 - 55 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
01.10.2011
|
| Témata: | |
| ISBN: | 9781467301565, 1467301566 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Scientists often need to extract, visualize and analyze lines from vast amounts of data to understand dynamic structures and interactions. The effectiveness of such a visual validation and analysis process mainly relies on a good strategy to categorize and visualize the lines. However, the sheer size of line data produced by state-of-the-art scientific simulations poses great challenges to preparing the data for visualization. In this paper, we present a parallelization design of regression model-based clustering to categorize large line data derived from detailed scientific simulations by leveraging the power of heterogeneous computers. This parallel clustering method employs the Expectation Maximization algorithm to iteratively approximate the optimal data partitioning. First, we use a sorted-balance algorithm to partition and distribute the lines with various lengths among multiple compute nodes. During the following iterative clustering process, regression model parameters are recovered based on the local lines on each individual node, with only a few inter-node message exchanges involved. Meanwhile, the workload of regression model computing is well balanced across the nodes. The experimental results demonstrate that our approach can effectively categorize large line data in a scalable manner to concisely convey dynamic structures and interactions, leading to a visualization that captures salient features and suppresses visual clutter to facilitate scientific exploration of large line data. |
|---|---|
| AbstractList | Scientists often need to extract, visualize and analyze lines from vast amounts of data to understand dynamic structures and interactions. The effectiveness of such a visual validation and analysis process mainly relies on a good strategy to categorize and visualize the lines. However, the sheer size of line data produced by state-of-the-art scientific simulations poses great challenges to preparing the data for visualization. In this paper, we present a parallelization design of regression model-based clustering to categorize large line data derived from detailed scientific simulations by leveraging the power of heterogeneous computers. This parallel clustering method employs the Expectation Maximization algorithm to iteratively approximate the optimal data partitioning. First, we use a sorted-balance algorithm to partition and distribute the lines with various lengths among multiple compute nodes. During the following iterative clustering process, regression model parameters are recovered based on the local lines on each individual node, with only a few inter-node message exchanges involved. Meanwhile, the workload of regression model computing is well balanced across the nodes. The experimental results demonstrate that our approach can effectively categorize large line data in a scalable manner to concisely convey dynamic structures and interactions, leading to a visualization that captures salient features and suppresses visual clutter to facilitate scientific exploration of large line data. |
| Author | Jishang Wei Hongfeng Yu Kwan-Liu Ma Chen, J. H. |
| Author_xml | – sequence: 1 surname: Jishang Wei fullname: Jishang Wei email: jswei@ucdavis.edu organization: Univ. of California Davis, Davis, CA, USA – sequence: 2 surname: Hongfeng Yu fullname: Hongfeng Yu email: hyu@sandia.gov organization: Sandia Nat. Labs., Albuquerque, NM, USA – sequence: 3 givenname: J. H. surname: Chen fullname: Chen, J. H. email: jhchen@sandia.gov organization: Sandia Nat. Labs., Albuquerque, NM, USA – sequence: 4 surname: Kwan-Liu Ma fullname: Kwan-Liu Ma email: ma@cs.ucdavis.edu organization: Univ. of California Davis, Davis, CA, USA |
| BookMark | eNo1j89KAzEYxCMqaOs-gHjJC-yaL9lkNxeh1H-FhXooXkuS_VIicSvJVtCnd8U6l-E3h2FmRs6G_YCEXAOrAJi-7e4XrxVnAJVimgtQJ2QGtWoEAynbU1Lopv1nJS9IkfMbm6SUblt1Se5eTDIxYqQuHvKIKQw76veJfoZ8MDF8_3I0aYc0u4DDGHxwNIYBaW9Gc0XOvYkZi6PPyebxYbN8Lrv102q56Mqg2Vj2qKz0jWPSWm-cR9kDd2DQuykWAkwvGqdB18qCZSh5LTTnyqOdVlst5uTmrzYg4vYjhXeTvrbHx-IHTc9Mbg |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/LDAV.2011.6092316 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 1467301558 9781467301558 |
| EndPage | 55 |
| ExternalDocumentID | 6092316 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IF 6IK 6IL 6IN AAJGR AAWTH ADFMO ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK IEGSK IERZE OCL RIE RIL |
| ID | FETCH-LOGICAL-i90t-de6b5f7c05bbfacfe5d12c1aefc5f7331ad37c91946b1b0e52439226feb156b93 |
| IEDL.DBID | RIE |
| ISBN | 9781467301565 1467301566 |
| IngestDate | Wed Aug 27 03:10:02 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i90t-de6b5f7c05bbfacfe5d12c1aefc5f7331ad37c91946b1b0e52439226feb156b93 |
| PageCount | 9 |
| ParticipantIDs | ieee_primary_6092316 |
| PublicationCentury | 2000 |
| PublicationDate | 2011-Oct. |
| PublicationDateYYYYMMDD | 2011-10-01 |
| PublicationDate_xml | – month: 10 year: 2011 text: 2011-Oct. |
| PublicationDecade | 2010 |
| PublicationTitle | 2011 IEEE Symposium on Large Data Analysis and Visualization |
| PublicationTitleAbbrev | LDAV |
| PublicationYear | 2011 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0000669886 |
| Score | 1.5191022 |
| Snippet | Scientists often need to extract, visualize and analyze lines from vast amounts of data to understand dynamic structures and interactions. The effectiveness of... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 47 |
| SubjectTerms | Clustering algorithms Computational modeling Data models Data visualization Graphics processing unit Mathematical model Vectors |
| Title | Parallel clustering for visualizing large scientific line data |
| URI | https://ieeexplore.ieee.org/document/6092316 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB7a4sGTSiu-ycGjsZvN5nURRC0eSumhlN7KJjsLhdJKX4f-epPstiJ48ZZJICQkZL4k3zcD8OjvHA654VRL7WimbUKt98PUMufHnec6i8lgxn01GOjJxAwb8HTUwiBiJJ_hcyjGv_xi6bbhqawrkwBHZBOaSslKq3V8T_Gu02gto3ZLhm3rccohpFNti_pXkyWm239_HVcBPOtOf2VXic6ld_a_YZ1D50elR4ZH_3MBDVy04WWYr0J6lDlx820IguCbiAemZDdbB_3kPtjzQP8mlRYyUIVIwJokkEU7MOp9jN4-aZ0jgc5MsqEFSitK5RJhbZm7EkXBUsdyLJ2v5pzlBVfOMJNJy2yCIvUAxCOu0h_RQlrDL6G1WC7wCogqLRruUmUEZlHAqg1Kwwt_40Fdmmtoh6lPv6ooGNN61jd_V9_CaXpgy7E7aG1WW7yHE7fbzNarh7h035fKlrI |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEB1qFfSk0orf5uDR2M1mk00ugqil4lp6KKW3ssnOQqG0pV8Hf73J7rYiePGWSSAkJGRekvdmAO7dncMi15wqqSyNlAmocX6YGmbduNNURUUymEESd7tqONS9GjzstDCIWJDP8NEXi7_8bGbX_qmsJQMPR-Qe7IsoCoNSrbV7UXHOUyslC_WW9BvXIZVtUKfKFtW_Jgt0K3l9HpQhPKtuf-VXKdxL-_h_AzuB5o9Oj_R2HugUajhtwFMvXfgEKRNiJ2sfBsE1EQdNyWa89ArKL29PPAGclGpITxYiHm0STxdtQr_91n_p0CpLAh3rYEUzlEbksQ2EMXlqcxQZCy1LMbeumnOWZjy2mulIGmYCFKGDIA5z5e6QFtJofgb16WyK50Di3KDmNoy1wKiQsCqNUvPM3XlQ5foCGn7qo3kZB2NUzfry7-o7OOz0P5NR8t79uIKjcMudY9dQXy3WeAMHdrMaLxe3xTJ-A6QTmfk |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2011+IEEE+Symposium+on+Large+Data+Analysis+and+Visualization&rft.atitle=Parallel+clustering+for+visualizing+large+scientific+line+data&rft.au=Jishang+Wei&rft.au=Hongfeng+Yu&rft.au=Chen%2C+J.+H.&rft.au=Kwan-Liu+Ma&rft.date=2011-10-01&rft.pub=IEEE&rft.isbn=9781467301565&rft.spage=47&rft.epage=55&rft_id=info:doi/10.1109%2FLDAV.2011.6092316&rft.externalDocID=6092316 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781467301565/lc.gif&client=summon&freeimage=true |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781467301565/mc.gif&client=summon&freeimage=true |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781467301565/sc.gif&client=summon&freeimage=true |

