Prism: Revealing Hidden Functional Clusters from Massive Instances in Cloud Systems
Ensuring the reliability of cloud systems is critical for both cloud vendors and customers. Cloud systems often rely on virtualization techniques to create instances of hardware resources, such as virtual machines. However, virtualization hinders the observability of cloud systems, making it challen...
Uloženo v:
| Vydáno v: | IEEE/ACM International Conference on Automated Software Engineering : [proceedings] s. 268 - 280 |
|---|---|
| Hlavní autoři: | , , , , , , , , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
11.09.2023
|
| Témata: | |
| ISSN: | 2643-1572 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Ensuring the reliability of cloud systems is critical for both cloud vendors and customers. Cloud systems often rely on virtualization techniques to create instances of hardware resources, such as virtual machines. However, virtualization hinders the observability of cloud systems, making it challenging to diagnose platform-level issues. To improve system observability, we propose to infer functional clusters of instances, i.e., groups of instances having similar functionalities. We first conduct a pilot study on a large-scale cloud system, i.e., Huawei Cloud, demonstrating that instances having similar functionalities share similar communication and resource usage patterns. Motivated by these findings, we formulate the identification of functional clusters as a clustering problem and propose a non-intrusive solution called Prism. Prism adopts a coarse-to-fine clustering strategy. It first partitions instances into coarse-grained chunks based on communication patterns. Within each chunk, Prism further groups instances with similar resource usage patterns to produce fine-grained functional clusters. Such a design reduces noises in the data and allows Prism to process massive instances efficiently. We evaluate Prism on two datasets collected from the real-world production environment of Huawei Cloud. Our experiments show that Prism achieves a v-measure of ∼0.95, surpassing existing state-of-the-art solutions. Additionally, we illustrate the integration of Prism within monitoring systems for enhanced cloud reliability through two real-world use cases. |
|---|---|
| AbstractList | Ensuring the reliability of cloud systems is critical for both cloud vendors and customers. Cloud systems often rely on virtualization techniques to create instances of hardware resources, such as virtual machines. However, virtualization hinders the observability of cloud systems, making it challenging to diagnose platform-level issues. To improve system observability, we propose to infer functional clusters of instances, i.e., groups of instances having similar functionalities. We first conduct a pilot study on a large-scale cloud system, i.e., Huawei Cloud, demonstrating that instances having similar functionalities share similar communication and resource usage patterns. Motivated by these findings, we formulate the identification of functional clusters as a clustering problem and propose a non-intrusive solution called Prism. Prism adopts a coarse-to-fine clustering strategy. It first partitions instances into coarse-grained chunks based on communication patterns. Within each chunk, Prism further groups instances with similar resource usage patterns to produce fine-grained functional clusters. Such a design reduces noises in the data and allows Prism to process massive instances efficiently. We evaluate Prism on two datasets collected from the real-world production environment of Huawei Cloud. Our experiments show that Prism achieves a v-measure of ∼0.95, surpassing existing state-of-the-art solutions. Additionally, we illustrate the integration of Prism within monitoring systems for enhanced cloud reliability through two real-world use cases. |
| Author | Liu, Jinyang Feng, Cong Chen, Zhuangbin Yang, Yongqiang Lyu, Michael R. Gu, Jiazhen Yang, Zengyin Huang, Junjie Jiang, Zhihan |
| Author_xml | – sequence: 1 givenname: Jinyang surname: Liu fullname: Liu, Jinyang email: jyliu@cse.cuhk.edu.hk organization: The Chinese University of Hong Kong,Hong Kong SAR,China – sequence: 2 givenname: Zhihan surname: Jiang fullname: Jiang, Zhihan email: zhjiang22@cse.cuhk.edu.hk organization: The Chinese University of Hong Kong,Hong Kong SAR,China – sequence: 3 givenname: Jiazhen surname: Gu fullname: Gu, Jiazhen email: jzgu@cse.cuhk.edu.hk organization: The Chinese University of Hong Kong,Hong Kong SAR,China – sequence: 4 givenname: Junjie surname: Huang fullname: Huang, Junjie email: jjhuang23@cse.cuhk.edu.hk organization: The Chinese University of Hong Kong,Hong Kong SAR,China – sequence: 5 givenname: Zhuangbin surname: Chen fullname: Chen, Zhuangbin email: chenzhb36@mail.sysu.edu.cn organization: School of Software Engineering, Sun Yat-sen University,Zhuhai,China – sequence: 6 givenname: Cong surname: Feng fullname: Feng, Cong email: fengcong5@huawei.com organization: Huawei Cloud Computing Technology Co., Ltd,Computing and Networking Innovation Lab,China – sequence: 7 givenname: Zengyin surname: Yang fullname: Yang, Zengyin email: yangzengyin@huawei.com organization: Huawei Cloud Computing Technology Co., Ltd,Computing and Networking Innovation Lab,China – sequence: 8 givenname: Yongqiang surname: Yang fullname: Yang, Yongqiang email: yangyongqiang@huawei.com organization: Huawei Cloud Computing Technology Co., Ltd,Computing and Networking Innovation Lab,China – sequence: 9 givenname: Michael R. surname: Lyu fullname: Lyu, Michael R. email: lyu@cse.cuhk.edu.hk organization: The Chinese University of Hong Kong,Hong Kong SAR,China |
| BookMark | eNotjN1KAkEYQKcoSM0nqIt5gbX52fnrTkRTMIqsa_l29puY2J2NnVXw7RXq6nDgcMbkJnUJCXngbMY5c0_z3VJpIdxMMCFnjDFjrsjUGWelYlI4p8trMhK6lAVXRtyRcc4_jKmLmBHZvfcxt8_0A48ITUzfdB3rGhNdHZIfYpegoYvmkAfsMw1919JXyDkekW5SHiB5zDSmS9Idaro7Xbo235PbAE3G6T8n5Gu1_Fysi-3by2Yx3xYgbDkUWClfMyOV80ahdlhZoQENBMDgKu9BcwXSlaFiNoA1JmjUQnkFjiP3ckIe_74REfe_fWyhP-05E84qK-UZcURTzw |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/ASE56229.2023.00077 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 9798350329964 |
| EISSN | 2643-1572 |
| EndPage | 280 |
| ExternalDocumentID | 10298583 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: Shenzhen Science and Technology Innovation Commission grantid: JCYJ20200109113403826 funderid: 10.13039/501100010877 |
| GroupedDBID | 6IE 6IF 6IH 6IK 6IL 6IM 6IN 6J9 AAJGR AAWTH ABLEC ACREN ADYOE ADZIZ AFYQB ALMA_UNASSIGNED_HOLDINGS AMTXH BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL |
| ID | FETCH-LOGICAL-a284t-eb5cd07359c75e69eb826ae7afaef9bcca615a394fb08fa877f6e625c5a91e1c3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 6 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001103357200022&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:32:41 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a284t-eb5cd07359c75e69eb826ae7afaef9bcca615a394fb08fa877f6e625c5a91e1c3 |
| PageCount | 13 |
| ParticipantIDs | ieee_primary_10298583 |
| PublicationCentury | 2000 |
| PublicationDate | 2023-Sept.-11 |
| PublicationDateYYYYMMDD | 2023-09-11 |
| PublicationDate_xml | – month: 09 year: 2023 text: 2023-Sept.-11 day: 11 |
| PublicationDecade | 2020 |
| PublicationTitle | IEEE/ACM International Conference on Automated Software Engineering : [proceedings] |
| PublicationTitleAbbrev | ASE |
| PublicationYear | 2023 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0051577 ssib057256115 |
| Score | 2.2966413 |
| Snippet | Ensuring the reliability of cloud systems is critical for both cloud vendors and customers. Cloud systems often rely on virtualization techniques to create... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 268 |
| SubjectTerms | Cloud computing cloud observability cloud systems functional clusters Hardware instances Observability Production Reliability engineering Software reliability Virtual machining |
| Title | Prism: Revealing Hidden Functional Clusters from Massive Instances in Cloud Systems |
| URI | https://ieeexplore.ieee.org/document/10298583 |
| WOSCitedRecordID | wos001103357200022&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELVoxcBUPor4lgfWQJwPn82GqlZloKooSN0qxz5LlVCK2qa_n3OaFjEwsEWWh9iX83t2_N4xdk8IJ2NvVeRjaaOMIDQywmcRmELGVmuHuqiLTcBopKZTPW7E6rUWBhHry2f4EB7rf_luYatwVEYZnmiVq7TFWgByK9bafTw5EHgLsee-hNMAjc2QiPXj86RPUJ8EbUoSTE1j-F1QpcaTQeefb3LMuj_KPD7eY84JO8DylHV2pRl4k6lnbDIO3oBP_A03xASpKx8Gr5CSDwjHtsd_vPdZBZeEFQ8SE_5KLJpWPv5S80VaPfi8pC6LyvHG1bzLPgb9994wauonRIZAZx1hkVtHKZxrCzlKjQXtJQyC8Qa9Lih2RGdMqjNfxMobBeAl0n7I5kYLFDY9Z-1yUeIF42AkjTdLHCiTuQSKjJiadypLpfEid5esGyZp9rW1yJjt5ufqj_ZrdhTiEC5eCHHD2utlhbfs0G7W89Xyrg7sN5J_pCg |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NTwIxEG0UTfSEHxi_7cFrdbtfbb0ZAoEIhAgm3Ei3nSYkZjHA8vudLgvGgwdvm6aHbWen77Xb94aQR0S4NHBGMhekhsUIoUxzFzOhszQwSllQWVlsQgwGcjJRw0qsXmphAKC8fAZP_rH8l2_npvBHZZjhoZKJjPbJgS-dVcm1tp9PIhC-Od-xX0RqISqjIR6o59dRC8E-9OqU0NuaBuJ3SZUSUdr1f77LCWn8aPPocIc6p2QP8jNS3xZnoFWunpPR0LsDvtB3WCMXxK60491CctpGJNscANLmZ-F9EpbUi0xoH3k0rn20WzJGXD_oLMcu88LSyte8QT7arXGzw6oKCkwj7KwYZImxmMSJMiKBVEGGuwkNQjsNTmUYPSQ0OlKxywLptBTCpYA7IpNoxYGb6ILU8nkOl4QKneJ449AKqWMbiixGruasjKNUO57YK9LwkzT92phkTLfzc_1H-wM56oz7vWmvO3i7Icc-Jv4aBue3pLZaFHBHDs16NVsu7ssgfwNPGKdx |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE%2FACM+International+Conference+on+Automated+Software+Engineering+%3A+%5Bproceedings%5D&rft.atitle=Prism%3A+Revealing+Hidden+Functional+Clusters+from+Massive+Instances+in+Cloud+Systems&rft.au=Liu%2C+Jinyang&rft.au=Jiang%2C+Zhihan&rft.au=Gu%2C+Jiazhen&rft.au=Huang%2C+Junjie&rft.date=2023-09-11&rft.pub=IEEE&rft.eissn=2643-1572&rft.spage=268&rft.epage=280&rft_id=info:doi/10.1109%2FASE56229.2023.00077&rft.externalDocID=10298583 |