Application of Improved DBSCAN Clustering Algorithm on Industrial Fault Text Data
The industrial fault text data are the special type of short texts, and they come from the records of faults in the factory. Clustering the industrial fault text data can reduce the redundant data and find out the hidden information, which is of great significance to improve the utilization of the i...
Uložené v:
| Vydané v: | IEEE International Conference on Industrial Informatics (INDIN) Ročník 1; s. 461 - 468 |
|---|---|
| Hlavní autori: | , , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
20.07.2020
|
| Predmet: | |
| ISSN: | 2378-363X |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | The industrial fault text data are the special type of short texts, and they come from the records of faults in the factory. Clustering the industrial fault text data can reduce the redundant data and find out the hidden information, which is of great significance to improve the utilization of the industrial fault text data. The industrial fault text data are unstructured and irregular, so the clustering faces quite a few challenges. This paper introduces some existing algorithms for the clustering of short texts, and the shortcomings of them are briefly analyzed. This paper indicates that the main problem of the clustering of the industrial fault text data is the contradiction between the requirements and the setup of parameters, and it leads to low accuracy when cluster the corpus of different sizes. To increase the accuracy of clustering, an improved clustering algorithm is proposed which can solve this contradiction. The results of the comparative experiments show that the improved clustering algorithm has better performance than DBSCAN in corpus of different sizes on the industrial fault text data. |
|---|---|
| AbstractList | The industrial fault text data are the special type of short texts, and they come from the records of faults in the factory. Clustering the industrial fault text data can reduce the redundant data and find out the hidden information, which is of great significance to improve the utilization of the industrial fault text data. The industrial fault text data are unstructured and irregular, so the clustering faces quite a few challenges. This paper introduces some existing algorithms for the clustering of short texts, and the shortcomings of them are briefly analyzed. This paper indicates that the main problem of the clustering of the industrial fault text data is the contradiction between the requirements and the setup of parameters, and it leads to low accuracy when cluster the corpus of different sizes. To increase the accuracy of clustering, an improved clustering algorithm is proposed which can solve this contradiction. The results of the comparative experiments show that the improved clustering algorithm has better performance than DBSCAN in corpus of different sizes on the industrial fault text data. |
| Author | Zhang, Xuesong Wang, Xiaohan Xie, Kunyu Zhang, Lin |
| Author_xml | – sequence: 1 givenname: Xiaohan surname: Wang fullname: Wang, Xiaohan email: xiaohanwang@buaa.edu.cn organization: School of Automation Science and Electrical Engineering, Beihang University,Beijing,China – sequence: 2 givenname: Lin surname: Zhang fullname: Zhang, Lin email: johnlin9999@163.com organization: School of Automation Science and Electrical Engineering, Beihang University,Beijing,China – sequence: 3 givenname: Xuesong surname: Zhang fullname: Zhang, Xuesong email: xs_zhang@126.com organization: College of Computer Science and Technology, Jilin University,Changchun,China – sequence: 4 givenname: Kunyu surname: Xie fullname: Xie, Kunyu email: 1285589283@qq.com organization: School of Automation Science and Electrical Engineering, Beihang University,Beijing,China |
| BookMark | eNotkNtKw0AYhFdRsK19Ai_cF0j895y9jEmrgRIRK3hXNsmmruREshV9ewP2agbmYxhmia66vrMI3RMICQH9kOVplnMhIhpSoBBqzilodoHWWkVE0YhwLbm8RAvKVBQwyT5u0HKavgCEIFwu0Gs8DI0rjXd9h_saZ-0w9t-2wunjWxLnOGlOk7ej6444bo796Pxni2c066o5GJ1p8NacGo_39sfj1Hhzi65r00x2fdYVet9u9slzsHt5ypJ4FzgKzAc0AkYKZRkQawCqeXlZlyUzs1XMFhq40EpzG4laVEQQVvFCVUKWkkrJga3Q3X-vs9YehtG1Zvw9nB9gf7q_UUc |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/INDIN45582.2020.9442093 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISBN | 9781728149646 1728149649 |
| EISSN | 2378-363X |
| EndPage | 468 |
| ExternalDocumentID | 9442093 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IK 6IL 6IN AAWTH ABLEC ACGFS ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK OCL RIE RIL |
| ID | FETCH-LOGICAL-i203t-28031b7e301ea00d442cfcc3a0d473eb90459794e85f5d1513d4b7d56c6266403 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 2 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000907230200070&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:50:55 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i203t-28031b7e301ea00d442cfcc3a0d473eb90459794e85f5d1513d4b7d56c6266403 |
| PageCount | 8 |
| ParticipantIDs | ieee_primary_9442093 |
| PublicationCentury | 2000 |
| PublicationDate | 2020-July-20 |
| PublicationDateYYYYMMDD | 2020-07-20 |
| PublicationDate_xml | – month: 07 year: 2020 text: 2020-July-20 day: 20 |
| PublicationDecade | 2020 |
| PublicationTitle | IEEE International Conference on Industrial Informatics (INDIN) |
| PublicationTitleAbbrev | INDIN |
| PublicationYear | 2020 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0055146 |
| Score | 2.1269324 |
| Snippet | The industrial fault text data are the special type of short texts, and they come from the records of faults in the factory. Clustering the industrial fault... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 461 |
| SubjectTerms | Clustering algorithm Clustering algorithms Conferences DBSCAN Degradation Faces Industrial fault text data Informatics Production facilities Text clustering |
| Title | Application of Improved DBSCAN Clustering Algorithm on Industrial Fault Text Data |
| URI | https://ieeexplore.ieee.org/document/9442093 |
| Volume | 1 |
| WOSCitedRecordID | wos000907230200070&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEJ4A8aAXH2B8pwePFsq2u22PCBK5bDBiwo10266SAGtw199vu6ygiRdvTdNXpplO5_kB3EbKR_tJia1SAWaRTrFkCcUilWlCuekmVpVgEzyOxXQqxzW42-bCWGvL4DPb9s3Sl28yXXhTWUcyFjgNvA51zqNNrtb3q-sFf1TFb7ndO6N4MIpZGAqfbBWQdjX1F4ZKKUKGh__b_Ahau1w8NN5KmWOo2dUJHPwoI9iEp97OC42yFG0MBdagwf1zvxej_qLw5RDcWNRbvGbref62RG7oDrYDDVWxyNHEvdRooHLVgpfhw6T_iCuoBDwPCM2xx5jqJtw6drWKEOPOqlOtqXJNTm0i3c9NOtazIkxD46Q8NSzhJoy0U2giRugpNFbZyp4BIpRzQhIqnOrHmAqFJm4l_zHghggdnEPTE2f2vqmGMavocvF39yXse_p7a2hArqCRrwt7DXv6M59_rG_KK_wC-l-bUg |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NT8IwFH9BNFEvfoDx2x48Oihrt65HBAlEXDBiwo10backyMzc_PttxwRNvHh7afqVNn1ffe_9AK59YaP9OHe0EK5DfRk7nEbECWIeR4SpVqRFATbBwjCYTPioAjerXBitdRF8phuWLP7yVSJz6yprckpdY4FvwKZnqWW21jfftaLfLyO4zPrNQdgdhNTzAptu5eJGOfgXikohRHp7_1t-H-rrbDw0WsmZA6joxSHs_igkWIPH9vofGiUxWroKtELd26dOO0SdeW4LIpi-qD1_SdJZ9vqGTNc1cAfqiXyeobHh1agrMlGH597duNN3SrAEZ-ZikjkWZaoVMW0erBYYK7NXGUtJhCEZ0RE3uhs3j08HXuwpI-eJohFTni-NSeNTTI6gukgW-hgQJoxhHJHAGH-UCi-Q2MxkVQOmcCDdE6jZw5m-L-thTMtzOf27-Qq2--OH4XQ4CO_PYMfehfWNuvgcqlma6wvYkp_Z7CO9LK7zC0Gfnpk |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE+International+Conference+on+Industrial+Informatics+%28INDIN%29&rft.atitle=Application+of+Improved+DBSCAN+Clustering+Algorithm+on+Industrial+Fault+Text+Data&rft.au=Wang%2C+Xiaohan&rft.au=Zhang%2C+Lin&rft.au=Zhang%2C+Xuesong&rft.au=Xie%2C+Kunyu&rft.date=2020-07-20&rft.pub=IEEE&rft.eissn=2378-363X&rft.volume=1&rft.spage=461&rft.epage=468&rft_id=info:doi/10.1109%2FINDIN45582.2020.9442093&rft.externalDocID=9442093 |