Application of Improved DBSCAN Clustering Algorithm on Industrial Fault Text Data

The industrial fault text data are the special type of short texts, and they come from the records of faults in the factory. Clustering the industrial fault text data can reduce the redundant data and find out the hidden information, which is of great significance to improve the utilization of the i...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE International Conference on Industrial Informatics (INDIN) Ročník 1; s. 461 - 468
Hlavní autoři: Wang, Xiaohan, Zhang, Lin, Zhang, Xuesong, Xie, Kunyu
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 20.07.2020
Témata:
ISSN:2378-363X
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract The industrial fault text data are the special type of short texts, and they come from the records of faults in the factory. Clustering the industrial fault text data can reduce the redundant data and find out the hidden information, which is of great significance to improve the utilization of the industrial fault text data. The industrial fault text data are unstructured and irregular, so the clustering faces quite a few challenges. This paper introduces some existing algorithms for the clustering of short texts, and the shortcomings of them are briefly analyzed. This paper indicates that the main problem of the clustering of the industrial fault text data is the contradiction between the requirements and the setup of parameters, and it leads to low accuracy when cluster the corpus of different sizes. To increase the accuracy of clustering, an improved clustering algorithm is proposed which can solve this contradiction. The results of the comparative experiments show that the improved clustering algorithm has better performance than DBSCAN in corpus of different sizes on the industrial fault text data.
AbstractList The industrial fault text data are the special type of short texts, and they come from the records of faults in the factory. Clustering the industrial fault text data can reduce the redundant data and find out the hidden information, which is of great significance to improve the utilization of the industrial fault text data. The industrial fault text data are unstructured and irregular, so the clustering faces quite a few challenges. This paper introduces some existing algorithms for the clustering of short texts, and the shortcomings of them are briefly analyzed. This paper indicates that the main problem of the clustering of the industrial fault text data is the contradiction between the requirements and the setup of parameters, and it leads to low accuracy when cluster the corpus of different sizes. To increase the accuracy of clustering, an improved clustering algorithm is proposed which can solve this contradiction. The results of the comparative experiments show that the improved clustering algorithm has better performance than DBSCAN in corpus of different sizes on the industrial fault text data.
Author Zhang, Xuesong
Wang, Xiaohan
Xie, Kunyu
Zhang, Lin
Author_xml – sequence: 1
  givenname: Xiaohan
  surname: Wang
  fullname: Wang, Xiaohan
  email: xiaohanwang@buaa.edu.cn
  organization: School of Automation Science and Electrical Engineering, Beihang University,Beijing,China
– sequence: 2
  givenname: Lin
  surname: Zhang
  fullname: Zhang, Lin
  email: johnlin9999@163.com
  organization: School of Automation Science and Electrical Engineering, Beihang University,Beijing,China
– sequence: 3
  givenname: Xuesong
  surname: Zhang
  fullname: Zhang, Xuesong
  email: xs_zhang@126.com
  organization: College of Computer Science and Technology, Jilin University,Changchun,China
– sequence: 4
  givenname: Kunyu
  surname: Xie
  fullname: Xie, Kunyu
  email: 1285589283@qq.com
  organization: School of Automation Science and Electrical Engineering, Beihang University,Beijing,China
BookMark eNotkNtKw0AYhFdRsK19Ai_cF0j895y9jEmrgRIRK3hXNsmmruREshV9ewP2agbmYxhmia66vrMI3RMICQH9kOVplnMhIhpSoBBqzilodoHWWkVE0YhwLbm8RAvKVBQwyT5u0HKavgCEIFwu0Gs8DI0rjXd9h_saZ-0w9t-2wunjWxLnOGlOk7ej6444bo796Pxni2c066o5GJ1p8NacGo_39sfj1Hhzi65r00x2fdYVet9u9slzsHt5ypJ4FzgKzAc0AkYKZRkQawCqeXlZlyUzs1XMFhq40EpzG4laVEQQVvFCVUKWkkrJga3Q3X-vs9YehtG1Zvw9nB9gf7q_UUc
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/INDIN45582.2020.9442093
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library (IEL) (UW System Shared)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 9781728149646
1728149649
EISSN 2378-363X
EndPage 468
ExternalDocumentID 9442093
Genre orig-research
GroupedDBID 6IE
6IK
6IL
6IN
AAWTH
ABLEC
ACGFS
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
OCL
RIE
RIL
ID FETCH-LOGICAL-i203t-28031b7e301ea00d442cfcc3a0d473eb90459794e85f5d1513d4b7d56c6266403
IEDL.DBID RIE
ISICitedReferencesCount 2
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000907230200070&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:50:55 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i203t-28031b7e301ea00d442cfcc3a0d473eb90459794e85f5d1513d4b7d56c6266403
PageCount 8
ParticipantIDs ieee_primary_9442093
PublicationCentury 2000
PublicationDate 2020-July-20
PublicationDateYYYYMMDD 2020-07-20
PublicationDate_xml – month: 07
  year: 2020
  text: 2020-July-20
  day: 20
PublicationDecade 2020
PublicationTitle IEEE International Conference on Industrial Informatics (INDIN)
PublicationTitleAbbrev INDIN
PublicationYear 2020
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0055146
Score 2.1268263
Snippet The industrial fault text data are the special type of short texts, and they come from the records of faults in the factory. Clustering the industrial fault...
SourceID ieee
SourceType Publisher
StartPage 461
SubjectTerms Clustering algorithm
Clustering algorithms
Conferences
DBSCAN
Degradation
Faces
Industrial fault text data
Informatics
Production facilities
Text clustering
Title Application of Improved DBSCAN Clustering Algorithm on Industrial Fault Text Data
URI https://ieeexplore.ieee.org/document/9442093
Volume 1
WOSCitedRecordID wos000907230200070&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEJ4A8aAXH2B8pwePLpTd7pYeESSSmA1GTLiRPmaVZGUN7vr7bZcVNPHipWmaadpMM_0603kAXNPE4o7EyEMthFVQbCOd45phRmLicn6p8qQfeBz3ZjMxqcHNJhYGEUvnM2y7bvmXbzJdOFNZRzDmWw28DnXOo3Ws1vet64A_qvy3ulR0xvFwHLMw7LlgK5-2q6m_aqiUEDLa_9_iB9DaxuKRyQZlDqGGyyPY-5FGsAmP_e0vNMkSsjYUoCHD26dBPyaDtHDpECwt6acv2WqRv74RS7ot20FGskhzMrU3NRnKXLbgeXQ3Hdx7VakEb-HTIPdcjamu4mjFFSWlxu5VJ1oH0nZ5gErYl5uwooe9MAmNRfnAMMVNGGmr0ESMBsfQWGZLPAGitH3SyJArYwLGkkRxoaXUgeIUu1zSU2g65szf19kw5hVfzv4ePoddx39nDfXpBTTyVYGXsKM_88XH6qo8wi_MR5x1
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dT8IwEL8gmqgvfoDx2z746KBs3UofESQQccGICW-ka29Kgszg5t9vOyZo4osvTdO0adNL--td7-4HcE1jgzsSAweVEEZBMYW0jmuaaYmxzfkV5ZIe8DBsjsdiWIKbVSwMIubOZ1iz1fwvXycqs6ayumDMNRr4Bmxa5ix_Ga31fe9a6A8KD64GFfV-2OmHzPebNtzKpbVi8C8WlRxEunv_m34fqutoPDJc4cwBlHB-CLs_EglW4LG1_ocmSUyWpgLUpHP71G6FpD3LbEIE05e0Zi_JYpq-vhHTdU3cQboym6VkZO5q0pGprMJz927U7jkFWYIzdamXOpZlqhFxNAcWJaXarFXFSnnSVLmHkTBvN2EOHzb92NcG5z3NIq79QBmVJmDUO4LyPJnjMZBImUeN9HmktcdYHEdcKCmVF3GKDS7pCVTs5kzel_kwJsW-nP7dfAXbvdHDYDLoh_dnsGNlYW2jLj2HcrrI8AK21Gc6_Vhc5uL8Aie6n8A
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE+International+Conference+on+Industrial+Informatics+%28INDIN%29&rft.atitle=Application+of+Improved+DBSCAN+Clustering+Algorithm+on+Industrial+Fault+Text+Data&rft.au=Wang%2C+Xiaohan&rft.au=Zhang%2C+Lin&rft.au=Zhang%2C+Xuesong&rft.au=Xie%2C+Kunyu&rft.date=2020-07-20&rft.pub=IEEE&rft.eissn=2378-363X&rft.volume=1&rft.spage=461&rft.epage=468&rft_id=info:doi/10.1109%2FINDIN45582.2020.9442093&rft.externalDocID=9442093