Application of Improved DBSCAN Clustering Algorithm on Industrial Fault Text Data

The industrial fault text data are the special type of short texts, and they come from the records of faults in the factory. Clustering the industrial fault text data can reduce the redundant data and find out the hidden information, which is of great significance to improve the utilization of the i...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE International Conference on Industrial Informatics (INDIN) Ročník 1; s. 461 - 468
Hlavní autori: Wang, Xiaohan, Zhang, Lin, Zhang, Xuesong, Xie, Kunyu
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 20.07.2020
Predmet:
ISSN:2378-363X
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract The industrial fault text data are the special type of short texts, and they come from the records of faults in the factory. Clustering the industrial fault text data can reduce the redundant data and find out the hidden information, which is of great significance to improve the utilization of the industrial fault text data. The industrial fault text data are unstructured and irregular, so the clustering faces quite a few challenges. This paper introduces some existing algorithms for the clustering of short texts, and the shortcomings of them are briefly analyzed. This paper indicates that the main problem of the clustering of the industrial fault text data is the contradiction between the requirements and the setup of parameters, and it leads to low accuracy when cluster the corpus of different sizes. To increase the accuracy of clustering, an improved clustering algorithm is proposed which can solve this contradiction. The results of the comparative experiments show that the improved clustering algorithm has better performance than DBSCAN in corpus of different sizes on the industrial fault text data.
AbstractList The industrial fault text data are the special type of short texts, and they come from the records of faults in the factory. Clustering the industrial fault text data can reduce the redundant data and find out the hidden information, which is of great significance to improve the utilization of the industrial fault text data. The industrial fault text data are unstructured and irregular, so the clustering faces quite a few challenges. This paper introduces some existing algorithms for the clustering of short texts, and the shortcomings of them are briefly analyzed. This paper indicates that the main problem of the clustering of the industrial fault text data is the contradiction between the requirements and the setup of parameters, and it leads to low accuracy when cluster the corpus of different sizes. To increase the accuracy of clustering, an improved clustering algorithm is proposed which can solve this contradiction. The results of the comparative experiments show that the improved clustering algorithm has better performance than DBSCAN in corpus of different sizes on the industrial fault text data.
Author Zhang, Xuesong
Wang, Xiaohan
Xie, Kunyu
Zhang, Lin
Author_xml – sequence: 1
  givenname: Xiaohan
  surname: Wang
  fullname: Wang, Xiaohan
  email: xiaohanwang@buaa.edu.cn
  organization: School of Automation Science and Electrical Engineering, Beihang University,Beijing,China
– sequence: 2
  givenname: Lin
  surname: Zhang
  fullname: Zhang, Lin
  email: johnlin9999@163.com
  organization: School of Automation Science and Electrical Engineering, Beihang University,Beijing,China
– sequence: 3
  givenname: Xuesong
  surname: Zhang
  fullname: Zhang, Xuesong
  email: xs_zhang@126.com
  organization: College of Computer Science and Technology, Jilin University,Changchun,China
– sequence: 4
  givenname: Kunyu
  surname: Xie
  fullname: Xie, Kunyu
  email: 1285589283@qq.com
  organization: School of Automation Science and Electrical Engineering, Beihang University,Beijing,China
BookMark eNotkNtKw0AYhFdRsK19Ai_cF0j895y9jEmrgRIRK3hXNsmmruREshV9ewP2agbmYxhmia66vrMI3RMICQH9kOVplnMhIhpSoBBqzilodoHWWkVE0YhwLbm8RAvKVBQwyT5u0HKavgCEIFwu0Gs8DI0rjXd9h_saZ-0w9t-2wunjWxLnOGlOk7ej6444bo796Pxni2c066o5GJ1p8NacGo_39sfj1Hhzi65r00x2fdYVet9u9slzsHt5ypJ4FzgKzAc0AkYKZRkQawCqeXlZlyUzs1XMFhq40EpzG4laVEQQVvFCVUKWkkrJga3Q3X-vs9YehtG1Zvw9nB9gf7q_UUc
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/INDIN45582.2020.9442093
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 9781728149646
1728149649
EISSN 2378-363X
EndPage 468
ExternalDocumentID 9442093
Genre orig-research
GroupedDBID 6IE
6IK
6IL
6IN
AAWTH
ABLEC
ACGFS
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
OCL
RIE
RIL
ID FETCH-LOGICAL-i203t-28031b7e301ea00d442cfcc3a0d473eb90459794e85f5d1513d4b7d56c6266403
IEDL.DBID RIE
ISICitedReferencesCount 2
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000907230200070&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:50:55 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i203t-28031b7e301ea00d442cfcc3a0d473eb90459794e85f5d1513d4b7d56c6266403
PageCount 8
ParticipantIDs ieee_primary_9442093
PublicationCentury 2000
PublicationDate 2020-July-20
PublicationDateYYYYMMDD 2020-07-20
PublicationDate_xml – month: 07
  year: 2020
  text: 2020-July-20
  day: 20
PublicationDecade 2020
PublicationTitle IEEE International Conference on Industrial Informatics (INDIN)
PublicationTitleAbbrev INDIN
PublicationYear 2020
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0055146
Score 2.1269324
Snippet The industrial fault text data are the special type of short texts, and they come from the records of faults in the factory. Clustering the industrial fault...
SourceID ieee
SourceType Publisher
StartPage 461
SubjectTerms Clustering algorithm
Clustering algorithms
Conferences
DBSCAN
Degradation
Faces
Industrial fault text data
Informatics
Production facilities
Text clustering
Title Application of Improved DBSCAN Clustering Algorithm on Industrial Fault Text Data
URI https://ieeexplore.ieee.org/document/9442093
Volume 1
WOSCitedRecordID wos000907230200070&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEJ4A8aAXH2B8pwePFsq2u22PCBK5bDBiwo10266SAGtw199vu6ygiRdvTdNXpplO5_kB3EbKR_tJia1SAWaRTrFkCcUilWlCuekmVpVgEzyOxXQqxzW42-bCWGvL4DPb9s3Sl28yXXhTWUcyFjgNvA51zqNNrtb3q-sFf1TFb7ndO6N4MIpZGAqfbBWQdjX1F4ZKKUKGh__b_Ahau1w8NN5KmWOo2dUJHPwoI9iEp97OC42yFG0MBdagwf1zvxej_qLw5RDcWNRbvGbref62RG7oDrYDDVWxyNHEvdRooHLVgpfhw6T_iCuoBDwPCM2xx5jqJtw6drWKEOPOqlOtqXJNTm0i3c9NOtazIkxD46Q8NSzhJoy0U2giRugpNFbZyp4BIpRzQhIqnOrHmAqFJm4l_zHghggdnEPTE2f2vqmGMavocvF39yXse_p7a2hArqCRrwt7DXv6M59_rG_KK_wC-l-bUg
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NT8IwFH9BNFEvfoDx2x48Oihrt65HBAlEXDBiwo10backyMzc_PttxwRNvHh7afqVNn1ffe_9AK59YaP9OHe0EK5DfRk7nEbECWIeR4SpVqRFATbBwjCYTPioAjerXBitdRF8phuWLP7yVSJz6yprckpdY4FvwKZnqWW21jfftaLfLyO4zPrNQdgdhNTzAptu5eJGOfgXikohRHp7_1t-H-rrbDw0WsmZA6joxSHs_igkWIPH9vofGiUxWroKtELd26dOO0SdeW4LIpi-qD1_SdJZ9vqGTNc1cAfqiXyeobHh1agrMlGH597duNN3SrAEZ-ZikjkWZaoVMW0erBYYK7NXGUtJhCEZ0RE3uhs3j08HXuwpI-eJohFTni-NSeNTTI6gukgW-hgQJoxhHJHAGH-UCi-Q2MxkVQOmcCDdE6jZw5m-L-thTMtzOf27-Qq2--OH4XQ4CO_PYMfehfWNuvgcqlma6wvYkp_Z7CO9LK7zC0Gfnpk
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=IEEE+International+Conference+on+Industrial+Informatics+%28INDIN%29&rft.atitle=Application+of+Improved+DBSCAN+Clustering+Algorithm+on+Industrial+Fault+Text+Data&rft.au=Wang%2C+Xiaohan&rft.au=Zhang%2C+Lin&rft.au=Zhang%2C+Xuesong&rft.au=Xie%2C+Kunyu&rft.date=2020-07-20&rft.pub=IEEE&rft.eissn=2378-363X&rft.volume=1&rft.spage=461&rft.epage=468&rft_id=info:doi/10.1109%2FINDIN45582.2020.9442093&rft.externalDocID=9442093