Recognition Method of Component Names in Patent Documents Based on the Algorithm of Word Frequency Difference and Library of Left-segmentation Words

Mechanical patent literature contains a large amount of domain knowledge where component names exist as information units.Being flexible and changeable, the word formatting of component name represents the characteristics of uniqueness, complexity and lesser-known expressions.The challenge of accura...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Ji suan ji ke xue Ročník 50; číslo 7; s. 229 - 236
Hlavní autori: Kong, Jiabin, Lyu, Jianwen, Liu, Jiangnan, Du, Wenxuan
Médium: Journal Article
Jazyk:Chinese
Vydavateľské údaje: Chongqing Guojia Kexue Jishu Bu 01.07.2023
Editorial office of Computer Science
Predmet:
ISSN:1002-137X
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Mechanical patent literature contains a large amount of domain knowledge where component names exist as information units.Being flexible and changeable, the word formatting of component name represents the characteristics of uniqueness, complexity and lesser-known expressions.The challenge of accurate recognition of component names by computers becomes an obstacle to patent knowledge mining.In order to propose an efficient method to recognize component names, the features of word formation in patent text statements are analyzed and extracted.Starting with external words related to component names, characters on the left side of the appended drawing reference signs(ADRS) are identified.Accordingly, candidate names are automatically retrieved from texts, and the set of candidate names are constructed.An algorithm of word frequency difference is proposed to filter redundant characters in the set of candidate names.By building left-segmentation library(LSL) dynamically, redundant characters which are not filtered
AbstractList Mechanical patent literature contains a large amount of domain knowledge where component names exist as information units.Being flexible and changeable, the word formatting of component name represents the characteristics of uniqueness, complexity and lesser-known expressions.The challenge of accurate recognition of component names by computers becomes an obstacle to patent knowledge mining.In order to propose an efficient method to recognize component names, the features of word formation in patent text statements are analyzed and extracted.Starting with external words related to component names, characters on the left side of the appended drawing reference signs(ADRS) are identified.Accordingly, candidate names are automatically retrieved from texts, and the set of candidate names are constructed.An algorithm of word frequency difference is proposed to filter redundant characters in the set of candidate names.By building left-segmentation library(LSL) dynamically, redundant characters which are not filtered
Mechanical patent literature contains a large amount of domain knowledge where component names exist as information units.Being flexible and changeable,the word formatting of component name represents the characteristics of uniqueness,complexity and lesser-known expressions.The challenge of accurate recognition of component names by computers becomes an obstacle to patent knowledge mining.In order to propose an efficient method to recognize component names,the features of word formation in patent text statements are analyzed and extracted.Starting with external words related to component names,characters on the left side of the appended drawing reference signs(ADRS) are identified.Accordingly,candidate names are automatically retrieved from texts,and the set of candidate names are constructed.An algorithm of word frequency difference is proposed to filter redundant characters in the set of candidate names.By building left-segmentation library(LSL) dynamically,redundant characters which are not filtered are fu
Author Liu, Jiangnan
Du, Wenxuan
Lyu, Jianwen
Kong, Jiabin
Author_xml – sequence: 1
  givenname: Jiabin
  surname: Kong
  fullname: Kong, Jiabin
– sequence: 2
  givenname: Jianwen
  surname: Lyu
  fullname: Lyu, Jianwen
– sequence: 3
  givenname: Jiangnan
  surname: Liu
  fullname: Liu, Jiangnan
– sequence: 4
  givenname: Wenxuan
  surname: Du
  fullname: Du, Wenxuan
BookMark eNotj0lLQzEQx3NQsGrvHgOen2Z561HrCnVBCnp7ZJm0qX1JTVLQ7-EHNk9lGGZh_j_-c4j2nHeA0AklZ5S2XX2-juv3zzPGSEUIqds9NKGEsILy5u0ATWO0kjBelznoBH2_gPJLZ5P1Dj9AWnmNvcEzP2wz1iX8KAaI2Dr8LNI4X3m1G3IT8aWIkI8dTivAF5ulDzathlH96oPGNwE-duDUF76yxkDILWDhNJ5bGUT4Gg_nYFIRYTkCxa-FURqP0b4RmwjT_3qEFjfXi9ldMX-6vZ9dzAvdlbSg0HRQG041qTgxWjCllNSKlG0tpNISdFnVLSgmdVsrzSpTQsNlI43pTCP4Ebr_w2ov1v022CHb6r2w_e_Ch2UvQrJqA72RhknOiYCuKVtJ25zSABOaygakzKzTP9Y2-Px2TP3a74LL7nvW8rIilHLKfwB8DIZ-
ContentType Journal Article
Copyright Copyright Guojia Kexue Jishu Bu 2023
Copyright_xml – notice: Copyright Guojia Kexue Jishu Bu 2023
DBID 7SC
8FD
JQ2
L7M
L~C
L~D
DOA
DOI 10.11896/jsjkx.220500068
DatabaseName Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DOAJ Directory of Open Access Journals
DatabaseTitle Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList Computer and Information Systems Abstracts

Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EndPage 236
ExternalDocumentID oai_doaj_org_article_fbf2b330ae9748b188b1bfe2ad1b7ebb
GroupedDBID -0Y
5XA
5XJ
7SC
8FD
92H
92I
ABJNI
ACGFS
ALMA_UNASSIGNED_HOLDINGS
CCEZO
CUBFJ
CW9
GROUPED_DOAJ
JQ2
L7M
L~C
L~D
TCJ
TGT
U1G
U5S
ID FETCH-LOGICAL-d941-1e79e6f31d0530fda2cccbdc0486abcdbed4568ec2bd86cd25f4e73b7bff9f7a3
IEDL.DBID DOA
ISSN 1002-137X
IngestDate Fri Oct 03 12:43:52 EDT 2025
Mon Jun 30 03:13:47 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Issue 7
Language Chinese
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-d941-1e79e6f31d0530fda2cccbdc0486abcdbed4568ec2bd86cd25f4e73b7bff9f7a3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
OpenAccessLink https://doaj.org/article/fbf2b330ae9748b188b1bfe2ad1b7ebb
PQID 2834501131
PQPubID 2048282
PageCount 8
ParticipantIDs doaj_primary_oai_doaj_org_article_fbf2b330ae9748b188b1bfe2ad1b7ebb
proquest_journals_2834501131
PublicationCentury 2000
PublicationDate 2023-07-01
PublicationDateYYYYMMDD 2023-07-01
PublicationDate_xml – month: 07
  year: 2023
  text: 2023-07-01
  day: 01
PublicationDecade 2020
PublicationPlace Chongqing
PublicationPlace_xml – name: Chongqing
PublicationTitle Ji suan ji ke xue
PublicationYear 2023
Publisher Guojia Kexue Jishu Bu
Editorial office of Computer Science
Publisher_xml – name: Guojia Kexue Jishu Bu
– name: Editorial office of Computer Science
SSID ssib023646461
ssib051375750
ssib001164759
ssj0057673
Score 2.34388
Snippet Mechanical patent literature contains a large amount of domain knowledge where component names exist as information units.Being flexible and changeable, the...
Mechanical patent literature contains a large amount of domain knowledge where component names exist as information units.Being flexible and changeable,the...
SourceID doaj
proquest
SourceType Open Website
Aggregation Database
StartPage 229
SubjectTerms Algorithms
Documents
Libraries
patent text|redundant characters|appended drawing reference signs|word frequency difference|left-segmentation words
Segmentation
Words (language)
Title Recognition Method of Component Names in Patent Documents Based on the Algorithm of Word Frequency Difference and Library of Left-segmentation Words
URI https://www.proquest.com/docview/2834501131
https://doaj.org/article/fbf2b330ae9748b188b1bfe2ad1b7ebb
Volume 50
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  issn: 1002-137X
  databaseCode: DOA
  dateStart: 20210101
  customDbUrl:
  isFulltext: true
  dateEnd: 99991231
  titleUrlDefault: https://www.doaj.org/
  omitProxy: false
  ssIdentifier: ssj0057673
  providerName: Directory of Open Access Journals
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV07T8MwELYQYmDhjXgU5IE1kMRJ7IzlUTFAhRAS3SI_oUBT1LQI-B38YO4SR1RiYGHIkOisyM757rNy932EHAEiDrlJTKAltuSEIglEYmSQSZs6ncUm1WEtNsH7fTEY5DdzUl9YE9bQAzcLd-KUixUcuqUF5CtUJOBSzsbSRIpbpTD6hjyfO0zVQABpsn4SNbKkZ3PEaWnEOOCUsI3ZALp5U4qPUh-MD9ofmiLPTp6qp-f3Y2xIxdguPLn_r9hdJ6TeGlnxSJJ2mxmsk4XPxw2y2qo0UL9pN8nXbVsjNC7pdS0YTceOouG4hJRD-1goS4clvQHcCfeQd2Z14xs9hRQHxiUFlEi7Lw_jyXD6OMLR93Bmpb1JU4f9Qc-9zIq2VJaG-mYINLyybhpU9mHke5zKemi1Re56F3dnl4HXYghMnkRBZHluM8ciA5s2dEbGWmtlNBL2SaWNsgaQmLA6VkZk2sSpSyxniivncscl2yaLJUxqh1AtlQ0t4sDMJE5GSku0ZCy1LEu12iWnuL7Fa8O2USD_df0AvKLwXlH85RW7pNN-ncJvyqoAJJWk4Bos2vuPd-yTZdSeb2p3O2RxOpnZA7Kk36bDanJY--M3YePmeg
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Recognition+Method+of+Component+Names+in+Patent+Documents+Based+on+the+Algorithm+of+Word+Frequency+Difference+and+Library+of+Left-segmentation+Words&rft.jtitle=Ji+suan+ji+ke+xue&rft.au=KONG+Jiabin%2C+LYU+Jianwen%2C+LIU+Jiangnan%2C+DU+Wenxuan&rft.date=2023-07-01&rft.pub=Editorial+office+of+Computer+Science&rft.issn=1002-137X&rft.volume=50&rft.issue=7&rft.spage=229&rft.epage=236&rft_id=info:doi/10.11896%2Fjsjkx.220500068&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_fbf2b330ae9748b188b1bfe2ad1b7ebb
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1002-137X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1002-137X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1002-137X&client=summon