Mining Concepts from Wikipedia for Ontology Construction

An ontology is a structured knowledgebase of concepts organized by relations among them. But concepts are usually mixed with their instances in the corpora for knowledge extraction. Concepts and their corresponding instances share similar features and are difficult to distinguish. In this paper, a n...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03 Jg. 3; S. 287 - 290
Hauptverfasser: Cui, Gaoying, Lu, Qin, Li, Wenjie, Chen, Yirong
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: Washington, DC, USA IEEE Computer Society 15.09.2009
IEEE
Schriftenreihe:ACM Conferences
Schlagworte:
ISBN:0769538010, 9780769538013
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract An ontology is a structured knowledgebase of concepts organized by relations among them. But concepts are usually mixed with their instances in the corpora for knowledge extraction. Concepts and their corresponding instances share similar features and are difficult to distinguish. In this paper, a novel approach is proposed to comprehensively obtain concepts with the help of definition sentences and Category Labels in Wikipedia pages. N-gram statistics and other NLP knowledge are used to help extracting appropriate concepts. The proposed method identified nearly 50,000 concepts from about 700,000 Wiki pages. The precision reaching 78.5% makes it an effective approach to mine concepts from Wikipedia for ontology construction.
AbstractList An ontology is a structured knowledgebase of concepts organized by relations among them. But concepts are usually mixed with their instances in the corpora for knowledge extraction. Concepts and their corresponding instances share similar features and are difficult to distinguish. In this paper, a novel approach is proposed to comprehensively obtain concepts with the help of definition sentences and Category Labels in Wikipedia pages. N-gram statistics and other NLP knowledge are used to help extracting appropriate concepts. The proposed method identified nearly 50,000 concepts from about 700,000 Wiki pages. The precision reaching 78.5% makes it an effective approach to mine concepts from Wikipedia for ontology construction.
Author Cui, Gaoying
Lu, Qin
Li, Wenjie
Chen, Yirong
Author_xml – sequence: 1
  givenname: Gaoying
  surname: Cui
  fullname: Cui, Gaoying
– sequence: 2
  givenname: Qin
  surname: Lu
  fullname: Lu, Qin
– sequence: 3
  givenname: Wenjie
  surname: Li
  fullname: Li, Wenjie
– sequence: 4
  givenname: Yirong
  surname: Chen
  fullname: Chen, Yirong
BookMark eNqNkL1OwzAYRY0ACVo6M7BkZEn4PjuOnbGK-IlU1KWoo-X6pzJt7SoJQ9-eVOUBmO5wj650z4TcxBQdIY8IBSLUL-s2b-erggLUBZXlFZnVQmJJy5IzhuyaTEBUNWcSEO7IrO-_AQCRQsmreyI_QwxxmzUpGncc-sx36ZCtwy4cnQ0686nLlnFI-7Q9naF-6H7MEFJ8ILde73s3-8sp-Xp7XTUf-WL53jbzRa6xokNeGypl5Z0RAtFaJxG5w5JbzSthPPXaa5TCoWWVEV56W3MKwhptPYL2bEqeLrvBOaeOXTjo7qQ4lRzGd1PyfGm1OahNSrteIaizF7Vu1ehFnb2o0cuIFv9E1aYLzrNfApZkQQ
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/WI-IAT.2009.284
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
Statistics
EISBN 9781424453313
1424453313
EndPage 290
ExternalDocumentID 5285031
Genre orig-research
GroupedDBID 6IE
6IF
6IK
6IL
6IN
AAJGR
AARBI
ACM
ADPZR
ALMA_UNASSIGNED_HOLDINGS
APO
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
GUFHI
IERZE
OCL
RIB
RIC
RIE
RIL
AAWTH
LHSKQ
ID FETCH-LOGICAL-a162t-9c2886fec7711dde8115e145da567cf2fafa187e1d36c7f8fd95207dcadf10af3
IEDL.DBID RIE
ISBN 0769538010
9780769538013
IngestDate Wed Aug 27 01:35:35 EDT 2025
Wed Jan 31 06:41:49 EST 2024
IsPeerReviewed false
IsScholarly false
Keywords Concept
Wikipedia
Ontology Construction
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a162t-9c2886fec7711dde8115e145da567cf2fafa187e1d36c7f8fd95207dcadf10af3
PageCount 4
ParticipantIDs acm_books_10_1109_WI_IAT_2009_284
ieee_primary_5285031
acm_books_10_1109_WI_IAT_2009_284_brief
PublicationCentury 2000
PublicationDate 20090915
2009-Sept.
PublicationDateYYYYMMDD 2009-09-15
2009-09-01
PublicationDate_xml – month: 09
  year: 2009
  text: 20090915
  day: 15
PublicationDecade 2000
PublicationPlace Washington, DC, USA
PublicationPlace_xml – name: Washington, DC, USA
PublicationSeriesTitle ACM Conferences
PublicationTitle Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
PublicationTitleAbbrev WIIAT
PublicationYear 2009
Publisher IEEE Computer Society
IEEE
Publisher_xml – name: IEEE Computer Society
– name: IEEE
SSID ssj0001120456
Score 1.4838402
Snippet An ontology is a structured knowledgebase of concepts organized by relations among them. But concepts are usually mixed with their instances in the corpora for...
SourceID ieee
acm
SourceType Publisher
StartPage 287
SubjectTerms Collaboration
Computing methodologies -- Artificial intelligence -- Knowledge representation and reasoning
Computing methodologies -- Artificial intelligence -- Knowledge representation and reasoning -- Semantic networks
Computing methodologies -- Artificial intelligence -- Natural language processing
Computing methodologies -- Machine learning -- Machine learning approaches -- Rule learning
Concept
Conferences
Information science
Information systems -- Information retrieval
Information systems -- Information retrieval -- Evaluation of retrieval results
Information systems -- Information systems applications -- Data mining
Intelligent agent
Intelligent structures
Ontologies
Ontology Construction
Search engines
Statistics
Taxonomy
Wikipedia
Title Mining Concepts from Wikipedia for Ontology Construction
URI https://ieeexplore.ieee.org/document/5285031
Volume 3
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwGP3YhoedptvE-YsIghfjmnZtkqMMxYHOHSbbLaT5AUPcxqaC_71J2m0IgnhrSg7lJen3fUneewCXns-YcMKxSpXFPUk1dnEmwoTnJqfMRtqGkX6kwyGbTvmoAtdbLowxJlw-Mzf-MZzl64X68Ftl3TRmaeRJ01VKs4KrtdtPIV5YPSsqc-6WsSs0SoGdTTsppX1IxLuTAR7cjgu9ythLm1alevvhsBICzH3jf5-2D-0dUw-NtjHoACpm3oTGxqoBlSu3CXWfVBaazC1gT8EVAvULyuIaeY4JmsxeZ0vPI0Euj0XP82Bs--U7bTVm2_ByfzfuP-DSQQFLksXvmKuYscwaRSkh7kfGXP5nSC_VMs2osrGVVhJGDdFJpqhlVvM0jqhWUlsSSZscQm2-mJsjQHmuHaJKufRCuRQslZZzq2VC8kwpRZIOXDgEhS8N1iJUFhEXk4FwKHurSy4cyh24-rOPyFczYzvQ8hiLZSG5IUp4j39_fQL14pDHX_06hZpDxZzBnvp0wK7Owzz5BmoVt5Y
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1dS8MwFL3MKbin6TZxflYQfLGu6VeSRxHHitvcw2R7C2k-YIjd2FTw35u03YYgiG9NyUM5SXrvTXLOAbi2fMaAIuqKSGg35Fi6Js54LqKpSjHRntT5SPfxcEimUzqqwO2GC6OUyi-fqTv7mJ_ly7n4sFtlncgnkWdJ07tRGPpewdba7qggK60eF7U5NQvZlBqlxM66HZTiPsijnUniJvfjQrHSt-KmO1y8_fBYyUNMt_6_jzuA1par54w2UegQKiprQH1t1uCUa7cBNZtWFqrMTSCD3BfCeShIiyvHskycyex1trBMEsdkss5zllvbftlOG5XZFrx0H8cPPbf0UHA5iv13lwqfkFgrgTFC5ldGTAaoUBhJHsVYaF9zzRHBCskgFlgTLWnke1gKLjXyuA6OoJrNM3UMTppKg6gQJsEQJgmLuKZUSx6gNBZCoKANVwZBZouDFctrC4-yScIMytbskjKDchtu_uzD0uVM6TY0LcZsUYhusBLek99fX8J-bzzos34yfDqFWnHkYy-CnUHVIKTOYU98GpCXF_mc-QYnt7rd
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+the+2009+IEEE%2FWIC%2FACM+International+Joint+Conference+on+Web+Intelligence+and+Intelligent+Agent+Technology+-+Volume+03&rft.atitle=Mining+Concepts+from+Wikipedia+for+Ontology+Construction&rft.au=Cui%2C+Gaoying&rft.au=Lu%2C+Qin&rft.au=Li%2C+Wenjie&rft.au=Chen%2C+Yirong&rft.series=ACM+Conferences&rft.date=2009-09-15&rft.pub=IEEE+Computer+Society&rft.isbn=0769538010&rft.spage=287&rft.epage=290&rft_id=info:doi/10.1109%2FWI-IAT.2009.284
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780769538013/lc.gif&client=summon&freeimage=true
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780769538013/mc.gif&client=summon&freeimage=true
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780769538013/sc.gif&client=summon&freeimage=true