Improved Hypernymy Detection Algorithm Based on Heterogeneous Graph Neural Network

Uložené v:
Podrobná bibliografia
Názov: Improved Hypernymy Detection Algorithm Based on Heterogeneous Graph Neural Network
Autori: Li Ren, Jing Huang, Hai-Tao Jia, Shu-Bao Sun, Kai-Shi Wang, Yi-Le Wu
Zdroj: International Journal of Computational Intelligence Systems, Vol 18, Iss 1, Pp 1-27 (2025)
Informácie o vydavateľovi: Springer, 2025.
Rok vydania: 2025
Zbierka: LCC:Electronic computers. Computer science
Predmety: Heterogeneous graph neural network, Hypernymy detection, Hierarchical structure construction, Semantic relation understanding, Electronic computers. Computer science, QA75.5-76.95
Popis: Abstract Concept mapping is a knowledge representation method used to represent and understand concepts, entities, and the relationships between them, which are referred to as hyponymy–hypernymy semantic relations. These relations are primarily used to describe the hierarchical and categorical relationships between different concepts or entities. The detection of hyponymy–hypernymy semantic relations is an important task in the field of natural language processing, crucial to many downstream tasks such as information extraction, automatic reasoning, and personalized recommendations. These tasks often require understanding the semantic relations between concepts or entities in text for more accurate analysis and reasoning. Currently, algorithms for identifying and detecting hyponymy–hypernymy semantic relations face two main challenges: first, candidate hyponymy–hypernymy relation tuples do not exist in the same contextual sentence, failing to meet the co-occurrence requirement; second, distributed algorithms have issues with lexical memory. To address these issues, this paper proposes an improved algorithm for detecting hyponymy–hypernymy relations based on heterogeneous graph neural networks, aiming to detect hyponymy–hypernymy relations in various candidate word sets and then construct a hierarchical system. To meet the co-occurrence requirement of candidate word pairs, an open-source large model is utilized to generate contextual sentences for the candidate pairs. Sub-word features are adopted to capture the intrinsic semantic connections between nested phrases, thus alleviating the issue of reversed predictions for nested phrases. The representation of relation nodes is modified by encoding relation definitions through a pre-trained model, enabling the model to understand the semantic relations between concept nodes. To address the problem of overfitting in traditional graph attention networks, the calculation order of adjacency node aggregation is changed in the heterogeneous graph to capture dynamic attention features. In addition, a pipeline for hierarchical system construction is designed and implemented, combining the divide-and-conquer approach with loop detection algorithms. Compared to baseline metrics, the proposed method achieves a 4.14% improvement in accuracy and a 0.62% increase in F1 score on the EVALution dataset; a 4.89% increase in accuracy and a 0.71% improvement in F1 score on the Bansal dataset; and a 1.05% increase in accuracy, a 2.21% increase in recall, and a 1.79% improvement in F1 score on a self-annotated Chinese dictionary dataset.
Druh dokumentu: article
Popis súboru: electronic resource
Jazyk: English
ISSN: 1875-6883
Relation: https://doaj.org/toc/1875-6883
DOI: 10.1007/s44196-025-00828-1
Prístupová URL adresa: https://doaj.org/article/9a6e2adc651441a3bbc099d3155cc26c
Prístupové číslo: edsdoj.9a6e2adc651441a3bbc099d3155cc26c
Databáza: Directory of Open Access Journals
Popis
Abstrakt:Abstract Concept mapping is a knowledge representation method used to represent and understand concepts, entities, and the relationships between them, which are referred to as hyponymy–hypernymy semantic relations. These relations are primarily used to describe the hierarchical and categorical relationships between different concepts or entities. The detection of hyponymy–hypernymy semantic relations is an important task in the field of natural language processing, crucial to many downstream tasks such as information extraction, automatic reasoning, and personalized recommendations. These tasks often require understanding the semantic relations between concepts or entities in text for more accurate analysis and reasoning. Currently, algorithms for identifying and detecting hyponymy–hypernymy semantic relations face two main challenges: first, candidate hyponymy–hypernymy relation tuples do not exist in the same contextual sentence, failing to meet the co-occurrence requirement; second, distributed algorithms have issues with lexical memory. To address these issues, this paper proposes an improved algorithm for detecting hyponymy–hypernymy relations based on heterogeneous graph neural networks, aiming to detect hyponymy–hypernymy relations in various candidate word sets and then construct a hierarchical system. To meet the co-occurrence requirement of candidate word pairs, an open-source large model is utilized to generate contextual sentences for the candidate pairs. Sub-word features are adopted to capture the intrinsic semantic connections between nested phrases, thus alleviating the issue of reversed predictions for nested phrases. The representation of relation nodes is modified by encoding relation definitions through a pre-trained model, enabling the model to understand the semantic relations between concept nodes. To address the problem of overfitting in traditional graph attention networks, the calculation order of adjacency node aggregation is changed in the heterogeneous graph to capture dynamic attention features. In addition, a pipeline for hierarchical system construction is designed and implemented, combining the divide-and-conquer approach with loop detection algorithms. Compared to baseline metrics, the proposed method achieves a 4.14% improvement in accuracy and a 0.62% increase in F1 score on the EVALution dataset; a 4.89% increase in accuracy and a 0.71% improvement in F1 score on the Bansal dataset; and a 1.05% increase in accuracy, a 2.21% increase in recall, and a 1.79% improvement in F1 score on a self-annotated Chinese dictionary dataset.
ISSN:18756883
DOI:10.1007/s44196-025-00828-1