Bibliographic Details
| Title: |
Integration Named Entity Recognition and Latent Dirichlet Allocation to Enhance Topic Modeling. |
| Authors: |
Taher, Hawraa Ali, Alabid, Noralhuda N., Hasan, Bushra Mahdi |
| Source: |
Annals of Emerging Technologies in Computing (AETiC); 4/1/2025, Vol. 9 Issue 2, p20-30, 11p |
| Subject Terms: |
NATURAL language processing, TEXT summarization, CONTENT analysis, DATA mining, DOCUMENT clustering, INFORMATION resources management, CLASSIFICATION |
| Abstract: |
Topic modeling from texts is one of the important topics in natural language processing (NLP), as it plays a fundamental role in summarizing texts, understanding their content, and facilitating access to the main ideas, especially in light of the vast quantity of unstructured texts available today. Extracting titles is used in a variety of fields, such as news archiving, document classification, and content analysis in social media, making it an essential tool for improving information management and effective presentation. In this research, we focused on improving the methodology for extracting titles from texts by integrating two leading techniques: the topic assignment model using Latent Dirichlet Allocation (LDA) and the named entity recognition technique (NER). This combination aims to achieve a balance between identifying general topics of texts via LDA and extracting important information and key entities using NER, ensuring the generation of accurate and understandable titles that better reflect the actual content of the texts. The results of the study showed that the combined methodology achieved an accuracy of 71.97%, outperforming the performance of each technique separately, where the accuracy of NER alone was 29.78% and the accuracy of LDA alone was 67.80%. These results underscore the importance of integrating different techniques into NLP to improve headline extraction performance. This approach contributes to the development of more efficient text analysis methods, which enhances NLP applications in areas such as news analysis, content management, and document summarization, highlighting the importance of the topic in improving the handling of large texts and presenting them in a clearer and more appropriate way. [ABSTRACT FROM AUTHOR] |
|
Copyright of Annals of Emerging Technologies in Computing (AETiC) is the property of International Association for Educators & Researchers (IAER) and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) |
| Database: |
Complementary Index |