Query-based automatic text summarization using query expansion approach

The amount of information available on the Web has grown dramatically and continues to grow on a daily basis. The massive amount of Web data poses significant challenges to the reliability and accuracy of current information retrieval systems. The purpose of information retrieval is to discover rele...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Data & knowledge engineering Ročník 162; s. 102531
Hlavní autor: Azad, Hiteshwar Kumar
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 01.03.2026
Témata:
ISSN:0169-023X
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:The amount of information available on the Web has grown dramatically and continues to grow on a daily basis. The massive amount of Web data poses significant challenges to the reliability and accuracy of current information retrieval systems. The purpose of information retrieval is to discover relevant documents within a huge group of documents whose contents match a user-initiated query. Because most users struggle to formulate well-defined queries, the query expansion technique is critical for retrieving the most relevant information. Obtaining relevant results in a concise manner is a significant challenge in this scenario. Automatic text summarization can condense a lengthy document while retaining its informative content and key concepts. It could be a potential solution to information overload. This paper proposed a query-based automatic text summarization technique that employs query expansion to improve text summarization and provide the relevant information in a concise manner. To produce a relevant text summary, this article employs a query-based extractive text summarization method, which involves selecting sentences based on the four best features retrieved from each sentence. In this process, the words are scored by the expanded query’s score, and the sentences are scored by four important features, including sentence terms, position, similarity to the first sentence, and proper noun. Extensive experiments with different ROUGE variants on various evaluation metrics, including precision, recall, and F-score, were carried out on the DUC 2007 dataset, with gains of approximately 44%, 46%, and 45% respectively, in the best scenario. It is observed that the suggested approach outperforms both DUC participatory systems and cutting-edge approaches in summary generation.
ISSN:0169-023X
DOI:10.1016/j.datak.2025.102531