Real-Time Semantic Indexing for High-Volume Data Streams

Gespeichert in:
Bibliographische Detailangaben
Titel: Real-Time Semantic Indexing for High-Volume Data Streams
Autoren: Raj, Yeshwanth, Mahdi, Hassan Mohamed, Abraham, Benjamin Jones, Rama Sree, S, Kiruthika, R, Ugli, Khusainov Ilyos Jamoliddin
Quelle: Indian Journal of Information Sources and Services; Vol. 15 No. 3 (2025): July-September 2025; 423-431 ; 2231-6094 ; 10.51983/ijiss-2025.IJISS.15.3
Verlagsinformationen: The Research Publication
Publikationsjahr: 2025
Schlagwörter: Real-Time Data Handling, High-Rate Data Stream Processing, Semantic Indexing, Natural Language Understanding, Knowledge Graphs, And Highly Scalable Systems
Beschreibung: Rapidly accumulating high-volume datasets from sources like social media, IoT devices, and the financial market present substantial issues for real-time data processing, storage, and restoration. Such indexing data and traditional search approaches could not maintain the requisite velocity, magnitude, and polymorphism that these databases offer in a conceptually relevant form. This paper proposes a new model for real-time semantic indexing (RTSI). This model proposes enhancing information retrieval and analytic capabilities by incorporating semantics into the indexing process during data ingestion. Contextual meaning is assigned to data items in real time using lightweight natural language processing (NLP), entity recognition, topic modeling, and Knowledge embedding. The distributed architecture, constructed from scalable stream processing engines like Apache Flink or Kafka Streams, provides low-latency operational performance for practical implementations. We implemented the proposed System on multiple high-throughput datasets consisting of news feeds, social media posts, and sensor logs. Experimental results demonstrate that RTSI outperforms conventional search and analytic tasks in terms of real-time relevance and accuracy compared to keyword-based indexing. Additionally, the semantic layer enables context-aware alerting and anomaly detection trend monitoring. The System also has adaptability, supporting the continuous refinement of semantic representations with incoming data. By incorporating semantic techniques into real-time stream indexing, the study's results suggest enhancements to the responsiveness, intelligence, and scalability of data-driven applications, which are increasingly important.
Publikationsart: article in journal/newspaper
Dateibeschreibung: application/pdf
Sprache: English
Relation: https://ojs.trp.org.in/index.php/ijiss/article/view/5283/7845; https://ojs.trp.org.in/index.php/ijiss/article/view/5283
DOI: 10.51983/ijiss-2025.IJISS.15.3.47
Verfügbarkeit: https://ojs.trp.org.in/index.php/ijiss/article/view/5283
https://doi.org/10.51983/ijiss-2025.IJISS.15.3.47
Rights: Copyright (c) 2025 The Research Publication ; https://creativecommons.org/licenses/by-nc-nd/4.0
Dokumentencode: edsbas.C6FF8D98
Datenbank: BASE
Beschreibung
Abstract:Rapidly accumulating high-volume datasets from sources like social media, IoT devices, and the financial market present substantial issues for real-time data processing, storage, and restoration. Such indexing data and traditional search approaches could not maintain the requisite velocity, magnitude, and polymorphism that these databases offer in a conceptually relevant form. This paper proposes a new model for real-time semantic indexing (RTSI). This model proposes enhancing information retrieval and analytic capabilities by incorporating semantics into the indexing process during data ingestion. Contextual meaning is assigned to data items in real time using lightweight natural language processing (NLP), entity recognition, topic modeling, and Knowledge embedding. The distributed architecture, constructed from scalable stream processing engines like Apache Flink or Kafka Streams, provides low-latency operational performance for practical implementations. We implemented the proposed System on multiple high-throughput datasets consisting of news feeds, social media posts, and sensor logs. Experimental results demonstrate that RTSI outperforms conventional search and analytic tasks in terms of real-time relevance and accuracy compared to keyword-based indexing. Additionally, the semantic layer enables context-aware alerting and anomaly detection trend monitoring. The System also has adaptability, supporting the continuous refinement of semantic representations with incoming data. By incorporating semantic techniques into real-time stream indexing, the study's results suggest enhancements to the responsiveness, intelligence, and scalability of data-driven applications, which are increasingly important.
DOI:10.51983/ijiss-2025.IJISS.15.3.47