A Unified Framework for Text Extraction and Plagiarism Detection in Image-Based Content Using OCR and NLP.

Gespeichert in:
Bibliographische Detailangaben
Titel: A Unified Framework for Text Extraction and Plagiarism Detection in Image-Based Content Using OCR and NLP.
Autoren: Kumar, Palvadi Srinivas, Prasad, Krishna
Quelle: Cuestiones de Fisioterapia; 2025, Vol. 54 Issue 1, p111-120, 10p
Schlagwörter: OPTICAL character recognition, MACHINE learning, NATURAL language processing, ARTIFICIAL intelligence, DEEP learning
Abstract: In today's digital landscape, images frequently contain valuable textual information, including numbers, symbols, and other critical data. Accurate extraction and verification of this embedded text are essential, especially in academic and content-rich fields where originality is paramount. This paper introduces a novel approach to detecting plagiarism in text embedded within images. Our method utilizes state-of-the-art Optical Character Recognition (OCR) techniques, combined with advanced Natural Language Processing (NLP) and deep learning algorithms, to extract and analyze the text content. By comparing the extracted text against a vast repository of existing sources, our system can effectively identify potential plagiarism while accurately distinguishing between original and copied content. This innovative approach ensures that not only traditional text documents but also image-based content is rigorously examined for authenticity, significantly enhancing the reliability of plagiarism detection across various content formats. The proposed solution offers a robust and automated pipeline for image-based text extraction and plagiarism detection, with the potential to revolutionize academic integrity, legal proceedings, and content creation practices. [ABSTRACT FROM AUTHOR]
Copyright of Cuestiones de Fisioterapia is the property of Cuestiones de Fisioterapia and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Datenbank: Biomedical Index