Bibliographische Detailangaben
| Titel: |
Implementation of the Image Text to Speech Conversion in the Desired Language by Translating with Raspberry Pi. |
| Autoren: |
MALLISHWARI, N., KUMAR, K. NAVEEN, LAXMAN, B., ARTHISHA, B., BHARATH, CH., RAJU, M. |
| Quelle: |
International Scientific Journal of Engineering & Management; Jun2025, Vol. 4 Issue 6, p1-9, 9p |
| Schlagwörter: |
OPTICAL character recognition, APPLICATION program interfaces, RASPBERRY Pi, SPEECH, ASSISTIVE technology |
| Abstract: |
The main problem in communication is language bias between the communicators. This device basically can be used by people who do not know English and want it to be translated to their native language. The novelty component of this research work is the speech output which is available in 53 different languages translated from English. This paper is based on a prototype which helps user to hear the contents of the text images in the desired language. It involves extraction of text from the image and converting the text to translated speech in the user desired language. This is done with Raspberry Pi and a camera module by using the concepts of Tesseract OCR [optical character recognition] engine, Google Speech API [application program interface] which is the Text to speech engine and the Microsoft translator. This relieves the travelers as they can use this device to hear the English text in their own desired language. It can also be used by the visually impaired. This device helps users to hear the images being read in their desired language. Image Text to Speech (ITTS) conversion is an assistive technology that bridges the gap between visual and auditory information, making printed text accessible to visually impaired individuals. This project focuses on developing a system that captures text from images using a camera module, processes the image to extract text using Optical Character Recognition (OCR), and then converts the recognized text into audible speech using Text-to-Speech (TTS) synthesis. The implementation utilizes a Raspberry Pi as the core processing unit, integrated with a webcam for image capture and a speaker for audio output. The Python-based system employs libraries such as Tesseract OCR for text extraction and pyttsx3 or gTTS for speech generation. [ABSTRACT FROM AUTHOR] |
|
Copyright of International Scientific Journal of Engineering & Management is the property of International Scientific Journal of Engineering & Management and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) |
| Datenbank: |
Complementary Index |