Deep neural architectures for Kashmiri-English machine translation.

Saved in:
Bibliographic Details
Title: Deep neural architectures for Kashmiri-English machine translation.
Authors: Ul Qumar, Syed Matla1 (AUTHOR) syed1909832@st.jmi.ac.in, Azim, Muzaffar1 (AUTHOR), Quadri, S. M. K.2 (AUTHOR), Alkanan, Mohannad3 (AUTHOR), Mir, Mohammad Shuaib3 (AUTHOR), Gulzar, Yonis3 (AUTHOR) ygulzar@kfu.edu.sa
Source: Scientific Reports. 8/16/2025, Vol. 15 Issue 1, p1-21. 21p.
Subject Terms: *MACHINE translating, *TRANSFORMER models, *DEEP learning, *LINGUISTICS
Abstract: This paper presents the first comprehensive deep learning-based Neural Machine Translation (NMT) framework for the Kashmiri-English language pair. We introduce a high-quality parallel corpus of 270,000 sentence pairs and evaluate three NMT architectures: a basic encoder-decoder model, an attention-enhanced model, and a Transformer-based model. All models are trained from scratch using byte-pair encoded vocabularies and evaluated using BLEU, GLEU, ROUGE, and ChrF + + metrics. The Transformer architecture outperforms RNN-based baselines, achieving a BLEU-4 score of 0.2965 and demonstrating superior handling of long-range dependencies and Kashmiri's morphological complexity. We further provide a structured linguistic error analysis and validate the significance of performance differences through bootstrap resampling. This work establishes the first NMT benchmark for Kashmiri-English translation and contributes a reusable dataset, baseline models, and evaluation methodology for future research in low-resource neural translation. [ABSTRACT FROM AUTHOR]
Database: Academic Search Index
Description
Abstract:This paper presents the first comprehensive deep learning-based Neural Machine Translation (NMT) framework for the Kashmiri-English language pair. We introduce a high-quality parallel corpus of 270,000 sentence pairs and evaluate three NMT architectures: a basic encoder-decoder model, an attention-enhanced model, and a Transformer-based model. All models are trained from scratch using byte-pair encoded vocabularies and evaluated using BLEU, GLEU, ROUGE, and ChrF + + metrics. The Transformer architecture outperforms RNN-based baselines, achieving a BLEU-4 score of 0.2965 and demonstrating superior handling of long-range dependencies and Kashmiri's morphological complexity. We further provide a structured linguistic error analysis and validate the significance of performance differences through bootstrap resampling. This work establishes the first NMT benchmark for Kashmiri-English translation and contributes a reusable dataset, baseline models, and evaluation methodology for future research in low-resource neural translation. [ABSTRACT FROM AUTHOR]
ISSN:20452322
DOI:10.1038/s41598-025-14177-8