View in EDS

Deep neural architectures for Kashmiri-English machine translation.

Saved in:

Bibliographic Details
Title:	Deep neural architectures for Kashmiri-English machine translation.
Authors:	Ul Qumar, Syed Matla¹ (AUTHOR) syed1909832@st.jmi.ac.in, Azim, Muzaffar¹ (AUTHOR), Quadri, S. M. K.² (AUTHOR), Alkanan, Mohannad³ (AUTHOR), Mir, Mohammad Shuaib³ (AUTHOR), Gulzar, Yonis³ (AUTHOR) ygulzar@kfu.edu.sa
Source:	Scientific Reports. 8/16/2025, Vol. 15 Issue 1, p1-21. 21p.
Subject Terms:	MACHINE translating, TRANSFORMER models, DEEP learning, LINGUISTICS
Abstract:	This paper presents the first comprehensive deep learning-based Neural Machine Translation (NMT) framework for the Kashmiri-English language pair. We introduce a high-quality parallel corpus of 270,000 sentence pairs and evaluate three NMT architectures: a basic encoder-decoder model, an attention-enhanced model, and a Transformer-based model. All models are trained from scratch using byte-pair encoded vocabularies and evaluated using BLEU, GLEU, ROUGE, and ChrF + + metrics. The Transformer architecture outperforms RNN-based baselines, achieving a BLEU-4 score of 0.2965 and demonstrating superior handling of long-range dependencies and Kashmiri's morphological complexity. We further provide a structured linguistic error analysis and validate the significance of performance differences through bootstrap resampling. This work establishes the first NMT benchmark for Kashmiri-English translation and contributes a reusable dataset, baseline models, and evaluation methodology for future research in low-resource neural translation. [ABSTRACT FROM AUTHOR]
Database:	Academic Search Index

Full Text Finder

Nájsť tento článok vo Web of Science

Description
Abstract:	This paper presents the first comprehensive deep learning-based Neural Machine Translation (NMT) framework for the Kashmiri-English language pair. We introduce a high-quality parallel corpus of 270,000 sentence pairs and evaluate three NMT architectures: a basic encoder-decoder model, an attention-enhanced model, and a Transformer-based model. All models are trained from scratch using byte-pair encoded vocabularies and evaluated using BLEU, GLEU, ROUGE, and ChrF + + metrics. The Transformer architecture outperforms RNN-based baselines, achieving a BLEU-4 score of 0.2965 and demonstrating superior handling of long-range dependencies and Kashmiri's morphological complexity. We further provide a structured linguistic error analysis and validate the significance of performance differences through bootstrap resampling. This work establishes the first NMT benchmark for Kashmiri-English translation and contributes a reusable dataset, baseline models, and evaluation methodology for future research in low-resource neural translation. [ABSTRACT FROM AUTHOR]
ISSN:	20452322
DOI:	10.1038/s41598-025-14177-8