Evaluating Terminology Translation in Machine Translation Systems via Metamorphic Testing

Machine translation has become an integral part of daily life, with terminology translation playing a crucial role in ensuring the accuracy of translation results. However, existing translation systems, such as Google Translate, have been shown to occasionally produce errors in terminology translati...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE/ACM International Conference on Automated Software Engineering : [proceedings] s. 758 - 769
Hlavní autori: Xu, Yihui, Li, Yanhui, Wang, Jun, Zhang, Xiaofang
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: ACM 27.10.2024
Predmet:
ISSN:2643-1572
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Machine translation has become an integral part of daily life, with terminology translation playing a crucial role in ensuring the accuracy of translation results. However, existing translation systems, such as Google Translate, have been shown to occasionally produce errors in terminology translation. Current metrics for assessing terminology translation rely on reference translations and bilingual dictionaries, limiting their effectiveness in large-scale automated MT system testing.To address this challenge, we propose a novel method: Metamorphic Testing for Terminology Translation (TermMT), which achieves effective and efficient testing for terminology translation in MT systems without relying on reference translations or bilingual terminology dictionaries. Our approach involves constructing metamorphic relations based on the characteristics of terms: (a) adding an appropriate reference of the term in the given context would not change the translation of the term; (b) if we modify part of a multi-word term, the translation of the revised word combination would change. To evaluate the effectiveness of TermMT, we tested the terminology translation capabilities of three machine translation systems, Google Translate, Bing Microsoft Translator, and mBART, using the English portion of the bilingual UM-corpus dataset. The results show that TermMT detected a total of 3,765 translation errors on Google Translate, 2,351 on Bing Microsoft Translator, and 6,011 on mBART, with precisions of 82.33%, 83.00%, and 86.33%, respectively.
ISSN:2643-1572
DOI:10.1145/3691620.3695069