Transfer learning and subword sampling for asymmetric-resource one-to-many neural translation
There are several approaches for improving neural machine translation for low-resource languages: monolingual data can be exploited via pretraining or data augmentation; parallel corpora on related language pairs can be used via parameter sharing or transfer learning in multilingual models; subword...
Uložené v:
| Vydané v: | Machine translation Ročník 34; číslo 4; s. 251 - 286 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Dordrecht
Springer Netherlands
01.12.2020
Springer Nature B.V |
| Predmet: | |
| ISSN: | 0922-6567, 1573-0573 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | There are several approaches for improving neural machine translation for low-resource languages: monolingual data can be exploited via pretraining or data augmentation; parallel corpora on related language pairs can be used via parameter sharing or transfer learning in multilingual models; subword segmentation and regularization techniques can be applied to ensure high coverage of the vocabulary. We review these approaches in the context of an asymmetric-resource one-to-many translation task, in which the pair of target languages are related, with one being a very low-resource and the other a higher-resource language. We test various methods on three artificially restricted translation tasks—English to Estonian (low-resource) and Finnish (high-resource), English to Slovak and Czech, English to Danish and Swedish—and one real-world task, Norwegian to North Sámi and Finnish. The experiments show positive effects especially for scheduled multi-task learning, denoising autoencoder, and subword sampling. |
|---|---|
| AbstractList | There are several approaches for improving neural machine translation for low-resource languages: monolingual data can be exploited via pretraining or data augmentation; parallel corpora on related language pairs can be used via parameter sharing or transfer learning in multilingual models; subword segmentation and regularization techniques can be applied to ensure high coverage of the vocabulary. We review these approaches in the context of an asymmetric-resource one-to-many translation task, in which the pair of target languages are related, with one being a very low-resource and the other a higher-resource language. We test various methods on three artificially restricted translation tasks—English to Estonian (low-resource) and Finnish (high-resource), English to Slovak and Czech, English to Danish and Swedish—and one real-world task, Norwegian to North Sámi and Finnish. The experiments show positive effects especially for scheduled multi-task learning, denoising autoencoder, and subword sampling. |
| Author | Virpioja, Sami Grönroos, Stig-Arne Kurimo, Mikko |
| Author_xml | – sequence: 1 givenname: Stig-Arne orcidid: 0000-0002-3750-6924 surname: Grönroos fullname: Grönroos, Stig-Arne email: stig-arne.gronroos@aalto.fi organization: Department of Signal Processing and Acoustics, Aalto University – sequence: 2 givenname: Sami orcidid: 0000-0002-3568-150X surname: Virpioja fullname: Virpioja, Sami organization: Department of Digital Humanities, University of Helsinki, Utopia Analytics – sequence: 3 givenname: Mikko surname: Kurimo fullname: Kurimo, Mikko organization: Department of Signal Processing and Acoustics, Aalto University |
| BookMark | eNp9kE1LAzEQhoNUsK3-AU8LnqOzyX40Ryl-QcFLPUqYzSZly25Sky22_95sVxA89JAMDPO88847IxPrrCbkNoX7FKB8CCnkAiiw-ATLOT1ckGmal5xC_CZkGruMFnlRXpFZCFuAiAGfks-1RxuM9kmr0dvGbhK0dRL21bfzsWK3a4emcT7BcOw63ftGUa-D23ulk2iD9o52aI-J1XuPbdIPii32jbPX5NJgG_TNb52Tj-en9fKVrt5f3paPK6p4wXuqdZWqgqEwQokKsWSZyEQNGeeAWZ0zjZUBUVW5LlWdL9LSLMoMwDCFNVcZn5O7UXfn3ddeh15uoz0bV8ooxTPBBAxTbJxS3oXgtZE733TojzIFOcQoxxhljFGeYpSHCC3-QarpT8fFO5v2PMpHNMQ9dqP9n6sz1A9dgIwh |
| CitedBy_id | crossref_primary_10_3103_S1060992X23030049 crossref_primary_10_3390_s21093063 |
| Cites_doi | 10.3115/v1/P15-1162 10.1007/s10590-019-09243-8 10.18653/v1/W18-6321 10.1162/COLI_a_00050 10.18653/v1/P17-4012 10.1109/CVPR.2016.308 10.1162/089120101750300490 10.1109/ASRU.2013.6707697 10.1007/BF00332918 10.18653/v1/P16-1185 10.18653/v1/W19-5205 10.1111/j.2517-6161.1977.tb01600.x 10.1198/016214502753479464 10.3115/1073083.1073135 10.18653/v1/W19-5206 10.18653/v1/D16-1163 10.18653/v1/W18-6325 10.18653/v1/W17-4715 10.1016/S0079-7421(08)60536-8 10.18653/v1/D18-1100 10.3115/1626355.1626359 10.18653/v1/W18-6313 10.18653/v1/P16-1160 10.18653/v1/D16-1160 10.18653/v1/N16-1101 10.2307/411036 10.1145/1390156.1390294 10.18653/v1/D18-1045 10.1007/s10590-019-09228-7 10.18653/v1/D18-1549 10.18653/v1/N19-1190 10.18653/v1/D17-1039 10.18653/v1/P18-1007 10.18653/v1/D18-1398 10.18653/v1/D18-1039 10.1007/s10590-011-9090-0 10.1007/978-1-4615-5529-2_5 10.1162/tacl_a_00065 10.18653/v1/P16-1009 10.18653/v1/D18-2012 10.18653/v1/P17-2061 10.1145/1187415.1187418 10.1162/tacl_a_00017 10.18653/v1/P18-1005 10.3115/1118647.1118650 10.1007/s10590-017-9203-5 10.18653/v1/W18-6401 10.18653/v1/E17-1100 10.18653/v1/D18-1461 10.3115/1613984.1613999 10.18653/v1/W15-3049 10.18653/v1/W18-6410 10.18653/v1/P16-2058 10.1609/aaai.v31i1.10950 10.1162/tacl_a_00300 10.18653/v1/P19-1021 10.18653/v1/D17-1158 10.18653/v1/W17-4123 |
| ContentType | Journal Article |
| Copyright | The Author(s), under exclusive licence to Springer Nature B.V. part of Springer Nature 2021 The Author(s), under exclusive licence to Springer Nature B.V. part of Springer Nature 2021. |
| Copyright_xml | – notice: The Author(s), under exclusive licence to Springer Nature B.V. part of Springer Nature 2021 – notice: The Author(s), under exclusive licence to Springer Nature B.V. part of Springer Nature 2021. |
| DBID | AAYXX CITATION 7T9 |
| DOI | 10.1007/s10590-020-09253-x |
| DatabaseName | CrossRef Linguistics and Language Behavior Abstracts (LLBA) |
| DatabaseTitle | CrossRef Linguistics and Language Behavior Abstracts (LLBA) |
| DatabaseTitleList | Linguistics and Language Behavior Abstracts (LLBA) |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Languages & Literatures Computer Science |
| EISSN | 1573-0573 |
| EndPage | 286 |
| ExternalDocumentID | 10_1007_s10590_020_09253_x |
| GrantInformation_xml | – fundername: European Research Council grantid: 771113 – fundername: European Union’s Horizon 2020 Research and Innovation Programme grantid: 780069 |
| GroupedDBID | -4Z -5C -5G -BR -D4 -EM -Y2 -~C .86 .DC .VR 06D 0VY 1N0 1SB 2.D 203 28- 29M 2J2 2JY 2KG 2LR 2P1 2VQ 2~H 30V 4.4 408 409 40D 40E 5GY 5QI 5VS 67Z 6IK 6NX 77K 8TC 95- 95. 95~ 96X AAAVM AABHQ AAHCP AAHNG AAIAL AAJKR AAOBN AARHV AARTL AATVU AAWCG AAYIU AAYQN AAYTO AAYZH ABBBX ABBHK ABBXA ABDZT ABECU ABECW ABFTV ABHLI ABHQN ABJNI ABJOX ABKCH ABKTR ABMNI ABMOR ABMQK ABNWP ABQBU ABQSL ABSXP ABTEG ABTHY ABTMW ABXPI ABXSQ ACBXY ACGFO ACGFS ACHXU ACKNC ACMLO ACNXV ACOKC ACOMO ACREN ACSNA ACYUM ACZOJ ADHHG ADHIR ADIMF ADINQ ADIYS ADKNI ADKPE ADMHG ADPTO ADRFC ADULT ADURQ ADYFF ADYOE ADZKW AEBTG AEFIE AEGAL AEGNC AEJHL AEJRE AEKMD AEOHA AEPYU AETLH AEUPB AEXYK AFBBN AFEXP AFGCZ AFLOW AFQWF AFWTZ AFYQB AFZKB AGAYW AGDGC AGGDS AGJBK AGQMX AGWIL AGWZB AGYKE AHAVH AHBYD AHEXP AHKAY AHSBF AHYZX AIIXL AILAN AITGF AJBLW AJRNO AJZVZ ALMA_UNASSIGNED_HOLDINGS ALWAN AMKLP AMTXH AMYQR AOCGG ARMRJ AXYYD AYJHY AZFZN AZRUE B-. BA0 BBWZM BDATZ BGNMA BHNFS CAG COF CS3 CSCUP DL5 DNIVK DU5 EBS EIOEI EJD ESBYG FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRRFC FSGXE FWDCC GGCAI GGRSB GJIRD GNWQR GPZZG GQ6 GQ7 GQ8 GXS H13 HF~ HG5 HG6 HMJXF HQYDN HRMNR HVGLF HZ~ I09 IHE IJ- IPSME IXC IXD IXE IZIGR IZQ I~X I~Z J-C J0Z JAAYA JAB JBMMH JBSCW JCJTX JENOY JHFFW JKQEH JLEZI JLXEF JPL JSODD JST KDC KOV KOW KZ1 LAK M4Y MA- MVM N2Q N9A NB0 NDZJH NQJWS NU0 O-J O9- O93 O9G O9I O9J OAM OVD P19 P2P P9Q PF0 PT5 QOK QOS R-Y R4E R89 R9I RHV RIG RNI ROL RPX RSV RZC RZE RZK S16 S1Z S26 S27 S28 S3B SA0 SAP SCJ SCLPG SDH SDM SHS SHX SISQX SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 T16 TEORI TSG TSK TSV TUC U2A UG4 UOJIU UTJUX UZXMN VC2 VFIZW W23 W48 WK6 WK8 YLTOR Z45 Z7X Z81 Z83 Z88 Z8R Z8U Z8W Z92 ZMTXR ~EX 77I AAYXX ABFSG ACSTC ADHKG AEZWR AFHIU AGQPQ AHWEU AIXLP CITATION EBLON JZLTJ 7T9 |
| ID | FETCH-LOGICAL-c363t-eeb1c62a9f9c9baa724949d04330a4d52eabf09bb5e7cd5817f87400f2cad3c43 |
| IEDL.DBID | RSV |
| ISICitedReferencesCount | 2 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000613039800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0922-6567 |
| IngestDate | Sat Nov 08 22:13:55 EST 2025 Tue Nov 18 20:12:49 EST 2025 Sat Nov 29 03:00:09 EST 2025 Fri Feb 21 02:49:35 EST 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 4 |
| Keywords | Low-resource languages Subword segmentation Multilingual machine translation Denoising sequence autoencoder Transfer learning Multi-task learning |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c363t-eeb1c62a9f9c9baa724949d04330a4d52eabf09bb5e7cd5817f87400f2cad3c43 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-3750-6924 0000-0002-3568-150X |
| OpenAccessLink | https://aaltodoc.aalto.fi/handle/123456789/102739 |
| PQID | 2493492904 |
| PQPubID | 2043595 |
| PageCount | 36 |
| ParticipantIDs | proquest_journals_2493492904 crossref_primary_10_1007_s10590_020_09253_x crossref_citationtrail_10_1007_s10590_020_09253_x springer_journals_10_1007_s10590_020_09253_x |
| PublicationCentury | 2000 |
| PublicationDate | 2020-12-01 |
| PublicationDateYYYYMMDD | 2020-12-01 |
| PublicationDate_xml | – month: 12 year: 2020 text: 2020-12-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | Dordrecht |
| PublicationPlace_xml | – name: Dordrecht |
| PublicationTitle | Machine translation |
| PublicationTitleAbbrev | Machine Translation |
| PublicationYear | 2020 |
| Publisher | Springer Netherlands Springer Nature B.V |
| Publisher_xml | – name: Springer Netherlands – name: Springer Nature B.V |
| References | Östling R, Tiedemann J (2017) Neural machine translation for low-resource languages. ArXiv:1708.05729 [cs.CL], arXiv:1708.05729 ForcadaMLGinestí-RosellMNordfalkJO’ReganJOrtiz-RojasSPérez-OrtizJASánchez-MartínezFRamírez-SánchezGTyersFMApertium: a free/open-source platform for rule-based machine translationMach Transl201125212714410.1007/s10590-011-9090-0 Skorokhodov I, Rykachevskiy A, Emelyanenko D, Slotin S, Ponkratov A (2018) Semi-supervised neural machine translation with language models. In: Proceedings of the AMTA 2018 workshop on technologies for MT of low resource languages (LoResMT 2018), pp 37–44 Luong MT (2016) Neural machine translation. PhD Thesis, Stanford University Virpioja S, Turunen VT, Spiegler S, Kohonen O, Kurimo M (2011) Empirical comparison of evaluation methods for unsupervised learning of morphology. Traitement Automatique des Langues 52(2):45–90, http://www.atala.org/Empirical-Comparison-of-Evaluation Grönroos SA, Virpioja S, Kurimo M (2020) Morfessor EM+Prune: improved subword segmentation with expectation maximization and pruning. In: Proceedings of the 12th language resources and evaluation conference, ELRA, Marseilles Arivazhagan N, Bapna A, Firat O, Lepikhin D, Johnson M, Krikun M, Chen MX, Cao Y, Foster G, Cherry C, Macherey W, Chen Z, Wu Y (2019) Massively multilingual neural machine translation in the wild: Findings and challenges. arXiv:1907.05019 [cs.CL] Sennrich R, Zhang B (2019) Revisiting low-resource neural machine translation: a case study. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 211–221 Mueller A, Nicolai G, McCarthy AD, Lewis D, Wu W, Yarowsky D (2020) An analysis of massively multilingual neural machine translation for low-resource languages. In: Proceedings of The 12th language resources and evaluation conference, pp 3710–3718 Sennrich R, Haddow B, Birch A (2016) Improving neural machine translation models with monolingual data. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 86–96 Vaibhav V, Singh S, Stewart C, Neubig G (2019) Improving robustness of machine translation with synthetic noise. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 1 (Long and Short Papers), pp 1916–1920, https://www.aclweb.org/anthology/N19-1190 Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 6000–6010, http://arxiv.org/abs/1706.03762 Artetxe M, Labaka G, Agirre E, Cho K (2018) Unsupervised neural machine translation. In: Proceedings of the 6th international conference on learning representations (ICLR), http://arxiv.org/abs/1710.11041 Cherry C, Foster G, Bapna A, Firat O, Macherey W (2018) Revisiting character-based neural machine translation with capacity and compression. In: Proceedings of the 2018 conference on empirical methods in natural language processing (EMNLP), Association for Computational Linguistics, Brussels, Belgium, pp 4295–4305, https://doi.org/10.18653/v1/D18-1461, https://www.aclweb.org/anthology/D18-1461 Lample G, Conneau A, Denoyer L, Ranzato M (2018a) Unsupervised machine translation using monolingual corpora only. In: International conference on learning representations (ICLR), http://arxiv.org/abs/1711.00043 KiperwasserEBallesterosMScheduled multi-task learning: from syntax to translationTrans Assoc Comput Linguist2018622524010.1162/tacl_a_00017 Conneau A, Lample G (2019) Cross-lingual language model pretraining. In: Proceedings of advances in neural information processing systems (NIPS), pp 7059–7069, http://papers.nips.cc/paper/8928-cross-lingual-language-model-pretraining Luong MT, Le QV, Sutskever I, Vinyals O, Kaiser L (2015) Multi-task sequence to sequence learning. In: Proceedings of international conference on learning representations (ICLR), http://arxiv.org/abs/1511.06114 CreutzMLagusKUnsupervised models for morpheme segmentation and morphology learningACM Trans Speech Lang Process20074140510.1145/1187415.1187418 Caswell I, Chelba C, Grangier D (2019) Tagged back-translation. In: Proceedings of the fourth conference on machine translation (WMT) (Volume 1: Research Papers), pp 53–63 BourlardHKampYAuto-association by multilayer perceptrons and singular value decompositionBiol Cybern1988594–529129496112010.1007/BF00332918 Dabre R, Chu C, Kunchukuttan A (2020) A comprehensive survey of multilingual neural machine translation. ArXiv:2001.01115 [cs.CL], arXiv:2001.01115 Kurimo M, Virpioja S, Turunen V, Lagus K (2010) Morpho challenge 2005-2010: Evaluations and results. In: Heinz J, Cahill L, Wicentowski R (eds) Proceedings of the 11th meeting of the ACL special interest group on computational morphology and phonology, association for computational linguistics, Uppsala, Sweden, pp 87–95 Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 1 (Long and Short Papers), pp 4171–4186, http://arxiv.org/abs/1810.04805 Creutz M, Lagus K (2005) Unsupervised morpheme segmentation and morphology induction from text corpora using Morfessor 1.0. Technical Report A81, Publications in Computer and Information Science, Publications in Computer and Information Science, Helsinki University of Technology Kudo T (2018) Subword regularization: Improving neural network translation models with multiple subword candidates. In: Proceedings of the 56th annual meeting of the association for computational linguistics (ACL) (Volume 1: Long Papers), pp 66–75, http://arxiv.org/abs/1804.10959 Di Gangi MA, Federico M (2017) Monolingual embeddings for low resourced neural machine translation. In: Proceedings of the 14th international workshop on spoken language translation (IWSLT’17), pp 97–104 Costa-jussà MR, Fonollosa JAR (2016) Character-based neural machine translation. In: Proceedings of the 54th annual meeting of the association for computational linguistics (ACL) (Volume 2: Short Papers), Association for Computational Linguistics, Berlin, Germany, pp 357–361, https://doi.org/10.18653/v1/P16-2058, https://www.aclweb.org/anthology/P16-2058 KoponenMSalmiLNikulinMA product and process analysis of post-editor corrections on neural, statistical and rule-based machine translation outputMach Transl2019331–2619010.1007/s10590-019-09228-7 Blackwood G, Ballesteros M, Ward T (2018) Multilingual neural machine translation with task-specific attention. In: Proceedings of the 27th international conference on computational linguistics, pp 3112–3122 Lee YS (2004) Morphological analysis for statistical machine translation. In: Proceedings of HLT-NAACL 2004: short papers, Association for Computational Linguistics, pp 57–60 Currey A, Miceli-Barone AV, Heafield K (2017) Copied monolingual data improves low-resource neural machine translation. In: Proceedings of the second conference on machine translation (WMT), pp 148–156 Bojar O, Federmann C, Fishel M, Graham Y, Haddow B, Huck M, Koehn P, Monz C (2018) Findings of the 2018 conference on machine translation (wmt18). In: Proceedings of the third conference on machine translation, volume 2: shared task papers, association for computational linguistics, Belgium, Brussels, pp 272–307. http://www.aclweb.org/anthology/W18-6401 Cheng Y, Xu W, He Z, He W, Wu H, Sun M, Liu Y (2016) Semi-supervised learning for neural machine translation. In: Proceedings of the 54th annual meeting of the association for computational linguistics (acl) (volume 1: long papers), pp 1965–1974, http://arxiv.org/abs/1606.04596 GoldsmithJUnsupervised learning of the morphology of a natural languageComput Linguist2001272153198184150810.1162/089120101750300490 Lison P, Tiedemann J (2016) OpenSubtitles2016: Extracting large parallel corpora from movie and tv subtitles. In: Proceedings of the 10th international conference on language resources and evaluation (LREC 2016), European Language Resources Association Oflazer K, El-Kahlout ID (2007) Exploring different representational units in English-to-Turkish statistical machine translation. In: Proceedings of the second workshop on statistical machine translation, Association for Computational Linguistics, pp 25–32, https://doi.org/10.3115/1626355.1626359, http://portal.acm.org/citation.cfm?doid=1626355.1626359 Song K, Tan X, Qin T, Lu J, Liu TY (2019) Mass: Masked sequence to sequence pre-training for language generation. In: Proceedings of the international conference on machine learning (ICML), pp 5926–5936 DempsterAPLairdNMRubinDBMaximum likelihood from incomplete data via the EM algorithmJ R Stat Soc19773911385015370364.62022 Wang X, Pham H, Dai Z, Neubig G (2018) Switchout: an efficient data augmentation algorithm for neural machine translation. In: Proceedings of the 2018 conference on empirical methods in natural language processing (EMNLP), pp 856–861, https://www.aclweb.org/anthology/D18-1100 HarrisZSFrom phoneme to morphemeLanguage195531219022210.2307/411036 Firat O, Cho K, Bengio Y (2016) Multi-way, multilingual neural machine translation with a shared attention mechanism. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 866–875, http://arxiv.org/abs/1601.01073 Kohonen O, Virpioja S, Lagus K (2010) Semi-supervised learning of concatenative morphology. In: Proceedings of the 11th meeting of the ACL special interest group on computational morphology and phonology, association for computational linguistics, Uppsala, Sweden, pp 78–86, http://www.aclweb.org/anthology/W10-2210 Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L (2019) BART: Denoising sequence-to-sequen M Koponen (9253_CR55) 2019; 33 9253_CR73 9253_CR72 9253_CR71 9253_CR70 9253_CR76 9253_CR75 9253_CR79 9253_CR78 M Creutz (9253_CR20) 2007; 4 9253_CR80 9253_CR83 9253_CR82 N Srivastava (9253_CR84) 2014; 15 9253_CR81 9253_CR88 9253_CR87 9253_CR86 9253_CR85 9253_CR89 A Karakanta (9253_CR48) 2018; 32 9253_CR51 9253_CR50 SL Scott (9253_CR77) 2002; 97 9253_CR54 9253_CR52 9253_CR5 9253_CR59 9253_CR6 9253_CR58 9253_CR3 9253_CR57 9253_CR4 9253_CR56 9253_CR1 9253_CR2 AP Dempster (9253_CR25) 1977; 39 R Caruana (9253_CR8) 1998 9253_CR9 J Rissanen (9253_CR74) 1989 9253_CR62 9253_CR61 9253_CR60 H Hammarström (9253_CR41) 2011; 37 9253_CR66 9253_CR65 9253_CR64 9253_CR63 9253_CR104 9253_CR69 9253_CR105 9253_CR68 9253_CR102 9253_CR67 9253_CR103 9253_CR100 9253_CR101 ZS Harris (9253_CR42) 1955; 31 P Gage (9253_CR32) 1994; 12 M Johnson (9253_CR45) 2017; 5 ML Forcada (9253_CR31) 2011; 25 9253_CR33 9253_CR30 9253_CR37 9253_CR36 9253_CR35 9253_CR39 9253_CR38 9253_CR40 9253_CR44 9253_CR43 9253_CR47 9253_CR91 9253_CR90 9253_CR95 9253_CR94 9253_CR93 M Joshi (9253_CR46) 2020; 8 9253_CR92 9253_CR11 E Kiperwasser (9253_CR49) 2018; 6 9253_CR99 9253_CR10 9253_CR98 9253_CR97 9253_CR96 9253_CR15 9253_CR14 9253_CR13 9253_CR12 9253_CR19 9253_CR18 9253_CR17 9253_CR16 P Koehn (9253_CR53) 2005; 5 H Bourlard (9253_CR7) 1988; 59 9253_CR22 9253_CR21 J Goldsmith (9253_CR34) 2001; 27 9253_CR26 9253_CR24 9253_CR23 9253_CR29 9253_CR28 9253_CR27 |
| References_xml | – reference: Dai AM, Le QV (2015) Semi-supervised sequence learning. In: Proceedings of advances in neural information processing systems (NIPS), pp 3079–3087 – reference: JoshiMChenDLiuYWeldDSZettlemoyerLLevyOSpanbert: Improving pre-training by representing and predicting spansTrans Assoc Comput Linguist20208647710.1162/tacl_a_00300 – reference: Graça M, Kim Y, Schamper J, Khadivi S, Ney H (2019) Generalizing back-translation in neural machine translation. In: Proceedings of the fourth conference on machine translation (volume 1: research papers), pp 45–52, https://arxiv.org/abs/1906.07286 – reference: Costa-jussà MR, Escolano C, Fonollosa JAR (2017) Byte-based neural machine translation. In: Proceedings of the first workshop on subword and character level models in NLP, Association for Computational Linguistics, Copenhagen, Denmark, pp 154–158, https://doi.org/10.18653/v1/W17-4123, https://www.aclweb.org/anthology/W17-4123 – reference: Currey A, Miceli-Barone AV, Heafield K (2017) Copied monolingual data improves low-resource neural machine translation. In: Proceedings of the second conference on machine translation (WMT), pp 148–156 – reference: Domhan T, Hieber F (2017) Using target-side monolingual data for neural machine translation through multi-task learning. In: Proceedings of the 2017 conference on empirical methods in natural language processing (EMNLP), pp 1500–1505 – reference: Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 1 (Long and Short Papers), pp 4171–4186, http://arxiv.org/abs/1810.04805 – reference: Ramachandran P, Liu PJ, Le Q (2017) Unsupervised pretraining for sequence to sequence learning. In: Proceedings of the 2017 conference on empirical methods in natural language processing (EMNLP), pp 383–391 – reference: Yang Z, Chen W, Wang F, Xu B (2018) Unsupervised neural machine translation with weight sharing. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 46–55 – reference: Gu J, Wang Y, Chen Y, Cho K, Li VOK (2018) Meta-learning for low-resource neural machine translation. In: Proceedings of the 2018 conference on empirical methods in natural language processing (EMNLP), pp 3622–3631, http://arxiv.org/abs/1808.08437 – reference: Caswell I, Chelba C, Grangier D (2019) Tagged back-translation. In: Proceedings of the fourth conference on machine translation (WMT) (Volume 1: Research Papers), pp 53–63 – reference: Kreutzer J, Sokolov A (2018) Learning to segment inputs for NMT favors character-level processing. In: Proceedings of the 15th international workshop on spoken language translation (IWSLT), https://arxiv.org/abs/1810.01480 – reference: CreutzMLagusKUnsupervised models for morpheme segmentation and morphology learningACM Trans Speech Lang Process20074140510.1145/1187415.1187418 – reference: Toral A, Sánchez-Cartagena VM (2017) A multifaceted evaluation of neural versus phrase-based machine translation for 9 language directions. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics (EACL): Volume 1, Long Papers, pp 1063–1073, https://www.aclweb.org/anthology/E17-1100 – reference: Tiedemann J (2009) Character-based PSMT for closely related languages. In: Proceedings of the 13th conference of the european association for machine translation (EAMT 2009), pp 12–19 – reference: Lample G, Ott M, Conneau A, Denoyer L, Ranzato M (2018b) Phrase-based & neural unsupervised machine translation. In: Proceedings of the 2018 conference on empirical methods in natural language processing (EMNLP), pp 5039–5049, https://www.aclweb.org/anthology/D18-1549.pdf – reference: Sennrich R, Zhang B (2019) Revisiting low-resource neural machine translation: a case study. In: Proceedings of the 57th annual meeting of the association for computational linguistics, pp 211–221 – reference: Luong MT, Le QV, Sutskever I, Vinyals O, Kaiser L (2015) Multi-task sequence to sequence learning. In: Proceedings of international conference on learning representations (ICLR), http://arxiv.org/abs/1511.06114 – reference: Blackwood G, Ballesteros M, Ward T (2018) Multilingual neural machine translation with task-specific attention. In: Proceedings of the 27th international conference on computational linguistics, pp 3112–3122 – reference: Kocmi T (2019) Exploring benefits of transfer learning in neural machine translation. PhD Thesis, Charles University – reference: McCloskey M, Cohen NJ (1989) Catastrophic interference in connectionist networks: the sequential learning problem. In: Psychology of learning and motivation, vol 24, Elsevier, pp 109–165 – reference: Dabre R, Nakagawa T, Kazawa H (2017) An empirical study of language relatedness for transfer learning in neural machine translation. In: Proceedings of the 31st Pacific Asia conference on language, information and computation, pp 282–286 – reference: Creutz M, Lagus K (2005) Unsupervised morpheme segmentation and morphology induction from text corpora using Morfessor 1.0. Technical Report A81, Publications in Computer and Information Science, Publications in Computer and Information Science, Helsinki University of Technology – reference: Skorokhodov I, Rykachevskiy A, Emelyanenko D, Slotin S, Ponkratov A (2018) Semi-supervised neural machine translation with language models. In: Proceedings of the AMTA 2018 workshop on technologies for MT of low resource languages (LoResMT 2018), pp 37–44 – reference: Wang X, Pham H, Dai Z, Neubig G (2018) Switchout: an efficient data augmentation algorithm for neural machine translation. In: Proceedings of the 2018 conference on empirical methods in natural language processing (EMNLP), pp 856–861, https://www.aclweb.org/anthology/D18-1100 – reference: Torrey L, Shavlik J (2009) Transfer learning. In: Olivas ES (ed) Handbook of research on machine learning applications and trends: algorithms, methods, and techniques: algorithms, methods, and techniques, IGI Global, pp 242–264 – reference: Dabre R, Chu C, Kunchukuttan A (2020) A comprehensive survey of multilingual neural machine translation. ArXiv:2001.01115 [cs.CL], arXiv:2001.01115 – reference: Cherry C, Foster G, Bapna A, Firat O, Macherey W (2018) Revisiting character-based neural machine translation with capacity and compression. In: Proceedings of the 2018 conference on empirical methods in natural language processing (EMNLP), Association for Computational Linguistics, Brussels, Belgium, pp 4295–4305, https://doi.org/10.18653/v1/D18-1461, https://www.aclweb.org/anthology/D18-1461 – reference: Kocmi T, Bojar O (2018) Trivial transfer learning for low-resource neural machine translation. In: Proceedings of the third conference on machine translation (WMT): research papers, pp 244–252 – reference: He D, Xia Y, Qin T, Wang L, Yu N, Liu TY, Ma WY (2016) Dual learning for machine translation. In: Proceedings of advances in neural information processing systems (NIPS), pp 820–828, http://arxiv.org/abs/1611.00179 – reference: Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Proceedings of advances in neural information processing systems (NIPS), pp 3104–3112 – reference: KiperwasserEBallesterosMScheduled multi-task learning: from syntax to translationTrans Assoc Comput Linguist2018622524010.1162/tacl_a_00017 – reference: JohnsonMSchusterMLeQVKrikunMWuYChenZThoratNViégasFWattenbergMCorradoGHughesMDeanJGoogle’s multilingual neural machine translation system: enabling zero-shot translationTrans Assoc Comput Linguist2017533935110.1162/tacl_a_00065 – reference: Kohonen O, Virpioja S, Lagus K (2010) Semi-supervised learning of concatenative morphology. In: Proceedings of the 11th meeting of the ACL special interest group on computational morphology and phonology, association for computational linguistics, Uppsala, Sweden, pp 78–86, http://www.aclweb.org/anthology/W10-2210 – reference: Popović M (2015) chrF: character n-gram F-score for automatic MT evaluation. In: Proceedings of the 10th workshop on statistical machine translation (WMT), association for computational linguistics, pp 392–395, https://doi.org/10.18653/v1/W15-3049, http://aclweb.org/anthology/W15-3049 – reference: Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 6000–6010, http://arxiv.org/abs/1706.03762 – reference: Platanios EA, Sachan M, Neubig G, Mitchell T (2018) Contextual parameter generation for universal neural machine translation. In: Proceedings of the 2018 conference on empirical methods in natural language processing (EMNLP), pp 425–435, https://www.aclweb.org/anthology/D18-1039 – reference: Iyyer M, Manjunatha V, Boyd-Graber J, Daumé III H (2015) Deep unordered composition rivals syntactic methods for text classification. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (ACL-IJCNLP) (Volume 1: Long Papers), pp 1681–1691 – reference: Varjokallio M, Kurimo M, Virpioja S (2013) Learning a subword vocabulary based on unigram likelihood. In: Proceedings of the 2013 IEEE workshop on automatic speech recognition and understanding (ASRU), IEEE, Olomouc, Czech Republic, pp 7–12, https://doi.org/10.1109/ASRU.2013.6707697, http://ieeexplore.ieee.org/document/6707697/ – reference: Virpioja S, Smit P, Grönroos SA, Kurimo M (2013) Morfessor 2.0: Python implementation and extensions for Morfessor Baseline. Report 25/2013 in Aalto University publication series SCIENCE + TECHNOLOGY, Department of Signal Processing and Acoustics, Aalto University, Helsinki, Finland – reference: Östling R, Tiedemann J (2017) Neural machine translation for low-resource languages. ArXiv:1708.05729 [cs.CL], arXiv:1708.05729 – reference: Belinkov Y, Bisk Y (2017) Synthetic and natural noise both break neural machine translation. arXiv:1711.02173 [cs.CL] – reference: Oflazer K, El-Kahlout ID (2007) Exploring different representational units in English-to-Turkish statistical machine translation. In: Proceedings of the second workshop on statistical machine translation, Association for Computational Linguistics, pp 25–32, https://doi.org/10.3115/1626355.1626359, http://portal.acm.org/citation.cfm?doid=1626355.1626359 – reference: Artetxe M, Labaka G, Agirre E, Cho K (2018) Unsupervised neural machine translation. In: Proceedings of the 6th international conference on learning representations (ICLR), http://arxiv.org/abs/1710.11041 – reference: HarrisZSFrom phoneme to morphemeLanguage195531219022210.2307/411036 – reference: Stahlberg F, Cross J, Stoyanov V (2018) Simple fusion: Return of the language model. In: Proceedings of the third conference on machine translation (WMT): research papers, pp 204–211 – reference: Steinberger R, Pouliquen B, Widiger A, Ignat C, Erjavec T, Tufiş D, Varga D (2006) The JRC-Acquis: A multilingual aligned parallel corpus with 20+ languages. In: Proceedings of the 5th international conference on language resources and evaluation (LREC’2006), Genoa, Italy – reference: ScottSLBayesian methods for hidden Markov models: recursive computing in the 21st centuryJ Am Stat Assoc200297457337351196339310.1198/016214502753479464 – reference: Di Gangi MA, Federico M (2017) Monolingual embeddings for low resourced neural machine translation. In: Proceedings of the 14th international workshop on spoken language translation (IWSLT’17), pp 97–104 – reference: CaruanaRMultitask learningLearning to learn1998BerlinSpringer9513310.1007/978-1-4615-5529-2_5 – reference: Mueller A, Nicolai G, McCarthy AD, Lewis D, Wu W, Yarowsky D (2020) An analysis of massively multilingual neural machine translation for low-resource languages. In: Proceedings of The 12th language resources and evaluation conference, pp 3710–3718 – reference: Cheng Y, Xu W, He Z, He W, Wu H, Sun M, Liu Y (2016) Semi-supervised learning for neural machine translation. In: Proceedings of the 54th annual meeting of the association for computational linguistics (acl) (volume 1: long papers), pp 1965–1974, http://arxiv.org/abs/1606.04596 – reference: Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826, http://arxiv.org/abs/1512.00567 – reference: Sennrich R, Haddow B, Birch A (2016) Improving neural machine translation models with monolingual data. In: Proceedings of the 54th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 86–96 – reference: Klein G, Kim Y, Deng Y, Senellart J, Rush AM (2017) OpenNMT: Open-source toolkit for neural machine translation. In: Proceedings of the annual meeting of the association for computational linguistics (ACL). https://doi.org/10.18653/v1/P17-4012,arXiv: 1701.02810 – reference: Kalchbrenner N, Blunsom P (2013) Recurrent continuous translation models. In: Proceedings of the 2013 conference on empirical methods in natural language processing (EMNLP), pp 1700–1709 – reference: Bojar O, Federmann C, Fishel M, Graham Y, Haddow B, Huck M, Koehn P, Monz C (2018) Findings of the 2018 conference on machine translation (wmt18). In: Proceedings of the third conference on machine translation, volume 2: shared task papers, association for computational linguistics, Belgium, Brussels, pp 272–307. http://www.aclweb.org/anthology/W18-6401 – reference: RissanenJStochastic complexity in statistical inquiry1989SingaporeWorld Scientific Series in Computer Science0800.68508 – reference: Creutz M, Lagus K (2002) Unsupervised discovery of morphemes. In: Proceedings of the ACL-02 workshop on morphological and phonological learning (MPL), Association for Computational Linguistics, Philadelphia, Pennsylvania, vol 6, pp 21–30, https://doi.org/10.3115/1118647.1118650, http://portal.acm.org/citation.cfm?doid=1118647.1118650 – reference: GagePA new algorithm for data compressionC Users J19941222338 – reference: Galuščáková P, Bojar O (2012) WMT 2011 testing set. http://hdl.handle.net/11858/00-097C-0000-0006-AADA-9, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University – reference: DempsterAPLairdNMRubinDBMaximum likelihood from incomplete data via the EM algorithmJ R Stat Soc19773911385015370364.62022 – reference: Grönroos SA, Virpioja S, Kurimo M (2018) Cognate-aware morphological segmentation for multilingual neural translation. In: Proceedings of the third conference on machine translation, Association for Computational Linguistics, Brussels – reference: Edunov S, Ott M, Auli M, Grangier D (2018) Understanding back-translation at scale. In: Proceedings of the 2018 conference on empirical methods in natural language processing (EMNLP), pp 489–500 – reference: GoldsmithJUnsupervised learning of the morphology of a natural languageComput Linguist2001272153198184150810.1162/089120101750300490 – reference: Sachan DS, Neubig G (2018) Parameter sharing methods for multilingual self-attentional translation models. In: Proceedings of the third conference on machine translation (WMT): research papers, pp 261–271, https://www.aclweb.org/anthology/W18-6327 – reference: Sriram A, Jun H, Satheesh S, Coates A (2017) Cold fusion: Training seq2seq models together with language models. In: Proceedings of the Interspeech 2018, https://arxiv.org/abs/1708.06426 – reference: Virpioja S, Turunen VT, Spiegler S, Kohonen O, Kurimo M (2011) Empirical comparison of evaluation methods for unsupervised learning of morphology. Traitement Automatique des Langues 52(2):45–90, http://www.atala.org/Empirical-Comparison-of-Evaluation – reference: ForcadaMLGinestí-RosellMNordfalkJO’ReganJOrtiz-RojasSPérez-OrtizJASánchez-MartínezFRamírez-SánchezGTyersFMApertium: a free/open-source platform for rule-based machine translationMach Transl201125212714410.1007/s10590-011-9090-0 – reference: KarakantaADehdariJvan GenabithJNeural machine translation for low-resource languages without parallel corporaMach Transl2018321–216718910.1007/s10590-017-9203-5 – reference: Song K, Tan X, Qin T, Lu J, Liu TY (2019) Mass: Masked sequence to sequence pre-training for language generation. In: Proceedings of the international conference on machine learning (ICML), pp 5926–5936 – reference: Costa-jussà MR, Fonollosa JAR (2016) Character-based neural machine translation. In: Proceedings of the 54th annual meeting of the association for computational linguistics (ACL) (Volume 2: Short Papers), Association for Computational Linguistics, Berlin, Germany, pp 357–361, https://doi.org/10.18653/v1/P16-2058, https://www.aclweb.org/anthology/P16-2058 – reference: Firat O, Cho K, Bengio Y (2016) Multi-way, multilingual neural machine translation with a shared attention mechanism. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 866–875, http://arxiv.org/abs/1601.01073 – reference: Virpioja S, Väyrynen JJ, Creutz M, Sadeniemi M (2007) Morphology-aware statistical machine translation based on morphs induced in an unsupervised manner. Machine Translation Summit XI, Copenhagen, Denmark 2007:491–498 – reference: Chu C, Dabre R, Kurohashi S (2017) An empirical comparison of domain adaptation methods for neural machine translation. In: Proceedings of the 55th annual meeting of the association for computational linguistics (ACL) (Volume 2: Short Papers), Association for Computational Linguistics, Vancouver, pp 385–391, https://doi.org/10.18653/v1/P17-2061, https://www.aclweb.org/anthology/P17-2061 – reference: Salesky E, Runge A, Coda A, Niehues J, Neubig G (2020) Optimizing segmentation granularity for neural machine translation. Mach Transl pp 1–19 – reference: Luong MT (2016) Neural machine translation. PhD Thesis, Stanford University – reference: SrivastavaNHintonGKrizhevskyASutskeverISalakhutdinovRDropout: a simple way to prevent neural networks from overfittingJ Mach Learn Res20141511929195832315921318.68153 – reference: Kurimo M, Virpioja S, Turunen V, Lagus K (2010) Morpho challenge 2005-2010: Evaluations and results. In: Heinz J, Cahill L, Wicentowski R (eds) Proceedings of the 11th meeting of the ACL special interest group on computational morphology and phonology, association for computational linguistics, Uppsala, Sweden, pp 87–95 – reference: Gulcehre C, Firat O, Xu K, Cho K, Barrault L, Lin HC, Bougares F, Schwenk H, Bengio Y (2015) On using monolingual corpora in neural machine translation. http://arxiv.org/abs/1503.03535 – reference: Sennrich R, Haddow B, Birch A (2015) Neural machine translation of rare words with subword units. In: Proceedings of the annual meeting of the association for computational linguistics (ACL), http://arxiv.org/abs/1508.07909 – reference: Grönroos SA, Virpioja S, Kurimo M (2020) Morfessor EM+Prune: improved subword segmentation with expectation maximization and pruning. In: Proceedings of the 12th language resources and evaluation conference, ELRA, Marseilles – reference: Kudo T (2018) Subword regularization: Improving neural network translation models with multiple subword candidates. In: Proceedings of the 56th annual meeting of the association for computational linguistics (ACL) (Volume 1: Long Papers), pp 66–75, http://arxiv.org/abs/1804.10959 – reference: BourlardHKampYAuto-association by multilayer perceptrons and singular value decompositionBiol Cybern1988594–529129496112010.1007/BF00332918 – reference: Lison P, Tiedemann J (2016) OpenSubtitles2016: Extracting large parallel corpora from movie and tv subtitles. In: Proceedings of the 10th international conference on language resources and evaluation (LREC 2016), European Language Resources Association – reference: KoehnPEuroparl: a parallel corpus for statistical machine translationMT Summit200557986 – reference: Bojar O, Dušek O, Kocmi T, Libovický J, Novák M, Popel M, Sudarikov R, Variš D (2016) CzEng 1.6: Enlarged Czech-English Parallel Corpus with Processing Tools Dockered. In: Sojka P, Horák A, Kopeček I, Pala K (eds) Text, speech, and dialogue: 19th international conference, TSD 2016, Masaryk University, Springer International Publishing, Cham/Heidelberg/New York/Dordrecht/London, no. 9924 in Lecture Notes in Artificial Intelligence, pp 231–238 – reference: Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on machine learning (ICML), pp 1096–1103 – reference: KoponenMSalmiLNikulinMA product and process analysis of post-editor corrections on neural, statistical and rule-based machine translation outputMach Transl2019331–2619010.1007/s10590-019-09228-7 – reference: Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: a method for automatic evaluation of machine translation. In: 40th annual meeting of the association for computational linguistics, Association for Computational Linguistics, Philadelphia, pp 311–318, https://doi.org/10.3115/1073083.1073135, http://portal.acm.org/citation.cfm?doid=1073083.1073135 – reference: HammarströmHBorinLUnsupervised learning of morphologyComput Linguist2011372309350184150810.1162/COLI_a_00050 – reference: Zoph B, Yuret D, May J, Knight K (2016) Transfer learning for low-resource neural machine translation. In: Proceedings of the 2016 conference on empirical methods in natural language processing (EMNLP), pp 1568–1575 – reference: Vaibhav V, Singh S, Stewart C, Neubig G (2019) Improving robustness of machine translation with synthetic noise. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 1 (Long and Short Papers), pp 1916–1920, https://www.aclweb.org/anthology/N19-1190/ – reference: Lee YS (2004) Morphological analysis for statistical machine translation. In: Proceedings of HLT-NAACL 2004: short papers, Association for Computational Linguistics, pp 57–60 – reference: Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L (2019) BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. ArXiv:1910.13461 [cs.CL], arXiv:1910.13461 – reference: Goodfellow IJ, Mirza M, Xiao D, Courville A, Bengio Y (2014) An empirical investigation of catastrophic forgetting in gradient-based neural networks. In: Proceedings of international conference on learning representations (ICLR), Citeseer, https://arxiv.org/abs/1312.6211 – reference: Chen Z, Badrinarayanan V, Lee CY, Rabinovich A (2018) Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In: Proceedings of the international conference on machine learning (ICML), pp 794–803 – reference: Conneau A, Lample G (2019) Cross-lingual language model pretraining. In: Proceedings of advances in neural information processing systems (NIPS), pp 7059–7069, http://papers.nips.cc/paper/8928-cross-lingual-language-model-pretraining – reference: Lample G, Conneau A, Denoyer L, Ranzato M (2018a) Unsupervised machine translation using monolingual corpora only. In: International conference on learning representations (ICLR), http://arxiv.org/abs/1711.00043 – reference: Zhang J, Zong C (2016) Exploiting source-side monolingual data in neural machine translation. In: Proceedings of the 2016 conference on empirical methods in natural language processing (EMNLP), pp 1535–1545 – reference: Chung J, Cho K, Bengio Y (2016) A character-level decoder without explicit segmentation for neural machine translation. In: Proceedings of the 54th annual meeting of the association for computational linguistics (ACL) (Volume 1: Long Papers), pp 1693–1703, http://arxiv.org/abs/1603.06147 – reference: Thompson B, Khayrallah H, Anastasopoulos A, McCarthy AD, Duh K, Marvin R, McNamee P, Gwinnup J, Anderson T, Koehn P (2018) Freezing subnetworks to analyze domain adaptation in neural machine translation. In: Proceedings of the third conference on machine translation (WMT): research papers, pp 124–132 – reference: Kudo T, Richardson J (2018) SentencePiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. In: Proceedings of the 2018 conference on empirical methods in natural language processing: system demonstrations, association for computational linguistics, Brussels, Belgium, pp 66–71, https://doi.org/10.18653/v1/D18-2012, https://www.aclweb.org/anthology/D18-2012 – reference: Tu Z, Liu Y, Shang L, Liu X, Li H (2017) Neural machine translation with reconstruction. In: Thirty-first AAAI conference on artificial intelligence, http://arxiv.org/abs/1611.01874 – reference: Arivazhagan N, Bapna A, Firat O, Lepikhin D, Johnson M, Krikun M, Chen MX, Cao Y, Foster G, Cherry C, Macherey W, Chen Z, Wu Y (2019) Massively multilingual neural machine translation in the wild: Findings and challenges. arXiv:1907.05019 [cs.CL] – ident: 9253_CR44 doi: 10.3115/v1/P15-1162 – ident: 9253_CR76 doi: 10.1007/s10590-019-09243-8 – ident: 9253_CR10 – ident: 9253_CR2 – ident: 9253_CR33 – ident: 9253_CR85 doi: 10.18653/v1/W18-6321 – ident: 9253_CR79 – ident: 9253_CR105 – volume: 37 start-page: 309 issue: 2 year: 2011 ident: 9253_CR41 publication-title: Comput Linguist doi: 10.1162/COLI_a_00050 – ident: 9253_CR56 – ident: 9253_CR50 doi: 10.18653/v1/P17-4012 – ident: 9253_CR88 doi: 10.1109/CVPR.2016.308 – volume: 27 start-page: 153 issue: 2 year: 2001 ident: 9253_CR34 publication-title: Comput Linguist doi: 10.1162/089120101750300490 – ident: 9253_CR95 doi: 10.1109/ASRU.2013.6707697 – ident: 9253_CR27 – volume: 59 start-page: 291 issue: 4–5 year: 1988 ident: 9253_CR7 publication-title: Biol Cybern doi: 10.1007/BF00332918 – ident: 9253_CR65 – ident: 9253_CR82 – ident: 9253_CR11 doi: 10.18653/v1/P16-1185 – ident: 9253_CR36 doi: 10.18653/v1/W19-5205 – volume: 39 start-page: 1 issue: 1 year: 1977 ident: 9253_CR25 publication-title: J R Stat Soc doi: 10.1111/j.2517-6161.1977.tb01600.x – ident: 9253_CR38 – ident: 9253_CR59 – volume: 97 start-page: 337 issue: 457 year: 2002 ident: 9253_CR77 publication-title: J Am Stat Assoc doi: 10.1198/016214502753479464 – ident: 9253_CR70 doi: 10.3115/1073083.1073135 – ident: 9253_CR9 doi: 10.18653/v1/W19-5206 – ident: 9253_CR24 – ident: 9253_CR104 doi: 10.18653/v1/D16-1163 – ident: 9253_CR51 doi: 10.18653/v1/W18-6325 – ident: 9253_CR21 doi: 10.18653/v1/W17-4715 – volume: 5 start-page: 79 year: 2005 ident: 9253_CR53 publication-title: MT Summit – ident: 9253_CR67 doi: 10.1016/S0079-7421(08)60536-8 – ident: 9253_CR64 – ident: 9253_CR4 – ident: 9253_CR101 doi: 10.18653/v1/D18-1100 – ident: 9253_CR69 doi: 10.3115/1626355.1626359 – ident: 9253_CR89 doi: 10.18653/v1/W18-6313 – ident: 9253_CR14 doi: 10.18653/v1/P16-1160 – volume: 15 start-page: 1929 issue: 1 year: 2014 ident: 9253_CR84 publication-title: J Mach Learn Res – ident: 9253_CR86 – ident: 9253_CR103 doi: 10.18653/v1/D16-1160 – ident: 9253_CR1 – ident: 9253_CR92 – ident: 9253_CR19 – ident: 9253_CR30 doi: 10.18653/v1/N16-1101 – volume: 31 start-page: 190 issue: 2 year: 1955 ident: 9253_CR42 publication-title: Language doi: 10.2307/411036 – ident: 9253_CR52 doi: 10.18653/v1/W18-6325 – ident: 9253_CR97 doi: 10.1145/1390156.1390294 – ident: 9253_CR47 – ident: 9253_CR75 – ident: 9253_CR29 doi: 10.18653/v1/D18-1045 – ident: 9253_CR22 – volume-title: Stochastic complexity in statistical inquiry year: 1989 ident: 9253_CR74 – volume: 33 start-page: 61 issue: 1–2 year: 2019 ident: 9253_CR55 publication-title: Mach Transl doi: 10.1007/s10590-019-09228-7 – ident: 9253_CR61 doi: 10.18653/v1/D18-1549 – ident: 9253_CR81 – ident: 9253_CR60 doi: 10.18653/v1/D18-1549 – ident: 9253_CR94 doi: 10.18653/v1/N19-1190 – ident: 9253_CR73 doi: 10.18653/v1/D17-1039 – ident: 9253_CR98 – ident: 9253_CR57 doi: 10.18653/v1/P18-1007 – ident: 9253_CR23 – ident: 9253_CR39 doi: 10.18653/v1/D18-1398 – ident: 9253_CR71 doi: 10.18653/v1/D18-1039 – volume: 25 start-page: 127 issue: 2 year: 2011 ident: 9253_CR31 publication-title: Mach Transl doi: 10.1007/s10590-011-9090-0 – ident: 9253_CR63 – ident: 9253_CR3 – start-page: 95 volume-title: Learning to learn year: 1998 ident: 9253_CR8 doi: 10.1007/978-1-4615-5529-2_5 – volume: 5 start-page: 339 year: 2017 ident: 9253_CR45 publication-title: Trans Assoc Comput Linguist doi: 10.1162/tacl_a_00065 – ident: 9253_CR80 doi: 10.18653/v1/P16-1009 – ident: 9253_CR58 doi: 10.18653/v1/D18-2012 – ident: 9253_CR90 – ident: 9253_CR13 doi: 10.18653/v1/P17-2061 – volume: 4 start-page: 405 issue: 1 year: 2007 ident: 9253_CR20 publication-title: ACM Trans Speech Lang Process doi: 10.1145/1187415.1187418 – volume: 6 start-page: 225 year: 2018 ident: 9253_CR49 publication-title: Trans Assoc Comput Linguist doi: 10.1162/tacl_a_00017 – ident: 9253_CR102 doi: 10.18653/v1/P18-1005 – ident: 9253_CR18 doi: 10.3115/1118647.1118650 – volume: 32 start-page: 167 issue: 1–2 year: 2018 ident: 9253_CR48 publication-title: Mach Transl doi: 10.1007/s10590-017-9203-5 – ident: 9253_CR66 – ident: 9253_CR83 – ident: 9253_CR87 – volume: 12 start-page: 23 issue: 2 year: 1994 ident: 9253_CR32 publication-title: C Users J – ident: 9253_CR35 – ident: 9253_CR6 doi: 10.18653/v1/W18-6401 – ident: 9253_CR54 – ident: 9253_CR91 doi: 10.18653/v1/E17-1100 – ident: 9253_CR12 doi: 10.18653/v1/D18-1461 – ident: 9253_CR62 doi: 10.3115/1613984.1613999 – ident: 9253_CR72 doi: 10.18653/v1/W15-3049 – ident: 9253_CR15 – ident: 9253_CR37 doi: 10.18653/v1/W18-6410 – ident: 9253_CR40 – ident: 9253_CR16 doi: 10.18653/v1/P16-2058 – ident: 9253_CR93 doi: 10.1609/aaai.v31i1.10950 – volume: 8 start-page: 64 year: 2020 ident: 9253_CR46 publication-title: Trans Assoc Comput Linguist doi: 10.1162/tacl_a_00300 – ident: 9253_CR96 – ident: 9253_CR99 – ident: 9253_CR100 – ident: 9253_CR78 doi: 10.18653/v1/P19-1021 – ident: 9253_CR28 doi: 10.18653/v1/D17-1158 – ident: 9253_CR26 – ident: 9253_CR43 – ident: 9253_CR5 – ident: 9253_CR68 – ident: 9253_CR17 doi: 10.18653/v1/W17-4123 |
| SSID | ssj0010003 |
| Score | 2.1980243 |
| Snippet | There are several approaches for improving neural machine translation for low-resource languages: monolingual data can be exploited via pretraining or data... |
| SourceID | proquest crossref springer |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 251 |
| SubjectTerms | Artificial Intelligence Asymmetry Computational Linguistics Computer Science Czech language Danish language Denoising English language Estonian language Experiments Finnish language Languages Learning Learning transfer Machine translation Monolingualism Natural Language Processing (NLP) Noise reduction Norwegian language Parallel corpora Pretraining Regularization Sami languages Sampling Segmentation Slovak language Swedish language Translation Translation methods and strategies Vocabulary |
| Title | Transfer learning and subword sampling for asymmetric-resource one-to-many neural translation |
| URI | https://link.springer.com/article/10.1007/s10590-020-09253-x https://www.proquest.com/docview/2493492904 |
| Volume | 34 |
| WOSCitedRecordID | wos000613039800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAVX databaseName: SpringerLINK Contemporary 1997-Present customDbUrl: eissn: 1573-0573 dateEnd: 20211231 omitProxy: false ssIdentifier: ssj0010003 issn: 0922-6567 databaseCode: RSV dateStart: 19970101 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3JTsMwELUQcOBCoWyFgnxAXMCSsziJjwhRcagqxFL1giLbSRASTVGTsvw9M6nTAgIkuCa2k3ib58zMe4QcOkZpGRjNQikE802mmNQO8t5qTwBecNJK67DfDXu9aDCQlzYprKij3WuXZLVTf0h2E5IzPO5w6QqPAXJcAnMX4XK8uu7PfAcI8yuGPThmAVoJbarM9218NkdzjPnFLVpZm07jf--5RlYtuqSn0-mwThbSvEkatXIDtQu5Sba79jdlQY9od8asXGyQu8p4ZVDY6kncU5UntJjoFzim0kJhBDpcBKxLVfE2HKIil2Fj6wWgozxl5YgNYY-hyJUJb1Nii9OQu01y2zm_ObtgVoKBGS_wSpbCVm4CV8lMGqmVCl1ks0mQ9YwrPxFuqnTGpdYiDU0iIifMUOOPZ65RiWd8b4ss5vDkHUJDACsSqqCojC9VIt0owyQSboKQJ5K3iFOPRGwsPznKZDzGc2Zl7NkYejauejZ-bZHjWZ2nKTvHr6Xb9QDHdqUWMXwQEjRK7rfIST2g89s_t7b7t-J7ZMXFOVFFwrTJYjmepPtk2TyXD8X4oJrB752D67w |
| linkProvider | Springer Nature |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3JTsMwEB0hQIIL-1Io4APiApacramPCFGBCBViExcU2U6CkGiLmrD9PTOp0wICJLgmtpN4m-fMzHsA245RWjaM5qEMAu6bTHGpHeK91V6AeMFJS63D6yhst5s3N_LMJoXlVbR75ZIsd-oPyW6BFJyOO0K6gccROU74aLEokO_84nroOyCYXzLs4TEL0UpoU2W-b-OzORphzC9u0dLatGb_955zMGPRJdsfTId5GEu7CzBbKTcwu5AXYCWyvylztsOiIbNyvgi3pfHKsLDVk7hjqpuw_Em_4DGV5Yoi0PEiYl2m8rdOhxS5DO9bLwDrdVNe9HgH9xhGXJn4NgW1OAi5W4Kr1uHlwRG3EgzceA2v4Clu5abhKplJI7VSoUtsNgmxngnlJ4GbKp0JqXWQhiYJmk6YkcafyFyjEs_43jKMd_HJq8BCBCsSq5CojC9VIt1mRkkkwjRCkUhRA6caidhYfnKSyXiIR8zK1LMx9mxc9mz8WoPdYZ3HATvHr6Xr1QDHdqXmMX4QETRK4ddgrxrQ0e2fW1v7W_EtmDq6PI3i6Lh9sg7TLs2PMiqmDuNF_yndgEnzXNzn_c1yNr8DsoHuoA |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1JS8QwFH7IKOLFcXdccxAvGibdJ0dRB8UyCC54kZKkrQhOlWnH5d-b16YzKiqI1zZJ26zf63vv-wB2LCUk95WkAfc86qpUUC4t5L2VjqfxgpWUWofXYdDrdW5u-PmHLP4y2r12SVY5DcjSlBXtpzhtf0h88zijaPowbnsO1Shy0kXRILTXL65HfgSE_CXbnja5NHIJTNrM9218PprGePOLi7Q8ebrN_7_zHMwa1EkOqmkyDxNJtgDNWtGBmAW-ACuh-X2Zk10SjhiX80W4LQ-1VBc2OhN3RGQxyYfyRZuvJBcYma4vagxMRP7W76NSl6ID4x0gj1lCi0fa13sPQQ5N_TYFtliF4i3BVff48vCEGmkGqhzfKWiit3jl24KnXHEpRGAjy02MbGhMuLFnJ0KmjEvpJYGKvY4VpKj9x1JbidhRrrMMjUw_eRVIoEEM11VQbMblIuZ2J8XkEqb8gMWctcCqRyVShrcc5TMeojHjMvZspHs2Kns2em3B3qjOU8Xa8WvpjXqwI7OC80h_EBI3cua2YL8e3PHtn1tb-1vxbZg-P-pG4WnvbB1mbJweZbDMBjSKwTDZhCn1XNzng61yYr8Dd_r3hA |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Transfer+learning+and+subword+sampling+for+asymmetric-resource+one-to-many+neural+translation&rft.jtitle=Machine+translation&rft.au=Stig-Arne%2C+Gr%C3%B6nroos&rft.au=Virpioja+Sami&rft.au=Kurimo+Mikko&rft.date=2020-12-01&rft.pub=Springer+Nature+B.V&rft.issn=0922-6567&rft.eissn=1573-0573&rft.volume=34&rft.issue=4&rft.spage=251&rft.epage=286&rft_id=info:doi/10.1007%2Fs10590-020-09253-x&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0922-6567&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0922-6567&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0922-6567&client=summon |