Text-to-Speech for Low-Resource Agglutinative Language With Morphology-Aware Language Model Pre-Training

Text-to-Speech (TTS) aims to convert the input text to a human-like voice. With the development of deep learning, encoder-decoder based TTS models perform superior performance, in terms of naturalness, in mainstream languages such as Chinese, English, etc. Note that the linguistic information learni...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE/ACM transactions on audio, speech, and language processing Vol. 32; pp. 1 - 13
Main Authors:	Liu, Rui, Hu, Yifan, Zuo, Haolin, Luo, Zhaojie, Wang, Longbiao, Gao, Guanglai
Format:	Journal Article
Language:	English
Published:	Piscataway IEEE 01.01.2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Acoustics agglutinative Agglutinative languages Chinese languages Data models Decoding Deep learning Effectiveness Encoders-Decoders Language Language modeling Languages Linguistics Masking Morphology Naturalness pre-training Prosody Speech processing Speech recognition Speech synthesis Synthesis Text-to-speech Text-to-speech (TTS) Training
ISSN:	2329-9290, 2329-9304
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Be the first to leave a comment!