Podrobná bibliografia
| Názov: |
Speech Time-Scale Modification With GANs. |
| Autori: |
Cohen, Eyal, Kreuk, Felix, Keshet, Joseph |
| Zdroj: |
IEEE Signal Processing Letters; May2022, Vol. 29, p1067-1071, 5p |
| Predmety: |
AUTOMATIC speech recognition, GENERATIVE adversarial networks, MUSICAL perception, MACHINE learning, SIGNAL processing, ARTIFICIAL neural networks, SPEECH perception, SPEECH |
| Abstrakt: |
While listening to spoken content, it is often desired to vary the speech rate while preserving the speaker’s timbre and pitch. To date, advanced signal processing techniques are used to address this task, but it still remains a challenge to maintain a high speech quality at all time-scales. Inspired by the success of speech generation using Generative Adversarial Networks (GANs), we propose a novel unsupervised learning algorithm for time-scale modification (TSM) of speech, called ScalerGAN. The model is trained using a set of speech utterances, where no time-scales are provided. The ScalerGAN algorithm is composed of a generator that gets as input speech with the desired rate and outputs a time-adjusted speech; a discriminator that works on various spectrum scales; and a decoder that converts the time-adjusted signal back to the original rate to maintain consistency. Using an A/B test and conditional A/B test, human listeners were asked to compare ScalerGAN with other state-of-the-art TSM methods. The results showed that the speech quality of ScalerGAN outperforms all other methods. [ABSTRACT FROM AUTHOR] |
|
Copyright of IEEE Signal Processing Letters is the property of IEEE and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) |
| Databáza: |
Complementary Index |