Utterance Style Transfer Using Deep Models
The paper describes a solution to utterance style transfer within the speaker’s identity and emotional tone exchange while maintaining the utterance’s content. Using deep generative neural networks, we developed two models that differ in introducing information related to the speaker’s identity and...
Saved in:
| Published in: | Procedia computer science Vol. 192; pp. 2132 - 2141 |
|---|---|
| Main Authors: | , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Elsevier B.V
2021
|
| Subjects: | |
| ISSN: | 1877-0509, 1877-0509 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | The paper describes a solution to utterance style transfer within the speaker’s identity and emotional tone exchange while maintaining the utterance’s content. Using deep generative neural networks, we developed two models that differ in introducing information related to the speaker’s identity and an additional variable representing the expected emotional category. The embedding of emotions is taken from the convolutional network that was trained to classify emotional categories. Siamese network is responsible for learning the content embeddings. The models can perform the style transfer between any two speakers with a satisfactory result. This assessment is based on a survey that considers the degree of content retention, the quality of transferring voice features related to identity, and the degree of converting emotional features into a desirable category. |
|---|---|
| ISSN: | 1877-0509 1877-0509 |
| DOI: | 10.1016/j.procs.2021.08.226 |