Utterance Style Transfer Using Deep Models
The paper describes a solution to utterance style transfer within the speaker’s identity and emotional tone exchange while maintaining the utterance’s content. Using deep generative neural networks, we developed two models that differ in introducing information related to the speaker’s identity and...
Gespeichert in:
| Veröffentlicht in: | Procedia computer science Jg. 192; S. 2132 - 2141 |
|---|---|
| Hauptverfasser: | , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Elsevier B.V
2021
|
| Schlagworte: | |
| ISSN: | 1877-0509, 1877-0509 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Zusammenfassung: | The paper describes a solution to utterance style transfer within the speaker’s identity and emotional tone exchange while maintaining the utterance’s content. Using deep generative neural networks, we developed two models that differ in introducing information related to the speaker’s identity and an additional variable representing the expected emotional category. The embedding of emotions is taken from the convolutional network that was trained to classify emotional categories. Siamese network is responsible for learning the content embeddings. The models can perform the style transfer between any two speakers with a satisfactory result. This assessment is based on a survey that considers the degree of content retention, the quality of transferring voice features related to identity, and the degree of converting emotional features into a desirable category. |
|---|---|
| ISSN: | 1877-0509 1877-0509 |
| DOI: | 10.1016/j.procs.2021.08.226 |