Truly Unsupervised Acoustic Word Embeddings Using Weak Top-down Constraints in Encoder-decoder Models

We investigate unsupervised models that can map a variable-duration speech segment to a fixed-dimensional representation. In settings where unlabelled speech is the only available resource, such acoustic word embeddings can form the basis for "zero-resource" speech search, discovery and in...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Proceedings of the ... IEEE International Conference on Acoustics, Speech and Signal Processing (1998) S. 6535 - 6539
1. Verfasser:	Kamper, Herman
Format:	Tagungsbericht
Sprache:	Englisch
Veröffentlicht:	IEEE 01.05.2019
Schlagworte:	Acoustic word embeddings Acoustics Decoding query-by-example Recurrent neural networks Speech processing Task analysis Training unsupervised learning zero-resource speech processing
ISSN:	2379-190X
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We investigate unsupervised models that can map a variable-duration speech segment to a fixed-dimensional representation. In settings where unlabelled speech is the only available resource, such acoustic word embeddings can form the basis for "zero-resource" speech search, discovery and indexing systems. Most existing unsupervised embedding methods still use some supervision, such as word or phoneme boundaries. Here we propose the encoder-decoder correspondence autoencoder (EncDec-CAE), which, instead of true word segments, uses automatically discovered segments: an unsupervised term discovery system finds pairs of words of the same unknown type, and the EncDec-CAE is trained to reconstruct one word given the other as input. We compare it to a standard encoder-decoder autoencoder (AE), a variational AE with a prior over its latent embedding, and downsampling. EncDec-CAE outperforms its closest competitor by 29% relative in average precision on two languages in a word discrimination task.
ISSN:	2379-190X
DOI:	10.1109/ICASSP.2019.8683639