Optimized Discriminative Kernel for SVM Scoring and Its Application to Speaker Verification

The decision-making process of many binary classification systems is based on the likelihood ratio (LR) scores of test patterns. This paper shows that LR scores can be expressed in terms of the similarity between the supervectors (SVs) formed by stacking the mean vectors of Gaussian mixture models c...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE transactions on neural networks Ročník 22; číslo 2; s. 173 - 185
Hlavní autori: Zhang, Shi-Xiong, Mak, Man-Wai
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: New York, NY IEEE 01.02.2011
Institute of Electrical and Electronics Engineers
Predmet:
ISSN:1045-9227, 1941-0093, 1941-0093
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:The decision-making process of many binary classification systems is based on the likelihood ratio (LR) scores of test patterns. This paper shows that LR scores can be expressed in terms of the similarity between the supervectors (SVs) formed by stacking the mean vectors of Gaussian mixture models corresponding to the test patterns, the target model, and the background model. By interpreting the support vector machine (SVM) kernels as a specific similarity (or discriminant) function between SVs, this paper shows that LR scoring is a special case of SVM scoring and that most sequence kernels can be obtained by assuming a specific form for the similarity function of SVs. This paper further shows that this assumption can be relaxed to derive a new general kernel. The kernel function is general in that it is a linear combination of any kernels belonging to the reproducing kernel Hilbert space. The combination weights are obtained by optimizing the ability of a discriminant function to separate the positive and negative classes using either regression analysis or SVM training. The idea was applied to both high-and low-level speaker verification. In both cases, results show that the proposed kernels achieve better performance than several state-of-the-art sequence kernels. Further performance enhancement was also observed when the high-level scores were combined with acoustic scores.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ObjectType-Article-2
ObjectType-Feature-1
ISSN:1045-9227
1941-0093
1941-0093
DOI:10.1109/TNN.2010.2090893