Nonnegative Matrix Factorization With Basis Clustering Using Cepstral Distance Regularization
One successful approach for audio source separation involves applying nonnegative matrix factorization (NMF) to a magnitude spectrogram regarded as a nonnegative matrix. This can be interpreted as approximating the observed spectra at each time frame as the linear sum of the basis spectra scaled by...
Uloženo v:
| Vydáno v: | IEEE/ACM transactions on audio, speech, and language processing Ročník 26; číslo 6; s. 1029 - 1040 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
01.06.2018
|
| Témata: | |
| ISSN: | 2329-9290, 2329-9304 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | One successful approach for audio source separation involves applying nonnegative matrix factorization (NMF) to a magnitude spectrogram regarded as a nonnegative matrix. This can be interpreted as approximating the observed spectra at each time frame as the linear sum of the basis spectra scaled by time-varying amplitudes. This paper deals with the problem of the unsupervised instrument-wise source separation of polyphonic signals based on an extension of the NMF approach. We focus on the fact that each piece of music is typically played on a handful of musical instruments, which allows us to assume that the spectra of the underlying audio events in a polyphonic signal can be grouped into a reasonably small number of clusters in the mel-frequency cepstral coefficient (MFCC) domain. Based on this assumption, we propose formulating factorization of a magnitude spectrogram and clustering of the basis spectra in the MFCC domain as a joint optimization problem and derive a novel optimization algorithm based on the majorization-minimization principle. Experimental results revealed that our method was superior to a two-stage algorithm that consists of performing factorization followed by clustering the basis spectra, thus showing the advantage of the joint optimization approach. |
|---|---|
| ISSN: | 2329-9290 2329-9304 |
| DOI: | 10.1109/TASLP.2018.2795746 |