A DenseU-Net framework for Music Source Separation using Spectrogram Domain Approach

Uložené v:
Podrobná bibliografia
Názov: A DenseU-Net framework for Music Source Separation using Spectrogram Domain Approach
Autori: Vinitha George E
Zdroj: International Journal of Intelligent Systems and Applications in Engineering; Vol. 12 No. 4 (2024); 77-85
Informácie o vydavateľovi: International Journal of Intelligent Systems and Applications in Engineering, 2024.
Rok vydania: 2024
Predmety: Autoencoder, Convolutional Neural Network, Deep learning, DenseNet, Music source separation, ResNet, U-Net architecture
Popis: Audio source separation has been intensively explored by the research community. Deep learning algorithms aid in creating a neural network model to isolate the different sources present in a music mixture. In this paper, we propose an algorithm to separate the constituent sources present in a music signal mixture using a DenseUNet framework. The conversion of an audio signal into a spectrogram, akin to an image, accentuates the valuable attributes concealed in the time domain signal. Hence, a spectrogram-based model is chosen for the extraction of the target signal. The model incorporates a dense block into the layers of the U-Net structure. The proposed system is trained to extract individual source spectrograms from the mixture spectrogram. An ablation study was performed by replacing the dense block with convolution filters to study the effectiveness of the dense block. The proposed method proves to be more efficient in comparison with other state-of-the-art methods. The experiment results to separate vocals, bass, drums and others show an average SDR of 6.59 dB on the MUSDB database.
Druh dokumentu: Article
Popis súboru: application/pdf
Jazyk: English
ISSN: 2147-6799
Prístupová URL adresa: https://www.ijisae.org/index.php/IJISAE/article/view/6175
Rights: CC BY SA
Prístupové číslo: edsair.issn21476799..2777c9b7a9e731cc4820acbb1d0afcaa
Databáza: OpenAIRE
Popis
Abstrakt:Audio source separation has been intensively explored by the research community. Deep learning algorithms aid in creating a neural network model to isolate the different sources present in a music mixture. In this paper, we propose an algorithm to separate the constituent sources present in a music signal mixture using a DenseUNet framework. The conversion of an audio signal into a spectrogram, akin to an image, accentuates the valuable attributes concealed in the time domain signal. Hence, a spectrogram-based model is chosen for the extraction of the target signal. The model incorporates a dense block into the layers of the U-Net structure. The proposed system is trained to extract individual source spectrograms from the mixture spectrogram. An ablation study was performed by replacing the dense block with convolution filters to study the effectiveness of the dense block. The proposed method proves to be more efficient in comparison with other state-of-the-art methods. The experiment results to separate vocals, bass, drums and others show an average SDR of 6.59 dB on the MUSDB database.
ISSN:21476799