A Multi-Feature Fusion Approach for Dialect Identification using 1D CNN

The phonological variety of Kurdish, a language with several dialects, poses a distinct problem in automatically identifying dialects. This study examines and evaluates several sound criteria for identifying Kurdish dialects: Badini, Hawrami, and Sorani. We deployed a dataset including 6,000 samples...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	JOIV : international journal on informatics visualization Online Ročník 8; číslo 3; s. 1246
Hlavní autori:	Karim, Sarkhel H.Taher, J. Ghafoor, Karzan, O. Abdulrahman, Ayub, M. Hama Rawf, Karwan
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	30.09.2024
ISSN:	2549-9610, 2549-9904
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	The phonological variety of Kurdish, a language with several dialects, poses a distinct problem in automatically identifying dialects. This study examines and evaluates several sound criteria for identifying Kurdish dialects: Badini, Hawrami, and Sorani. We deployed a dataset including 6,000 samples and utilized a mix of 1D convolutional neural networks (CNN) and fully connected layers to conduct the identification job. Our study aimed to assess the efficacy of different sound characteristics in accurately identifying dialects. We employed the Mel-frequency Cepstral Coefficients (MFCC) and other features such as the Mel spectrogram, spectral contrast, and polynomial features to extract the sound characteristics. We conducted training and testing of our models utilizing both individual characteristics and a composite of all features. Our analysis revealed that the identification task achieved excellent accuracy rates, suggesting a promising potential for success. We achieved 95.75% accuracy using MFCC combined with a Mel spectrogram. The accuracy improved by including contrast in the MFCC feature extraction process to 91.42%. Similarly, using poly_features resulted in an accuracy of 90.83%. Remarkably, accuracy reached a maximum of 96.5% when all the attributes were combined.
ISSN:	2549-9610 2549-9904
DOI:	10.62527/joiv.8.3.2146