Bibliographic Details
| Title: |
基于多分辨率特征和时频注意力的环境声音分类. (Chinese) |
| Alternate Title: |
Environmental sound classification based on multi-resolution features and time-frequency attention. (English) |
| Authors: |
刘 慧, 李小霞, 何宏森 |
| Source: |
Application Research of Computers / Jisuanji Yingyong Yanjiu; Dec2021, Vol. 38 Issue 12, p3569-3573, 5p |
| Subject Terms: |
CONVOLUTIONAL neural networks, SPECTROGRAMS, LINEAR complementarity problem, NOISE |
| Abstract (English): |
For ESC, this paper proposed a convolutional neural network method based on multi-resolution features and time-frequency attention module. Firstly, compared with the single-resolution spectrogram, multi-channel and multi-resolution features could enrich feature information, realize information complementarity among different feature resolutions, and enhance the expression ability of features. Secondly, for sound signals, this paper proposed a time-frequency attention module. The module firstly used different sizes of one-dimensional convolution to focus on the effective information in the time domain and frequency domain, and then used two-dimensional convolution to fuse the two-domain information to suppress the background noise in the environment sound and eliminate redundant information interference caused by multi-channel and multi-resolution features. The experimental results show that the classification accuracy rates on the two benchmark data sets of ESC-10 and ESC-50 have reached 98.50% and 88.46%, which are 2.70% and 0.76% higher than the latest methods. [ABSTRACT FROM AUTHOR] |
| Abstract (Chinese): |
针对环境声音分类(ESC),提出了一种基于多分辨率特征和时频注意力的卷积神经网络环境声音分类方法。首先,相较单一分辨率的谱图,多通道多分辨率特征可以丰富特征信息,实现不同特征分辨率之间信息互补,增强特征的表达能力;其次,针对声信号提出了一种时频注意力模块,该模块先利用不同大小的一维卷积分别关注时域和频域有效信息,再用二维卷积将两者进行融合,从而抑制环境声中背景噪声并消除由多通道多分辨率带来的冗余信息干扰。实验结果表明,在ESC-10和ESC-50两个基准数据集上的分类准确率达到了98.50%和88.46%,与现有的最新方法相比分别提高了2.70%和0.76%。 [ABSTRACT FROM AUTHOR] |
|
Copyright of Application Research of Computers / Jisuanji Yingyong Yanjiu is the property of Application Research of Computers Edition and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) |
| Database: |
Complementary Index |