基于改进双通道1DCNN的城市声音分类方法
DOI:
CSTR:
作者:
作者单位:

1.太原理工大学电子信息与光学工程学院;2.山西高等创新研究院

作者简介:

通讯作者:

中图分类号:

基金项目:

山西省基础研究计划项目(青年基金)(20210302124544);山西省应用基础研究计划项目(201901D111094)


Urban sound classification method based on improved dual-channel 1DCNN
Author:
Affiliation:

1.College of Electronic Information and Optical Engineering,Taiyuan University of Technology;2.Shanxi Academy of Advanced Research and Innovation

Fund Project:

Natural Science Foundation of Shanxi Province(20210302124544);Applied Basic Research Project of Shanxi Province(201901D111094)

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    为了提高城市声音分类的准确率,降低模型应用的难度,提出一种基于改进双通道一维卷积神经网络的城市声音分类方法。首先,对音频的Fbank特征按时间帧和梅尔频段两个不同的方向进行展平得到一维数据;其次,将 AlexNet模型中的二维卷积替换为一维卷积,并对模型结构进行改进,根据不同的展平方式分别对应增加初次卷积的感受野并增加卷积步长以减少特征数据量;最后,利用改进的AlexNet模型和决策融合的方法设计了一种双通道卷积神经网络模型。为了验证该方法的有效性,在UrbanSound8K数据集上进行城市声音分类实验,结果显示该方法的分类准确率达到96.76%,并且能够有效缩小模型体积,便于在存储和计算资源较少的场景中应用。

    Abstract:

    A new urban sound classification method based on improved dual-channel one-dimensional convolutional neural network is proposed to improve the accuracy of urban sound classification and reduce the difficulty of model application. Firstly, the Fbank features of audio are flattened according to two different directions of the time frame and Mel frequency band to obtain one-dimensional data. Secondly, the two-dimensional convolution in the AlexNet model is replaced by one-dimensional convolution, and the model structure is improved. Moreover,according to different flattening methods, the receptive field of the first convolution is increased and the convolution step size is also increased to reduce the amount of feature data. Finally, a two-channel convolutional neural network model is designed using the modified AlexNet model and the decision fusion method. To verify the effectiveness of the proposed method, an urban sound classification experiment was carried out on the UrbanSound8K data set. The results show that the classification accuracy of the proposed method is 96.76%, and the size of the model can be effectively reduced, which is convenient for application in the scene with few storage and computing resources.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-12-05
  • 最后修改日期:2024-01-06
  • 录用日期:2024-02-22
  • 在线发布日期:
  • 出版日期:
文章二维码