基于VGGNet改进网络结构的多尺度大熊猫面部检测
作者:
中图分类号:

TP183;TP391.4

基金项目:

四川省科技计划资助项目(2019YFG0299);四川省科技创新苗子工程(2019027);西华师范大学基本科研项目(19B045)。


Multi-scale giant panda face detection based on the improved VGGNet architecture
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [27]
  • |
  • 相似文献
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    大熊猫个体识别对研究大熊猫的种群数量非常重要,大熊猫面部检测是基于面部图像的大熊猫个体识别方法中的首要关键步骤。针对现有的大熊猫面部检测方法精确度不高的问题,提出基于VGGNet-16改进网络结构的多尺度大熊猫面部检测方法。首先,以VGGNet-16网络结构为基础,通过增加残差结构与BN层,降低卷积层通道数,并采用LeakyRelu激活函数等改进,构建一个新的特征提取主干网络。其次,将一个3尺度的特征金字塔网络结构与SPP结构结合用于目标检测。最后,使用深度分离卷积结构替代常规卷积结构。实验结果表明,提出的大熊猫面部检测方法在测试集上能够达到99.48%的mAP,检测性能优于YOLOv4。

    Abstract:

    Individual identification of giant pandas is very important for studying their population of them.. Giant panda face detection is the first key step of giant panda individual identification method based on facial images. To solve the problem that the precision of the existing giant panda face detection methods are low, a multi-scale giant panda face detection method based on improved VGGNet-16 architecture was proposed in this paper. Firstly, based on the VGGNet-16 network architecture, a new feature extraction backbone network was constructed through certain improvements such as adding the residual block and BN(Batch Normalization) layer, reducing the channel dimensionality of convolution layer and adopting LeakyRelu active function as well. Secondly, a 3-scale feature pyramid network structure was combined with SPP(Spatial Pyramid Pooling) structure for object detection. Finally, the conventional convolution architecture was replaced with the depwise separation convolution architecture. Experimental results show that the proposed method can achieve 99.48% mAP(mean average recision) in the test dataset, and the detection performance is better than YOLOv4(You Only Look Once Version 4).

    参考文献
    [1] Li B V, Pimm S L. China's endemic vertebrates sheltering under the protective umbrella of the giant panda[J]. Conservation Biology, 2016, 30(2):329-339.
    [2] Li B V, Alibhai S, Jewell Z, et al. Using footprints to identify and sex giant pandas[J]. Biological Conservation, 2018, 218:83-90.
    [3] Zhang J D, Hull V, Huang J Y, et al. Activity patterns of the giant panda (Ailuropoda melanoleuca)[J]. Journal of Mammalogy, 2015, 96(6):1116-1127.
    [4] Zheng X, Owen M A, Nie Y, et al. Individual identification of wild giant pandas from camera trap photos-a systematic and hierarchical approach[J]. Journal of Zoology, 2016, 300(4):247-256.
    [5] 史雪威, 张晋东, 欧阳志云. 野生大熊猫种群数量调查方法研究进展[J]. 生态学报, 2016, 36(23):7528-7537. SHI Xuewei, ZHANG Jindong, OUYANG Zhiyun. Research progress on population investigation methods for wild giantpanda[J]. Acta Ecologica Sinica, 2016, 36(23):7528-7537. (in Chinese)
    [6] 唐小平, 贾建生, 王志臣, 等. 全国第四次大熊猫调查方案设计及主要结果分析[J]. 林业资源管理, 2015(1):11-16. TANG Xiaoping, JIA Jiansheng, WANG Zhichen, et al. Scheme design and main result analysis of the fouth national survey on giant pandas[J]. Forest Resources Management, 2015(1):11-16. (in Chinese)
    [7] Hou J, He Y X, Yang H B, et al. Identification of animal individuals using deep learning:a case study of giant panda[J]. Biological Conservation, 2020, 242:108414.
    [8] Chen J, Wen Q, Qu W M, et al. Panda facial region detection based on topology modelling[C]//2012 5th International Congress on Image and Signal Processing. Piscataway, NJ:IEEE, 2012:911-915.
    [9] Chen J, Wen Q, Zhuo C L, et al. A novel approach towards head detection of giant pandas in the free-range environment[C]//2012 5th International Congress on Image and Signal Processing. Piscataway, NJ:IEEE, 2012:814-818.
    [10] Zhang W W, Sun J, Tang X O. From tiger to panda:animal head detection[J]. IEEE Transactions on Image Processing, 2011, 20(6):1696-1708.
    [11] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]//Proceedings of the International Conference on Neural Information Processing Systems. New York, USA:ACM, 2012:1097-1105.
    [12] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J/OL]. Computer ence, 2014[2020-09-29]. https://arxiv.org/abs/1409.1556
    [13] He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ:IEEE, 2016:770-778.
    [14] Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. Los Alamitos:IEEE Computer Society Press, 2015:1-9.
    [15] Ren S, He K, Girshick R, et al. Faster R-CNN:Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(6):1137-1149.
    [16] Redmon J, Farhadi A. Yolov3:An incremental improvement[J/OL]. Computer Vision and Pattern Recognition, 2018[2020-09-29]. https://arxiv.org/abs/1804.02767
    [17] Bochkovskiy A, Wang C Y, Liao H Y M. YOLOv4:Optimal speed and accuracy of object detection[J/OL]. Computer Vision and Pattern Recognition, 2020[2020-09-29]. https://arxiv.org/abs/2004.10934
    [18] Liu W, Anguelov D, Erhan D, et al. SSD:single shot MultiBox detector[M]. Cham:Springer International Publishing, 2016:21-37.
    [19] Lin T Y, Dollar P, Girshick R, et al. Feature pyramid networks for object detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ:IEEE, 2017:2117-2125.
    [20] Lin T Y, Maire M, Belongie S, et al. Microsoft COCO:common objects in context[M]. Cham:Springer International Publishing, 2014:740-755.
    [21] Russakovsky O, Deng J, Su H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3):211-252.
    [22] Everingham M, van Gool L, Williams C K I, et al. The pascal visual object classes (VOC) challenge[J]. International Journal of Computer Vision, 2010, 88(2):303-338.
    [23] Ioffe S, Szegedy C. Batch normalization:accelerating deep network training by reducing internal covariate shift[J/OL]. Machine Learning, 2015[2020-09-29]. https://arxiv.org/abs/1502.03167
    [24] He K M, Zhang X Y, Ren S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9):1904-1916.
    [25] Chollet F. Xception:deep learning with depthwise separable convolutions[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ:IEEE, 2017:1251-1258.
    [26] Gencay R, Qi M. Pricing and hedging derivative securities with neural networks:Bayesian regularization, early stopping, and bagging[J]. IEEE Transactions on Neural Networks, 2001, 12(4):726-734.
    [27] Dollar P, Wojek C, Schiele B, et al. Pedestrian detection:an evaluation of the state of the art[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(4):743-761.
    相似文献
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

何育欣,郑伯川,谭代伦,刘丹,蔡前舟.基于VGGNet改进网络结构的多尺度大熊猫面部检测[J].重庆大学学报,2020,43(11):63-71.

复制
分享
文章指标
  • 点击次数:738
  • 下载次数: 1076
  • HTML阅读次数: 967
  • 引用次数: 0
历史
  • 收稿日期:2020-07-11
  • 在线发布日期: 2020-12-02
  • 出版日期: 2020-11-30
文章二维码