基于VGGNet改进网络结构的多尺度大熊猫面部检测

doi:10.11835/j.issn.1000-582X.2020.11.007

首页 > 过刊浏览>2020年第43卷第11期 >63-71. DOI:10.11835/j.issn.1000-582X.2020.11.007

基于VGGNet改进网络结构的多尺度大熊猫面部检测
DOI:
                        10.11835/j.issn.1000-582X.2020.11.007
                    
CSTR:
                        
                    
作者:
                        何育欣何育欣
西华师范大学 数学与信息学院, 四川 南充 637002
在期刊界中查找
在百度中查找
在本站中查找
郑伯川郑伯川
西华师范大学 数学与信息学院, 四川 南充 637002;西华师范大学 计算方法及应用软件研究所, 四川 南充 637002
在期刊界中查找
在百度中查找
在本站中查找
谭代伦谭代伦
西华师范大学 数学与信息学院, 四川 南充 637002;西华师范大学 计算方法及应用软件研究所, 四川 南充 637002
在期刊界中查找
在百度中查找
在本站中查找
刘丹刘丹
西华师范大学 计算机学院, 四川 南充 637002
在期刊界中查找
在百度中查找
在本站中查找
蔡前舟蔡前舟
西华师范大学 数学与信息学院, 四川 南充 637002
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:TP183;TP391.4
基金项目:四川省科技计划资助项目（2019YFG0299）；四川省科技创新苗子工程（2019027）；西华师范大学基本科研项目（19B045）。

Multi-scale giant panda face detection based on the improved VGGNet architecture

Author:

HE Yuxin
HE Yuxin
School of Mathematics and Information, China West Normal University, Nanchong, Sichuan 637002, P. R. China
在期刊界中查找
在百度中查找
在本站中查找
ZHENG Bochuan
ZHENG Bochuan
School of Mathematics and Information, China West Normal University, Nanchong, Sichuan 637002, P. R. China;Institute of Computing Method and Application Software, China West Normal University, Nanchong, Sichuan 637002, P. R. China
在期刊界中查找
在百度中查找
在本站中查找
TAN Dailun
TAN Dailun
School of Mathematics and Information, China West Normal University, Nanchong, Sichuan 637002, P. R. China;Institute of Computing Method and Application Software, China West Normal University, Nanchong, Sichuan 637002, P. R. China
在期刊界中查找
在百度中查找
在本站中查找
LIU Dan
LIU Dan
School of Computer Science, China West Normal University, Nanchong, Sichuan 637002, P. R. China
在期刊界中查找
在百度中查找
在本站中查找
CAI Qianzhou
CAI Qianzhou
School of Mathematics and Information, China West Normal University, Nanchong, Sichuan 637002, P. R. China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [27]

相似文献

引证文献

资源附件

文章评论

摘要:

大熊猫个体识别对研究大熊猫的种群数量非常重要，大熊猫面部检测是基于面部图像的大熊猫个体识别方法中的首要关键步骤。针对现有的大熊猫面部检测方法精确度不高的问题，提出基于VGGNet-16改进网络结构的多尺度大熊猫面部检测方法。首先，以VGGNet-16网络结构为基础，通过增加残差结构与BN层，降低卷积层通道数，并采用LeakyRelu激活函数等改进，构建一个新的特征提取主干网络。其次，将一个3尺度的特征金字塔网络结构与SPP结构结合用于目标检测。最后，使用深度分离卷积结构替代常规卷积结构。实验结果表明，提出的大熊猫面部检测方法在测试集上能够达到99.48%的mAP，检测性能优于YOLOv4。

关键词:VGGNet网络结构;大熊猫;面部检测;目标检测

Abstract:

Individual identification of giant pandas is very important for studying their population of them.. Giant panda face detection is the first key step of giant panda individual identification method based on facial images. To solve the problem that the precision of the existing giant panda face detection methods are low, a multi-scale giant panda face detection method based on improved VGGNet-16 architecture was proposed in this paper. Firstly, based on the VGGNet-16 network architecture, a new feature extraction backbone network was constructed through certain improvements such as adding the residual block and BN(Batch Normalization) layer, reducing the channel dimensionality of convolution layer and adopting LeakyRelu active function as well. Secondly, a 3-scale feature pyramid network structure was combined with SPP(Spatial Pyramid Pooling) structure for object detection. Finally, the conventional convolution architecture was replaced with the depwise separation convolution architecture. Experimental results show that the proposed method can achieve 99.48% mAP(mean average recision) in the test dataset, and the detection performance is better than YOLOv4(You Only Look Once Version 4).

Key words:VGGNet network structure;giant panda;face detection;object detection

参考文献

[1] Li B V, Pimm S L. China's endemic vertebrates sheltering under the protective umbrella of the giant panda[J]. Conservation Biology, 2016, 30(2):329-339.

[2] Li B V, Alibhai S, Jewell Z, et al. Using footprints to identify and sex giant pandas[J]. Biological Conservation, 2018, 218:83-90.

[3] Zhang J D, Hull V, Huang J Y, et al. Activity patterns of the giant panda (Ailuropoda melanoleuca)[J]. Journal of Mammalogy, 2015, 96(6):1116-1127.

[4] Zheng X, Owen M A, Nie Y, et al. Individual identification of wild giant pandas from camera trap photos-a systematic and hierarchical approach[J]. Journal of Zoology, 2016, 300(4):247-256.

[5] 史雪威, 张晋东, 欧阳志云. 野生大熊猫种群数量调查方法研究进展[J]. 生态学报, 2016, 36(23):7528-7537. SHI Xuewei, ZHANG Jindong, OUYANG Zhiyun. Research progress on population investigation methods for wild giantpanda[J]. Acta Ecologica Sinica, 2016, 36(23):7528-7537. (in Chinese)

[6] 唐小平, 贾建生, 王志臣, 等. 全国第四次大熊猫调查方案设计及主要结果分析[J]. 林业资源管理, 2015(1):11-16. TANG Xiaoping, JIA Jiansheng, WANG Zhichen, et al. Scheme design and main result analysis of the fouth national survey on giant pandas[J]. Forest Resources Management, 2015(1):11-16. (in Chinese)

[7] Hou J, He Y X, Yang H B, et al. Identification of animal individuals using deep learning:a case study of giant panda[J]. Biological Conservation, 2020, 242:108414.

[8] Chen J, Wen Q, Qu W M, et al. Panda facial region detection based on topology modelling[C]//2012 5th International Congress on Image and Signal Processing. Piscataway, NJ:IEEE, 2012:911-915.

[9] Chen J, Wen Q, Zhuo C L, et al. A novel approach towards head detection of giant pandas in the free-range environment[C]//2012 5th International Congress on Image and Signal Processing. Piscataway, NJ:IEEE, 2012:814-818.

[10] Zhang W W, Sun J, Tang X O. From tiger to panda:animal head detection[J]. IEEE Transactions on Image Processing, 2011, 20(6):1696-1708.

[11] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]//Proceedings of the International Conference on Neural Information Processing Systems. New York, USA:ACM, 2012:1097-1105.

[12] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J/OL]. Computer ence, 2014[2020-09-29]. https://arxiv.org/abs/1409.1556

[13] He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ:IEEE, 2016:770-778.

[14] Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. Los Alamitos:IEEE Computer Society Press, 2015:1-9.

[15] Ren S, He K, Girshick R, et al. Faster R-CNN:Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(6):1137-1149.

[16] Redmon J, Farhadi A. Yolov3:An incremental improvement[J/OL]. Computer Vision and Pattern Recognition, 2018[2020-09-29]. https://arxiv.org/abs/1804.02767

[17] Bochkovskiy A, Wang C Y, Liao H Y M. YOLOv4:Optimal speed and accuracy of object detection[J/OL]. Computer Vision and Pattern Recognition, 2020[2020-09-29]. https://arxiv.org/abs/2004.10934

[18] Liu W, Anguelov D, Erhan D, et al. SSD:single shot MultiBox detector[M]. Cham:Springer International Publishing, 2016:21-37.

[19] Lin T Y, Dollar P, Girshick R, et al. Feature pyramid networks for object detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ:IEEE, 2017:2117-2125.

[20] Lin T Y, Maire M, Belongie S, et al. Microsoft COCO:common objects in context[M]. Cham:Springer International Publishing, 2014:740-755.

[21] Russakovsky O, Deng J, Su H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3):211-252.

[22] Everingham M, van Gool L, Williams C K I, et al. The pascal visual object classes (VOC) challenge[J]. International Journal of Computer Vision, 2010, 88(2):303-338.

[23] Ioffe S, Szegedy C. Batch normalization:accelerating deep network training by reducing internal covariate shift[J/OL]. Machine Learning, 2015[2020-09-29]. https://arxiv.org/abs/1502.03167

[24] He K M, Zhang X Y, Ren S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9):1904-1916.

[25] Chollet F. Xception:deep learning with depthwise separable convolutions[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ:IEEE, 2017:1251-1258.

[26] Gencay R, Qi M. Pricing and hedging derivative securities with neural networks:Bayesian regularization, early stopping, and bagging[J]. IEEE Transactions on Neural Networks, 2001, 12(4):726-734.

[27] Dollar P, Wojek C, Schiele B, et al. Pedestrian detection:an evaluation of the state of the art[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(4):743-761.

引用本文

何育欣,郑伯川,谭代伦,刘丹,蔡前舟.基于VGGNet改进网络结构的多尺度大熊猫面部检测[J].重庆大学学报,2020,43(11):63-71.

复制

文章指标

点击次数:738
下载次数: 1076
HTML阅读次数: 967
引用次数: 0

历史

收稿日期:2020-07-11
最后修改日期:
录用日期:
在线发布日期: 2020-12-02
出版日期: 2020-11-30

期刊社主页

编辑部首页

期刊介绍

编委会

数据库收录

过刊浏览

联系我们

引用本文

分享

文章指标

历史

文章二维码

期刊社主页

编辑部首页

期刊介绍

编委会

数据库收录

过刊浏览

联系我们

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码