共享知识引导和模态一致性学习的可见光-红外行人重识别
作者单位:

昆明理工大学

中图分类号:

U448.213???????

基金项目:

云南省科技厅科技计划项目(面上项目)(202101AT070136)


Shared knowledge guidance and modal consistency learning for Visible-Infrared person re-identification
Author:
Affiliation:

1.Kunming University of Science&2.Technology

Fund Project:

Supported by Science and Technology Planning Project of Yunnan Science and Technology Department (General Project) (202101AT070136).

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [47]
  • |
  • 相似文献
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    在可见光-红外行人重识别(VI-ReID)中,提取不受模态差异影响的鉴别性特征是提升模型识别性能的关键。目前常见的解决方法是通过一个双流网络学习模态共享的特征表示,然而,这些方法没有挖掘模态间更多的共享知识,并且模态间的差异依然存在。为此,提出共享知识引导和模态一致性学习的可见光-红外行人重识别(SKMCL)。该方法由跨模态共享知识引导(SKG)和模态一致性学习(MCL)组成,前者通过跨模态注意力充分挖掘模态间的共享知识并作为引导,辅助模型更好地提取鉴别性特征,后者通过设计的模态分类器与双流网络的对抗学习缩小模态差异,两个模块相互配合,强化特征学习。为了进一步减小模态差异,引入特征混合策略来增强双流网络提取模态一致性特征的能力。提出的方法在两个公共数据集SYSU-MM01和RegDB上的性能与相关工作比较,优势明显,Rank1/mAP精度分别达到了58.38%/56.10%和87.41%/80.75%,证明本方法的有效性。源代码地址:https://github.com/lhf12278/SKMCL.

    Abstract:

    In the visible-infrared cross-modality person re-identification(VI-ReID), how to extract discriminant features which are not affected by modal discrepancy is critical to improve recognition performance. The current common solution is to learn the shared feature representation of two modalities by using a dual-stream network, however, these methods did not mine more shared knowledge between the modalities, and the discrepancy between the modalities still exits. Therefore, Shared Knowledge guidance Modal Consistency Learning(SKMCL) is proposed. This method is composed of cross-modal shared knowledge guidance(SKG) and modal consistency learning. The former fully explore the shared knowledge between modalities by cross-modal attention mechanism and serves as a guide to assist the model to extract discriminative features, the latter reduce the discrepancy of two modalities through the adversarial learning of the designed modal classifier and the dual-stream network, and the two modules cooperate with each other to strengthen feature learning. Meanwhile, in order to further reduce the modal discrepancy of two modalities, a features mixing strategy is introduced to enhance the ability of the dual-stream network to extract modal consistency features. The performance of the proposed method on the two public datasets SYSU-MM01 and RegDB obviously superior to that of related works. The accuracy of Ramk1/mAP is 58.38%/56.10% and 87.41%/80.75%, respectively, which proves the proposed method is effective. The source code has been released in https://github.com/lhf12278/SKMCL.

    参考文献
    [1] 雷大江, 滕君, 王明达,等. 基于卡方核的正则化线性判别行人再识别算法[J]. 重庆大学报, 2018, 41(09): 66-76.
    Lei D J, Teng J, Wang M D, et al. Chi square kernel regularized linear discriminant analysis for person re-identification[J]. Journal of Chongqing University, 2018, 41(09): 66-76. (in Chinese).
    [2] 李玲莉, 谢明鸿, 李凡, 等. 低秩先验引导的无监督域自适应行人重识别[J]. 重庆大学学报, 2021, 44(11): 57-70.Li L L, Xie M H, Li F, et al. Unsupervised domain adaptative person re-identification guided by low-rank priori[J]. Journal of Chongqing University, 2021, 44(11): 57-70. (in Chinese).
    [3] Li H F, Pang J, Tao D P, et al. Cross adversarial consistency self-prediction learning for unsupervised domain adaptation person re-identification[J]. Information Sciences, 2021, 559: 46-60.
    [4] Li H F, Chen Y W, Tao D P, et al. Attribute-aligned domain-invariant feature learning for unsupervised domain adaptation person re-identification[J]. IEEE Transactions on Information Forensics and Security, 2020, 16: 1480-1494.
    [5] 刘智, 冯欣, 张杰. 基于深度卷积神经网络和深度视频的人体行为识别[J]. 重庆大学学报, 2017, 40(11): 99-106.Liu Z, Feng X, Zhang J. Action recognition based on deep convolution neural network and depth sequences[J]. Journal of Chongqing University, 2017, 40(11): 99-106. (in Chinese).
    [6] 董能, 谢明鸿, 张亚飞, 等. 知识引导和细粒度信息增强的无监督域自适应行人再识别[J]. 重庆大学学报.Dong N, Xie M H, Zhang Y F, et al. Knowledge guidance and fine-grained information enhancement for Unsupervised domain adaptation person re-identification. Journal of Chongqing University. (in Chinese).
    [7] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets[J]. In Advances in Neural Information Processing Systems (NeurlPS), 2014, 2672–2680.
    [8] Wu A, Zheng W S, Yu H X, et al. RGB-infrared cross-modality person re-identification[C]. Proceedings of the IEEE international conference on computer vision(ICCV), 2017: 5380-5389.
    [9] Wang G A, Zhang T Z, Cheng J, et al. Rgb-infrared cross-modality person re-identification via joint pixel and feature alignment[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision(ICCV), 2019: 3623-3632.
    [10] Wang Z X, Wang Z, Zheng Y Q, et al. Learning to reduce dual-level discrepancy for infrared-visible person re-identification[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 2019: 618-626.
    [11] Wang G A, Zhang T Z, Yang Y, et al. Cross-modality paired-images generation for RGB-infrared person re-identification[C]. Proceedings of the AAAI Conference on Artificial Intelligence(AAAI), 2020, 34(07): 12144-12151.
    [12] Zhang Z Y, Jiang S, Huang C Z T, et al. RGB-IR cross-modality person ReID based on teacher-student GAN model[J]. Pattern Recognition Letters, 2021, 150: 155-161.
    [13] Kansal K, Subramanyam A V, Wang Z, et al. SDL: Spectrum-disentangled representation learning for visible-infrared person re-identification[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(10): 3422-3432.
    [14] Choi S, Lee S, Kim Y, et al. Hi-cmd: Hierarchical cross-modality disentanglement for visible-infrared person re-identification[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition(CVPR), 2020: 10257-10266.
    [15] Pu N, Chen W, Liu Y, et al. Dual gaussian-based variational subspace disentanglement for visible-infrared person re-identification[C]. Proceedings of the 28th ACM International Conference on Multimedia, 2020: 2149-2158.
    [16] Sun Y, Zheng L, Yang Y, et al. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline)[C]. Proceedings of the European conference on computer vision (ECCV), 2018: 480-496.
    [17] Ye M, Shen J B, Crandall D J, et al. Dynamic dual-attentive aggregation learning for visible-infrared person re-identification[C]. In European conference on computer vision(ECCV), 2020: 229-247.
    [18] Zhang L Y, Du G D, Liu F, et al. Global-Local Multiple Granularity Learning for Cross-Modality Visible-Infrared Person Reidentification[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021: 1-11.
    [19] Dai P Y, Ji R R, Wang H B, et al. Cross-modality person re-identification with generative adversarial training[C]. IJCAI, 2018, 677-683.
    [20] Liu H J, Tan X H, Zhou X C. Parameter sharing exploration and hetero-center triplet loss for visible-thermal person re-identification[J]. IEEE Transactions on Multimedia, 2020, 30: 4414-4425.
    [21] Huang N, Liu J N, Zhang Q, et al. Exploring Modality-shared Appearance Features and Modality-invariant Relation Features for Cross-modality Person Re-Identification[J]. arXiv preprint arXiv:2104.11539, 2021.
    [22] Fang P F, Zhou J M, Roy S K, et al. Bilinear attention networks for person retrieval[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision(ICCV), 2019: 8030-8039.
    [23] Xia B N, Gong Y, Zhang Y Z, et al. Second-order non-local attention networks for person re-identification[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision(ICCV), 2019: 3760-3769.
    [24] Li D W, Chen X T, Zhang Z, et al. Learning deep context-aware features over body and latent parts for person re-identification[C]. Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), 2017: 384-393.
    [25] Li W, Zhu X T, Gong S G. Harmonious attention network for person re-identification[C]. Proceedings of the IEEE conference on computer vision and pattern recognition, 2018: 2285-2294.
    [26] Ye M, Shen, J B, Lin, G J, et al. Deep Learning for Person Re-identification: A Survey and Outlook[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021:1-1.
    [27] Wang X L, Girshick R, Gupta A, et al. Non-local neural networks[C]. Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), 2018: 7794-7803.
    [28] Verma V,? Lamb A, Beckham C, et al. Manifold mixup: Better representations by interpolating hidden states, in: International Conference on Machine Learning (ICML), 2019: 6438–6447.
    [29] Luo C C, Song C F, Zhang Z X. Generalizing person re-identification by camera-aware invariance learning and cross-domain mixup[C]. In European conference on computer vision(ECCV), 2020: 224-241.
    [30] Nguyen D T, Hong H G, Kim K W, et al. Person recognition system based on a combination of body images from visible light and thermal cameras[J]. Sensors, 2017, 17(3): 605.
    [31] Ye M, Lan X Y, Wang Z, et al. Bi-directional center-constrained top-ranking for visible thermal person re-identification[J]. IEEE Transactions on Information Forensics and Security, 2019, 15: 407-419.
    [32] Paszke A, Gross S, Massa F, et al. Pytorch: An imperative style, high-performance deep learning library[J]. Advances in neural information processing systems, 2019, 32: 8026–8037.
    [33] Deng J, Dong W, Socher R, et al. ImageNet: A large-scale hierarchical image database[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009: 248–255.
    [34] He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 770–778.
    [35] Hao Y, Wang N, Li J, et al. HSME: hypersphere manifold embedding for visible thermal person re-identification[C]. Proceedings of the AAAI conference on artificial intelligence(AAAI), 2019, 33(01): 8385-8392.
    [36] Ye M, Lan X, Leng Q. Modality-aware collaborative learning for visible thermal person re-identification[C]. Proceedings of the 27th ACM International Conference on Multimedia, 2019: 347-355.
    [37] Lin J W, Li H. HPILN: A feature learning framework for cross-modality person re-identification[J]. arXiv preprint arXiv:1906.03142, 2019.
    [38] Wang G A, Yang Y, Zhang T Z, et al. Cross-modality paired-images generation and augmentation for RGB-infrared person re-identification[J]. Neural Networks: the Official Journal of the International Neural Network Society, 2020, 128: 294-304.
    [39] Liu H J, Cheng J, Wang W, et al. Enhancing the discriminative feature learning for visible-thermal cross-modality person re-identification[J]. Neurocomputing, 2020, 398: 11-19.
    [40] Li D, Wei X, Hong X, et al. Infrared-visible cross-modal person re-identification with an x modality[C]. Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34(04): 4610-4617.
    [41] Ye M, Lan X, Wang Z, et al. Bi-directional center-constrained top-ranking for visible thermal person re-identification[J]. IEEE Transactions on Information Forensics and Security, 2019, 15: 407-419.
    [42] Feng Z, Lai J, Xie X. Learning modality-specific representations for visible-infrared person re-identification[J]. IEEE Transactions on Image Processing, 2019, 29: 579-590.
    [43] Wu A, Zheng W S, Gong S, et al. RGB-IR person re-identification by cross-modality similarity preservation[J]. International journal of computer vision, 2020, 128(6): 1765-1785.
    [44] Wang X G, Doretto G, Sebastian T, et al. Shape and appearance context modeling. Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2007: 1–8.
    [45] Zheng L, Shen L Y, Tian L, et al. Scalable person re-identification: A benchmark[C]. Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015: 1116–1124.
    [46] Luo H, Gu Y Z, Liao X Y, et al. Bag of tricks and a strong baseline for deep person re-identification[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops(CVPR), 2019: 0-0.
    [47] Radenovic F, Tolias G, Chum O. Fine-Tuning CNN Image Retrieval with No Human Annotation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41(7): 1655-1668.
    相似文献
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2021-12-29
  • 最后修改日期:2021-12-29
  • 录用日期:2022-03-02
文章二维码