基于注意力机制的互补融合RGB-D食物图像分割
DOI:
作者:
作者单位:

1.天津大学;2.拉夫堡大学

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金(基金号61771338)


Complementary fusion RGB-D food image segmentation based on attention mechanism
Author:
Affiliation:

1.Tianjin University;2.Loughborough University

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    食物图像分割在食物体积估计领域中起着至关重要的作用,但是由于食物的细微结构和拍摄时可能存在的挑战,如边界模糊、图像过曝等,使得许多分割算法的性能往往不高。为解决上述问题,提出了基于注意力机制的互补融合RGB-D食物图像分割网络(RGB-D ABCFNet),该网络总体上采用U型结构,分为编码过程和解码过程。在编码过程中,所提出的膨胀头通道注意力模块提取深度图中对分割更有利的通道特征,通过层层叠加,使深度图的特征和RGB图的特征互相补充。在解码过程中,所提出的多头空间注意力模块可以更好地恢复细节和位置信息,提取的语义特征可以更好地映射语义分割结果。此外,还构建了一个多类别食物语义分割数据集Nutrition-Pix,并在其上进行大量对比实验和消融实验,证明所提出模型以87.5%的平均交并比mIoU优于目前的方法。

    Abstract:

    Food image segmentation plays an important role in the field of food volume estimation, but there is still much room for improvement in its performance due to the fine structure of food and some challenges in shooting, such as blurred boundaries and image overexposure. To solve these above problems, a complementary fusion RGB-D Food Image segmentation Network (RGB-D ABCFNet) based on attention mechanism is proposed. The network adopts U-shaped structure and is divided into encoding process and decoding process. In the coding process, the Expand Head Channel Attention Module (EHCAM) proposed extracts the channel features that are more helpful to segmentation of the depth map, so that the characteristics of depth map are well complemented to RGB feature map by adding layer by layer. In decoding process, the Multi-Head Spatial Attention Module (MHSAM) present enables the detailed information and location information to be well recovered, and the extracted semantic features can better map the semantic segmentation results. In addition, a multi-class food semantic segmentation dataset Nutrition-Pix is constructed and a large number of comparison and ablation experiments are conducted on it, proving that the proposed model is superior to the current method with the mIoU of 87.5%.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-04-13
  • 最后修改日期:2024-04-27
  • 录用日期:2024-05-31
  • 在线发布日期:
  • 出版日期: