一种基于自注意力机制的文本图像生成对抗网络
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP311

基金项目:

重庆市自然科学基金资助项目(cstc2014jcyjA40030)。


A generative adversarial network based on self-attention mechanism for text-to-image generation
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    图像自动生成一直以来都是计算机视觉领域的一项重要挑战,其中的文本到图像的生成更是图像生成领域的重要分支。随着深度学习技术迅猛发展,生成对抗网络的出现使得图像生成领域焕发生机,借助生成对抗网络能够生成较为生动且多样的图像。本文将自注意力机制引入生成对抗网络,提出GAN-SelfAtt以提升生成图像的质量。同时,使用WGAN、WGAN-GP 2种生成对抗网络框架对GAN-SelfAtt进行实现。实验结果表明,自注意力机制的引入能够提高生成图像的清晰度,这归功于自注意力机制弥补了卷积运算中只能计算局部像素区域内的相关性的缺陷。除此之外,GAN-SelfAtt在训练时有着更好的稳定性,避免了原始生成对抗网络中的模式坍塌问题。

    Abstract:

    Automatic image generation is a challenging problem in computer vision for a long time. As a branch of this area, there are also challenges in text-to-image generation. With the fast development of deep learning, generative adversarial networks (GANs) give a new inspiration to the image generation because it can generate highly compelling images of various categories. In this paper, we introduce the self-attention mechanism to GAN and propose GAN-SelfAtt to improve the quality of images. Meanwhile, we implement GAN-SelfAtt using two different GAN frameworks, i.e., WGAN and WGAN-GP. The experimental results show that self-attention mechanism improves the resolution of generated images. The reason of this improvement is that the self-attention mechanism fixes the defect of convolution computation which only calculates the correlation in the local pixel region. In addition, our results show that the stability of GAN-SelfAtt during the training process is improved. This fixes the problem of mode collapse which appears in the original GANs.

    参考文献
    相似文献
    引证文献
引用本文

黄宏宇,谷子丰.一种基于自注意力机制的文本图像生成对抗网络[J].重庆大学学报,2020,43(3):55-61.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2019-03-27
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2020-03-31
  • 出版日期: