A generative adversarial network based on self-attention mechanism for text-to-image generation

doi:10.11835/j.issn.1000-582X.2020.03.006

Archive > Volume 43 Issue 3 > 2020,43(3):55-61. DOI:10.11835/j.issn.1000-582X.2020.03.006 CSTR:[cstr] Prev Next

A generative adversarial network based on self-attention mechanism for text-to-image generation

Article
Figures
Metrics
Preview PDF
Reference
Related
Cited by
Materials

Abstract:

Automatic image generation is a challenging problem in computer vision for a long time. As a branch of this area, there are also challenges in text-to-image generation. With the fast development of deep learning, generative adversarial networks (GANs) give a new inspiration to the image generation because it can generate highly compelling images of various categories. In this paper, we introduce the self-attention mechanism to GAN and propose GAN-SelfAtt to improve the quality of images. Meanwhile, we implement GAN-SelfAtt using two different GAN frameworks, i.e., WGAN and WGAN-GP. The experimental results show that self-attention mechanism improves the resolution of generated images. The reason of this improvement is that the self-attention mechanism fixes the defect of convolution computation which only calculates the correlation in the local pixel region. In addition, our results show that the stability of GAN-SelfAtt during the training process is improved. This fixes the problem of mode collapse which appears in the original GANs.

Keywords:

Project Supported:

Clc Number:

TP311