Abstract:Automatic image generation is a challenging problem in computer vision for a long time. As a branch of this area, there are also challenges in text-to-image generation. With the fast development of deep learning, generative adversarial networks (GANs) give a new inspiration to the image generation because it can generate highly compelling images of various categories. In this paper, we introduce the self-attention mechanism to GAN and propose GAN-SelfAtt to improve the quality of images. Meanwhile, we implement GAN-SelfAtt using two different GAN frameworks, i.e., WGAN and WGAN-GP. The experimental results show that self-attention mechanism improves the resolution of generated images. The reason of this improvement is that the self-attention mechanism fixes the defect of convolution computation which only calculates the correlation in the local pixel region. In addition, our results show that the stability of GAN-SelfAtt during the training process is improved. This fixes the problem of mode collapse which appears in the original GANs.