Improving GAN Equilibrium
by Raising Spatial Awareness
Jianyuan Wang1,2Ceyuan Yang1Yinghao Xu1Yujun Shen1
Hongdong Li2Bolei Zhou3 
1 The Chinese University of Hong Kong, 2 The Australian National University,
3 University of California, Los Angeles

** Interactive editng on the output synthesis of EqGAN-SA, by altering spatial heatmaps.
In Generative Adversarial Networks (GANs), a generator (G) and a discriminator (D) are expected to reach a certain equilibrium where D cannot distinguish the generated images from the real ones. However, in practice it is difficult to achieve such an equilibrium in GAN training, instead, D almost always surpasses G. We attribute this phenomenon to the information asymmetry that D learns its own visual attention when determining whether an image is real or fake, but G has no explicit clue on which regions to focus on.
To alleviate the issue of D dominating the competition in GANs, we aim to raise the spatial awareness of G. We encode randomly sampled multi-level heatmaps into the intermediate layers of G as an inductive bias. We further propose to align the spatial awareness of G with the attention map induced from D. Through this way we effectively lessen the information gap between D and G. Extensive results show that our method pushes the two-player game in GANs closer to the equilibrium, leading to a better synthesis performance. As a byproduct, the introduced spatial awareness facilitates interactive editing over the output synthesis.

We build an interactive interface to visualize that though not designed for this, EqGAN-SA enables the interactive spatial editing of the output image.
Qualitative Results
We first show several generated samples of an EqGAN-SA model on the LSUN Cat dataset, as below.
Then, we demostrate the spatial awareness of the EqGAN-SA generator via varying the spatial heatmaps. Specifically, we keep the latent codes unchanged and move the spatial heatmap at the coarsest feature level. The arrows indicate the movement direction, where the cat moves along with the varied heatmap.
To further show its ability of hierarchical manipulation, we move the heatmap at the finer levels. Different from the body movement, the change in 8 × 8 heatmap (two centers) mainly moves the cat eyes, and the change in 16 × 16 heatmap (four centers) leads to subtle movement of the cat ears. As highlighted in the rightest column, the cat ears subtly turn right while other parts, even the cat whiskers, remain unchanged.
  title   = {Improving GAN Equilibrium by Raising Spatial Awareness},
  author  = {Wang, Jianyuan and Yang, Ceyuan and Xu, Yinghao and Shen, Yujun and Li, Hongdong and Zhou, Bolei},
  article = {arXiv preprint arXiv: 2112.00718},
  year    = {2021}