HiGAN

The Chinese University of Hong Kong
International Journal of Computer Vision (IJCV)

Overview

In this work, we show that highly-structured semantic hierarchy emerges from the generative representations as the variation factors for synthesizing scenes. By probing the layer-wise representations with a broad set of visual concepts at different abstraction levels, we are able to quantify the causality between the activations and the semantics occurring in the output image. The qualitative and quantitative results suggest that the generative representations learned by GANs are specialized to synthesize different hierarchical semantics: the early layers tend to determine the spatial layout and configuration, the middle layers control the categorical objects, and the later layers finally render the scene attributes as well as color scheme.

Results

Identifying such a set of manipulatable latent variation factors facilitates semantic scene manipulation.

Check more results of various scenes in the following video.

BibTeX

@article{yang2019semantic,
  title   = {Semantic Hierarchy Emerges in Deep Generative Representations for Scene Synthesis},
  author  = {Yang, Ceyuan and Shen, Yujun and Zhou, Bolei},
  journal = {International Journal of Computer Vision},
  year    = {2020}
}

Related Work

B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. Object Detectors Emerge in Deep Scene CNNs. ICLR, 2015.
Comment: Studies the emergent interpretable object detectors inside the CNNs trained for classifying scenes.

D. Bau, JY. Zhu, H. Strobelt, B. Zhou, JB. Tenenbaum, WT. Freeman, A. Torralba. GAN Dissection: Visualizing and Understanding Generative Adversarial Networks. ICLR, 2019.
Comment: Explores how neurons of the generator are related to the semantic objects generated by GANs.

D. Bau, H. Strobelt, W. Peebles, J. Wulff, B. Zhou, JY. Zhu, A. Torralba. Semantic Photo Manipulation with a Generative Image Prior. SIGGRAPH 2019.
Comment: Applies GAN dissection to the manipulation of real images.

L. Goetschalckx, A. Andonian, A. Oliva, P. Isola. GANalyze: Toward Visual Definitions of Cognitive Image Properties. ICCV, 2019.
Comment: Navigates the manifold in the latent space to make images more or less memorable.

Y. Shen, J. Gu, X. Tang, B. Zhou. Interpreting Latent Space of GANs for Semantic Face Editing. CVPR, 2020.
Comment: Proposes a technique for semantic face editing in latent space.

A. Jahanian, L. Chai, P. Isola. On the "steerability" of generative adversarial networks. ICLR, 2020.
Comment: Shifts the distribution by "steering" the latent code to change camera motion and image color tone.