Unsupervised Image Transformation Learning via Generative Adversarial Networks
Kaiwen Zha1Yujun Shen2Bolei Zhou2
1 MIT CSAIL
2 The Chinese University of Hong Kong
Overview
In this work, we study the image transformation problem by learning the underlying transformations from a collection of images using Generative Adversarial Networks (GANs). Specifically, we propose an unsupervised learning framework, termed as TrGAN, to project images onto a transformation space that is shared by the generator and the discriminator. Any two points in this projected space define a transformation that can guide the image generation process. By projecting a pair of images onto the transformation space, TrGAN is able to adequately extract the semantic variation between them and further apply the extracted semantic to facilitating image editing, including not only transferring image styles (e.g., changing day to night) but also manipulating image contents (e.g., adding clouds in the sky).
Results
The transformations extracted from the image pairs are applied to transforming new images.
Input Pair Transforming
Season
Cloud
Shape
Demo video here.
BibTeX
  @article{zha2021unsupervised,
    title   = {Unsupervised Image Transformation Learning via Generative Adversarial Networks},
    author  = {Zha, Kaiwen and Shen, Yujun and Zhou, Bolei},
    journal = {arXiv preprint arXiv:2103.07751},
    year    = {2021}
  }
Related Work
P.Y. Laffont, Z. Ren, X. Tao, C. Qian, J. Hays. Transient Attributes for High-Level Understanding and Editing of Outdoor Scenes. SIGGRAPH 2014.
Comment: Defines 40 transient attributes and proposes an image editing method that can adjust the attributes of a scene based on regressors trained on on labeled data.
T. Karras, S. Laine, T. Aila. A Style-Based Generator Architecture for Generative Adversarial Networks. CVPR 2019.
Comment: Proposes a style-based generator for high-quality image synthesis.
R. Abdal, Y. Qin, P. Wonka. Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?. ICCV 2019.
Comment: Explores how to embed images into the latent space of StyleGAN.