VolumeGAN

Yinghao Xu¹, Sida Peng², Ceyuan Yang¹, Yujun Shen³, Bolei Zhou¹

¹ The Chinese University of Hong Kong ² Zhejiang University ³ ByteDance Inc.

Overview

This paper aims at achieving high-fidelity 3D-aware images synthesis. We propose a novel framework, termed as VolumeGAN, for synthesizing images under different camera views, through explicitly learning a structural representation and a textural representation. We first learn a feature volume to represent the underlying structure, which is then converted to a feature field using a NeRF-like model. The feature field is further accumulated into a 2D feature map as the textural representation, followed by a neural renderer for appearance synthesis. Such a design enables independent control of the shape and the appearance. Extensive experiments on a wide range of datasets show that our approach achieves sufficiently higher image quality and better 3D control than the previous methods.

Results

Independent control of structure (shape) and texture (appearance) achieved by VolumeGAN.

Qualitative comparison between our VolumeGAN and existing alternatives.

Demo

We include a demo video, which shows more results with varying camera views. From the video, we can see the continuous 3D control achieved by our VolumeGAN.

BibTeX

@article{xu2021volumegan,
  title   = {3D-aware Image Synthesis via Learning Structural and Textural Representations},
  author  = {Xu, Yinghao and Peng, Sida and Yang, Ceyuan and Shen, Yujun and Zhou, Bolei},
  article = {arXiv preprint arXiv:2112.10759},
  year    = {2021}
}

Related Work

Thu Nguyen-Phuoc, Chuan Li, Lucas Theis, Christian Richardt, Yong-Liang Yang. HoloGAN: Unsupervised learning of 3D representations from natural images. ICCV, 2019.
Comment: Proposes voxelized and implicit 3D representations and then render it to 2D image space with a reshape operation.

Katja Schwarz, Yiyi Liao, Michael Niemeyer, Andreas Geiger. GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis. NeurIPS, 2020.
Comment: Proposes the generative radiance fields for 3D-aware image synthesis.

Michael Niemeyer, Andreas Geiger. GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields. CVPR, 2021.
Comment: Proposes the compositional generative neural feature fields for scene synthesis.

Eric R. Chan, Marco Monteiro, Petr Kellnhofer, Jiajun Wu, Gordon Wetzstein. pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis. CVPR, 2021.
Comment: Proposes the periodic implicit generative neural feature fields for 3d-aware image synthesis.