FreeControl

Sicheng Mo¹*, Fangzhou Mu²*, Kuan Heng Lin¹, Yanli Liu³, Bochen Guan³, Yin Li², Bolei Zhou¹

¹ UCLA, ² University of Wisconsin-Madison, ³ Innopeak Technology, Inc

* Equal contribution

Overview

In this work, we present FreeControl, a training-free approach for controllable T2I generation that supports multiple conditions, architectures, and checkpoints simultaneously. FreeControl designs structure guidance to facilitate the structure alignment with a guidance image, and appearance guidance to enable the appearance sharing between images generated using the same seed. FreeControl combines an analysis stage and a synthesis stage. In the analysis stage, FreeControl queries a T2I model to generate as few as one seed image and then constructs a linear feature subspace from the generated images. In the synthesis stage, FreeControl employs guidance in the subspace to facilitate structure alignment with a guidance image, as well as appearance alignment between images generated with and without control.

Qualitative Results

Controllable generation with T2I diffusion models.

More Results

Any condition generation:

BibTeX

@article{mo2023freecontrol,
  title={FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition},
  author={Mo, Sicheng and Mu, Fangzhou and Lin, Kuan Heng and Liu, Yanli and Guan, Bochen and Li, Yin and Zhou, Bolei},
  journal={arXiv preprint arXiv:2312.07536},
  year={2023}
}

Related Work

Lvmin Zhang, Anyi Rao, Maneesh Agrawala. Adding Conditional Control to Text-to-Image Diffusion Models. ICCV 2023.
Comment: Builds a addition encoder to add spatial conditioning controls to T2I diffusion models.

Narek Tumanyan, Michal Geyer, Shai Bagon, Tali Dekel. Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation. CVPR 2023.
Comment: Training-free method for image-to-image translation via attention and conv feature injection.