FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition
Sicheng Mo1*,  Fangzhou Mu2*,  Kuan Heng Lin1Yanli Liu3Bochen Guan3Yin Li2Bolei Zhou1
1 UCLA, 2 University of Wisconsin-Madison, 3 Innopeak Technology, Inc
* Equal contribution
Overview
In this work, we present FreeControl, a training-free approach for controllable T2I generation that supports multiple conditions, architectures, and checkpoints simultaneously. FreeControl designs structure guidance to facilitate the structure alignment with a guidance image, and appearance guidance to enable the appearance sharing between images generated using the same seed. FreeControl combines an analysis stage and a synthesis stage. In the analysis stage, FreeControl queries a T2I model to generate as few as one seed image and then constructs a linear feature subspace from the generated images. In the synthesis stage, FreeControl employs guidance in the subspace to facilitate structure alignment with a guidance image, as well as appearance alignment between images generated with and without control.
Qualitative Results
Controllable generation with T2I diffusion models.
More Results
Any condition generation:
BibTeX
@article{mo2023freecontrol,
  title={FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition},
  author={Mo, Sicheng and Mu, Fangzhou and Lin, Kuan Heng and Liu, Yanli and Guan, Bochen and Li, Yin and Zhou, Bolei},
  journal={arXiv preprint arXiv:2312.07536},
  year={2023}
}
Related Work
Lvmin Zhang, Anyi Rao, Maneesh Agrawala. Adding Conditional Control to Text-to-Image Diffusion Models. ICCV 2023.
Comment: Builds a addition encoder to add spatial conditioning controls to T2I diffusion models.
Narek Tumanyan, Michal Geyer, Shai Bagon, Tali Dekel. Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation. CVPR 2023.
Comment: Training-free method for image-to-image translation via attention and conv feature injection.