OMG: Occlusion-friendly Personalized Multi-concept Generation In Diffusion Models

1Sun Yat-sen University, 2Tencent AI Lab, 3International Digital Economy Academy, 4Nanjing University, 5Harbin Institute of Technology, Shenzhen, 6Shenzhen University, 7Tencent, 8The Hong Kong University of Science and Technology
*Corresponding Author
Interpolate start reference image.

OMG is a framework for multi-concept image generation, supporting character and style LoRAs on It also can be combined with InstantID for multiple IDs with using a single image for each ID.

Introduction of OMG, a tool for high-quality multi-character image generation.


Personalization is an important topic in text-to-image generation, especially the challenging multi-concept personalization. Current multi-concept methods are struggling with identity preservation, occlusion, and the harmony between foreground and background. In this work, we propose OMG, an occlusion-friendly personalized generation framework designed to seamlessly integrate multiple concepts within a single image. We propose a novel two-stage sampling solution. The first stage takes charge of layout generation and visual comprehension information collection for handling occlusions. The second one utilizes the acquired visual comprehension information and the designed noise blending to integrate multiple concepts while considering occlusions. We also observe that the initiation denoising timestep for noise blending is the key to identity preservation and layout. Moreover, our method can be combined with various single-concept models, such as LoRA and InstantID without additional tuning. Especially, LoRA models on can be exploited directly. Extensive experiments demonstrate that OMG exhibits superior performance in multi-concept personalization.

Trailor Demo: A short trailor "Home Defense" created by using OMG + SVD.

OMG + LoRA (ID with multiple images)

Interpolate start reference image.

OMG + InstantID (ID with single image)

Interpolate start reference image.

More Features

Combine with ControlNet

OMG is versatile and practical, allowing for combination with various conditions using ControlNet, including human pose, canny edge, and depth maps.

Interpolate start reference image.

Combine with style LoRAs

OMG can be combined with different style LoRAs.

Interpolate start reference image.