ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preservation #7955

clarencechen · 2024-05-15T22:25:49Z

Model/Pipeline/Scheduler description

Existing methods for facial identity transfer for diffusion denoising image generation models face challenges in achieving high fidelity and detailed identity (ID) consistency, primarily due to insufficient fine-grained control over facial areas and the lack of a comprehensive strategy for ID preservation by fully considering intricate facial details. To address these limitations, the authors introduce ConsistentID, an innovative method crafted for diverse identity-preserving portrait generation under fine-grained multimodal facial prompts, utilizing only a single reference image.

ConsistentID is comprised of three key components:

A fine-tuned IP-Adapter-FaceID-Plus module to capture the overall facial context from the reference image.
Expanded textual descriptions of generated from the reference face image using LLAVA 1.5 to further refine facial features.
An ID-preservation network injecting Perceiver-remapped CLIP embeddings of separated facial regions into the embeddings of the expanded text prompt, optimized through the facial attention localization strategy aimed at preserving ID consistency in facial regions.

Together, these components significantly enhance the accuracy of ID preservation by introducing fine-grained multimodal ID information from facial regions.

Open source status

The model implementation is available.
The model weights are available (Only relevant if addition is not a scheduler).

Provide useful links for the implementation

Arxiv: https://arxiv.org/pdf/2404.16771
Github: https://github.com/JackAILab/ConsistentID
Contact: @JackAILab

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preservation #7955

ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preservation #7955

clarencechen commented May 15, 2024

ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preservation #7955

ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preservation #7955

Comments

clarencechen commented May 15, 2024

Model/Pipeline/Scheduler description

Open source status

Provide useful links for the implementation