Papers
arxiv:2509.17120

Stencil: Subject-Driven Generation with Context Guidance

Published on Sep 21
Authors:
,

Abstract

Stencil framework combines lightweight and pre-trained diffusion models to generate high-fidelity images efficiently with subject consistency.

AI-generated summary

Recent text-to-image diffusion models can generate striking visuals from text prompts, but they often fail to maintain subject consistency across generations and contexts. One major limitation of current fine-tuning approaches is the inherent trade-off between quality and efficiency. Fine-tuning large models improves fidelity but is computationally expensive, while fine-tuning lightweight models improves efficiency but compromises image fidelity. Moreover, fine-tuning pre-trained models on a small set of images of the subject can damage the existing priors, resulting in suboptimal results. To this end, we present Stencil, a novel framework that jointly employs two diffusion models during inference. Stencil efficiently fine-tunes a lightweight model on images of the subject, while a large frozen pre-trained model provides contextual guidance during inference, injecting rich priors to enhance generation with minimal overhead. Stencil excels at generating high-fidelity, novel renditions of the subject in less than a minute, delivering state-of-the-art performance and setting a new benchmark in subject-driven generation.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2509.17120 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2509.17120 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2509.17120 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.