Skip to content

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models (LLM-grounded Diffusion: LMD)

Notifications You must be signed in to change notification settings

nari95park/LLM-groundedDiffusion

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models

Long Lian, Boyi Li, Adam Yala, Trevor Darrell at UC Berkeley/UCSF.

Paper | Project Page | 5-minute Blog Post | HuggingFace Demo (stage 1 and 2) | Citation

TL;DR: Text Prompt -> LLM as a request parser -> Intermediate Representation (such as an image layout) -> Stable Diffusion -> Image.

Main Image

Updates

Our huggingface demo for stage 1 and 2 is released! Check it out here.

Our code (stage 1 and stage 2) is also available to run locally. The code and instructions to run.

Our code (with Web UI)

Our code that supports text-to-layout (stage 1) and layout-to-image (stage 2) generation is released. Click here to see the code and instructions to run.

LLM-grounded Diffusion (LMD)

Enhanced Prompt Understanding

Visualizations

Additional Capabilities: Multi-round Scene Specification/Generation from Non-English Prompts

Additional Capabilities Additional Capabilities GIF

Contact us

Please contact Long (Tony) Lian if you have any questions: [email protected].

Citation

If you use this work or find it helpful, please consider giving a citation.

@article{lian2023llmgrounded,
    title={LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models}, 
    author={Lian, Long and Li, Boyi and Yala, Adam and Darrell, Trevor},
    journal={arXiv preprint arXiv:2305.13655},
    year={2023}
}

About

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models (LLM-grounded Diffusion: LMD)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%