About semantic-draw

<div align="center"> <h1>SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion Models</h1> <h4><b>CVPR 2025</b></h4> <p>Previously <em>StreamMultiDiffusion: Real-Time Interactive Generation</br>with Region-Based Semantic Control</em></p> | ![mask](./assets/demo_app_nostream.gif) | ![result](./assets/demo_app.gif) | | :----------------------------: | :----------------------------: | | Draw multiple prompt-masks in a large canvas | Real-time creation | [**Jaerin Lee**](http://jaerinlee.com/) · [**Daniel Sungho Jung**](https://dqj5182.github.io/) · [**Kanggeon Lee**](https://github.com/dlrkdrjs97/) · [**Kyoung Mu Lee**](https://cv.snu.ac.kr/index.php/~kmlee/) <p align="center"> <img src="assets/logo_cvlab.png" height=60> </p> [![Project](https://img.shields.io/badge/Project-Page-green)](https://jaerinlee.com/research/semantic-draw) [![ArXiv](https://img.shields.io/badge/Arxiv-2403.09055-red)](https://arxiv.org/abs/2403.09055) [![Github](https://img.shields.io/github/stars/iro ...

i

Published by

ironjr

Visit View Profile

README.md

View on GitHub

SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion Models

CVPR 2025

Previously StreamMultiDiffusion: Real-Time Interactive Generation
with Region-Based Semantic Control


Draw multiple prompt-masks in a large canvas	Real-time creation

Jaerin Lee · Daniel Sungho Jung · Kanggeon Lee · Kyoung Mu Lee

SemanticDraw is a real-time interactive text-to-image generation framework that allows you to draw with meanings 🧠 using semantic brushes 🖌️.

🚀 Quick Start

# Install
conda create -n semdraw python=3.12 && conda activate semdraw
git clone https://github.com/ironjr/semantic-draw
cd semantic-draw
pip install -r requirements.txt

# Run streaming demo
cd demo/stream
python app.py --model "runwayml/stable-diffusion-v1-5" --port 8000

# Open http://localhost:8000 in your browser

For SD3 support, additionally run:

pip install git+https://github.com/initml/diffusers.git@clement/feature/flash_sd3

Note: this is default in requirements.txt

⭐ Features

Interactive Drawing	Prompt Separation	Real-time Editing

Paint with semantic brushes	No unwanted content mixing	Edit photos in real-time

🔧 Installation

Basic Installation

conda create -n smd python=3.12 && conda activate smd
git clone https://github.com/ironjr/StreamMultiDiffusion
cd StreamMultiDiffusion
pip install -r requirements.txt

Stable Diffusion 3 Support

pip install git+https://github.com/initml/diffusers.git@clement/feature/flash_sd3

🎨 Demo Applications

We provide several demo applications with different features and model support:

1. StreamMultiDiffusion (Main Demo)

Real-time streaming interface with semantic drawing capabilities.

cd demo/stream
python app.py --model "your-model" --height 512 --width 512 --port 8000

Options

Option	Description	Default
`--model`	Path to SD1.5 checkpoint (HF or local .safetensors)	None
`--height`	Canvas height	768
`--width`	Canvas width	1920
`--bootstrap_steps`	Semantic region separation (1-3 recommended)	1
`--seed`	Random seed	2024
`--device`	GPU device number	0
`--port`	Web server port	8000

2. Semantic Palette

Simplified interface for different SD versions:

SD 1.5 Version

cd demo/semantic_palette
python app.py --model "runwayml/stable-diffusion-v1-5" --port 8000

SDXL Version

cd demo/semantic_palette_sdxl
python app.py --model "your-sdxl-model" --port 8000

SD3 Version

cd demo/semantic_palette_sd3
python app.py --port 8000

Using Custom Models (.safetensors)

Place your .safetensors file in the demo's checkpoints folder
Run with: python app.py --model "your-model.safetensors"

💻 Usage Examples

Python API

Basic Generation

import torch
from model import StableMultiDiffusionPipeline

# Initialize
device = torch.device('cuda:0')
smd = StableMultiDiffusionPipeline(device, hf_key='runwayml/stable-diffusion-v1-5')

# Generate
image = smd.sample('A photo of the dolomites')
image.save('output.png')

Region-Based Generation

import torch
from model import StableMultiDiffusionPipeline
from util import seed_everything

# Setup
seed_everything(2024)
device = torch.device('cuda:0')
smd = StableMultiDiffusionPipeline(device)

# Define prompts and masks
prompts = ['background: city', 'foreground: a cat', 'foreground: a dog']
masks = load_masks()  # Your mask loading logic

# Generate
image = smd(prompts, masks=masks, height=768, width=768)
image.save('output.png')

Streaming Generation

from model import StreamMultiDiffusion

# Initialize streaming pipeline
smd = StreamMultiDiffusion(device, height=512, width=512)

# Register layers
smd.update_single_layer(idx=0, prompt='background', mask=bg_mask)
smd.update_single_layer(idx=1, prompt='object', mask=obj_mask)

# Stream generation
while True:
    image = smd()
    display(image)

Jupyter Notebooks

Explore our notebooks directory for interactive examples:

Basic usage tutorial
Advanced region control
SD3 examples
Custom model integration

📖 Documentation

Detailed Guides

Paper

For technical details, see our paper and project page.

🙋 FAQ

What is Semantic Palette?

Semantic Palette lets you paint with text prompts instead of colors. Each brush carries a meaning (prompt) that generates appropriate content in real-time.

Which models are supported?

✅ Stable Diffusion 1.5 and variants
✅ SDXL and variants (with Lightning LoRA)
✅ Stable Diffusion 3
✅ Custom .safetensors checkpoints

Hardware requirements?

Minimum: GPU with 8GB VRAM (for 512x512)
Recommended: GPU with 11GB VRAM (for larger resolutions) (Tested with 1080 ti).

🚩 Recent Updates

🔥 June 2025: Presented at CVPR 2025
✅ June 2024: SD3 support with Flash Diffusion
✅ April 2024: StreamMultiDiffusion v2 with responsive UI
✅ March 2024: SDXL support with Lightning LoRA
✅ March 2024: First version released

See README_old.md for full history.

🌏 Citation

@inproceedings{lee2025semanticdraw,
    title="{SemanticDraw:} Towards Real-Time Interactive Content Creation from Image Diffusion Models",
    author={Lee, Jaerin and Jung, Daniel Sungho and Lee, Kanggeon and Lee, Kyoung Mu},
    booktitle={CVPR},
    year={2025}
}

🤗 Acknowledgements

Built upon StreamDiffusion, MultiDiffusion, and LCM. Special thanks to the Hugging Face team and the model contributors.

📧 Contact

Please email [email protected] or open an issue.

semantic-draw