SyncTweedies: A General Generative Framework Based on Synchronized Diffusions

Jaihoon Kim*, Juil Koo*, Kyeongmin Yeo*, Minhyuk Sung (* Denotes equal contribution)

Introduction

This repository contains the official implementation of SyncTweedies. SyncTweedies can be applied to various downstread applications including ambiguous image generation, arbitrary-sized image generation, 360° panorama generation and texturing 3D mesh and Gaussians. More results can be found at our project webpage.

We introduce a general diffusion synchronization framework for generating diverse visual content, including ambiguous images, panorama images, 3D mesh textures, and 3D Gaussian splats textures, using a pretrained image diffusion model. We first present an analysis of various scenarios for synchronizing multiple diffusion processes through a canonical space. Based on the analysis, we introduce a novel synchronized diffusion method, SyncTweedies, which averages the outputs of Tweedie’s formula while conducting denoising in multiple instance spaces. Compared to previous work that achieves synchronization through finetuning, SyncTweedies is a zero-shot method that does not require any finetuning, preserving the rich prior of diffusion models trained on Internet-scale image datasets without overfitting to specific domains. We verify that SyncTweedies offers the broadest applicability to diverse applications and superior performance compared to the previous state-of-the-art for each application.

Environment Setup

Software Requirements

Python 3.8
CUDA 11.7
PyTorch 2.0.0

git clone https://github.com/KAIST-Visual-AI-Group/SyncTweedies
conda env create -f environment.yml
pip install git+https://github.com/openai/CLIP.git
pip install -e .

3D Mesh Texturing (PyTorch3D)

pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py38_cu117_pyt200/download.html

3D Gaussians Texturing (Differentiable 3D Gaussian Rasterizer - gsplat)

cd synctweedies/renderer/gaussian/gsplat
python setup.py install
pip install .

Data

3D Mesh Texturing

Use 3D mesh and prompt pairs from Text2Tex and TEXTure. Text2Tex uses a subset of Objaverse dataset.

3D mesh texturing - data/mesh/turtle.obj (TEXTure), data/meshclutch_bag.obj (Text2Tex)

For 3D mesh texture editing, use the generated 3D mesh from Luma AI.

3D mesh texture editing (SDEdit) - data/mesh/sdedit/mesh.obj (Luma AI)

360° Panorama Generation

Use depth maps from 360MonoDepth to generate 360° panoamra images.

360° panoamra generation - data/panorama

3D Gaussians Texturing

Download Synthetic NeRF dataset and reconstruct 3D scenes using either 3D Gaussian Splatting framework or gsplat.

Use the reconstructed 3D scene for texturing 3D Gaussians.

3D Gaussians texturing - data/gaussians/chair and data/gaussians/chair.ply.

Inference

Please run the commands below to run each application.

Ambiguous Image

1-to-1 Projection

python main.py --app ambiguous_image --case_num 2 --tag ambiguous_image --save_dir_now

1-to-n Projection

python main.py --app ambiguous_image --case_num 2 --tag ambiguous_image --save_dir_now --views_names identity inner_rotate

n-to-1 Projection

python main.py --app ambiguous_image --case_num 2 --tag ambiguous_image --save_dir_now --optimize_inverse_mapping

--prompts

Text prompts to guide the generation process. (Provide a prompt per view)

--save_top_dir

Directory to save intermediate/final outputs.

--tag

Tag output directory.

--save_dir_now

Save output directory with current time.

--case_num

Denoising case num. Refer to the main paper for other cases. (Case 2 - SyncTweedies)

--seed

Random seed.

--views_names

View transformation to each denoising process.

--rotate_angle

Rotation angle for rotation transformations.

--initialize_xt_from_zt

Initialize the initial random noise by projecting from the canonical space.

--optimize_inverse_mapping

Use optimization for projection operation. (n-to-1 projection)

Arbitrary-sized Image

python main.py --app wide_image --prompt "A photo of a mountain range at twilight" --save_top_dir ./output --save_dir_now --tag wide_image --case_num 2 --seed 0 --sampling_method ddim --num_inference_steps 50 --panorama_height 512 --panorama_width 3072 --mvd_end 1.0 --initialize_xt_from_zt

--prompts

Text prompts to guide the generation process.

--save_top_dir

Directory to save intermediate/final outputs.

--tag

Tag output directory.

--save_dir_now

Save output directory with current time.

--case_num

Denoising case num. Refer to the main paper for other cases. (Case 2 - SyncTweedies)

--seed

Random seed.

--sampling_method

Denoising sampling method.

--num_inference_steps

Number of sampling steps.

--panorama_height

The height of the image to generate.

--panorama_width

The width of the image to generate.

--mvd_end

Step to stop the synchronization. (1.0 - Synchronize all timesteps, 0.0 - No synchronizaiton)

--initialize_xt_from_zt

Initialize the initial random noise by projecting from the canonical space.

3D Mesh Texturing

python main.py --app mesh --prompt "A hand carved wood turtle" --save_top_dir ./output --tag mesh  --save_dir_now --case_num 2 --mesh ./data/mesh/turtle.obj --seed 0 --sampling_method ddim --initialize_xt_from_zt

--prompts

Text prompts to guide the generation process.

--save_top_dir

Directory to save intermediate/final outputs.

--tag

Tag output directory.

--save_dir_now

Save output directory with current time.

--case_num

Denoising case num. Refer to the main paper for other cases. (Case 2 - SyncTweedies)

--mesh

Path to input 3D mesh.

--seed

Random seed.

--sampling_method

Denoising sampling method.

--initialize_xt_from_zt

Initialize the initial random noise by projecting from the canonical space.

--steps

Number of sampling steps.

3D Mesh Texture Editing

python main.py --app mesh --prompt "lantern" --save_top_dir ./output --tag mesh  --save_dir_now --case_num 2 --mesh ./data/mesh/sdedit/mesh.obj --seed 0 --sampling_method ddim --initialize_xt_from_zt --sdedit --sdedit_prompt "A Chinese style lantern" --sdedit_timestep 0.2

--sdedit

Editing 3D mesh texture.

--sdedit_prompt

Target editing prompt. This overrides the original prompt.

--sdedit_timestep

Timestep to add noise. (1.0 - x_0, 0.0 - x_T)

360° Panorama

python main.py --app panorama --tag panorama --save_top_dir ./output --save_dir_now --prompt "An old looking library" --depth_data_path ./data/panorama/cf726b6c0144425282245b34fc4efdca_depth.dpt --case_num 2 --average_rgb --initialize_xt_from_zt --model controlnet

--prompts

Text prompts to guide the generation process.

--save_top_dir

Directory to save intermediate/final outputs.

--tag

Tag output directory.

--save_dir_now

Save output directory with current time.

--depth_data_path

Path to depth map image.

--case_num

Denoising case num. Refer to the main paper for other cases. (Case 2 - SyncTweedies)

--mesh

Path to input 3D mesh.

--seed

Random seed.

--sampling_method

Denoising sampling method.

--initialize_xt_from_zt

Initialize the initial random noise by projecting from the canonical space.

--steps

Number of sampling steps.

--canonical_rgb_h

Resolution (height) of the RGB canonical space.

--canonical_rgb_w

Resolution (width) of the RGB canonical space.

--canonical_latent_h

Resolution (width) of the latent canonical space.

--canonical_latent_w

Resolution (width) of the latent canonical space.

--instance_latent_size

Resolution of the latent instance space.

--instance_rgb_size

Resolution of the RGB instance space.

--theta_range

Azimuthal range (0-360)

--theta_interval

Interval of the azimuth.

--FOV

Resolution of the RGB instance space.

--average_rgb

Perform averaging in the RGB domain (Only valid for Case 2 and Case 5).

3D Gaussians Texturing

python main.py --app gs --tag gs --save_dir_now --save_top_dir ./output --prompt "A photo of majestic red throne, adorned with gold accents" --source_path ./data/gaussians/chair --plyfile ./data/gaussians/chair.ply --dataset_type blender --case_num 2 --zt_init --force_clean_composition

--prompts

Text prompts to guide the generation process.

--save_top_dir

Directory to save intermediate/final outputs.

--tag

Tag output directory.

--save_dir_now

Save output directory with current time.

--case_num

Denoising case num. Refer to the main paper for other cases. (Case 2 - SyncTweedies)

--source_path

Path to input dataset (Refer to 3D Gaussian Splatting repo for data format).

--plyfile

Path to 3D Gaussians model plyfile.

--dataset_type

Input dataset type {colmap, blender}.

--zt_init

Initialize the initial random noise by projecting from the canonical space.

--no-antialiased

Used for 3D scenes trained with 3D Gaussian Splatting framework. Do not provide this option when using 3D scenes reconstructed with gsplat.

Citation

@article{kim2024synctweedies,
  title={SyncTweedies: A General Generative Framework Based on Synchronized Diffusions},
  author={Kim, Jaihoon and Koo, Juil and Yeo, Kyeongmin and Sung, Minhyuk},
  journal={arXiv preprint arXiv:2403.14370},
  year={2024}
}

Acknowledgement

This repository is based on Visual Anagrams, SyncMVD, and gsplat. We thank the authors for publicly releasing their codes.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
assets		assets
data		data
synctweedies		synctweedies
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
main.py		main.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SyncTweedies: A General Generative Framework Based on Synchronized Diffusions

Introduction

Environment Setup

Software Requirements

Data

3D Mesh Texturing

360° Panorama Generation

3D Gaussians Texturing

Inference

3D Mesh Texture Editing

Citation

Acknowledgement

About

Releases

Packages

Contributors 2

Languages

License

KAIST-Visual-AI-Group/SyncTweedies

Folders and files

Latest commit

History

Repository files navigation

SyncTweedies: A General Generative Framework Based on Synchronized Diffusions

Introduction

Environment Setup

Software Requirements

Data

3D Mesh Texturing

360° Panorama Generation

3D Gaussians Texturing

Inference

3D Mesh Texture Editing

Citation

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages