Skip to content

yc015/scene-representation-diffusion-model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Beyond Surface Statistics: Scene Representations in a Latent Diffusion Model

Linear probes found controllable representations of scene attributes in a text-to-image diffusion model

Project page of "Beyond Surface Statistics: Scene Representations in a Latent Diffusion Model"
Paper arXiv link: https://arxiv.org/abs/2306.05720
[NeurIPS link] [Poster link]

How to generate a short video of moving foreground object using a pretrained text-to-image generative model?

See application_of_intervention.ipynb for how to use our intervention technique to generate a short video of moving objects.

Some examples:

The gifs are sampled using the original text-to-image diffusion model without fine-tuning. All frames are generated using the same prompt, random seed (inital latent vectors), and model. We edited the intermediate activations of the latent diffusion model when it generated the images so its internal representtaion of foreground match with our reference mask. See notebook for implementation details.

Probe Weights:

Unzip the probe_checkpoints.zip to acquire all probe weights trained by us. The probe weights in the unzipped folder should be sufficient for you to run all experiments shown in the paper.

Citation

If you find the source code of this repo helpful, please cite

@article{chen2023beyond,
  title={Beyond Surface Statistics: Scene Representations in a Latent Diffusion Model},
  author={Chen, Yida and Vi{\'e}gas, Fernanda and Wattenberg, Martin},
  journal={arXiv preprint arXiv:2306.05720},
  year={2023}
}

Releases

No releases published

Packages

No packages published

Languages