You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We collected a set of natural images and data renderings and compare their latents to normalize the renderings so that the latents look more like natural images, which SD2 is trained on, mostly.
For image, we empirically found that this scaling will let the model converge faster. This also helps reduce the contrast of rendering latents to the normal level of natural images. This way the model can learn better 'global timesteps' (the same reason why we swap the noise schedule and choose a v-prediction base model).
Dear authors, thanks for releasing zero123++.
May I know why do you perform latent and image unscaling? And how do you decide the scaling ratio?
https://huggingface.co/sudo-ai/zero123plus-pipeline/blob/main/pipeline.py#L396
Thank you very much!
The text was updated successfully, but these errors were encountered: