Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[馃専 New Model] ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation #8414

Open
2 tasks done
Bai-YT opened this issue Jun 5, 2024 · 6 comments 路 May be fixed by #8739
Open
2 tasks done

Comments

@Bai-YT
Copy link

Bai-YT commented Jun 5, 2024

Model/Pipeline/Scheduler description

ConsistencyTTA, introduced in the paper Accelerating Diffusion-Based Text-to-Audio Generation
with Consistency Distillation
, is an efficient text-to-audio generation model. Compared to a comparable diffusion-based TTA model, ConsistencyTTA achieves a 400x generation speed-up, while retaining the generation quality and diversity.

Due to its high generation quality and fast inference, we believe integrating this model into diffusers will make diffusers more appealing to text-to-audio generation researchers and users! Thank you very much.

Open source status

  • The model implementation is available.
  • The model weights are available (Only relevant if addition is not a scheduler).

Provide useful links for the implementation

The open-source code implementation can be found at https://github.com/Bai-YT/ConsistencyTTA.

There is also a simplified implementation for inference only: https://github.com/Bai-YT/ConsistencyTTA/tree/main/easy_inference.

The model checkpoints can be found at https://huggingface.co/Bai-YT/ConsistencyTTA.

I am the main author of the code, and am more than happy to assist the integration.

@Bai-YT Bai-YT changed the title ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation [馃専 New Model] ConsistencyTTA: Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation Jun 5, 2024
@sayakpaul
Copy link
Member

@sanchit-gandhi @Vaibhavs10 FYI.

@a-r-r-o-w
Copy link
Member

@Bai-YT Thank you for your awesome work! I just finished understanding the paper and think that I have a good grasp of the modeling and inference code to convert to diffusers.

@sayakpaul Could I pick this up if no one's working on it?

@sayakpaul
Copy link
Member

Yeah for sure.

@yiyixuxu
Copy link
Collaborator

@a-r-r-o-w cool! but let's put it in community folder to start with

@a-r-r-o-w
Copy link
Member

Sure, sounds good.

@Bai-YT
Copy link
Author

Bai-YT commented Jun 27, 2024

@Bai-YT Thank you for your awesome work! I just finished understanding the paper and think that I have a good grasp of the modeling and inference code to convert to diffusers.

@sayakpaul Could I pick this up if no one's working on it?

Appreciate everyone's time for helping!!! Massive thanks.

@a-r-r-o-w a-r-r-o-w linked a pull request Jun 29, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants