-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gradio demo for real-time conversations with WebRTC #150
Conversation
@freddyaboulton just curious- why are you using whisper to transcribe? |
Hi @eschmidbauer ! The audio is passed directly to ultravox in my demo but I used whisper to pass the previous audio prompts as text in the |
@freddyaboulton https://github.com/fixie-ai/ultravox/blob/main/ultravox/tools/gradio_helper.py#L15C13-L15C36 |
Thanks @freddyaboulton! Two quick replies to your questions:
AFAIU the We've had other issues with
Yes, the current |
@freddyaboulton Thanks for submitting this PR—it looks great! Regarding your two questions about streaming output and multi-turn conversation, both are supported in the Gradio demo implemented in ultravox/tools/gradio_demo.py (which requires start/stop recording for each user audio input). I was wondering if it might be possible to adapt some of the ideas from that demo into your Gradio demo? |
Hi @zqhuang211 ! Yes that is a great plan - will update my demo this week :) |
Hi @zqhuang211 , @farzadab, @eschmidbauer - I have updated the demo to use the |
2024-12-06.16-15-56.mp4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. Thank you!
I will make some minor changes from my end.
@freddyaboulton there are some minor formatting issues. Can you run |
Should be fixed @zqhuang211 -thanks! |
This PR adds a gradio demo for real-time conversations with the latest ultravox model. The gradio demo leverages the WebRTC custom component for low-latency audio streaming both locally and on remote servers like EC2 and huggingface spaces.
You can see the demo running here
ultravox-demo.mp4
Key Features:
Improvements (need help from community/model authors!):