Skip to content

Latest commit

 

History

History
159 lines (91 loc) · 6.88 KB

File metadata and controls

159 lines (91 loc) · 6.88 KB
description
AI-powered lip-sync in seconds!

👄 How to use AI Lip Sync Generator?

Gooey.AI offers a simple and no-code solution for lipsyncing in any language. Lip sync Animation can be handy in corporate training modules, AD campaigns, retail and hospitality sectors, and other video content sources where you have multilingual content.

Why Lipsync?

Lipsync can:

  1. Provide multilingual brand information at speed
  2. Reduce production costs for communication and marketing teams
  3. Adds a human touch to your content without added production budgets

Who is Lipsync for?

Lipsync is useful across many industries:

  1. in retail and hospitality for orientation and information videos
  2. In corporate training modules
  3. For Advertisement campaigns with brand ambassadors
  4. Video content creation

Try it here

{% embed url="https://gooey.ai/lipsync-maker/?example_id=zjtgl707&run_id=ov8pu5q4&uid=fm165fOmucZlpa5YHupPBdcvDR02" %}

VIDEO TUTORIAL:

{% embed url="https://www.youtube.com/watch?v=RRmwQR-IytI" %}

How do you use Lipsync Animation generator in Gooey.AI?

Step 1

Prep your avatar video or photograph. Here are some pointers when choosing your image:

  1. Make sure the image is high-resolution
  2. Ensure it clearly shows all the features of your talking head
  3. The image must be cropped up till bust height
  4. Use only human faces

For this example, we have generated an avatar on Gooey.AI’s Image generator tool.

Try the tool here: https://gooey.ai/compare-ai-image-generators/?run_id=pqikl5mi&uid=fm165fOmucZlpa5YHupPBdcvDR02

Step 2

Create your text for the lip sync.

Prepare your text for the lipsync, here is an example below:

Hi, welcome to Gooey.AI. Where Shared AI Workflows Create Measurable Value.
Discover, customize and deploy low-code AI recipes using the best of private and open source Generative A.I.
Built for developers who code fast and teams that prove ROI. 

{% hint style="info" %} Note: Use shorter pieces of text, to ensure high quality lipsync with low-latency and minimum distortion. {% endhint %}

Our workflow allows for multilingual lip-sync. Try our spanish example below:

{% embed url="https://gooey.ai/lipsync-maker/?example_id=zjtgl707&run_id=3ns28gi5&uid=fm165fOmucZlpa5YHupPBdcvDR02" %}

Step 3

Hit “Submit” ☄️🚀

{% embed url="https://storage.googleapis.com/dara-c1b52.appspot.com/daras_ai/media/9f22b116-9d7a-11ee-8eac-02420a0001f9/gooey.ai%20lipsync.mp4#t=0.001" %}

Try it here:

{% embed url="https://gooey.ai/lipsync-maker/?example_id=ygblwbc1" %}

Advanced Settings

Face Padding

You can use the “Face Padding” settings to improve the accuracy of the detected face in the image/video. This ensures that the Lip Sync video looks more realistic.

Speech Provider

Our Lip Sync Animation generator includes several Speech PRovider services. We have detailed each of them below, and speech provider would suit your needs the best.

Google Cloud Text-to-Speech

Google offers a range of voices and accents.

How to use the settings:

  • You can select the voice from the dropdown.
  • To hear the various voice samples you can click on the link that is circled in red.
  • Use the “Speaking rate” and “Pitch” settings to ensure that the voice sounds closest to your brand and character’s personality.

Note: If you are looking for consistent, long-form speech across many languages, then Google is an excellent choice. But, the voice will sound a little robotic, and many not work for uses that require expressive and emotional speech synthesis.

ElevenLabs

ElevenLabs is currently one of the most popular synthetic voice services. They offer fast, accurate speech synthesis, with very realistic human tones.

How to use the settings:

  • Choose a voice from the “Voice Name” dropdown box
  • Choose a “Voice Model” - we recommend using “Multilingual V2” for more accuracy, more languages covered, more natural sounding voices and more stability
  • Stability setting - A lower stability provides a broader emotional range. A value lower than 0.3 can lead to too much instability.
  • Similarity Boost setting - Dictates how hard the model should try to replicate the original voice.
  • Style Exagerration setting - This setting attempts to amplify the style of the original speaker. It requires more compute power and also increases the latency. We recommend you keep this setting on 0.

Custom Voice settings

You can learn more about custom voice settings here

{% embed url="https://gooey.ai/docs/guides/lipsync-videos-with-custom-voices" %}

UberDuck.AI

UberDuck offers low-latency text-to-speech generation.

Bark (Suno.AI)

Bark is also a great service with several voice options. You can find all the various voice samples here.

Speech Provider Samples

Here are some samples of the various speech providers.

Google (english) - https://gooey.ai/compare-text-to-speech-engines/?run_id=n2b8ng36&uid=fm165fOmucZlpa5YHupPBdcvDR02

ElevenLabs (english) -

https://gooey.ai/compare-text-to-speech-engines/?run_id=jj3vkot4&uid=fm165fOmucZlpa5YHupPBdcvDR02

UberDuck.AI (english) -

https://gooey.ai/compare-text-to-speech-engines/?run_id=mvrb5fco&uid=fm165fOmucZlpa5YHupPBdcvDR02