The Multi-Lingual Speech App uses AI to automate the flow from audio recording through translation to speech synthesis. This application is designed to transcribe speech into text, translate the text from English to Spanish or French, and then synthesize the translated text back into speech.
Demo.mp4
- Automated Audio Processing Workflow: Streamlines the process of converting spoken words into another language's speech output.
- Speech-to-Text Transcription: Utilizes Google Speech-to-Text API to accurately transcribe audio recordings into text.
- Multi-Language Translation: Employs DeepL API for high-quality translations from English to Spanish or French.
- Text-to-Speech Synthesis: Uses Google Text-to-Speech API to generate natural-sounding speech in Spanish or French from the translated text.
- Choose your target language (Spanish or French) from dropdown.
- Start by recording your speech through the application interface.
- The app automatically transcribes the speech into text using the Google Speech-to-Text API.
- The app translates the text using DeepL
- Finally, listen to the translated speech, synthesized by the Google Text-to-Speech API.
See the setion below titled "Installation Guide".
This project is licensed under the MIT License. See the LICENSE file for details.
This section guides you through setting up the Multi-Lingual Speech App on your local machine for development and testing purposes. Follow these steps to get a copy of the project up and running.
Before installing the application, ensure you have the following:
- Python 3.6 or later installed on your system.
- Pip for installing Python packages.
-
Google Cloud Speech-to-Text and Text-to-Speech APIs:
- Visit the Google Cloud Console.
- Create a new project or select an existing one.
- Enable the Speech-to-Text and Text-to-Speech APIs for your project.
- Go to the "Credentials" page and create a new API key for your application.
-
DeepL API:
- Sign up for an account at DeepL.
- Access the API subscription page and subscribe to a plan that suits your needs.
- Obtain your DeepL API key from the account overview or API plan details.
-
Clone the Repository:
- Clone this repository to your local machine using
git clone https://github.com/Op27/Multi-Lingual-Speech-App.git
.
- Clone this repository to your local machine using
-
Install Dependencies:
- Navigate to the project directory and install the required Python packages using:
pip install -r requirements.txt
- Navigate to the project directory and install the required Python packages using:
-
Configure API Keys:
- For security reasons, it's best to set your API keys as environment variables. On your system, set the following variables:
- For Windows:
set GOOGLE_APPLICATION_CREDENTIALS="path_to_your_google_credentials_json_file" set DEEPL_API_KEY="your_deepl_api_key"
- For Unix/Linux/Mac:
export GOOGLE_APPLICATION_CREDENTIALS="path_to_your_google_credentials_json_file" export DEEPL_API_KEY="your_deepl_api_key"
- For Windows:
- Replace
path_to_your_google_credentials_json_file
with the path to the JSON file containing your Google Cloud credentials, andyour_deepl_api_key
with your actual DeepL API key.
- For security reasons, it's best to set your API keys as environment variables. On your system, set the following variables:
-
Run the Application:
- With the environment variables set and dependencies installed, you can now run the application. Navigate to the app's directory and execute:
python app.py
- With the environment variables set and dependencies installed, you can now run the application. Navigate to the app's directory and execute:
-
Accessing the Application:
- Open your web browser and go to
http://localhost:5000
(or whichever port your application runs on) to start using the app.
- Open your web browser and go to