Speech-to-Text using Gaussian Mixture Model & Hidden Markov Model
Speech-to-Text is a technology that can convert voice data into text data. This allows computers to understand human language through voice commands. We combine machine learning technology into Speech-to-Text, namely the Gaussian Mixture Model and the Hidden Markov Model to identify sounds in text.
Live Demo Now is UNAVAILABLE.
- Description
- General Info
- Technologies Used
- Features
- Screenshots
- Setup
- Usage
- Project Status
- Room for Improvement
- Acknowledgements
- Contact
Speech-to-Text is a technology that can convert voice data into text data. This allows computers to understand human language through voice commands. We combine machine learning technology into Speech-to-Text, namely the Gaussian Mixture Model and the Hidden Markov Model to identify sounds in text.
- Flask==2.2.2
- hmmlearn==0.2.8
- ipython==8.10.0
- librosa==0.9.2
- numpy==1.23.5
- PySoundFile==0.9.0.post1
- python_speech_features==0.6
- scikit_learn==1.2.1
- scipy==1.10.0
- soundfile==0.11.0
- Werkzeug==2.2.2
- Easy to Use
- Able to perform speech recognition and convert to text automatically
- For now only supports English Language & File Format .wav
The requirements.txt
file should list all Python libraries that needed for this project.
This library will be installed using:
pip install -r requirements.txt
Type on your CMD or Terminal :
- Clone this Repository
git clone https://github.com/daffaarizkyy/STT_GMM-HMM
- cd to your directory (on where's you clone this project)
For Example:
cd STT_GMM-HMM
-
Run
pip install -r requirements.txt
-
And Run
python app.py
-
Open your browser and enter
localhost:5000
orhttp://127.0.0.1:5000/
Project is: complete
Room for improvement:
- The Speech Recognition Processing needs to be improved so that the processing is more faster
To do for future development:
- Added more supported languages and file formats
- This project was inspired by Youtube Closed Captions and Many Films with Subtitle.
Many thanks to:
-
Irvan Kurniawan : Modelling HMM Department of Informatics, Faculty of Computer Sciences, Universitas Sriwijaya, Indonesia
-
Muhammad Daffa Rizky Fatarah : Modelling GMM and UI/UX Designer Department of Informatics, Faculty of Computer Sciences, Universitas Sriwijaya, Indonesia
-
Osvari Arsalan, S.Kom., M.T : Lecturer and Researcher Department of Informatics, Faculty of Computer Sciences, Universitas Sriwijaya, Indonesia
-
Rizki Kurniati, M.T. : Lecturer and Researcher Department of Informatics, Faculty of Computer Sciences, Universitas Sriwijaya, Indonesia
Created by @Wibu x Nolep - feel free to contact us!