Speech-to-Text using Gaussian Mixture Model & Hidden Markov Model

Description

Speech-to-Text is a technology that can convert voice data into text data. This allows computers to understand human language through voice commands. We combine machine learning technology into Speech-to-Text, namely the Gaussian Mixture Model and the Hidden Markov Model to identify sounds in text.

Live Demo Now is UNAVAILABLE.

General Information

Speech-to-Text is a technology that can convert voice data into text data. This allows computers to understand human language through voice commands. We combine machine learning technology into Speech-to-Text, namely the Gaussian Mixture Model and the Hidden Markov Model to identify sounds in text.

Technologies Used

Flask==2.2.2
hmmlearn==0.2.8
ipython==8.10.0
librosa==0.9.2
numpy==1.23.5
PySoundFile==0.9.0.post1
python_speech_features==0.6
scikit_learn==1.2.1
scipy==1.10.0
soundfile==0.11.0
Werkzeug==2.2.2

Features

Easy to Use
Able to perform speech recognition and convert to text automatically

Lacks

For now only supports English Language & File Format .wav

Screenshots

Setup

The requirements.txt file should list all Python libraries that needed for this project. This library will be installed using:

pip install -r requirements.txt

Usage

Type on your CMD or Terminal :

Clone this Repository

git clone https://github.com/daffaarizkyy/STT_GMM-HMM

cd to your directory (on where's you clone this project)

For Example:

cd STT_GMM-HMM

Run pip install -r requirements.txt
And Run python app.py
Open your browser and enter localhost:5000 or http://127.0.0.1:5000/

Project Status

Project is: complete

Room for Improvement

Room for improvement:

The Speech Recognition Processing needs to be improved so that the processing is more faster

To do for future development:

Added more supported languages and file formats

Acknowledgements

This project was inspired by Youtube Closed Captions and Many Films with Subtitle.

Many thanks to:

Irvan Kurniawan : Modelling HMM Department of Informatics, Faculty of Computer Sciences, Universitas Sriwijaya, Indonesia
Muhammad Daffa Rizky Fatarah : Modelling GMM and UI/UX Designer Department of Informatics, Faculty of Computer Sciences, Universitas Sriwijaya, Indonesia
Osvari Arsalan, S.Kom., M.T : Lecturer and Researcher Department of Informatics, Faculty of Computer Sciences, Universitas Sriwijaya, Indonesia
Rizki Kurniati, M.T. : Lecturer and Researcher Department of Informatics, Faculty of Computer Sciences, Universitas Sriwijaya, Indonesia

Contact

Created by @Wibu x Nolep - feel free to contact us!

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
__pycache__		__pycache__
static		static
templates		templates
.DS_Store		.DS_Store
.gitattributes		.gitattributes
Procfile		Procfile
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
runtime.txt		runtime.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech-to-Text using Gaussian Mixture Model & Hidden Markov Model

Description

Table of Contents

General Information

Technologies Used

Features

Lacks

Screenshots

Setup

Usage

Project Status

Room for Improvement

Acknowledgements

Contact

About

Languages

daffaarizkyy/STT_GMM-HMM

Folders and files

Latest commit

History

Repository files navigation

Speech-to-Text using Gaussian Mixture Model & Hidden Markov Model

Description

Table of Contents

General Information

Technologies Used

Features

Lacks

Screenshots

Setup

Usage

Project Status

Room for Improvement

Acknowledgements

Contact

About

Topics

Resources

Stars

Watchers

Forks

Languages