Sphinx OPS Aphasia Models

This is the main repository to build an acoustic model for sphinx based on the Open Speech Corpus Aphasia Corpus.

First execute the script download_word_recordings.py, this script will fetch all data from OPS.

Then execute the script convert_mp4_to_wav.py, to execute this script you must have FFMpeg installed and on your path.

After you need to prepare sphinx configuration data, to achieve this:

Then you can call the script configure_sphinx.py, this script will configure almost all the files required by sphinx, but to create a custom language model you need to execute generate_language_model.sh.

Make sure you have sphinxtrain installed on your pc

Now execute

sphinxtrain -t ops_aphasia setup

After this in your etc folder you will have a full structure or what you need for your project

Please check this link for further information.

Search for $CFG_HMM_TYPE and select .semi If you are on a multicore machine change $CFG_QUEUE_TYPE to Queue::POSIX and $CFG_NPART and $DEC_CFG_NPART to your machine cores

Then execute the train

sphinxtrain run

This could take some time.

To check the results

pocketsphinx_continuous -hmm model_parameters/ops_aphasia.ci_semi/ -lm etc/ops_aphasia.lm.DMP -dict etc/ops_aphasia.dic -inmic yes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Sphinx OPS Aphasia Models

Files

README.md

Latest commit

History

README.md

File metadata and controls

Sphinx OPS Aphasia Models