An audio video-based multi-modal fusion approach for speech emotion recognition

Status: Submitted for Review

Authors: S M JISHANUL ISLAM, SAHID HOSSAIN MUSTAKIM, MUSFIRAT HOSSAIN, MYSUN MASHIRA, NUR ISLAM SHOURAV, MD. RAYHAN AHMED, SALEKUL ISLAM, A.K.M. MUZAHIDUL ISLAM, AND SWAKKHAR SHATABDA

Requirements

Prerequisite: A CUDA-enabled GPU is preferred. However, for those who would run this code on CPU, ensure to tweak the batch size in correspondence to your hardware capacity. Tweak the batch size in the hyperparams.py file before running the notebooks. If the installations fail, kindly refer to: conda instructions.

If frequent problems arise while running on the local environment, kindly resort to the instructions for cloud notebooks, and run on any cloud platform.

Step-2: Clone this repository:

git clone https://github.com/S-M-J-I/Multimodal-Emotion-Recognition

If you have SSH configured:

git clone [email protected]:S-M-J-I/Multimodal-Emotion-Recognition.git

Step-3: Install pipenv. Skip if you already have it in your system.

pip3 install --user pipenv

Step-4: Install the modules. Run the following command in the terminal:

pipenv install -r requirements.txt

Run the pipelines

To run the notebooks on SAVEE and RAVDESS, we recommend you download the dataset and unpack it in this directory. Then set the path to the directory in their respective notebooks.
Note: while setting the file path, ensure the exta '/' is added to the end. Example: /path_to_dir/

To run the model on the datasets, navigate to the individual notebooks made for them in the explore directory.

Run the following command in the terminal to start the local server:

pipenv run jupyter notebook

Weights

To obtain the weights of the model, kindly access it through the weights directory. Torch hub support for ease of model use is being worked on.

For any assistance or issues, kindly open an Issue in this repository.

Contributions

This repository is not accepting any contributors OUTSIDE the author list mentioned. For any issues related to the code, we request you to open an Issue.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
cloud_notebooks		cloud_notebooks
explore		explore
src		src
weights		weights
.gitattributes		.gitattributes
.gitignore		.gitignore
AlernateInstructions.md		AlernateInstructions.md
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
__init__.py		__init__.py
hubconf.py		hubconf.py
load_test.py		load_test.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

An audio video-based multi-modal fusion approach for speech emotion recognition

Status: Submitted for Review

Requirements

Run the pipelines

Weights

For any assistance or issues, kindly open an Issue in this repository.

Contributions

About

Releases

Packages

Languages

License

S-M-J-I/Multimodal-SER

Folders and files

Latest commit

History

Repository files navigation

An audio video-based multi-modal fusion approach for speech emotion recognition

Status: Submitted for Review

Requirements

Run the pipelines

Weights

For any assistance or issues, kindly open an Issue in this repository.

Contributions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages