Skip to content

Mateusz-Dera/whisperspeech-webui

Repository files navigation

WhisperSpeech web UI

Web UI for WhisperSpeech (https://github.com/collabora/WhisperSpeech)

Preview

Info

Version

Note

Version 2.x now allows voice generation via API.

Test platforms:

Name Info
CPU AMD Ryzen 7900X3D (iGPU disabled in BIOS)
GPU AMD Radeon 7900XTX
RAM 64GB DDR5 6600MHz
Motherboard ASRock B650E PG Riptide WiFi (BIOS 2.10)
OS Ubuntu 24.04
Kernel 6.8.0-39-generic
ROCm 6.1.3
Name Info
CPU IntelCore i5-12500H
GPU NVIDIA GeForce RTX 4050
RAM 16GB DDR4 3200MHz
Motherboard GIGABYTE G5 MF (BIOS FB10)
OS Ubuntu 24.04
Kernel 6.8.0-36-generic
NVIDIA Driver 535.183.01

Instalation:

  1. Install Python 3.12

  2. Clone repository

  3. Mount the repository directory.

  4. Create and activate venv

  5. For ROCm set HSA_OVERRIDE_GFX_VERSION.

  • For the Radeon 7900XTX:
export HSA_OVERRIDE_GFX_VERSION=11.0.0
  1. Install ffmpeg:
sudo apt install ffmpeg
  1. Install requirements
  • CPU (not recommended):
pip install -r requrements.txt
  • ROCm 5.7:
pip install -r requirements_rocm_5.7.txt
pip install git+https://github.com/ROCmSoftwarePlatform/flash-attention.git@2554f490101742ccdc56620a938f847f61754be6
  • ROCm 6.0:
pip install -r requirements_rocm_6.0.txt
pip install git+https://github.com/ROCmSoftwarePlatform/flash-attention.git@2554f490101742ccdc56620a938f847f61754be6
  • ROCm 6.1:
pip install -r requirements_rocm_6.1.txt
pip install git+https://github.com/ROCmSoftwarePlatform/flash-attention.git@2554f490101742ccdc56620a938f847f61754be6
  • CUDA 11.8:
pip install -r requrements_cuda_11.8.txt
  • CUDA 12.1:
pip install -r requrements_cuda_12.1.txt
  1. Run:
python webui.py
  • With -h or --help for help:
python webui.py -h

GUI available languages:

  • English
  • Polish