Web UI for WhisperSpeech (https://github.com/collabora/WhisperSpeech)
Note
Version 2.x now allows voice generation via API.
Name | Info |
---|---|
CPU | AMD Ryzen 7900X3D (iGPU disabled in BIOS) |
GPU | AMD Radeon 7900XTX |
RAM | 64GB DDR5 6600MHz |
Motherboard | ASRock B650E PG Riptide WiFi (BIOS 2.10) |
OS | Ubuntu 24.04 |
Kernel | 6.8.0-39-generic |
ROCm | 6.1.3 |
Name | Info |
---|---|
CPU | IntelCore i5-12500H |
GPU | NVIDIA GeForce RTX 4050 |
RAM | 16GB DDR4 3200MHz |
Motherboard | GIGABYTE G5 MF (BIOS FB10) |
OS | Ubuntu 24.04 |
Kernel | 6.8.0-36-generic |
NVIDIA Driver | 535.183.01 |
-
Install Python 3.12
-
Clone repository
-
Mount the repository directory.
-
Create and activate venv
-
For ROCm set HSA_OVERRIDE_GFX_VERSION.
- For the Radeon 7900XTX:
export HSA_OVERRIDE_GFX_VERSION=11.0.0
- Install ffmpeg:
sudo apt install ffmpeg
- Install requirements
- CPU (not recommended):
pip install -r requrements.txt
- ROCm 5.7:
pip install -r requirements_rocm_5.7.txt
pip install git+https://github.com/ROCmSoftwarePlatform/flash-attention.git@2554f490101742ccdc56620a938f847f61754be6
- ROCm 6.0:
pip install -r requirements_rocm_6.0.txt
pip install git+https://github.com/ROCmSoftwarePlatform/flash-attention.git@2554f490101742ccdc56620a938f847f61754be6
- ROCm 6.1:
pip install -r requirements_rocm_6.1.txt
pip install git+https://github.com/ROCmSoftwarePlatform/flash-attention.git@2554f490101742ccdc56620a938f847f61754be6
- CUDA 11.8:
pip install -r requrements_cuda_11.8.txt
- CUDA 12.1:
pip install -r requrements_cuda_12.1.txt
- Run:
python webui.py
- With -h or --help for help:
python webui.py -h
- English
- Polish