VOSK ASR for Unity

[中文|English]

This is a project that implements ASR (Automatic Speech Recognition) inside Unity via the Vosk libraries. This project was built and improved upon the Sample linked on the official website.

What is VOSK

Vosk is a speech recognition toolkit. Some advantages of Vosk include:

Supports 20+ languages and dialects
Works offline, even on lightweight devices
Model comes in ~50 MB size for Mobile & ~2 GB size for Server
Allows manual configuration of keywords for better accuracy

Getting Started

Download a language model from the Official List
Extract the model to the Application.streamingAssetsPath of the project
Larger model is more accurate but takes longer to load
(Optionally) Install Newtonsoft Json Unity Package

How to Use

Inside the script that uses the ASR, add using Vosk.APIs;
Call VoskASR.Init with the necessary arguments
Subscribe to VoskASR.OnTranscriptionResult to process the results
The results are sent in a Json format. Hence why Newtonsoft.Json is recommended.
Refer to Demo.cs for examples
Use LoudnessMeter to visualize the input volume
Use ChineseUtil to convert between Simplified Chinese and Traditional Chinese
- Chinese models are often trained on Simplified Chinese

Parameters

caller: Pass in the MonoBehaviour for Unity to call StartCoroutine on
modelName: Pass in the folder name of the model
autoStart: Should the ASR start after Init
maxAlternatives: How many variants of results does the ASR generate
microphoneIndex: The index of Microphone to use
keyPhrases: Manual configuration of keywords to detect

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README_EN.md

README_EN.md

VOSK ASR for Unity

What is VOSK

Getting Started

How to Use

Parameters

Files

README_EN.md

Latest commit

History

README_EN.md

File metadata and controls

VOSK ASR for Unity

What is VOSK

Getting Started

How to Use

Parameters