Skip to content

Voice Assistant

Piotr Skowronek edited this page Jun 14, 2020 · 7 revisions

Concept

The idea was to use open-source voice recognition software that is able to run offline. I found PocketSphinx very promising. Raspberry Pi Zero CPU is not so powerful and AFAIK it has only one core, so I decided to use very simple acoustic model (HMM) and generate custom LM & Dict using lmtool online tool for only number of words.

Thanks to all that voice recognition and pulseaudio use <20% of RPi CPU (10% of 512M RAM). The recognition isn't perfect:

  • there are lots of false positives for wake word
  • I'm physically incapable of saying 'forecast' in a way it could understand
    • but this can be due to miserable quality of my USB sound card

An installation script is provided to automate the setup, download acoustic model, build LM & DICT files.

Adjusting assistant

You can adjust the words and commands by editing the following files:

  • config.yaml
    • this is a main configuration of the assistant - it contains paths to acoustic model, dicts, lm, and a definition of sentences and commands it should execute. Multiple invocation words/sentences can be defined, each can answer with different tone (male, female etc, thanks to espeak command)
  • invocation.list - the list of invocation words/sentences *)
  • keyphrase.list - the list of sentences (different variants) *)

*) Both files will be concatenated together and sent to lmtool to generate lm & dict. Make sure to write those sentences in UPPER case (don't ask me why, I dunno).

Writing scripts for command

Take a look in this folder - pick one of cmd_* as an example and write your own. Mind that those commands are executed in blocking mode (assistant waits until they end up, this is to let the script to decide when to hand over the control to assistant). Use & to go into background if you need and know what you're doing.

Hardware

You must get a USB soundcard that has microphone input and audio output. You may also try to use tiny USB microphones and a separate USB audio output (this can be a good idea btw, see below). The thing is that the cheapest USB soundcards introduce a lot of noise especially when mike is on - and even upgrading them with capacitors won't help that much (at least it didn't work for me, noise is being heard in speaker and on recording when mike is on).

For output I used a tiny 0.1W 8ohm speaker (MG15) - but since the stereo output from soundcard is too weak, you need to use an amp module (like this one, take 5V from either USB or from RPi, ideally if this voltage was regulated and filtered).

When you have it connected to RPi, open sound preferences UI and adjust the volume and mike gain. The settings should persist after reboots.

Clone this wiki locally