Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[micro_wake_word] Version 2 #7032

Merged
merged 24 commits into from
Jul 11, 2024
Merged

Conversation

kahrendt
Copy link
Contributor

@kahrendt kahrendt commented Jul 2, 2024

What does this implement/fix?

Adds support for microWakeWord version 2 models.

  • Adds ops to support more efficient and accurate 'MixedNet' models which can run on an ESP32
  • Adds support for running multiple models at once
  • Adds support for running a Voice Activity Model to reduce false activations from non-speech noises
  • Moved spectrogram feature generation to code in C for a large speed improvement using ESPMicroSpeechFeatures:
  • Models load and unload as mWW is started and stopped to reduce memory usage
  • Update the model manifest to version 2 to support setting the spectrogram feature step size and setting the amount of memory to allocate for the tensor arena.

The old microWakeWord models will still work with this version of the component, but they cannot run simultaneously with the new models. It is a breaking change as the YAML configuration has changed to support listing multiple models.

Types of changes

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Other

Related issue or feature (if applicable): not applicable

Pull request in esphome-docs with documentation (if applicable): esphome/esphome-docs#4015

Test Environment

  • ESP32
  • ESP32 IDF
  • ESP8266
  • RP2040
  • BK72xx
  • RTL87xx

Example entry for config.yaml:

Ensure you are running the ESPHome dev branch!

# Example config.yaml
external_components:
  - source:
      type: git
      url: https://github.com/kahrendt/esphome
      ref:  mww-v2-external-library
    refresh: 0s
    components: [ micro_wake_word ]  

micro_wake_word:
  on_wake_word_detected:
    - voice_assistant.start: 
        wake_word: !lambda return wake_word; 
  vad:
    model: https://github.com/kahrendt/microWakeWord/releases/download/v2.1_models/vad.json
  models:
    - model: https://github.com/kahrendt/microWakeWord/releases/download/v2.1_models/okay_nabu.json
    - model: https://github.com/kahrendt/microWakeWord/releases/download/v2.1_models/alexa.json
    - model: https://github.com/kahrendt/microWakeWord/releases/download/v2.1_models/hey_jarvis.json
    - model: https://github.com/kahrendt/microWakeWord/releases/download/v2.1_models/hey_mycroft.json

Checklist:

  • The code change is tested and works locally.
  • Tests have been added to verify that the new code works (under tests/ folder).

If user exposed functionality or configuration variables are added/changed:

@codecov-commenter
Copy link

codecov-commenter commented Jul 2, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 53.78%. Comparing base (4d8b5ed) to head (fde855b).
Report is 925 commits behind head on dev.

Additional details and impacted files
@@            Coverage Diff             @@
##              dev    #7032      +/-   ##
==========================================
+ Coverage   53.70%   53.78%   +0.07%     
==========================================
  Files          50       50              
  Lines        9408     9660     +252     
  Branches     1654     1704      +50     
==========================================
+ Hits         5053     5196     +143     
- Misses       4056     4140      +84     
- Partials      299      324      +25     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@jesserockz jesserockz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is my initial review. I have not looked at everything yet.

We have the tflite component loaded now in CI, so #ifndef CLANG_TIDY can be removed from the source files.

esphome/components/micro_wake_word/__init__.py Outdated Show resolved Hide resolved
esphome/components/micro_wake_word/__init__.py Outdated Show resolved Hide resolved
esphome/components/micro_wake_word/__init__.py Outdated Show resolved Hide resolved
esphome/components/micro_wake_word/micro_wake_word.cpp Outdated Show resolved Hide resolved
esphome/components/micro_wake_word/__init__.py Outdated Show resolved Hide resolved
esphome/components/micro_wake_word/streaming_model.cpp Outdated Show resolved Hide resolved
esphome/components/micro_wake_word/streaming_model.cpp Outdated Show resolved Hide resolved
esphome/components/micro_wake_word/streaming_model.h Outdated Show resolved Hide resolved
@probot-esphome probot-esphome bot added the core label Jul 3, 2024
@kahrendt kahrendt marked this pull request as ready for review July 3, 2024 17:03
@kahrendt kahrendt requested a review from a team as a code owner July 3, 2024 17:03
@probot-esphome
Copy link

probot-esphome bot commented Jul 3, 2024

Hey there @jesserockz, mind taking a look at this pull request as it has been labeled with an integration (micro_wake_word) you are listed as a code owner for? Thanks!
(message by CodeOwnersMention)

@kahrendt
Copy link
Contributor Author

kahrendt commented Jul 3, 2024

Here is my initial review. I have not looked at everything yet.

We have the tflite component loaded now in CI, so #ifndef CLANG_TIDY can be removed from the source files.

Thanks for the initial review! I made those changes and fixed the things clang tidy was complaining about:

  • add final specifier to the WakeWordModel and VADModel classes (I'm not 100% this is the best way to handle the virtual destructor issue)
  • use unique_ptr for the VAD model and the TFLite interpreter
  • use constexpr in the generate_features_for_window_ function
  • use emplace_back instead of push_back when adding WakeWordModels

@kahrendt kahrendt marked this pull request as draft July 5, 2024 16:04
@kahrendt kahrendt marked this pull request as ready for review July 10, 2024 19:17
Copy link
Member

@jesserockz jesserockz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@esphome
Copy link

esphome bot commented Jul 10, 2024

Please take a look at the requested changes, and use the Ready for review button when you are done, thanks 👍

Learn more about our pull request process.

@esphome esphome bot marked this pull request as draft July 10, 2024 22:59
@kbx81 kbx81 marked this pull request as ready for review July 11, 2024 00:51
@esphome esphome bot requested a review from jesserockz July 11, 2024 00:51
@jesserockz jesserockz merged commit 2873c6b into esphome:dev Jul 11, 2024
34 checks passed
@jesserockz jesserockz mentioned this pull request Jul 11, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Jul 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants