WIP: Add support for multiple wakeword/vad models #6653

kahrendt · 2024-04-28T20:38:52Z

What does this implement/fix?

This is still a work in progress! I'd appreciate people testing this out and letting me know about any issues. It will be a breaking change due to changes in the yaml syntax for handling multiple wake word models.

This PR adds several features/performance improvements:

Multiple wake word models can run simultaneously
- At most two of the current models can run simultaneously without issue
Adds support for running a Voice Activity Detection model to potentially reduce certain false accepts
- The current VAD model can only run with 1 model at the same time. If you try to run VAD and two models all at once, accuracy will suffer greatly
Several memory improvements
- Models are loaded and unloaded as mWW starts and stops to save memory when not actively running
- All buffers (excluding the ring buffer) are freed when not actively running
- Ring buffer size is reduced (if it filled up before, there was no chance of ever recovering and so 0.5 s of audio was dropped each time)
- Makes the tensor arena's default allocated memory smaller. The exact space allocated can be set in the codegen stage.

Todo:

Update the manifest format
- Add a field for the necessary tensor arena size (currently the values are hardcoded in __init__.py)
- Add support for a VAD helper model along with its specific parameters
Allow users to only enable VAD instead of having to point to a specific manifest file (also requires uploading the VAD model to the appropriate default repository)
?Possibly better handle the VAD code? I have added a new preprocessor directive to only compile the relevant code if it is enabled, but I'm not sure if this is the best way to handle it. Warning: Enabling/disabling a VAD model will require a full recompile when rebuilding, so if you have a slow computer, this may take awhile!
Update the documentation (see the example yaml at the end of this PR in the mean time)
Fix any bugs people encounter in testing

Types of changes

Bugfix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Other

Related issue or feature (if applicable): not applicable

Pull request in esphome-docs with documentation (if applicable): unfinished

Test Environment

Example entry for `config.yaml`:

# Example config.yaml for multiple models

micro_wake_word:
  on_wake_word_detected:
    - voice_assistant.start: 
        wake_word: !lambda return wake_word; 
  models:
    - model: okay_nabu
      sliding_window_average_size: 5
    - model: hey_jarvis
      probability_cutoff: 0.75

# Example config.yaml for VAD

micro_wake_word:
  on_wake_word_detected:
    - voice_assistant.start: 
        wake_word: !lambda return wake_word; 
  vad_model: 
    model: https://github.com/kahrendt/microWakeWord/releases/download/model/vad_model.json
    sliding_window_average_size: 2
    threshold:
      upper: 0.95
      lower: 0.5
  models:
    - model: alexa

Checklist:

The code change is tested and works locally.
Tests have been added to verify that the new code works (under tests/ folder).

If user exposed functionality or configuration variables are added/changed:

Documentation added/updated in esphome-docs.

codecov-commenter · 2024-04-28T20:39:53Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 54.05%. Comparing base (4d8b5ed) to head (a4886ac).
Report is 494 commits behind head on dev.

Additional details and impacted files

@@            Coverage Diff             @@
##              dev    #6653      +/-   ##
==========================================
+ Coverage   53.70%   54.05%   +0.34%     
==========================================
  Files          50       50              
  Lines        9408     9554     +146     
  Branches     1654     1687      +33     
==========================================
+ Hits         5053     5164     +111     
- Misses       4056     4066      +10     
- Partials      299      324      +25

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

add support for multiple wakeword/vad models

e83a587

probot-esphome bot added has-tests integration: micro_wake_word labels Apr 28, 2024

kahrendt mentioned this pull request Apr 28, 2024

Multiple wake words possible? kahrendt/microWakeWord#19

Closed

kahrendt added 3 commits April 29, 2024 11:41

move probability_cutoff_ to WakeWordModel

d1d4ab1

verify component is setup before starting detection

a4886ac

add ops to support mixednet architecture

7a0d86e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Add support for multiple wakeword/vad models #6653

WIP: Add support for multiple wakeword/vad models #6653

kahrendt commented Apr 28, 2024

codecov-commenter commented Apr 28, 2024 •

edited

WIP: Add support for multiple wakeword/vad models #6653

Are you sure you want to change the base?

WIP: Add support for multiple wakeword/vad models #6653

Conversation

kahrendt commented Apr 28, 2024

What does this implement/fix?

Types of changes

Test Environment

Example entry for config.yaml:

Checklist:

codecov-commenter commented Apr 28, 2024 • edited

Codecov Report

Example entry for `config.yaml`:

codecov-commenter commented Apr 28, 2024 •

edited