Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@EverythingSmartHome
Last active April 22, 2024 19:17
Show Gist options
  • Star 40 You must be signed in to star a gist
  • Fork 5 You must be signed in to fork a gist
  • Save EverythingSmartHome/055fbdde31a607ef9d695d5cac780e94 to your computer and use it in GitHub Desktop.
Save EverythingSmartHome/055fbdde31a607ef9d695d5cac780e94 to your computer and use it in GitHub Desktop.
ESP32 & ESPHome Voice Assistant
esphome:
name: esp32-mic-speaker
friendly_name: esp32-mic-speaker
on_boot:
- priority: -100
then:
- wait_until: api.connected
- delay: 1s
- if:
condition:
switch.is_on: use_wake_word
then:
- voice_assistant.start_continuous:
esp32:
board: esp32dev
framework:
type: esp-idf
version: recommended
# Enable logging
logger:
# Enable Home Assistant API
api:
ota:
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password
# Enable fallback hotspot (captive portal) in case wifi connection fails
ap:
ssid: "Esp32-Mic-Speaker"
password: "9vYvAFzzPjuc"
i2s_audio:
i2s_lrclk_pin: GPIO27
i2s_bclk_pin: GPIO26
microphone:
- platform: i2s_audio
id: mic
adc_type: external
i2s_din_pin: GPIO13
pdm: false
speaker:
- platform: i2s_audio
id: big_speaker
dac_type: external
i2s_dout_pin: GPIO25
mode: mono
voice_assistant:
microphone: mic
use_wake_word: false
noise_suppression_level: 2
auto_gain: 31dBFS
volume_multiplier: 2.0
speaker: big_speaker
id: assist
switch:
- platform: template
name: Use wake word
id: use_wake_word
optimistic: true
restore_mode: RESTORE_DEFAULT_ON
entity_category: config
on_turn_on:
- lambda: id(assist).set_use_wake_word(true);
- if:
condition:
not:
- voice_assistant.is_running
then:
- voice_assistant.start_continuous
on_turn_off:
- voice_assistant.stop
- lambda: id(assist).set_use_wake_word(false);
@BlackCube4
Copy link

I am trying to get this to work with a ESP32 C3 Super Mini. Has anyone tried that before?

@Xornop
Copy link

Xornop commented Feb 1, 2024

Sometimes I boot the esp and then suddenly it is giving a whole bunch of log lines within the same second, the cycle over and over. Code without light works fine.
[14:28:18][E][voice_assistant:646]: Error: no_wake_word - No wake word detected
[14:28:18][D][voice_assistant:512]: Signaling stop...
[14:28:18][D][voice_assistant:412]: State changed from STARTING_PIPELINE to STOP_MICROPHONE
[14:28:18][D][voice_assistant:418]: Desired state set to IDLE
[14:28:18][D][voice_assistant:412]: State changed from STOP_MICROPHONE to IDLE
[14:28:18][D][voice_assistant:519]: Event Type: 2
[14:28:18][D][voice_assistant:609]: Assist Pipeline ended
[14:28:18][D][voice_assistant:412]: State changed from IDLE to START_PIPELINE
[14:28:18][D][voice_assistant:418]: Desired state set to START_MICROPHONE
[14:28:18][D][voice_assistant:512]: Signaling stop...
[14:28:18][D][voice_assistant:118]: microphone not running
[14:28:18][D][voice_assistant:200]: Requesting start...
[14:28:18][D][voice_assistant:412]: State changed from START_PIPELINE to STARTING_PIPELINE
[14:28:18][D][voice_assistant:519]: Event Type: 1
[14:28:18][D][voice_assistant:522]: Assist Pipeline running
[14:28:18][D][voice_assistant:118]: microphone not running
[14:28:18][D][voice_assistant:519]: Event Type: 9
[14:28:18][D][voice_assistant:118]: microphone not running
[14:28:18][D][voice_assistant:519]: Event Type: 0
Just over and over within the same second. Sometimes restarting it works, sometimes it starts working on its own. How to fix?
I used the original code.
Sometimes it wont even start the code after unplug/replug. Then the speaker starts crackling loudly, sometimes not. Very buggy.

@Xornop
Copy link

Xornop commented Feb 1, 2024

@BlackCube4 I read somewhere that won't work. Did you make it work anyway?

@joshfedo
Copy link

joshfedo commented Feb 2, 2024

@Xornop That logging loop is what you would expect to see. When a wake word is detected the logs will start showing the intent flow.

Unfortunately there are issues with ESP home and i2s/voice assist. After getting everything working on the hardware side Im able to trigger one or two commands before the assist pipe line just stops running.

[10:16:42][D][voice_assistant:519]: Event Type: 7 [10:16:42][D][voice_assistant:575]: Response: "The office lights have been turned on." [10:16:42][VV][scheduler:032]: set_timeout(name='', timeout=0) [10:16:42][VV][scheduler:226]: Running timeout '' with interval=0 last_execution=108935 (now=108948) [10:16:42][VV][api.service:920]: on_voice_assistant_event_response: VoiceAssistantEventResponse { event_type: VOICE_ASSISTANT_TTS_END data: VoiceAssistantEventData { name: 'url' value: 'http://192.168.1.165:8123/api/tts_proxy/2a7ce8cec50032f5696e5e16702bb5339ec37ebe_en-us_03ed9f9845_cloud.wav' } } [10:16:42][D][voice_assistant:519]: Event Type: 8 [10:16:42][D][voice_assistant:595]: Response URL: "http://192.168.1.165:8123/api/tts_proxy/2a7ce8cec50032f5696e5e16702bb5339ec37ebe_en-us_03ed9f9845_cloud.wav" [10:16:42][VV][scheduler:032]: set_timeout(name='', timeout=0) [10:16:42][D][voice_assistant:412]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE [10:16:42][D][voice_assistant:418]: Desired state set to STREAMING_RESPONSE [10:16:42][W][component:214]: Component api took a long time for an operation (0.07 s). [10:16:42][W][component:215]: Components should block for at most 20-30ms. [10:16:42][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:42][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:42][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:42][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:42][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:43][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:43][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:43][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:43][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:43][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:43][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:43][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:43][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:43][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:43][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:43][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:43][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:43][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:43][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:43][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:43][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:43][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:43][VV][scheduler:032]: set_timeout(name='speaker-timeout', timeout=5000) [10:16:43][VV][api.service:920]: on_voice_assistant_event_response: VoiceAssistantEventResponse { event_type: VOICE_ASSISTANT_TTS_STREAM_END } [10:16:43][D][voice_assistant:519]: Event Type: 99 [10:16:43][D][voice_assistant:665]: TTS stream end [10:16:43][D][voice_assistant:283]: End of audio stream received [10:16:43][D][voice_assistant:412]: State changed from STREAMING_RESPONSE to RESPONSE_FINISHED [10:16:43][D][voice_assistant:418]: Desired state set to RESPONSE_FINISHED [10:17:04][VV][api.service:558]: on_ping_request: PingRequest {} [10:17:04][VV][api.service:043]: send_ping_response: PingResponse {} [10:17:08][VV][api.connection:132]: Sending keepalive PING... [10:17:08][VV][api.service:037]: send_ping_request: PingRequest {} [10:17:08][VV][api.service:567]: on_ping_response: PingResponse {} [10:17:33][VV][scheduler:226]: Running interval '' with interval=60000 last_execution=101421 (now=161421) [10:17:44][VV][api.service:558]: on_ping_request: PingRequest {} [10:17:44][VV][api.service:043]: send_ping_response: PingResponse {} [10:18:08][VV][api.service:558]: on_ping_request: PingRequest {} [10:18:08][VV][api.service:043]: send_ping_response: PingResponse {} [10:18:24][VV][api.service:558]: on_ping_request: PingRequest {} [10:18:24][VV][api.service:043]: send_ping_response: PingResponse {} [10:18:33][VV][scheduler:226]: Running interval '' with interval=60000 last_execution=161421 (now=221421)

The device will just keeping pinging until it is rebooted. Once rebooted I can run 1-2 commands and then i run into the same issue. Looking around on the forms im seeing other have the same issue.

@BlackCube4
Copy link

@BlackCube4 I read somewhere that won't work. Did you make it work anyway?

I got it to work fine~ish

@Xornop
Copy link

Xornop commented Feb 4, 2024

@Xornop That logging loop is what you would expect to see. When a wake word is detected the logs will start showing the intent flow.

Not within the same second, it completely freezes up the esp.

@reidprichard
Copy link

reidprichard commented Feb 14, 2024

...I found out that my issue was actually network related. The firewall on my HA server (running HA container) was not allowing the UDP connection from the ESP32 devices. I have not been able to narrow down which port is used for the mic streaming so right now I just all all UDP from the IP address of my ESP32 allowed in ufw.

@ibrummel Thank you!!! I've been banging my head against the desk for an hour trying to figure this out. I would have never in a million years thought of a firewall issue - I'm so used to ESPHome devices working without messing with the firewall and never would have considered that this uses UDP.

@Sn00kiT
Copy link

Sn00kiT commented Feb 14, 2024

Has anyone managed the fix the crackling noise? As soon as i switch the wakeword option on. the speaker starts to make this crackling noise. changing the bitrate didn´t help. i tried another esp32 board/mic/amp, but the problem stays the same.

@rich33584
Copy link

Has anyone managed the fix the crackling noise? As soon as i switch the wakeword option on. the speaker starts to make this crackling noise. changing the bitrate didn´t help. i tried another esp32 board/mic/amp, but the problem stays the same.

Change the dout pin on the speaker. I have mine on GPIO12.

i2s_audio:
  - id: i2s_in
    i2s_lrclk_pin: GPIO25
    i2s_bclk_pin: GPIO26
  - id: i2s_out
    i2s_lrclk_pin: GPIO32
    i2s_bclk_pin: GPIO13

microphone:
  platform: i2s_audio 
  id: external_microphone 
  adc_type: external 
  i2s_audio_id: i2s_in
  i2s_din_pin: GPIO34
  pdm: false
  bits_per_sample: 32bit


speaker:
  platform: i2s_audio 
  id: external_speaker 
  dac_type: external
  i2s_audio_id: i2s_out
  i2s_dout_pin: GPIO12
  mode: mono 

@rich33584
Copy link

Anybody know how to limit the max volume with the Max98357?

Ive built 3 of these now:
https://youtube.com/shorts/-tBJXZdqqRw?si=Tfem5Rv8X2GZ0gWe
A couple things I have learned.
Be careful with the Microphone, the INMP441. If you get it too hot, you'll kill it. If you have a long delay between the command and the action, its more than likely the microphone.
I seem to have better luck building and wiring the voice assistant before loading the firmware onto it. So, get the basic firmware loaded, then build and wire the voice assistant. After that, load the final firmware.
If you are just seeing the basic firmware in the ESP32 interface, select "Clean Build Files" in ESP home by clicking the 3 dots, and then upload the firmware.
Spamming too many commands too quickly will freeze the pipeline. In normal use, you wont be sending 10 commands a minute, so slow down in testing.
The YAML im using. Credit goes to this guy:
https://github.com/ubergeek10/My-ESP32-Assistant

substitutions:
  voice_assist_idle_phase_id: '1'
  voice_assist_listening_phase_id: '2'
  voice_assist_thinking_phase_id: '3'
  voice_assist_replying_phase_id: '4'
  voice_assist_not_ready_phase_id: '10'
  voice_assist_error_phase_id: '11'
  voice_assist_muted_phase_id: '12'
esphome:
  name: bedroom-voice-assistant
  friendly_name: Bedroom Voice assistant

  on_boot:
    priority: 600
    then:
      - script.execute: control_led
      - delay: 30s
      - if:
          condition:
            lambda: return id(init_in_progress);
          then:
            - lambda: id(init_in_progress) = false;
            - script.execute: control_led

esp32:
  board: esp32dev
  framework:
    type: esp-idf


# Enable logging
logger:

# Enable Home Assistant API
api:
  encryption:
    key: "snip"

ota:
  password: "snip"

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "Bedroom-Voice-Assistant"
    password: "snip"

esp_adf:
external_components:
  - source: github://pr#5230
    components:
    - esp_adf 
    refresh: 0s

captive_portal:

light:
  - platform: esp32_rmt_led_strip
    rgb_order: GRB
    pin: GPIO18
    num_leds: 3
    rmt_channel: 0
    chipset: WS2812
    name: "Status LED"
    id: led
    default_transition_length: 0s
    effects:
      - pulse:
          name: "extra_slow_pulse"
          transition_length: 800ms
          update_interval: 800ms
          min_brightness: 0%
          max_brightness: 30%
      - pulse:
          name: "slow_pulse"
          transition_length: 250ms
          update_interval: 250ms
          min_brightness: 50%
          max_brightness: 100%
      - pulse:
          name: "fast_pulse"
          transition_length: 100ms
          update_interval: 100ms
          min_brightness: 50%
          max_brightness: 100%

i2s_audio:
  - id: i2s_in
    i2s_lrclk_pin: GPIO25
    i2s_bclk_pin: GPIO26
  - id: i2s_out
    i2s_lrclk_pin: GPIO32
    i2s_bclk_pin: GPIO13

microphone:
  platform: i2s_audio 
  id: external_microphone 
  adc_type: external 
  i2s_audio_id: i2s_in
  i2s_din_pin: GPIO34
  pdm: false
  bits_per_sample: 32bit


speaker:
  platform: i2s_audio 
  id: external_speaker 
  dac_type: external
  i2s_audio_id: i2s_out
  i2s_dout_pin: GPIO12
  mode: mono 

voice_assistant:
  id: va
  microphone: external_microphone 
  speaker: external_speaker
  use_wake_word: true
  noise_suppression_level: 1
  auto_gain: 31dBFS
  volume_multiplier: 2.5

  on_listening:
    - lambda: id(voice_assistant_phase) = ${voice_assist_listening_phase_id};
    - script.execute: control_led

  on_stt_vad_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_thinking_phase_id};
    - script.execute: control_led

  on_tts_stream_start:
    - lambda: id(voice_assistant_phase) = ${voice_assist_replying_phase_id};
    - script.execute: control_led

  on_tts_stream_end:
    - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
    - script.execute: control_led

  on_error: 
    - if:
        condition:
          lambda: return !id(init_in_progress);
        then:
          - lambda: id(voice_assistant_phase) = ${voice_assist_error_phase_id};
          - script.execute: control_led
          - delay: 1s
          - if:
              condition:
                switch.is_on: use_wake_word
              then:
                - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
              else:
                - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
          - script.execute: control_led

  on_client_connected: 
    - if:
        condition:
          switch.is_on: use_wake_word
        then:
          - voice_assistant.start_continuous
          - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
        else:
          - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
    - script.execute: control_led          

  on_client_disconnected: 
    - lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};
    - script.execute: control_led

switch:
  - platform: template
    name: Use Wake Word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    on_turn_on:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
            - if:
                condition:
                    not:
                      - voice_assistant.is_running
                then:
                  - voice_assistant.start_continuous
            - script.execute: control_led          
 
    on_turn_off:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - voice_assistant.stop
            - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
            - script.execute: control_led          

globals:
  - id: init_in_progress
    type: bool
    restore_value: no
    initial_value: 'true'
  - id: voice_assistant_phase
    type: int
    restore_value: no
    initial_value: ${voice_assist_not_ready_phase_id}
  
script:
  - id: control_led
    then:
      - if:
          condition:
            lambda: return !id(init_in_progress);
          then:
            - if:
                condition:
                    wifi.connected:
                then:
                  - if:
                      condition:
                          api.connected:
                      then:
                        - lambda: |
                            switch(id(voice_assistant_phase)) {
                              case ${voice_assist_listening_phase_id}:
                                id(led).turn_on().set_rgb(0, 0, 1).set_brightness(1.0).set_effect("none").perform();
                                break;
                              case ${voice_assist_thinking_phase_id}:
                                id(led).turn_on().set_rgb(0, 1, 0).set_effect("slow_pulse").perform();
                                break;
                              case ${voice_assist_replying_phase_id}:
                                id(led).turn_on().set_rgb(0, 0, 1).set_brightness(1.0).set_effect("fast_pulse").perform();
                                break;
                              case ${voice_assist_error_phase_id}:
                                id(led).turn_on().set_rgb(1, 1, 1).set_brightness(.5).set_effect("none").perform();
                                break;
                              case ${voice_assist_muted_phase_id}:
                                id(led).turn_off().perform();
                                break;
                              case ${voice_assist_not_ready_phase_id}:
                                id(led).turn_on().perform();
                                break;
                              default:
                                id(led).turn_on().set_rgb(1, 0, 0).set_brightness(0.2).set_effect("none").perform();
                                break;
                            }
                      else:
                        - light.turn_off:
                            id: led
                else:
                  - light.turn_off:
                      id: led
          else:
            - light.turn_on:
                id: led
                blue: 50%
                red: 50%
                green: 50%
                effect: "fast_pulse"

DEV board:
https://www.amazon.com/gp/product/B09GK74F7N/ref=ppx_yo_dt_b_asin_title_o03_s00?ie=UTF8&psc=1
Using the INMP441 for the mic
Max98357 for the amp
Led's:
https://www.amazon.com/gp/product/B09P8MH56K/ref=ppx_yo_dt_b_asin_title_o03_s00?ie=UTF8&th=1
Speaker assembly:
https://www.walmart.com/ip/onn-Small-Rugged-Speaker-with-Bluetooth-Wireless-Technology-Blue/883044562?athbdg=L1100&from=/search

The 3 I have built are running well and reliably. They dont like too much background noise, but some tweeking may help that. I have an automation that will mute my TV on wake word detection to help.
Im also running Home Assistant on a bare metal machine with an I7 and 16gb RAM

@Sn00kiT
Copy link

Sn00kiT commented Feb 16, 2024

Has anyone managed the fix the crackling noise? As soon as i switch the wakeword option on. the speaker starts to make this crackling noise. changing the bitrate didn´t help. i tried another esp32 board/mic/amp, but the problem stays the same.

Change the dout pin on the speaker. I have mine on GPIO12.

i2s_audio:
  - id: i2s_in
    i2s_lrclk_pin: GPIO25
    i2s_bclk_pin: GPIO26
  - id: i2s_out
    i2s_lrclk_pin: GPIO32
    i2s_bclk_pin: GPIO13

microphone:
  platform: i2s_audio 
  id: external_microphone 
  adc_type: external 
  i2s_audio_id: i2s_in
  i2s_din_pin: GPIO34
  pdm: false
  bits_per_sample: 32bit


speaker:
  platform: i2s_audio 
  id: external_speaker 
  dac_type: external
  i2s_audio_id: i2s_out
  i2s_dout_pin: GPIO12
  mode: mono 

Thank you that did the trick and also thank you for the post after that (-: !!!

@joshfedo
Copy link

@rich33584
That config seems to be working for me. Im still hitting some issues with TTS getting cut short or the wake work getting picked up by STT. But all in all its actually working better.

Heres a wire mapping for anyone who is going to try it out:
Mapping:

ESP32 (WROOM-32) MAX98357A (Speaker) I2S Microphone
GPIO33 DIN -
GPIO12 LRCLK -
GPIO13 BCLK -
GPIO34 - SD
GPIO25 - WS
GPIO26 - SCK
3.3V VDD VDD
GND GND GND

@rich33584
Copy link

@rich33584 That config seems to be working for me. Im still hitting some issues with TTS getting cut short or the wake work getting picked up by STT. But all in all its actually working better.

I seem to have spells of hours that it works perfectly, and then hours where the TTS is broken and crappy.
Not sure if its a wifi issue or if the ESP32 is just barely adequate to run these. I ordered som "Better" esp32 modules. Ill report back on them.

@Sn00kiT
Copy link

Sn00kiT commented Feb 27, 2024

Have you seen this?
https://beta.esphome.io/components/micro_wake_word.html

Has anyone luck with an ESP32-S3? The device seems to be more capable to manage the workload

In case someone needs a working config without crackling speakers in correct formating.

esphome:
  name: ha-mic-speaker01
  friendly_name: ha-mic-speaker01

esp32:
  board: esp32dev
  framework:
    type: esp-idf

# Enable logging
logger:

web_server:

# Enable Home Assistant API
api:
  encryption:
    key: "paste_your_key_here"

ota:
  password: "paste_your_key_here"

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "Ha-Mic-Speaker01"
    password: "paste_your_key_here"

captive_portal:

i2s_audio:
  - id: i2s_in
    i2s_lrclk_pin: GPIO26   #WS / LRC
    i2s_bclk_pin: GPIO25    #SCK /BCLK

microphone:
  - platform: i2s_audio
    adc_type: external
    pdm: false
    id: mic_i2s
    channel: right
    bits_per_sample: 32bit
    i2s_audio_id: i2s_in
    i2s_din_pin: GPIO14    #SD

speaker:
  - platform: i2s_audio
    id: my_speaker
    dac_type: external
    i2s_dout_pin: GPIO12   #DIN 
    mode: mono
    i2s_audio_id: i2s_in


voice_assistant:
  microphone: mic_i2s
  id: va
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 4.0
  use_wake_word: false
  speaker: my_speaker
  
  on_error: 
   - if:
        condition:
          switch.is_on: use_wake_word
        then:
          - switch.turn_off: use_wake_word
          - switch.turn_on: use_wake_word      

  on_client_connected:
    - if:
        condition:
          switch.is_on: use_wake_word
        then:
          - voice_assistant.start_continuous:

  on_client_disconnected:
    - if:
        condition:
          switch.is_on: use_wake_word
        then:
          - voice_assistant.stop:
  

binary_sensor:
  - platform: status
    name: API Connection
    id: api_connection
    filters:
      - delayed_on: 1s
    on_press:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - voice_assistant.start_continuous:
    on_release:
      - if:
          condition:
            switch.is_on: use_wake_word
          then:
            - voice_assistant.stop:


switch:
  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(va).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
    
    on_turn_off:
      - voice_assistant.stop
      - lambda: id(va).set_use_wake_word(false);

@jake8796
Copy link

PS D:\Projects\LocalVoiceAssistant> esphome run voiceAssistantESP32S3.yaml
INFO ESPHome 2024.2.2
INFO Reading configuration voiceAssistantESP32S3.yaml...
INFO Updating https://github.com/esphome/esphome.git@pull/5230/head
Failed config

media_player.i2s_audio: [source voiceAssistantESP32S3.yaml:74]

This feature is only available with frameworks ['arduino'].
platform: i2s_audio
name: esp_speaker
id: media_player_speaker
dac_type: external
i2s_audio_id: i2s_out
i2s_dout_pin: GPIO16
mode: mono

Any ideas why I might be getting this error?

@LumCrafter
Copy link

INFO Resolving IP address of esp32-mic-speaker.local
ERROR Error resolving IP address of esp32-mic-speaker.local. Is it connected to WiFi?
ERROR (If this error persists, please set a static IP address: https://esphome.io/components/wifi.html#manual-ips)
ERROR Error resolving IP address: Error resolving address with mDNS: Did not respond. Maybe the device is offline., [Errno -5] No address associated with hostname

I keep getting this error what do i do?

@Xornop
Copy link

Xornop commented Mar 27, 2024

my setup has a screeching sound instead of a crackle when playing back a response. using the setup posted above but using different gpio pins.

Any ideas?

@Djelle
Copy link

Djelle commented Mar 29, 2024

I have discovered, that many of the problem mentioned above is due to interference noise. I had a lot of crackling noise with 10 cm wires on my test setup. So I had to go study on the net.

I2S is ment to be used between components on a PCB. It is without error protocol, so any errors will result in some kind of static noise. The solution is to keep the I2S connection as short as possible. There is no room for the pins on the ESP32 in my box. So I cut those off. Though I left the plastic part of the pin-row. I clammed the MAX98357A board on the backside of the ESP32 PCB, so I only needed 5 mm of wire. I placed the INMP441 on the side-edge of the ESP32 so they form a T (because that works with the box i use). And now it works without any noise.

I don't know how long the wires can be without shielding. But if you need to have even longer wires, they must be individually shielded (only the I2S wires). Shields should be connected to ground on the ESP32 (so only connected in one end). I have no idea how long they then can be. But not especially long. Maybe 20-30 cm?

/Djelle

@Battman2013
Copy link

Battman2013 commented Mar 29, 2024

PS D:\Projects\LocalVoiceAssistant> esphome run voiceAssistantESP32S3.yaml INFO ESPHome 2024.2.2 INFO Reading configuration voiceAssistantESP32S3.yaml... INFO Updating https://github.com/esphome/esphome.git@pull/5230/head Failed config

media_player.i2s_audio: [source voiceAssistantESP32S3.yaml:74]

This feature is only available with frameworks ['arduino']. platform: i2s_audio name: esp_speaker id: media_player_speaker dac_type: external i2s_audio_id: i2s_out i2s_dout_pin: GPIO16 mode: mono

Any ideas why I might be getting this error?

As the error message tells, the media_player is available in the arduino framework only.

esp32:
board: esp32dev # board type
framework:
type: arduino

@edwardtich1
Copy link

edwardtich1 commented Apr 9, 2024

The sound is interrupted when playing through the speaker

here is my code

i2s_audio:
  - id: i2s_in
    i2s_lrclk_pin: GPIO26
    i2s_bclk_pin: GPIO25

media_player:
  - platform: i2s_audio
    name: "Informer"
    id: media_player_speaker
    i2s_audio_id: i2s_in
    dac_type: external
    i2s_dout_pin: GPIO27
    mode: mono
    on_pause:
      - media_player.stop

here is my log

13:32:01	[W]	[component:232]	
Component i2s_audio.media_player took a long time for an operation (167 ms).
13:32:01	[W]	[component:233]	
Components should block for at most 30 ms.
13:32:01	[W]	[component:233]	
Components should block for at most 30 ms.

please help!

@imonlinux
Copy link

imonlinux commented Apr 17, 2024

I was doing some serious google'ing and ran across this Github repo: https://github.com/gnumpi/esphome_audio

Using a lot of what was put in this thread and the work that gnumpi has done on media-player for esp-idf, I have put together this config running on an esp-wroom-32. Still testing (and I think that I'm going to need a S3 board to make this reliable) but I did get it to work last night with a mic and speaker (4Ohm, 3W) with great sound.

Next I am going to add a 16bit led ring and see if I can get even better success on an S3 board.

substitutions:
  device_name: test-media-assistant
  friendly_name: Test Assistant
  device_description: "ESP WROOM 32"
  api_key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

external_components:
  - source:
      type: git
      url: https://github.com/gnumpi/esphome_audio
      ref: main
    components: [ adf_pipeline, i2s_audio ]

esphome:
  name: ${device_name}
  comment: ${device_description}
  name_add_mac_suffix: false
  friendly_name: ${friendly_name}
  on_boot:
     - priority: -100
       then:
         - wait_until: api.connected
         - delay: 1s
         - if:
             condition:
               switch.is_on: use_wake_word
             then:
               - voice_assistant.start_continuous:

esp32:
  board: esp32dev
  framework:
    type: esp-idf

logger:

api:
  encryption:
    key: ${api_key}

ota:
  password: !secret ota_password

wifi:
  enable_rrm: true
  ssid: !secret wifi_ssid
  password: !secret wifi_password

i2s_audio:
  - id: i2s_in
    i2s_lrclk_pin: GPIO21 # INMP441 WS 
    i2s_bclk_pin: GPIO22  # INMP441 SCK 
  - id: i2s_out
    i2s_lrclk_pin: GPIO25 # PCM5102 LCK 
    i2s_bclk_pin: GPIO26  # PCM5102 BCK
    
adf_pipeline:
  - platform: i2s_audio
    type: audio_out
    id: adf_i2s_out
    i2s_audio_id: i2s_out
    i2s_dout_pin: GPIO33 # PCM5102 DIN

  - platform: i2s_audio
    type: audio_in
    id: adf_i2s_in
    i2s_audio_id: i2s_in
    i2s_din_pin: GPIO13 #INMP441 SD
    pdm: false
    channel: right
    sample_rate: 16000
    bits_per_sample: 32bit

microphone:
  - platform: adf_pipeline
    id: adf_microphone
    gain_log2: 3
    keep_pipeline_alive: true
    pipeline:
      - adf_i2s_in
      - self

media_player:
  - platform: adf_pipeline
    id: adf_media_player
    name: s3-dev_media_player
    keep_pipeline_alive: true
    internal: false
    pipeline:
      - self
      - adf_i2s_out

voice_assistant:
  microphone: adf_microphone
  use_wake_word: false
  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 3.0
  media_player: adf_media_player
  id: assist

switch:
  - platform: template
    name: Use wake word
    id: use_wake_word
    optimistic: true
    restore_mode: RESTORE_DEFAULT_ON
    entity_category: config
    on_turn_on:
      - lambda: id(assist).set_use_wake_word(true);
      - if:
          condition:
            not:
              - voice_assistant.is_running
          then:
            - voice_assistant.start_continuous
    on_turn_off:
      - voice_assistant.stop
      - lambda: id(assist).set_use_wake_word(false);

@Battman2013
Copy link

Hi imonlinux, many thanks for sharing your config. Good to know, that the media_player component can be configured with the idf framework also. I have tried this with a wroom32 board using very short wires, but it makes noise almost suppressing the speech.
Earlier I had some success with S3 board, but using the arduino framework. The output sound was clear, but the media_player function was not stable. And I could not figure out, how to get the mic work with the assist pipeline. Physicly the mic was OK (I tested it with a separate Arduino test program on the same board), but Home Assistant did not recognize it.
I plan to try your config on the S3 board, if possible, to see, how it performs...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment