-
-
Save EverythingSmartHome/055fbdde31a607ef9d695d5cac780e94 to your computer and use it in GitHub Desktop.
esphome: | |
name: esp32-mic-speaker | |
friendly_name: esp32-mic-speaker | |
on_boot: | |
- priority: -100 | |
then: | |
- wait_until: api.connected | |
- delay: 1s | |
- if: | |
condition: | |
switch.is_on: use_wake_word | |
then: | |
- voice_assistant.start_continuous: | |
esp32: | |
board: esp32dev | |
framework: | |
type: esp-idf | |
version: recommended | |
# Enable logging | |
logger: | |
# Enable Home Assistant API | |
api: | |
ota: | |
wifi: | |
ssid: !secret wifi_ssid | |
password: !secret wifi_password | |
# Enable fallback hotspot (captive portal) in case wifi connection fails | |
ap: | |
ssid: "Esp32-Mic-Speaker" | |
password: "9vYvAFzzPjuc" | |
i2s_audio: | |
i2s_lrclk_pin: GPIO27 | |
i2s_bclk_pin: GPIO26 | |
microphone: | |
- platform: i2s_audio | |
id: mic | |
adc_type: external | |
i2s_din_pin: GPIO13 | |
pdm: false | |
speaker: | |
- platform: i2s_audio | |
id: big_speaker | |
dac_type: external | |
i2s_dout_pin: GPIO25 | |
mode: mono | |
voice_assistant: | |
microphone: mic | |
use_wake_word: false | |
noise_suppression_level: 2 | |
auto_gain: 31dBFS | |
volume_multiplier: 2.0 | |
speaker: big_speaker | |
id: assist | |
switch: | |
- platform: template | |
name: Use wake word | |
id: use_wake_word | |
optimistic: true | |
restore_mode: RESTORE_DEFAULT_ON | |
entity_category: config | |
on_turn_on: | |
- lambda: id(assist).set_use_wake_word(true); | |
- if: | |
condition: | |
not: | |
- voice_assistant.is_running | |
then: | |
- voice_assistant.start_continuous | |
on_turn_off: | |
- voice_assistant.stop | |
- lambda: id(assist).set_use_wake_word(false); |
l_r: "right" # INMP441 - L/R (GND = right / 3.3v = left)
Your description of the script contains an error
Left/Right Channel Select. When set low, the microphone outputs its signal in the left channel
of the I²S frame. When set high, the microphone outputs its signal in the right channel.
GND = LEFT
Hey indevor. Thanks for catching that mistake.
I have updated my previous post with this correction. Explains why I was fighting with my microphone on the XIAO ESP32 S3 board.
CONFIG_TCP_SND_BUF_DEFAULT: "65535"
CONFIG_TCP_WND_DEFAULT: "512000"
CONFIG_TCP_RECVMBOX_SIZE: "512"
could you explain how this works?
or a link to a primary source?
Hey indevor, those are pulled straight out of the example provided by gnumpi. I haven't tested to see if removing them affects the functionality of the device using gnumpi's work.
https://github.com/gnumpi/esphome_audio/blob/main/examples/esp32-s3-N16R8-adf.yaml
I was trying probably everything... ESP32, ESP32-S2 mini, XIAO ESP32-S3. All of them have distorted crackling sound, on xiao mic is not working at all dunno why.
Now my code looks like this:
substitutions:
# Phases of the Voice Assistant
# IDLE: The voice assistant is ready to be triggered by a wake-word
voice_assist_idle_phase_id: '1'
# LISTENING: The voice assistant is ready to listen to a voice command (after being triggered by the wake word)
voice_assist_listening_phase_id: '2'
# THINKING: The voice assistant is currently processing the command
voice_assist_thinking_phase_id: '3'
# REPLYING: The voice assistant is replying to the command
voice_assist_replying_phase_id: '4'
# NOT_READY: The voice assistant is not ready
voice_assist_not_ready_phase_id: '10'
# ERROR: The voice assistant encountered an error
voice_assist_error_phase_id: '11'
# MUTED: The voice assistant is muted and will not reply to a wake-word
voice_assist_muted_phase_id: '12'
esphome:
name: esp32-s3-voice-assistant
friendly_name: ESP32-S3 Voice Assistant
min_version: 2024.1.0
platformio_options:
build_flags: -DBOARD_HAS_PSRAM
board_build.flash_mode: dio
board_build.mcu: esp32s3
on_boot:
- priority: 600
then:
- light.turn_on:
id: led_ring
brightness: 70%
effect: connecting
psram:
mode: octal
speed: 80MHz
external_components:
- source:
type: git
url: https://github.com/gnumpi/esphome_audio
ref: main
#type: local
#path: /Users/siekmann/Privat/Projects/espHome/esphome_audio/esphome/components
components: [ adf_pipeline, i2s_audio ]
esp32:
board: esp32-s3-devkitc-1
variant: esp32s3
flash_size: 8MB
framework:
type: esp-idf
version: recommended
sdkconfig_options:
# need to set a s3 compatible board for the adf-sdk to compile
# board specific code is not used though
CONFIG_ESP32_S3_BOX_BOARD: "y"
CONFIG_ESP32_WIFI_STATIC_RX_BUFFER_NUM: "16"
CONFIG_ESP32_WIFI_DYNAMIC_RX_BUFFER_NUM: "512"
CONFIG_TCPIP_RECVMBOX_SIZE: "512"
CONFIG_TCP_SND_BUF_DEFAULT: "65535"
CONFIG_TCP_WND_DEFAULT: "512000"
CONFIG_TCP_RECVMBOX_SIZE: "512"
logger:
# Enable Home Assistant API
api:
encryption:
key: DELETED
on_client_connected:
then:
- if:
condition:
switch.is_on: use_wake_word
then:
- delay: 1s
- voice_assistant.start_continuous:
- delay: 1s
- voice_assistant.stop:
- delay: 2s
- voice_assistant.start_continuous:
- script.execute: reset_led
on_client_disconnected:
then:
- light.turn_on:
id: led_ring
blue: 0%
red: 100%
green: 100%
brightness: 50%
effect: connecting
ota:
password: "c04fc98d47fab50835359a772849716b"
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password
manual_ip:
static_ip: 192.168.0.144
gateway: 192.168.0.1
subnet: 255.255.255.0
fast_connect: true
# Enable fallback hotspot (captive portal) in case wifi connection fails
ap:
ssid: "Esp32-S3-Voice-Assistant"
password: "494Uimj1IQc9"
captive_portal:
button:
- platform: restart
id: restart_btn
name: "REBOOT"
i2s_audio:
- id: i2s_in
i2s_lrclk_pin: GPIO7
i2s_bclk_pin: GPIO8
- id: i2s_out
i2s_lrclk_pin: GPIO6
i2s_bclk_pin: GPIO5
adf_pipeline:
- platform: i2s_audio
type: audio_out
id: adf_i2s_out
i2s_audio_id: i2s_out
i2s_dout_pin: GPIO4
- platform: i2s_audio
type: audio_in
id: adf_i2s_in
i2s_audio_id: i2s_in
i2s_din_pin: GPIO9
pdm: false
channel: left
sample_rate: 16000
bits_per_sample: 32bit
microphone:
- platform: adf_pipeline
id: adf_microphone
gain_log2: 3
keep_pipeline_alive: true
pipeline:
- adf_i2s_in
- self
media_player:
- platform: adf_pipeline
id: adf_media_player
name: adf_media_player
keep_pipeline_alive: false
internal: false
pipeline:
- self
- adf_i2s_out
voice_assistant:
id: voice_asst
microphone: adf_microphone
media_player: adf_media_player
noise_suppression_level: 4
auto_gain: 31dBFS
volume_multiplier: 15
use_wake_word: false
on_listening:
- light.turn_on:
id: led_ring
blue: 100%
red: 0%
green: 0%
brightness: 100%
effect: wakeword
on_tts_start:
- light.turn_on:
id: led_ring
blue: 0%
red: 0%
green: 100%
brightness: 75%
effect: pulse
on_end:
- delay: 100ms
- wait_until:
not:
media_player.is_playing:
- script.execute: reset_led
on_error:
- light.turn_on:
id: led_ring
blue: 0%
red: 100%
green: 0%
brightness: 100%
effect: none
- delay: 1s
- script.execute: reset_led
- script.wait: reset_led
- lambda: |-
if (code == "wake-provider-missing" || code == "wake-engine-missing") {
id(use_wake_word).turn_off();
}
script:
- id: reset_led
then:
- if:
condition:
switch.is_on: use_wake_word
then:
- light.turn_on:
id: led_ring
blue: 100%
red: 0%
green: 0%
brightness: 50%
effect: none
else:
- light.turn_off: led_ring
switch:
- platform: template
name: Use wake word
id: use_wake_word
optimistic: true
restore_mode: RESTORE_DEFAULT_ON
entity_category: config
on_turn_on:
- lambda: id(voice_asst).set_use_wake_word(true);
- if:
condition:
not:
- voice_assistant.is_running
then:
- voice_assistant.start_continuous
- script.execute: reset_led
on_turn_off:
- voice_assistant.stop
- script.execute: reset_led
light:
- platform: esp32_rmt_led_strip
id: led_ring
name: "Light"
pin: GPIO1
num_leds: 8
rmt_channel: 0
rgb_order: GRB
chipset: ws2812
default_transition_length: 0s
effects:
- pulse:
name: "Pulse"
transition_length: 0.5s
update_interval: 0.5s
- addressable_twinkle:
name: "Working"
twinkle_probability: 5%
progress_interval: 4ms
- addressable_color_wipe:
name: "Wakeword"
colors:
- red: 0%
green: 50%
blue: 0%
num_leds: 12
add_led_interval: 20ms
reverse: false
- addressable_color_wipe:
name: "Connecting"
colors:
- red: 60%
green: 60%
blue: 60%
num_leds: 12
- red: 60%
green: 60%
blue: 0%
num_leds: 12
add_led_interval: 100ms
reverse: true
I dont know how to properly set this up right now :/
I am not sure if the pin-out for ESP32 DevKit is the same as for ur board. But I think they are. U should not use GPIO pin 6, 7 and 8.
https://randomnerdtutorials.com/esp32-pinout-reference-gpios/
This config is for xiao esp32-s3
Hey strusic,
I was having some of those same issues with the microphone until I enabled micro_wake_word. Below is my current testing yaml for a XIAO ESP32-S3 using duplex mode on a shared i2s_audio channel in order to cut down on the number of pins needed. This is working very well especially in conjunction with Extended OpenAI Conversation. I have ChatGPT 4.o configured to "embody" HAL 9000 and it has been very entertaining. Printing a HAL 9000 prop replica to hold the assistant hardware.
edit: removed OTA password....
edit2: fixed persistent "connecting" led effect after boot
edit3: fixed INMP441 L/R Channel error (thanks again indevor)
substitutions:
device_name: "test-media-assistant-v2"
friendly_name: "Test Media Assistant V2"
device_description: "XIAO ESP32 S3"
esp_board: "esp32-s3-devkitc-1"
framework_type: "esp-idf"
din: "GPIO1" # MAX98357A - DIN
lrclk: "GPIO7" # MAX98357A - LRCLK / INMP441 - WS
bclk: "GPIO8" # MAX98357A - BCLK / INMP441 - SCK
sd: "GPIO2" # INMP441 - SD
l_r: "right" # INMP441 - L/R (3.3v = right / GND = left)
di: "GPIO9" # WS2812 - DI
api_key: "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
# Phases of the Voice Assistant
# IDLE: The voice assistant is ready to be triggered by a wake-word
voice_assist_idle_phase_id: '1'
# LISTENING: The voice assistant is ready to listen to a voice command (after being triggered by the wake word)
voice_assist_listening_phase_id: '2'
# THINKING: The voice assistant is currently processing the command
voice_assist_thinking_phase_id: '3'
# REPLYING: The voice assistant is replying to the command
voice_assist_replying_phase_id: '4'
# NOT_READY: The voice assistant is not ready
voice_assist_not_ready_phase_id: '10'
# ERROR: The voice assistant encountered an error
voice_assist_error_phase_id: '11'
# MUTED: The voice assistant is muted and will not reply to a wake-word
voice_assist_muted_phase_id: '12'
external_components:
- source:
type: git
url: https://github.com/gnumpi/esphome_audio
ref: main
#type: local
#path: /Users/siekmann/Privat/Projects/espHome/esphome_audio/esphome/components
components: [ adf_pipeline, i2s_audio ]
esphome:
name: ${device_name}
comment: ${device_description}
friendly_name: ${friendly_name}
min_version: 2024.2.0
platformio_options:
build_flags: -DBOARD_HAS_PSRAM
board_build.flash_mode: dio
board_upload.maximum_size: 16777216
on_boot:
priority: 600
then:
# Run the script to refresh the LED status
# If after 30 seconds, the device is still initializing (It did not yet connect to Home Assistant), turn off the init_in_progress variable and run the script to refresh the LED status
- delay: 30s
- if:
condition:
lambda: return id(init_in_progress);
then:
- lambda: id(init_in_progress) = false;
esp32:
board: ${esp_board}
variant: ESP32S3
flash_size: 16MB
framework:
type: ${framework_type}
version: recommended
sdkconfig_options:
# need to set a s3 compatible board for the adf-sdk to compile
# board specific code is not used though
CONFIG_ESP32_S3_BOX_BOARD: "y"
CONFIG_ESP32_WIFI_STATIC_RX_BUFFER_NUM: "16"
CONFIG_ESP32_WIFI_DYNAMIC_RX_BUFFER_NUM: "512"
CONFIG_TCPIP_RECVMBOX_SIZE: "512"
CONFIG_TCP_SND_BUF_DEFAULT: "65535"
CONFIG_TCP_WND_DEFAULT: "512000"
CONFIG_TCP_RECVMBOX_SIZE: "512"
logger:
globals:
# Global initialisation variable. Initialized to true and set to false once everything is connected. Only used to have a smooth "plugging" experience
- id: init_in_progress
type: bool
restore_value: no
initial_value: 'true'
# Global variable tracking the phase of the voice assistant (defined above). Initialized to not_ready
- id: voice_assistant_phase
type: int
restore_value: no
initial_value: ${voice_assist_not_ready_phase_id}
psram:
mode: octal
speed: 80MHz
wifi:
enable_rrm: true
ssid: !secret wifi_ssid
password: !secret wifi_password
fast_connect: true
ota:
password: !secret ota_password
api:
encryption:
key: ${api_key}
i2s_audio:
- id: i2s_dplx
i2s_lrclk_pin: ${lrclk}
i2s_bclk_pin: ${bclk}
access_mode: duplex
adf_pipeline:
- platform: i2s_audio
type: audio_out
id: adf_i2s_out
i2s_audio_id: i2s_dplx
i2s_dout_pin: ${din}
sample_rate: 16000
bits_per_sample: 32bit
fixed_settings: true
- platform: i2s_audio
type: audio_in
id: adf_i2s_in
i2s_audio_id: i2s_dplx
i2s_din_pin: ${sd}
pdm: false
channel: ${l_r}
sample_rate: 16000
bits_per_sample: 32bit
fixed_settings: true
microphone:
- platform: adf_pipeline
id: adf_microphone
keep_pipeline_alive: true
pipeline:
- adf_i2s_in
- self
media_player:
- platform: adf_pipeline
id: adf_media_player
name: media_player
keep_pipeline_alive: true
internal: false
pipeline:
- self
- resampler
- adf_i2s_out
micro_wake_word:
model: okay_nabu
on_wake_word_detected:
- media_player.stop:
- light.turn_on:
id: led_ring
blue: 0%
red: 0%
green: 100%
brightness: 75%
effect: pulse
- voice_assistant.start:
voice_assistant:
microphone: adf_microphone
media_player: adf_media_player
use_wake_word: false
#vad_threshold: 3
noise_suppression_level: 1
auto_gain: 31dBFS
volume_multiplier: 15.0
on_client_connected:
- lambda: id(init_in_progress) = false;
- if:
condition:
switch.is_on: use_wake_word
then:
- micro_wake_word.start:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
- script.execute: reset_led
else:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
on_client_disconnected:
- lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};
- voice_assistant.stop
- micro_wake_word.stop
- light.turn_on:
id: led_ring
blue: 0%
red: 100%
green: 100%
brightness: 50%
effect: connecting
on_listening:
- light.turn_on:
id: led_ring
blue: 100%
red: 0%
green: 0%
brightness: 25%
effect: wakeword
on_tts_start:
- light.turn_on:
id: led_ring
blue: 0%
red: 0%
green: 100%
brightness: 75%
effect: pulse
on_end:
then:
- light.turn_off:
id: led_ring
- voice_assistant.stop
- wait_until:
not:
media_player.is_playing:
- script.execute: reset_led
- if:
condition:
switch.is_on: use_wake_word
then:
- micro_wake_word.start:
on_error:
- light.turn_on:
id: led_ring
blue: 0%
red: 100%
green: 0%
brightness: 100%
effect: none
- delay: 1s
- script.execute: reset_led
- script.wait: reset_led
- lambda: |-
if (code == "wake-provider-missing" || code == "wake-engine-missing") {
id(use_wake_word).turn_off();
}
- if:
condition:
switch.is_on: use_wake_word
then:
- micro_wake_word.start:
- script.execute: reset_led
script:
- id: reset_led
then:
- if:
condition:
switch.is_on: use_wake_word
then:
- light.turn_on:
id: led_ring
blue: 100%
red: 0%
green: 0%
brightness: 25%
effect: none
else:
- light.turn_off: led_ring
button:
- platform: restart
id: restart_btn
name: "${friendly_name} REBOOT"
switch:
- platform: template
name: Enable Voice Assistant
id: use_wake_word
optimistic: true
restore_mode: RESTORE_DEFAULT_ON
icon: mdi:assistant
# When the switch is turned on (on Home Assistant):
# Start the voice assistant component
# Set the correct phase and run the script to refresh the LED status
on_turn_on:
- logger.log: "switch on"
- if:
condition:
lambda: return !id(init_in_progress);
then:
- logger.log: "condition 1"
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
- voice_assistant.stop
- delay: 1s
- if:
condition:
not:
- voice_assistant.is_running
then:
- logger.log: "Starting MWW"
#- voice_assistant.start_continuous
- micro_wake_word.start:
- script.execute: reset_led
on_turn_off:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- voice_assistant.stop
- micro_wake_word.stop
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- script.execute: reset_led
- platform: template
name: Pipeline
id: pipeline_switch
optimistic: true
restore_mode: RESTORE_DEFAULT_OFF
on_turn_off:
- media_player.stop
on_turn_on:
- media_player.play_media: "https://dl.espressif.com/dl/audio/ff-16b-2c-44100hz.mp3"
light:
- platform: esp32_rmt_led_strip
id: led_ring
name: "${friendly_name} Light"
pin: ${di}
num_leds: 16
rmt_channel: 0
rgb_order: GRB
chipset: ws2812
default_transition_length: 0s
effects:
- pulse:
name: "Pulse"
transition_length: 0.5s
update_interval: 0.5s
- addressable_twinkle:
name: "Working"
twinkle_probability: 5%
progress_interval: 4ms
- addressable_color_wipe:
name: "Wakeword"
colors:
- red: 0%
green: 50%
blue: 0%
num_leds: 12
add_led_interval: 20ms
reverse: false
- addressable_color_wipe:
name: "Connecting"
colors:
- red: 60%
green: 60%
blue: 60%
num_leds: 12
- red: 60%
green: 60%
blue: 0%
num_leds: 12
add_led_interval: 100ms
reverse: true
I was trying to compile this code, but my home assistant is crashing when I get this line:
Compiling .pioenvs/esp32-s3-voice-assistant/components/esp-tflite-micro/tensorflow/lite/micro/micro_allocation_info.o
I also tried to compile it on windows, but after flashing device is not connecting to HA
Here are a few things that I do when I'm having issues flashing one of these:
- In ESPHome dashboard, click Clean Build Files and then recompile
- Make sure that your USB has enough power for the device (My laptop USB port is not sufficient and will result in a corrupt flash. I purchased a powered USB hub.)
- Use esptool to erase the ESP32 (esptool --chip esp32 erase_flash)
I compiled that exact config today using the following:
ESPHome = 2024.5.0
HA Core = 2024.5.3
Supervisor = 2024.05.1
HASSOS = 12.3
My ESP is well powered, issue I got is compiling this yaml is crashing whole HA OS. I am running RPI4B 4GB with HAOS on SSD. I've added 2GB of swap right now and additional fan. Maybe it helps. Compiling this yaml with micro_wake_word is crazy. I've never run into something like that.
Also there is a lot of warnings while compiling, like:
Compiling .pioenvs/esp32-s3-voice-assistant/components/esp-tflite-micro/tensorflow/lite/micro/kernels/mirror_pad.o
In file included from components/esp-tflite-micro/tensorflow/lite/micro/kernels/lstm_eval.cc:25:
components/esp-tflite-micro/tensorflow/lite/kernels/internal/reference/mul.h: In lambda function:
components/esp-tflite-micro/tensorflow/lite/kernels/internal/reference/mul.h:151:34: warning: declaration of 'const tflite::ArithmeticParams& params' shadows a parameter [-Wshadow]
const uint8_t input2_val) {
^
components/esp-tflite-micro/tensorflow/lite/kernels/internal/reference/mul.h:126:56: note: shadowed declaration is here
inline void BroadcastMul6DSlow(const ArithmeticParams& params,
~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
components/esp-tflite-micro/tensorflow/lite/kernels/internal/reference/mul.h: In lambda function:
components/esp-tflite-micro/tensorflow/lite/kernels/internal/reference/mul.h:210:28: warning: declaration of 'const tflite::ArithmeticParams& params' shadows a parameter [-Wshadow]
const T input2_val) {
^
components/esp-tflite-micro/tensorflow/lite/kernels/internal/reference/mul.h:171:44: note: shadowed declaration is here
BroadcastMul6DSlow(const ArithmeticParams& params,
~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
components/esp-tflite-micro/tensorflow/lite/kernels/internal/reference/mul.h: In lambda function:
components/esp-tflite-micro/tensorflow/lite/kernels/internal/reference/mul.h:250:46: warning: declaration of 'const tflite::ArithmeticParams& params' shadows a parameter [-Wshadow]
const std::complex<float> input2_val) {
^
components/esp-tflite-micro/tensorflow/lite/kernels/internal/reference/mul.h:221:56: note: shadowed declaration is here
inline void BroadcastMul6DSlow(const ArithmeticParams& params,
Is this normal?
100%
You will see all kinds of warnings while it is compiling. Good luck.
I successfully flashed this yaml, but there are a lot of cracklings and distortion while playing anything on speaker. I am using max98357a as amp and MEMS I2S - DFRobot SEN0526
l_r: "right" # INMP441 - L/R (GND = right / 3.3v = left)
di: "GPIO9" # WS2812 - DI
Funny, you're wrong again. Maybe you're copying from the wrong source.
According to the document for this microphone:
Left/Right Channel Select. When set low, the microphone outputs its signal in the left channel
of the I²S frame. When set high, the microphone outputs its signal in the right channel.
GND = LEFT
Hey strusic, I was having some of those same issues with the microphone until I enabled micro_wake_word. Below is my current testing yaml for a XIAO ESP32-S3 using duplex mode on a shared i2s_audio channel in order to cut down on the number of pins needed. This is working very well especially in conjunction with Extended OpenAI Conversation. I have ChatGPT 4.o configured to "embody" HAL 9000 and it has been very entertaining. Printing a HAL 9000 prop replica to hold the assistant hardware.
edit: removed OTA password.... edit2: fixed persistent "connecting" led effect after boot
substitutions: device_name: "test-media-assistant-v2" friendly_name: "Test Media Assistant V2" device_description: "XIAO ESP32 S3" esp_board: "esp32-s3-devkitc-1" framework_type: "esp-idf" din: "GPIO1" # MAX98357A - DIN lrclk: "GPIO7" # MAX98357A - LRCLK / INMP441 - WS bclk: "GPIO8" # MAX98357A - BCLK / INMP441 - SCK sd: "GPIO2" # INMP441 - SD l_r: "right" # INMP441 - L/R (GND = right / 3.3v = left) di: "GPIO9" # WS2812 - DI api_key: "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" # Phases of the Voice Assistant # IDLE: The voice assistant is ready to be triggered by a wake-word voice_assist_idle_phase_id: '1' # LISTENING: The voice assistant is ready to listen to a voice command (after being triggered by the wake word) voice_assist_listening_phase_id: '2' # THINKING: The voice assistant is currently processing the command voice_assist_thinking_phase_id: '3' # REPLYING: The voice assistant is replying to the command voice_assist_replying_phase_id: '4' # NOT_READY: The voice assistant is not ready voice_assist_not_ready_phase_id: '10' # ERROR: The voice assistant encountered an error voice_assist_error_phase_id: '11' # MUTED: The voice assistant is muted and will not reply to a wake-word voice_assist_muted_phase_id: '12' external_components: - source: type: git url: https://github.com/gnumpi/esphome_audio ref: main #type: local #path: /Users/siekmann/Privat/Projects/espHome/esphome_audio/esphome/components components: [ adf_pipeline, i2s_audio ] esphome: name: ${device_name} comment: ${device_description} friendly_name: ${friendly_name} min_version: 2024.2.0 platformio_options: build_flags: -DBOARD_HAS_PSRAM board_build.flash_mode: dio board_upload.maximum_size: 16777216 on_boot: priority: 600 then: # Run the script to refresh the LED status # If after 30 seconds, the device is still initializing (It did not yet connect to Home Assistant), turn off the init_in_progress variable and run the script to refresh the LED status - delay: 30s - if: condition: lambda: return id(init_in_progress); then: - lambda: id(init_in_progress) = false; esp32: board: ${esp_board} variant: ESP32S3 flash_size: 16MB framework: type: ${framework_type} version: recommended sdkconfig_options: # need to set a s3 compatible board for the adf-sdk to compile # board specific code is not used though CONFIG_ESP32_S3_BOX_BOARD: "y" CONFIG_ESP32_WIFI_STATIC_RX_BUFFER_NUM: "16" CONFIG_ESP32_WIFI_DYNAMIC_RX_BUFFER_NUM: "512" CONFIG_TCPIP_RECVMBOX_SIZE: "512" CONFIG_TCP_SND_BUF_DEFAULT: "65535" CONFIG_TCP_WND_DEFAULT: "512000" CONFIG_TCP_RECVMBOX_SIZE: "512" logger: globals: # Global initialisation variable. Initialized to true and set to false once everything is connected. Only used to have a smooth "plugging" experience - id: init_in_progress type: bool restore_value: no initial_value: 'true' # Global variable tracking the phase of the voice assistant (defined above). Initialized to not_ready - id: voice_assistant_phase type: int restore_value: no initial_value: ${voice_assist_not_ready_phase_id} psram: mode: octal speed: 80MHz wifi: enable_rrm: true ssid: !secret wifi_ssid password: !secret wifi_password fast_connect: true ota: password: !secret ota_password api: encryption: key: ${api_key} i2s_audio: - id: i2s_dplx i2s_lrclk_pin: ${lrclk} i2s_bclk_pin: ${bclk} access_mode: duplex adf_pipeline: - platform: i2s_audio type: audio_out id: adf_i2s_out i2s_audio_id: i2s_dplx i2s_dout_pin: ${din} sample_rate: 16000 bits_per_sample: 32bit fixed_settings: true - platform: i2s_audio type: audio_in id: adf_i2s_in i2s_audio_id: i2s_dplx i2s_din_pin: ${sd} pdm: false channel: ${l_r} sample_rate: 16000 bits_per_sample: 32bit fixed_settings: true microphone: - platform: adf_pipeline id: adf_microphone keep_pipeline_alive: true pipeline: - adf_i2s_in - self media_player: - platform: adf_pipeline id: adf_media_player name: media_player keep_pipeline_alive: true internal: false pipeline: - self - resampler - adf_i2s_out micro_wake_word: model: okay_nabu on_wake_word_detected: - media_player.stop: - light.turn_on: id: led_ring blue: 0% red: 0% green: 100% brightness: 75% effect: pulse - voice_assistant.start: voice_assistant: microphone: adf_microphone media_player: adf_media_player use_wake_word: false #vad_threshold: 3 noise_suppression_level: 1 auto_gain: 31dBFS volume_multiplier: 15.0 on_client_connected: - lambda: id(init_in_progress) = false; - if: condition: switch.is_on: use_wake_word then: - micro_wake_word.start: - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id}; - script.execute: reset_led else: - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id}; on_client_disconnected: - lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id}; - voice_assistant.stop - micro_wake_word.stop - light.turn_on: id: led_ring blue: 0% red: 100% green: 100% brightness: 50% effect: connecting on_listening: - light.turn_on: id: led_ring blue: 100% red: 0% green: 0% brightness: 25% effect: wakeword on_tts_start: - light.turn_on: id: led_ring blue: 0% red: 0% green: 100% brightness: 75% effect: pulse on_end: then: - light.turn_off: id: led_ring - voice_assistant.stop - wait_until: not: media_player.is_playing: - script.execute: reset_led - if: condition: switch.is_on: use_wake_word then: - micro_wake_word.start: on_error: - light.turn_on: id: led_ring blue: 0% red: 100% green: 0% brightness: 100% effect: none - delay: 1s - script.execute: reset_led - script.wait: reset_led - lambda: |- if (code == "wake-provider-missing" || code == "wake-engine-missing") { id(use_wake_word).turn_off(); } - if: condition: switch.is_on: use_wake_word then: - micro_wake_word.start: - script.execute: reset_led script: - id: reset_led then: - if: condition: switch.is_on: use_wake_word then: - light.turn_on: id: led_ring blue: 100% red: 0% green: 0% brightness: 25% effect: none else: - light.turn_off: led_ring button: - platform: restart id: restart_btn name: "${friendly_name} REBOOT" switch: - platform: template name: Enable Voice Assistant id: use_wake_word optimistic: true restore_mode: RESTORE_DEFAULT_ON icon: mdi:assistant # When the switch is turned on (on Home Assistant): # Start the voice assistant component # Set the correct phase and run the script to refresh the LED status on_turn_on: - logger.log: "switch on" - if: condition: lambda: return !id(init_in_progress); then: - logger.log: "condition 1" - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id}; - voice_assistant.stop - delay: 1s - if: condition: not: - voice_assistant.is_running then: - logger.log: "Starting MWW" #- voice_assistant.start_continuous - micro_wake_word.start: - script.execute: reset_led on_turn_off: - if: condition: lambda: return !id(init_in_progress); then: - voice_assistant.stop - micro_wake_word.stop - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id}; - script.execute: reset_led - platform: template name: Pipeline id: pipeline_switch optimistic: true restore_mode: RESTORE_DEFAULT_OFF on_turn_off: - media_player.stop on_turn_on: - media_player.play_media: "https://dl.espressif.com/dl/audio/ff-16b-2c-44100hz.mp3" light: - platform: esp32_rmt_led_strip id: led_ring name: "${friendly_name} Light" pin: ${di} num_leds: 16 rmt_channel: 0 rgb_order: GRB chipset: ws2812 default_transition_length: 0s effects: - pulse: name: "Pulse" transition_length: 0.5s update_interval: 0.5s - addressable_twinkle: name: "Working" twinkle_probability: 5% progress_interval: 4ms - addressable_color_wipe: name: "Wakeword" colors: - red: 0% green: 50% blue: 0% num_leds: 12 add_led_interval: 20ms reverse: false - addressable_color_wipe: name: "Connecting" colors: - red: 60% green: 60% blue: 60% num_leds: 12 - red: 60% green: 60% blue: 0% num_leds: 12 add_led_interval: 100ms reverse: true
INFO ESPHome 2024.5.0
INFO Reading configuration /config/esphome/test.yaml...
WARNING GPIO3 is a strapping PIN and should only be used for I/O with care.
Attaching external pullup/down resistors to strapping pins can cause unexpected failures.
See https://esphome.io/guides/faq.html#why-am-i-getting-a-warning-about-strapping-pins
WARNING GPIO45 is a strapping PIN and should only be used for I/O with care.
Attaching external pullup/down resistors to strapping pins can cause unexpected failures.
See https://esphome.io/guides/faq.html#why-am-i-getting-a-warning-about-strapping-pins
INFO Generating C++ source...
INFO Updating https://github.com/espressif/esp-adf.git@v2.5
INFO Updating submodules (components/esp-adf-libs, components/esp-sr) for https://github.com/espressif/esp-adf.git@v2.5
Traceback (most recent call last):
File "/usr/local/bin/esphome", line 33, in <module>
sys.exit(load_entry_point('esphome', 'console_scripts', 'esphome')())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/esphome/esphome/__main__.py", line 1065, in main
return run_esphome(sys.argv)
^^^^^^^^^^^^^^^^^^^^^
File "/esphome/esphome/__main__.py", line 1052, in run_esphome
rc = POST_CONFIG_ACTIONS[args.command](args, config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/esphome/esphome/__main__.py", line 479, in command_run
exit_code = write_cpp(config)
^^^^^^^^^^^^^^^^^
File "/esphome/esphome/__main__.py", line 193, in write_cpp
return write_cpp_file()
^^^^^^^^^^^^^^^^
File "/esphome/esphome/__main__.py", line 211, in write_cpp_file
writer.write_cpp(code_s)
File "/esphome/esphome/writer.py", line 344, in write_cpp
copy_src_tree()
File "/esphome/esphome/writer.py", line 297, in copy_src_tree
copy_files()
File "/esphome/esphome/components/esp32/__init__.py", line 684, in copy_files
repo_dir, _ = git.clone_or_update(
^^^^^^^^^^^^^^^^^^^^
File "/esphome/esphome/git.py", line 111, in clone_or_update
run_git_command(
File "/esphome/esphome/git.py", line 31, in run_git_command
raise cv.Invalid(lines[-1][len("fatal: ") :])
voluptuous.error.Invalid: Unable to find current revision in submodule path 'components/esp-adf-libs'
Hey indevor,
I have so many versions of this config running on different esp32 boards now I must have pulled that old config back in at some point. The odd thing is that this specific config running on the XIAO ESP32-S3 with a INMP441 with grounded L/R pin set to Right channel is working perfectly. When I'm back home I will change it back to Left and see if it continues to work. I was 100% wrong. I have the INMP441 L/R channel set high. That is why the selection of right channel is working but my note in substitutions is backwards again. Thanks again indevor!
Not sure what your last post was indicating, I have moved away from using GPIO3 and GPIO45 due to these warnings and am now using the single i2s_audio channel in duplex mode. Not sure where that voluptuous.error is coming from on your run. I just reran this config after doing a Clean Build Files and it compiles as expected.
edit: typo
edit2: added completion of build
edit3: confirmed that INMP441 L/R channel is set high and not low so right channel is working and fixed in config above
INFO ESPHome 2024.5.0
INFO Reading configuration /config/esphome/test-media-assistant-v2.yaml...
INFO Generating C++ source...
INFO Updating https://github.com/espressif/esp-adf.git@v2.5
INFO Updating submodules (components/esp-adf-libs, components/esp-sr) for https://github.com/espressif/esp-adf.git@v2.5
INFO Updating https://github.com/espressif/esp-tflite-micro@None
INFO Compiling app...
Processing test-media-assistant-v2 (board: esp32-s3-devkitc-1; framework: espidf; platform: platformio/espressif32@5.4.0)
--------------------------------------------------------------------------------
Library Manager: Installing esphome/noise-c @ 0.1.4
INFO Installing esphome/noise-c @ 0.1.4
Unpacking [####################################] 100%
Library Manager: noise-c@0.1.4 has been installed!
INFO noise-c@0.1.4 has been installed!
Library Manager: Resolving dependencies...
INFO Resolving dependencies...
Library Manager: Installing esphome/libsodium @ 1.10018.1
INFO Installing esphome/libsodium @ 1.10018.1
Unpacking [####################################] 100%
Library Manager: libsodium@1.10018.1 has been installed!
INFO libsodium@1.10018.1 has been installed!
HARDWARE: ESP32S3 240MHz, 320KB RAM, 16MB Flash
- framework-espidf @ 3.40407.0 (4.4.7)
- tool-cmake @ 3.16.4
- tool-ninja @ 1.7.1
- toolchain-esp32ulp @ 2.35.0-20220830
- toolchain-riscv32-esp @ 8.4.0+2021r2-patch5
- toolchain-xtensa-esp32s3 @ 8.4.0+2021r2-patch5
Reading CMake configuration...
Generating assembly for certificate bundle...
Dependency Graph
|-- noise-c @ 0.1.4
Generating assembly for .pioenvs/test-media-assistant-v2/duer_profile.S
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/adf_pipeline/adf_audio_element.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/adf_pipeline/adf_audio_process.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/adf_pipeline/adf_audio_sinks.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/adf_pipeline/adf_audio_sources.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/adf_pipeline/adf_pipeline.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/adf_pipeline/adf_pipeline_controller.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/adf_pipeline/media_player/adf_media_player.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/adf_pipeline/microphone/esp_adf_microphone.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/api/api_connection.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/api/api_frame_helper.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/api/api_pb2.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/api/api_pb2_service.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/api/api_server.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/api/list_entities.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/api/proto.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/api/subscribe_state.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/api/user_services.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/button/button.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/esp32/core.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/esp32/gpio.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/esp32/preferences.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/esp32_rmt_led_strip/led_strip.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/i2s_audio/adf_pipeline/adf_i2s_in.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/i2s_audio/adf_pipeline/adf_i2s_out.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/i2s_audio/adf_pipeline/i2s_stream_mod.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/i2s_audio/external_adc.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/i2s_audio/external_dac.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/i2s_audio/i2s_audio.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/light/addressable_light.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/light/automation.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/light/esp_color_correction.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/light/esp_hsv_color.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/light/esp_range_view.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/light/light_call.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/light/light_json_schema.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/light/light_output.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/light/light_state.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/logger/logger.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/logger/logger_esp32.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/logger/logger_esp8266.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/logger/logger_host.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/logger/logger_libretiny.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/logger/logger_rp2040.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/md5/md5.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/mdns/mdns_component.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/mdns/mdns_esp32.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/mdns/mdns_esp8266.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/mdns/mdns_host.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/mdns/mdns_libretiny.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/mdns/mdns_rp2040.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/media_player/media_player.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/micro_wake_word/micro_wake_word.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/network/util.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/ota/ota_backend_arduino_esp32.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/ota/ota_backend_arduino_esp8266.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/ota/ota_backend_arduino_libretiny.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/ota/ota_backend_arduino_rp2040.o
Compiling .pioenvs/test-media-assistant-v2/src/esphome/components/ota/ota_backend_esp_idf.o
.
.
.
.
Linking .pioenvs/test-media-assistant-v2/firmware.elf
/data/cache/platformio/packages/toolchain-xtensa-esp32s3/bin/../lib/gcc/xtensa-esp32s3-elf/8.4.0/../../../../xtensa-esp32s3-elf/bin/ld: missing --end-group; added as last command line option
RAM: [= ] 12.0% (used 39348 bytes from 327680 bytes)
Flash: [== ] 17.8% (used 1444629 bytes from 8126464 bytes)
Building .pioenvs/test-media-assistant-v2/firmware.bin
Creating esp32s3 image...
Successfully created esp32s3 image.
esp32_create_combined_bin([".pioenvs/test-media-assistant-v2/firmware.bin"], [".pioenvs/test-media-assistant-v2/firmware.elf"])
Wrote 0x170c80 bytes to file /data/build/test-media-assistant-v2/.pioenvs/test-media-assistant-v2/firmware-factory.bin, ready to flash to offset 0x0
======================== [SUCCESS] Took 1629.69 seconds ========================
INFO Successfully compiled program.
Hey strusic,
I haven't used that mic board before, but I am using the same output amp. What kind of speaker are you using? I have tried a few and am having a lot of success with this one running at 50% volume:
https://www.amazon.com/dp/B01CHYIU26
When I get home, I will record a video of it working as a reference and link it here.
edit: included volume level
I am using speaker from old bt speaker. It sounds good when I was using only i2s mediaplayer with esp32-s2 mini. I have plan to get all electronics into case of that speaker.
You may want to take a look at gnumpi's config for Shared I2S-Port with Exclusive Access instead of duplex mode.
I kinda managed to get it work but it sound like from hell
20240518_135917.mp4
Wow. That's ruff. If you changed to exclusive mode, did you include resampler in the pipeline for the media_player?
media_player:
- platform: adf_pipeline
id: adf_media_player
name: s3-dev_media_player
internal: false
keep_pipeline_alive: true
pipeline:
- self
- resampler
- adf_i2s_out
This is mine in duplex mode.
20240518_092752.1.mp4
Hey, based on the config above I have been cooking this:
substitutions:
device_name: "test-01"
friendly_name: "test-01"
device_description: "esp32s3"
esp_board: "esp32-s3-devkitc-1"
# Phases of the Voice Assistant
# IDLE: The voice assistant is ready to be triggered by a wake-word
voice_assist_idle_phase_id: '1'
# LISTENING: The voice assistant is ready to listen to a voice command (after being triggered by the wake word)
voice_assist_listening_phase_id: '2'
# THINKING: The voice assistant is currently processing the command
voice_assist_thinking_phase_id: '3'
# REPLYING: The voice assistant is replying to the command
voice_assist_replying_phase_id: '4'
# NOT_READY: The voice assistant is not ready
voice_assist_not_ready_phase_id: '10'
# ERROR: The voice assistant encountered an error
voice_assist_error_phase_id: '11'
# MUTED: The voice assistant is muted and will not reply to a wake-word
voice_assist_muted_phase_id: '12'
external_components:
- source:
type: git
url: https://github.com/gnumpi/esphome_audio
ref: main
#type: local
#path: /Users/siekmann/Privat/Projects/espHome/esphome_audio/esphome/components
components: [ adf_pipeline, i2s_audio ]
esphome:
name: ${device_name}
comment: ${device_description}
friendly_name: ${friendly_name}
min_version: 2024.2.0
platformio_options:
build_flags: -DBOARD_HAS_PSRAM
board_build.flash_mode: dio
board_upload.maximum_size: 16777216
on_boot:
priority: 600
then:
# Run the script to refresh the LED status
# If after 30 seconds, the device is still initializing (It did not yet connect to Home Assistant), turn off the init_in_progress variable and run the script to refresh the LED status
- delay: 30s
- if:
condition:
lambda: return id(init_in_progress);
then:
- lambda: id(init_in_progress) = false;
esp32:
board: ${esp_board}
variant: ESP32S3
flash_size: 16MB
framework:
type: esp-idf
sdkconfig_options:
# need to set a s3 compatible board for the adf-sdk to compile
# board specific code is not used though
CONFIG_ESP32_S3_BOX_BOARD: "y"
CONFIG_ESP32_WIFI_STATIC_RX_BUFFER_NUM: "16"
CONFIG_ESP32_WIFI_DYNAMIC_RX_BUFFER_NUM: "512"
CONFIG_TCPIP_RECVMBOX_SIZE: "512"
CONFIG_TCP_SND_BUF_DEFAULT: "65535"
CONFIG_TCP_WND_DEFAULT: "512000"
CONFIG_TCP_RECVMBOX_SIZE: "512"
logger:
globals:
# Global initialisation variable. Initialized to true and set to false once everything is connected. Only used to have a smooth "plugging" experience
- id: init_in_progress
type: bool
restore_value: no
initial_value: 'true'
# Global variable tracking the phase of the voice assistant (defined above). Initialized to not_ready
- id: voice_assistant_phase
type: int
restore_value: no
initial_value: ${voice_assist_not_ready_phase_id}
psram:
mode: octal
speed: 80MHz
wifi:
enable_rrm: true
ssid: !secret wifi_ssid
password: !secret wifi_password
fast_connect: true
ota:
password: !secret ota_password
api:
encryption:
key: !secret api_key
i2s_audio:
- id: i2s_in
i2s_lrclk_pin: GPIO7
i2s_bclk_pin: GPIO16
- id: i2s_out
i2s_lrclk_pin: GPIO8
i2s_bclk_pin: GPIO18
adf_pipeline:
- platform: i2s_audio
type: audio_out
id: adf_i2s_out
i2s_audio_id: i2s_out
i2s_dout_pin: GPIO17
- platform: i2s_audio
type: audio_in
id: adf_i2s_in
i2s_audio_id: i2s_in
i2s_din_pin: GPIO15
pdm: false
channel: right
sample_rate: 16000
bits_per_sample: 32bit
microphone:
- platform: adf_pipeline
id: adf_microphone
keep_pipeline_alive: true
pipeline:
- adf_i2s_in
- self
media_player:
- platform: adf_pipeline
id: adf_media_player
name: media_player
keep_pipeline_alive: true
internal: false
pipeline:
- self
- resampler
- adf_i2s_out
micro_wake_word:
model: okay_nabu
on_wake_word_detected:
- media_player.stop:
- light.turn_on:
id: led_ring
blue: 0%
red: 0%
green: 100%
brightness: 75%
effect: pulse
- voice_assistant.start:
voice_assistant:
microphone: adf_microphone
media_player: adf_media_player
use_wake_word: false
#vad_threshold: 3
#noise_suppression_level: 1
auto_gain: 31dBFS
#volume_multiplier: 15.0
on_client_connected:
- lambda: id(init_in_progress) = false;
- if:
condition:
switch.is_on: use_wake_word
then:
- micro_wake_word.start:
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
- script.execute: reset_led
else:
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
on_client_disconnected:
- lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id};
- voice_assistant.stop
- micro_wake_word.stop
- light.turn_on:
id: led_ring
blue: 0%
red: 100%
green: 100%
brightness: 50%
effect: connecting
on_listening:
- light.turn_on:
id: led_ring
blue: 100%
red: 0%
green: 0%
brightness: 25%
effect: wakeword
on_tts_start:
- light.turn_on:
id: led_ring
blue: 0%
red: 0%
green: 100%
brightness: 75%
effect: pulse
on_end:
then:
- light.turn_off:
id: led_ring
- voice_assistant.stop
- wait_until:
not:
media_player.is_playing:
- script.execute: reset_led
- if:
condition:
switch.is_on: use_wake_word
then:
- micro_wake_word.start:
on_error:
- light.turn_on:
id: led_ring
blue: 0%
red: 100%
green: 0%
brightness: 100%
effect: none
- delay: 1s
- script.execute: reset_led
- script.wait: reset_led
- lambda: |-
if (code == "wake-provider-missing" || code == "wake-engine-missing") {
id(use_wake_word).turn_off();
}
- if:
condition:
switch.is_on: use_wake_word
then:
- micro_wake_word.start:
- script.execute: reset_led
script:
- id: reset_led
then:
- if:
condition:
switch.is_on: use_wake_word
then:
- light.turn_on:
id: led_ring
blue: 100%
red: 0%
green: 0%
brightness: 25%
effect: none
else:
- light.turn_off: led_ring
button:
- platform: restart
id: restart_btn
name: "${friendly_name} REBOOT"
switch:
- platform: template
name: Enable Voice Assistant
id: use_wake_word
optimistic: true
restore_mode: RESTORE_DEFAULT_ON
icon: mdi:assistant
# When the switch is turned on (on Home Assistant):
# Start the voice assistant component
# Set the correct phase and run the script to refresh the LED status
on_turn_on:
- logger.log: "switch on"
- if:
condition:
lambda: return !id(init_in_progress);
then:
- logger.log: "condition 1"
- lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id};
- voice_assistant.stop
- delay: 1s
- if:
condition:
not:
- voice_assistant.is_running
then:
- logger.log: "Starting MWW"
#- voice_assistant.start_continuous
- micro_wake_word.start:
- script.execute: reset_led
on_turn_off:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- voice_assistant.stop
- micro_wake_word.stop
- lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id};
- script.execute: reset_led
- platform: template
name: Pipeline
id: pipeline_switch
optimistic: true
restore_mode: RESTORE_DEFAULT_OFF
on_turn_off:
- media_player.stop
light:
- platform: esp32_rmt_led_strip
id: led_ring
name: "${friendly_name} Front LED"
pin: GPIO05
num_leds: 1
rmt_channel: 1
rgb_order: GRB
chipset: ws2812
default_transition_length: 0s
effects:
- pulse:
name: "Pulse"
transition_length: 0.5s
update_interval: 0.5s
- addressable_twinkle:
name: "Working"
twinkle_probability: 5%
progress_interval: 4ms
- addressable_color_wipe:
name: "Wakeword"
colors:
- red: 0%
green: 50%
blue: 0%
num_leds: 12
add_led_interval: 20ms
reverse: false
- addressable_color_wipe:
name: "Connecting"
colors:
- red: 60%
green: 60%
blue: 60%
num_leds: 1
- red: 60%
green: 60%
blue: 0%
num_leds: 1
add_led_interval: 100ms
reverse: true
This works extremely well, but just once ;)
The second time, it detects the wake word, starts blinking green, does not process the request and keeps blinking.
LOG (after trying twice)
INFO ESPHome 2024.5.4
[21:08:48][I][app:100]: ESPHome version 2024.5.4 compiled on May 30 2024, 21:05:15
[21:08:48][C][wifi:580]: WiFi:
[21:08:48][C][wifi:408]: Local MAC: 3C:84:27:CC:42:0C
[21:08:48][C][wifi:413]: SSID: [redacted]
[21:08:48][C][wifi:416]: IP Address: 192.168.207.246
[21:08:48][C][wifi:420]: BSSID: [redacted]
[21:08:48][C][wifi:421]: Hostname: 'test-01'
[21:08:48][C][wifi:423]: Signal strength: -60 dB ▂▄▆█
[21:08:48][C][wifi:427]: Channel: 11
[21:08:48][C][wifi:428]: Subnet: 255.255.255.0
[21:08:48][C][wifi:429]: Gateway: 192.168.207.1
[21:08:48][C][wifi:430]: DNS1: 192.168.207.130
[21:08:48][C][wifi:431]: DNS2: 0.0.0.0
[21:08:48][C][wifi:433]: BTM: disabled
[21:08:48][C][wifi:434]: RRM: enabled
[21:08:48][C][logger:185]: Logger:
[21:08:48][C][logger:186]: Level: DEBUG
[21:08:48][C][logger:188]: Log Baud Rate: 115200
[21:08:48][C][logger:189]: Hardware UART: USB_SERIAL_JTAG
[21:08:48][C][esp32_rmt_led_strip:175]: ESP32 RMT LED Strip:
[21:08:48][C][esp32_rmt_led_strip:176]: Pin: 5
[21:08:48][C][esp32_rmt_led_strip:177]: Channel: 1
[21:08:48][C][esp32_rmt_led_strip:202]: RGB Order: GRB
[21:08:48][C][esp32_rmt_led_strip:203]: Max refresh rate: 0
[21:08:48][C][esp32_rmt_led_strip:204]: Number of LEDs: 1
[21:08:48][C][light:103]: Light 'test-01 Front LED'
[21:08:48][C][light:105]: Default Transition Length: 0.0s
[21:08:48][C][light:106]: Gamma Correct: 2.80
[21:08:48][C][template.switch:068]: Template Switch 'Pipeline'
[21:08:48][C][template.switch:091]: Restore Mode: restore defaults to OFF
[21:08:48][C][template.switch:057]: Optimistic: YES
[21:08:48][C][template.switch:068]: Template Switch 'Enable Voice Assistant'
[21:08:48][C][template.switch:070]: Icon: 'mdi:assistant'
[21:08:48][C][template.switch:091]: Restore Mode: restore defaults to ON
[21:08:48][C][template.switch:057]: Optimistic: YES
[21:08:48][C][psram:020]: PSRAM:
[21:08:48][C][psram:021]: Available: YES
[21:08:48][C][psram:024]: Size: 8191 KB
[21:08:48][C][i2s_audio:028]: I2SController:
[21:08:48][C][i2s_audio:029]: AccessMode: exclusive
[21:08:48][C][i2s_audio:030]: Port: 0
[21:08:48][C][i2s_audio:032]: Reader registered.
[21:08:48][C][i2s_audio:028]: I2SController:
[21:08:48][C][i2s_audio:029]: AccessMode: exclusive
[21:08:48][C][i2s_audio:030]: Port: 1
[21:08:48][C][i2s_audio:035]: Writer registered.
[21:08:48][C][i2s_audio:138]: I2S-Writer (Initial-CFG):
[21:08:48][C][i2s_audio:140]: sample-rate: 16000 bits_per_sample: 32
[21:08:48][C][i2s_audio:141]: channel_fmt: 0 channels: 2
[21:08:48][C][i2s_audio:142]: use_apll: no, use_pdm: no
[21:08:48][C][i2s_audio:135]: I2S-Reader (Initial-CFG):
[21:08:48][C][i2s_audio:140]: sample-rate: 16000 bits_per_sample: 32
[21:08:48][C][i2s_audio:141]: channel_fmt: 3 channels: 1
[21:08:48][C][i2s_audio:142]: use_apll: no, use_pdm: no
[21:08:48][C][restart.button:017]: Restart Button 'test-01 REBOOT'
[21:08:48][C][mdns:115]: mDNS:
[21:08:48][C][mdns:116]: Hostname: test-01
[21:08:48][C][ota:096]: Over-The-Air Updates:
[21:08:48][C][ota:097]: Address: test-01.local:3232
[21:08:48][C][ota:100]: Using Password.
[21:08:48][C][ota:103]: OTA version: 2.
[21:08:48][C][api:139]: API Server:
[21:08:48][C][api:140]: Address: test-01.local:6053
[21:08:48][C][api:142]: Using noise encryption: YES
[21:08:48][C][micro_wake_word:057]: microWakeWord:
[21:08:48][C][micro_wake_word:058]: Wake Word: okay nabu
[21:08:48][C][micro_wake_word:059]: Probability cutoff: 0.500
[21:08:48][C][micro_wake_word:060]: Sliding window size: 10
[21:08:48][C][esp_adf_pipeline.microphone:020]: ADF-Microphone
[21:08:48][C][adf_media_player:016]: ESP-ADF-MediaPlayer:
[21:08:48][C][adf_media_player:018]: Number of ASPComponents: 3
[21:08:51][D][api:102]: Accepted 192.168.207.101
[21:08:51][D][api.connection:1321]: Home Assistant 2024.5.5 (192.168.207.101): Connected successfully
[21:08:51][D][micro_wake_word:177]: State changed from IDLE to START_MICROPHONE
[21:08:51][D][light:036]: 'test-01 Front LED' Setting:
[21:08:51][D][light:051]: Brightness: 25%
[21:08:51][D][light:059]: Red: 0%, Green: 0%, Blue: 100%
[21:08:51][D][micro_wake_word:115]: Starting Microphone
[21:08:51][D][esp_adf_pipeline:050]: Starting request, current state UNINITIALIZED
[21:08:51][D][esp-idf:000]: I (10976) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4
[21:08:52][D][i2s_audio:072]: Installing driver : yes
[21:08:52][D][esp_adf_pipeline:358]: pipeline tag 0, i2s_in
[21:08:52][D][esp_adf_pipeline:358]: pipeline tag 1, pcm_reader
[21:08:52][D][esp-idf:000]: I (10988) AUDIO_PIPELINE: link el->rb, el:0x3d8178b0, tag:i2s_in, rb:0x3d817b8c
[21:08:52][D][esp_adf_pipeline:370]: Setting up event listener.
[21:08:52][D][esp_adf_pipeline:302]: State changed from UNINITIALIZED to PREPARING
[21:08:52][D][micro_wake_word:177]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[21:08:52][D][esp_audio_sinks:053]: Set bitdepth to 32
[21:08:52][D][esp_adf_pipeline:302]: State changed from PREPARING to STARTING
[21:08:52][D][esp-idf:000]: I (11005) AUDIO_ELEMENT: [i2s_in-0x3d8178b0] Element task created
[21:08:52][D][esp-idf:000]: I (11007) AUDIO_ELEMENT: [pcm_reader-0x3d817a44] Element task created
[21:08:52][D][esp-idf:000]: I (11009) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8451851 Bytes, Inter:163104 Bytes, Dram:163104 Bytes
[21:08:52][D][esp-idf:000][i2s_in]: I (11012) AUDIO_ELEMENT: [i2s_in] AEL_MSG_CMD_RESUME,state:1
[21:08:52][D][esp-idf:000]: I (11015) AUDIO_PIPELINE: Pipeline started
[21:08:52][I][esp_adf_pipeline:214]: [ pcm_reader ] status: 12
[21:08:52][D][esp_adf_pipeline:131]: Check element [i2s_in] status, 3
[21:08:52][D][esp_adf_pipeline:131]: Check element [pcm_reader] status, 3
[21:08:52][D][esp_adf_pipeline:302]: State changed from STARTING to RUNNING
[21:08:52][D][micro_wake_word:177]: State changed from STARTING_MICROPHONE to DETECTING_WAKE_WORD
[21:08:52][I][esp_adf_pipeline:214]: [ pcm_reader ] status: 12
[21:08:52][I][esp_adf_pipeline:214]: [ i2s_in ] status: 12
[21:09:14][D][micro_wake_word:362]: Wake word sliding average probability is 0.522 and most recent probability is 1.000
[21:09:14][D][micro_wake_word:128]: Wake Word Detected
[21:09:14][D][micro_wake_word:177]: State changed from DETECTING_WAKE_WORD to STOP_MICROPHONE
[21:09:14][D][micro_wake_word:134]: Stopping Microphone
[21:09:14][D][esp_adf_pipeline:302]: State changed from RUNNING to STOPPING
[21:09:14][D][micro_wake_word:177]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[21:09:14][D][esp-idf:000][i2s_in]: W (33041) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT
[21:09:14][D][esp-idf:000][i2s_in]: W (33044) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT
[21:09:14][D][esp-idf:000][i2s_in]: W (33047) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT
[21:09:14][D][esp-idf:000][i2s_in]: W (33050) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT
[21:09:14][D][esp_adf_pipeline:302]: State changed from STOPPING to STOPPED
[21:09:14][D][micro_wake_word:177]: State changed from STOPPING_MICROPHONE to IDLE
[21:09:14][D][media_player:061]: 'media_player' - Setting
[21:09:14][D][media_player:065]: Command: STOP
[21:09:14][D][esp_adf_pipeline:085]: Called 'stop' while in UNINITIALIZED state.
[21:09:14][D][light:036]: 'test-01 Front LED' Setting:
[21:09:14][D][light:051]: Brightness: 75%
[21:09:14][D][light:059]: Red: 0%, Green: 100%, Blue: 0%
[21:09:14][D][light:109]: Effect: 'Pulse'
[21:09:14][D][voice_assistant:502]: State changed from IDLE to START_MICROPHONE
[21:09:14][D][voice_assistant:508]: Desired state set to START_PIPELINE
[21:09:14][D][voice_assistant:220]: Starting Microphone
[21:09:14][D][ring_buffer:024]: Created ring buffer with size 32768
[21:09:14][D][esp_adf_pipeline:050]: Starting request, current state STOPPED
[21:09:14][D][esp_adf_pipeline:302]: State changed from STOPPED to PREPARING
[21:09:14][D][voice_assistant:502]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[21:09:14][D][esp_adf_pipeline:302]: State changed from PREPARING to STARTING
[21:09:14][D][esp-idf:000]: I (33111) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8422123 Bytes, Inter:168276 Bytes, Dram:168276 Bytes
[21:09:14][D][esp-idf:000][i2s_in]: I (33115) AUDIO_ELEMENT: [i2s_in] AEL_MSG_CMD_RESUME,state:1
[21:09:14][D][esp-idf:000]: I (33117) AUDIO_PIPELINE: Pipeline started
[21:09:14][I][esp_adf_pipeline:214]: [ pcm_reader ] status: 14
[21:09:14][I][esp_adf_pipeline:214]: [ i2s_in ] status: 14
[21:09:14][I][esp_adf_pipeline:214]: [ i2s_in ] status: 12
[21:09:14][D][esp_adf_pipeline:131]: Check element [i2s_in] status, 3
[21:09:14][D][esp_adf_pipeline:131]: Check element [pcm_reader] status, 3
[21:09:14][D][esp_adf_pipeline:302]: State changed from STARTING to RUNNING
[21:09:14][D][voice_assistant:502]: State changed from STARTING_MICROPHONE to START_PIPELINE
[21:09:14][I][esp_adf_pipeline:214]: [ pcm_reader ] status: 12
[21:09:14][D][voice_assistant:274]: Requesting start...
[21:09:14][D][voice_assistant:502]: State changed from START_PIPELINE to STARTING_PIPELINE
[21:09:14][D][voice_assistant:523]: Client started, streaming microphone
[21:09:14][D][voice_assistant:502]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE
[21:09:14][D][voice_assistant:508]: Desired state set to STREAMING_MICROPHONE
[21:09:14][D][voice_assistant:625]: Event Type: 1
[21:09:14][D][voice_assistant:628]: Assist Pipeline running
[21:09:14][D][voice_assistant:625]: Event Type: 3
[21:09:14][D][voice_assistant:639]: STT started
[21:09:14][D][light:036]: 'test-01 Front LED' Setting:
[21:09:14][D][light:051]: Brightness: 25%
[21:09:14][D][light:059]: Red: 0%, Green: 0%, Blue: 100%
[21:09:14][D][light:109]: Effect: 'Wakeword'
[21:09:14][D][voice_assistant:625]: Event Type: 11
[21:09:14][D][voice_assistant:779]: Starting STT by VAD
[21:09:16][D][voice_assistant:625]: Event Type: 12
[21:09:16][D][voice_assistant:783]: STT by VAD end
[21:09:16][D][voice_assistant:502]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE
[21:09:16][D][voice_assistant:508]: Desired state set to AWAITING_RESPONSE
[21:09:16][D][esp_adf_pipeline:302]: State changed from RUNNING to STOPPING
[21:09:16][D][voice_assistant:502]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[21:09:16][D][esp-idf:000][i2s_in]: W (35138) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT
[21:09:16][D][esp-idf:000][i2s_in]: W (35141) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT
[21:09:16][D][esp-idf:000][i2s_in]: W (35146) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT
[21:09:16][D][esp-idf:000][i2s_in]: W (35150) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT
[21:09:16][D][esp_adf_pipeline:302]: State changed from STOPPING to STOPPED
[21:09:16][D][voice_assistant:502]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE
[21:09:16][D][voice_assistant:625]: Event Type: 4
[21:09:16][D][voice_assistant:653]: Speech recognised as: "Zet licht in het kantoor uit?"
[21:09:16][D][voice_assistant:625]: Event Type: 5
[21:09:16][D][voice_assistant:658]: Intent started
[21:09:19][D][voice_assistant:625]: Event Type: 6
[21:09:19][D][voice_assistant:625]: Event Type: 7
[21:09:19][D][voice_assistant:681]: Response: "Het licht in het kantoor is uitgeschakeld."
[21:09:19][D][light:036]: 'test-01 Front LED' Setting:
[21:09:19][D][light:051]: Brightness: 75%
[21:09:19][D][light:059]: Red: 0%, Green: 100%, Blue: 0%
[21:09:19][D][light:109]: Effect: 'Pulse'
[21:09:19][D][voice_assistant:625]: Event Type: 8
[21:09:19][D][voice_assistant:701]: Response URL: "http://192.168.207.101:8123/api/tts_proxy/80ccb868b538746ae77149736cbb3987d8494075_nl-nl_f57ed9f7cb_tts.home_assistant_cloud.mp3"
[21:09:19][D][voice_assistant:502]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE
[21:09:19][D][voice_assistant:508]: Desired state set to STREAMING_RESPONSE
[21:09:19][D][media_player:061]: 'media_player' - Setting
[21:09:19][D][media_player:068]: Media URL: http://192.168.207.101:8123/api/tts_proxy/80ccb868b538746ae77149736cbb3987d8494075_nl-nl_f57ed9f7cb_tts.home_assistant_cloud.mp3
[21:09:19][D][media_player:074]: Announcement: yes
[21:09:19][D][adf_media_player:030]: Got control call in state 1
[21:09:19][D][esp_adf_pipeline:050]: Starting request, current state UNINITIALIZED
[21:09:19][D][esp-idf:000]: I (38335) MP3_DECODER: MP3 init
[21:09:19][D][esp-idf:000]: I (38339) I2S: DMA Malloc info, datalen=blocksize=2048, dma_buf_count=4
[21:09:19][D][i2s_audio:072]: Installing driver : yes
[21:09:19][D][esp_adf_pipeline:358]: pipeline tag 0, http
[21:09:19][D][esp_adf_pipeline:358]: pipeline tag 1, decoder
[21:09:19][D][esp_adf_pipeline:358]: pipeline tag 2, resampler
[21:09:19][D][esp_adf_pipeline:358]: pipeline tag 3, i2s_out
[21:09:19][D][esp-idf:000]: I (38356) AUDIO_PIPELINE: link el->rb, el:0x3d820ed8, tag:http, rb:0x3d8215a8
[21:09:19][D][esp-idf:000]: I (38358) AUDIO_PIPELINE: link el->rb, el:0x3d8210dc, tag:decoder, rb:0x3d8225e8
[21:09:19][D][esp-idf:000]: I (38361) AUDIO_PIPELINE: link el->rb, el:0x3d821278, tag:resampler, rb:0x3d823628
[21:09:19][D][esp_adf_pipeline:370]: Setting up event listener.
[21:09:19][D][esp_adf_pipeline:302]: State changed from UNINITIALIZED to PREPARING
[21:09:19][I][adf_media_player:135]: got new pipeline state: 1
[21:09:19][D][adf_i2s_out:127]: Set final i2s settings: 16000
[21:09:19][D][esp_audio_processors:079]: New settings: SRC: rate: 16000, ch: 2 DST: rate: 16000, ch: 2
[21:09:19][W][component:237]: Component voice_assistant took a long time for an operation (55 ms).
[21:09:19][W][component:238]: Components should block for at most 30 ms.
[21:09:19][D][voice_assistant:625]: Event Type: 2
[21:09:19][D][voice_assistant:715]: Assist Pipeline ended
[21:09:19][D][esp-idf:000]: I (38392) AUDIO_THREAD: The http task allocate stack on external memory
[21:09:19][D][esp-idf:000]: I (38394) AUDIO_ELEMENT: [http-0x3d820ed8] Element task created
[21:09:19][D][esp-idf:000]: I (38398) AUDIO_THREAD: The decoder task allocate stack on external memory
[21:09:19][D][esp-idf:000]: I (38401) AUDIO_ELEMENT: [decoder-0x3d8210dc] Element task created
[21:09:19][D][esp-idf:000][http]: I (38404) AUDIO_ELEMENT: [http] AEL_MSG_CMD_RESUME,state:1
[21:09:19][D][esp-idf:000][decoder]: I (38407) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1
[21:09:19][D][esp_aud:000]d]:
ERROR Fatal error: protocol.data_received() call failed.
protocol: <aioesphomeapi._frame_helper.noise.APINoiseFrameHelper object at 0x7fa4dc080670>
transport: <_SelectorSocketTransport fd=6 read=polling write=<idle, bufsize=0>>
Traceback (most recent call last):
File "/usr/lib/python3.11/asyncio/selector_events.py", line 1009, in _read_ready__data_received
self._protocol.data_received(data)
File "aioesphomeapi/_frame_helper/noise.py", line 136, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper.data_received
File "aioesphomeapi/_frame_helper/noise.py", line 163, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper.data_received
File "aioesphomeapi/_frame_helper/noise.py", line 319, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper._handle_frame
File "/usr/local/lib/python3.11/dist-packages/noise/state.py", line 74, in decrypt_with_ad
plaintext = self.cipher.decrypt(self.k, self.n, ad, ciphertext)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/noise/backends/default/ciphers.py", line 13, in decrypt
return self.cipher.decrypt(nonce=self.format_nonce(n), data=ciphertext, associated_data=ad)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "src/chacha20poly1305_reuseable/__init__.py", line 127, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt
File "src/chacha20poly1305_reuseable/__init__.py", line 147, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt
File "src/chacha20poly1305_reuseable/__init__.py", line 263, in chacha20poly1305_reuseable._decrypt_with_fixed_nonce_len
File "src/chacha20poly1305_reuseable/__init__.py", line 273, in chacha20poly1305_reuseable._decrypt_data
cryptography.exceptions.InvalidTag
WARNING test-01 @ 192.168.207.246: Connection error occurred: test-01 @ 192.168.207.246: Invalid encryption key: received_name=test-01
INFO Processing unexpected disconnect from ESPHome API for test-01 @ 192.168.207.246
WARNING Disconnected from API
INFO Successfully connected to test-01 @ 192.168.207.246 in 0.004s
INFO Successful handshake with test-01 @ 192.168.207.246 in 0.105s
[21:09:19][I][esp_adf_pipeline:214]: [ i2s_out ] status: 12
[21:09:19][D][esp_adf_pipeline:131]: Check element [http] status, 2
[21:09:19][I][esp_adf_pipeline:214]: [ resampler ] status: 12
[21:09:19][D][esp_adf_pipeline:131]: Check element [http] status, 2
[21:09:19][D][esp-idf:000][http]: I (38612) HTTP_CLIENT: Body received in fetch header state, 0x3fcc511b, 1841
[21:09:19][D][esp-idf:000][http]: I (38617) HTTP_STREAM: total_bytes=23199
[21:09:19][I][esp_adf_pipeline:214]: [ http ] status: 12
[21:09:19][D][esp_adf_pipeline:131]: Check element [http] status, 3
[21:09:19][D][esp_adf_pipeline:131]: Check element [decoder] status, 2
[21:09:19][I][esp_adf_pipeline:214]: [ decoder ] status: 12
[21:09:19][D][esp_adf_pipeline:131]: Check element [http] status, 3
[21:09:19][D][esp_adf_pipeline:131]: Check element [decoder] status, 3
[21:09:19][D][esp_adf_pipeline:131]: Check element [resampler] status, 3
[21:09:19][D][esp_adf_pipeline:131]: Check element [i2s_out] status, 3
[21:09:19][D][esp_adf_pipeline:302]: State changed from STARTING to RUNNING
[21:09:19][I][adf_media_player:135]: got new pipeline state: 3
[21:09:19][D][adf_i2s_out:127]: Set final i2s settings: 24000
[21:09:19][D][esp_audio_processors:079]: New settings: SRC: rate: 24000, ch: 1 DST: rate: 24000, ch: 1
[21:09:19][I][HTTPStreamReader:129]: [ * ] Receive music info from mp3 decoder, sample_rates=24000, bits=16, ch=1
[21:09:19][D][adf_i2s_out:127]: Set final i2s settings: 24000
[21:09:19][D][esp_audio_processors:079]: New settings: SRC: rate: 24000, ch: 1 DST: rate: 24000, ch: 1
[21:09:22][D][esp-idf:000][http]: W (41118) HTTP_STREAM: No more data,errno:0, total_bytes:23199, rlen = 0
[21:09:22][D][esp-idf:000][http]: I (41123) AUDIO_ELEMENT: IN-[http] AEL_IO_DONE,0
[21:09:22][I][esp_adf_pipeline:214]: [ http ] status: 15
[21:09:22][D][esp_adf_pipeline:302]: State changed from RUNNING to STOPPING
[21:09:22][I][adf_media_player:135]: got new pipeline state: 4
[21:09:22][D][esp-idf:000][decoder]: I (41799) AUDIO_ELEMENT: IN-[decoder] AEL_IO_DONE,-2
[21:09:23][D][esp-idf:000][decoder]: I (42161) MP3_DECODER: Closed
[21:09:23][D][esp-idf:000][resampler]: I (42267) AUDIO_ELEMENT: IN-[resampler] AEL_IO_DONE,-2
[21:09:23][D][esp-idf:000][i2s_out]: I (42330) AUDIO_ELEMENT: IN-[i2s_out] AEL_IO_DONE,-2
[21:09:23][D][esp_adf_pipeline:302]: State changed from STOPPING to STOPPED
[21:09:23][I][adf_media_player:135]: got new pipeline state: 5
[21:09:23][D][light:036]: 'test-01 Front LED' Setting:
[21:09:23][D][light:047]: State: ON
[21:09:23][D][light:051]: Brightness: 25%
[21:09:23][D][light:059]: Red: 0%, Green: 0%, Blue: 100%
[21:09:23][D][micro_wake_word:177]: State changed from IDLE to START_MICROPHONE
[21:09:23][D][micro_wake_word:115]: Starting Microphone
[21:09:23][D][esp_adf_pipeline:050]: Starting request, current state STOPPED
[21:09:23][D][esp_adf_pipeline:302]: State changed from STOPPED to PREPARING
[21:09:23][D][micro_wake_word:177]: State changed from START_MICROPHONE to STARTING_MICROPHONE
[21:09:23][D][esp_adf_pipeline:302]: State changed from PREPARING to STARTING
[21:09:23][D][esp-idf:000]: I (42537) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8376627 Bytes, Inter:152808 Bytes, Dram:152808 Bytes
[21:09:23][D][esp-idf:000][i2s_in]: I (42539) AUDIO_ELEMENT: [i2s_in] AEL_MSG_CMD_RESUME,state:1
[21:09:23][D][esp-idf:000]: I (42543) AUDIO_PIPELINE: Pipeline started
[21:09:23][I][esp_adf_pipeline:214]: [ pcm_reader ] status: 14
[21:09:23][I][esp_adf_pipeline:214]: [ i2s_in ] status: 14
[21:09:23][I][esp_adf_pipeline:214]: [ i2s_in ] status: 12
[21:09:23][D][esp_adf_pipeline:131]: Check element [i2s_in] status, 3
[21:09:23][D][esp_adf_pipeline:131]: Check element [pcm_reader] status, 3
[21:09:23][D][esp_adf_pipeline:302]: State changed from STARTING to RUNNING
[21:09:23][D][micro_wake_word:177]: State changed from STARTING_MICROPHONE to DETECTING_WAKE_WORD
[21:09:23][I][esp_adf_pipeline:214]: [ pcm_reader ] status: 12
[21:09:25][D][micro_wake_word:362]: Wake word sliding average probability is 0.585 and most recent probability is 1.000
[21:09:25][D][micro_wake_word:128]: Wake Word Detected
[21:09:25][D][micro_wake_word:177]: State changed from DETECTING_WAKE_WORD to STOP_MICROPHONE
[21:09:25][D][micro_wake_word:134]: Stopping Microphone
[21:09:25][D][esp_adf_pipeline:302]: State changed from RUNNING to STOPPING
[21:09:25][D][micro_wake_word:177]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE
[21:09:25][D][esp-idf:000][i2s_in]: W (44496) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT
[21:09:25][D][esp-idf:000][i2s_in]: W (44499) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT
[21:09:25][D][esp-idf:000][i2s_in]: W (44503) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT
[21:09:25][D][esp-idf:000][i2s_in]: W (44506) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT
[21:09:25][D][esp_adf_pipeline:302]: State changed from STOPPING to STOPPED
[21:09:25][D][micro_wake_word:177]: State changed from STOPPING_MICROPHONE to IDLE
[21:09:25][D][media_player:061]: 'media_player' - Setting
[21:09:25][D][media_player:065]: Command: STOP
[21:09:25][D][esp_adf_pipeline:085]: Called 'stop' while in STOPPED state.
[21:09:25][D][light:036]: 'test-01 Front LED' Setting:
[21:09:25][D][light:051]: Brightness: 75%
[21:09:25][D][light:059]: Red: 0%, Green: 100%, Blue: 0%
[21:09:25][D][light:109]: Effect: 'Pulse'
Does anyone know what's going on here?
Hey, based on the config above I have been cooking this:
substitutions: device_name: "test-01" friendly_name: "test-01" device_description: "esp32s3" esp_board: "esp32-s3-devkitc-1" # Phases of the Voice Assistant # IDLE: The voice assistant is ready to be triggered by a wake-word voice_assist_idle_phase_id: '1' # LISTENING: The voice assistant is ready to listen to a voice command (after being triggered by the wake word) voice_assist_listening_phase_id: '2' # THINKING: The voice assistant is currently processing the command voice_assist_thinking_phase_id: '3' # REPLYING: The voice assistant is replying to the command voice_assist_replying_phase_id: '4' # NOT_READY: The voice assistant is not ready voice_assist_not_ready_phase_id: '10' # ERROR: The voice assistant encountered an error voice_assist_error_phase_id: '11' # MUTED: The voice assistant is muted and will not reply to a wake-word voice_assist_muted_phase_id: '12' external_components: - source: type: git url: https://github.com/gnumpi/esphome_audio ref: main #type: local #path: /Users/siekmann/Privat/Projects/espHome/esphome_audio/esphome/components components: [ adf_pipeline, i2s_audio ] esphome: name: ${device_name} comment: ${device_description} friendly_name: ${friendly_name} min_version: 2024.2.0 platformio_options: build_flags: -DBOARD_HAS_PSRAM board_build.flash_mode: dio board_upload.maximum_size: 16777216 on_boot: priority: 600 then: # Run the script to refresh the LED status # If after 30 seconds, the device is still initializing (It did not yet connect to Home Assistant), turn off the init_in_progress variable and run the script to refresh the LED status - delay: 30s - if: condition: lambda: return id(init_in_progress); then: - lambda: id(init_in_progress) = false; esp32: board: ${esp_board} variant: ESP32S3 flash_size: 16MB framework: type: esp-idf sdkconfig_options: # need to set a s3 compatible board for the adf-sdk to compile # board specific code is not used though CONFIG_ESP32_S3_BOX_BOARD: "y" CONFIG_ESP32_WIFI_STATIC_RX_BUFFER_NUM: "16" CONFIG_ESP32_WIFI_DYNAMIC_RX_BUFFER_NUM: "512" CONFIG_TCPIP_RECVMBOX_SIZE: "512" CONFIG_TCP_SND_BUF_DEFAULT: "65535" CONFIG_TCP_WND_DEFAULT: "512000" CONFIG_TCP_RECVMBOX_SIZE: "512" logger: globals: # Global initialisation variable. Initialized to true and set to false once everything is connected. Only used to have a smooth "plugging" experience - id: init_in_progress type: bool restore_value: no initial_value: 'true' # Global variable tracking the phase of the voice assistant (defined above). Initialized to not_ready - id: voice_assistant_phase type: int restore_value: no initial_value: ${voice_assist_not_ready_phase_id} psram: mode: octal speed: 80MHz wifi: enable_rrm: true ssid: !secret wifi_ssid password: !secret wifi_password fast_connect: true ota: password: !secret ota_password api: encryption: key: !secret api_key i2s_audio: - id: i2s_in i2s_lrclk_pin: GPIO7 i2s_bclk_pin: GPIO16 - id: i2s_out i2s_lrclk_pin: GPIO8 i2s_bclk_pin: GPIO18 adf_pipeline: - platform: i2s_audio type: audio_out id: adf_i2s_out i2s_audio_id: i2s_out i2s_dout_pin: GPIO17 - platform: i2s_audio type: audio_in id: adf_i2s_in i2s_audio_id: i2s_in i2s_din_pin: GPIO15 pdm: false channel: right sample_rate: 16000 bits_per_sample: 32bit microphone: - platform: adf_pipeline id: adf_microphone keep_pipeline_alive: true pipeline: - adf_i2s_in - self media_player: - platform: adf_pipeline id: adf_media_player name: media_player keep_pipeline_alive: true internal: false pipeline: - self - resampler - adf_i2s_out micro_wake_word: model: okay_nabu on_wake_word_detected: - media_player.stop: - light.turn_on: id: led_ring blue: 0% red: 0% green: 100% brightness: 75% effect: pulse - voice_assistant.start: voice_assistant: microphone: adf_microphone media_player: adf_media_player use_wake_word: false #vad_threshold: 3 #noise_suppression_level: 1 auto_gain: 31dBFS #volume_multiplier: 15.0 on_client_connected: - lambda: id(init_in_progress) = false; - if: condition: switch.is_on: use_wake_word then: - micro_wake_word.start: - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id}; - script.execute: reset_led else: - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id}; on_client_disconnected: - lambda: id(voice_assistant_phase) = ${voice_assist_not_ready_phase_id}; - voice_assistant.stop - micro_wake_word.stop - light.turn_on: id: led_ring blue: 0% red: 100% green: 100% brightness: 50% effect: connecting on_listening: - light.turn_on: id: led_ring blue: 100% red: 0% green: 0% brightness: 25% effect: wakeword on_tts_start: - light.turn_on: id: led_ring blue: 0% red: 0% green: 100% brightness: 75% effect: pulse on_end: then: - light.turn_off: id: led_ring - voice_assistant.stop - wait_until: not: media_player.is_playing: - script.execute: reset_led - if: condition: switch.is_on: use_wake_word then: - micro_wake_word.start: on_error: - light.turn_on: id: led_ring blue: 0% red: 100% green: 0% brightness: 100% effect: none - delay: 1s - script.execute: reset_led - script.wait: reset_led - lambda: |- if (code == "wake-provider-missing" || code == "wake-engine-missing") { id(use_wake_word).turn_off(); } - if: condition: switch.is_on: use_wake_word then: - micro_wake_word.start: - script.execute: reset_led script: - id: reset_led then: - if: condition: switch.is_on: use_wake_word then: - light.turn_on: id: led_ring blue: 100% red: 0% green: 0% brightness: 25% effect: none else: - light.turn_off: led_ring button: - platform: restart id: restart_btn name: "${friendly_name} REBOOT" switch: - platform: template name: Enable Voice Assistant id: use_wake_word optimistic: true restore_mode: RESTORE_DEFAULT_ON icon: mdi:assistant # When the switch is turned on (on Home Assistant): # Start the voice assistant component # Set the correct phase and run the script to refresh the LED status on_turn_on: - logger.log: "switch on" - if: condition: lambda: return !id(init_in_progress); then: - logger.log: "condition 1" - lambda: id(voice_assistant_phase) = ${voice_assist_idle_phase_id}; - voice_assistant.stop - delay: 1s - if: condition: not: - voice_assistant.is_running then: - logger.log: "Starting MWW" #- voice_assistant.start_continuous - micro_wake_word.start: - script.execute: reset_led on_turn_off: - if: condition: lambda: return !id(init_in_progress); then: - voice_assistant.stop - micro_wake_word.stop - lambda: id(voice_assistant_phase) = ${voice_assist_muted_phase_id}; - script.execute: reset_led - platform: template name: Pipeline id: pipeline_switch optimistic: true restore_mode: RESTORE_DEFAULT_OFF on_turn_off: - media_player.stop light: - platform: esp32_rmt_led_strip id: led_ring name: "${friendly_name} Front LED" pin: GPIO05 num_leds: 1 rmt_channel: 1 rgb_order: GRB chipset: ws2812 default_transition_length: 0s effects: - pulse: name: "Pulse" transition_length: 0.5s update_interval: 0.5s - addressable_twinkle: name: "Working" twinkle_probability: 5% progress_interval: 4ms - addressable_color_wipe: name: "Wakeword" colors: - red: 0% green: 50% blue: 0% num_leds: 12 add_led_interval: 20ms reverse: false - addressable_color_wipe: name: "Connecting" colors: - red: 60% green: 60% blue: 60% num_leds: 1 - red: 60% green: 60% blue: 0% num_leds: 1 add_led_interval: 100ms reverse: true
This works extremely well, but just once ;)
The second time, it detects the wake word, starts blinking green, does not process the request and keeps blinking.
LOG (after trying twice)
INFO ESPHome 2024.5.4 [21:08:48][I][app:100]: ESPHome version 2024.5.4 compiled on May 30 2024, 21:05:15 [21:08:48][C][wifi:580]: WiFi: [21:08:48][C][wifi:408]: Local MAC: 3C:84:27:CC:42:0C [21:08:48][C][wifi:413]: SSID: [redacted] [21:08:48][C][wifi:416]: IP Address: 192.168.207.246 [21:08:48][C][wifi:420]: BSSID: [redacted] [21:08:48][C][wifi:421]: Hostname: 'test-01' [21:08:48][C][wifi:423]: Signal strength: -60 dB ▂▄▆█ [21:08:48][C][wifi:427]: Channel: 11 [21:08:48][C][wifi:428]: Subnet: 255.255.255.0 [21:08:48][C][wifi:429]: Gateway: 192.168.207.1 [21:08:48][C][wifi:430]: DNS1: 192.168.207.130 [21:08:48][C][wifi:431]: DNS2: 0.0.0.0 [21:08:48][C][wifi:433]: BTM: disabled [21:08:48][C][wifi:434]: RRM: enabled [21:08:48][C][logger:185]: Logger: [21:08:48][C][logger:186]: Level: DEBUG [21:08:48][C][logger:188]: Log Baud Rate: 115200 [21:08:48][C][logger:189]: Hardware UART: USB_SERIAL_JTAG [21:08:48][C][esp32_rmt_led_strip:175]: ESP32 RMT LED Strip: [21:08:48][C][esp32_rmt_led_strip:176]: Pin: 5 [21:08:48][C][esp32_rmt_led_strip:177]: Channel: 1 [21:08:48][C][esp32_rmt_led_strip:202]: RGB Order: GRB [21:08:48][C][esp32_rmt_led_strip:203]: Max refresh rate: 0 [21:08:48][C][esp32_rmt_led_strip:204]: Number of LEDs: 1 [21:08:48][C][light:103]: Light 'test-01 Front LED' [21:08:48][C][light:105]: Default Transition Length: 0.0s [21:08:48][C][light:106]: Gamma Correct: 2.80 [21:08:48][C][template.switch:068]: Template Switch 'Pipeline' [21:08:48][C][template.switch:091]: Restore Mode: restore defaults to OFF [21:08:48][C][template.switch:057]: Optimistic: YES [21:08:48][C][template.switch:068]: Template Switch 'Enable Voice Assistant' [21:08:48][C][template.switch:070]: Icon: 'mdi:assistant' [21:08:48][C][template.switch:091]: Restore Mode: restore defaults to ON [21:08:48][C][template.switch:057]: Optimistic: YES [21:08:48][C][psram:020]: PSRAM: [21:08:48][C][psram:021]: Available: YES [21:08:48][C][psram:024]: Size: 8191 KB [21:08:48][C][i2s_audio:028]: I2SController: [21:08:48][C][i2s_audio:029]: AccessMode: exclusive [21:08:48][C][i2s_audio:030]: Port: 0 [21:08:48][C][i2s_audio:032]: Reader registered. [21:08:48][C][i2s_audio:028]: I2SController: [21:08:48][C][i2s_audio:029]: AccessMode: exclusive [21:08:48][C][i2s_audio:030]: Port: 1 [21:08:48][C][i2s_audio:035]: Writer registered. [21:08:48][C][i2s_audio:138]: I2S-Writer (Initial-CFG): [21:08:48][C][i2s_audio:140]: sample-rate: 16000 bits_per_sample: 32 [21:08:48][C][i2s_audio:141]: channel_fmt: 0 channels: 2 [21:08:48][C][i2s_audio:142]: use_apll: no, use_pdm: no [21:08:48][C][i2s_audio:135]: I2S-Reader (Initial-CFG): [21:08:48][C][i2s_audio:140]: sample-rate: 16000 bits_per_sample: 32 [21:08:48][C][i2s_audio:141]: channel_fmt: 3 channels: 1 [21:08:48][C][i2s_audio:142]: use_apll: no, use_pdm: no [21:08:48][C][restart.button:017]: Restart Button 'test-01 REBOOT' [21:08:48][C][mdns:115]: mDNS: [21:08:48][C][mdns:116]: Hostname: test-01 [21:08:48][C][ota:096]: Over-The-Air Updates: [21:08:48][C][ota:097]: Address: test-01.local:3232 [21:08:48][C][ota:100]: Using Password. [21:08:48][C][ota:103]: OTA version: 2. [21:08:48][C][api:139]: API Server: [21:08:48][C][api:140]: Address: test-01.local:6053 [21:08:48][C][api:142]: Using noise encryption: YES [21:08:48][C][micro_wake_word:057]: microWakeWord: [21:08:48][C][micro_wake_word:058]: Wake Word: okay nabu [21:08:48][C][micro_wake_word:059]: Probability cutoff: 0.500 [21:08:48][C][micro_wake_word:060]: Sliding window size: 10 [21:08:48][C][esp_adf_pipeline.microphone:020]: ADF-Microphone [21:08:48][C][adf_media_player:016]: ESP-ADF-MediaPlayer: [21:08:48][C][adf_media_player:018]: Number of ASPComponents: 3 [21:08:51][D][api:102]: Accepted 192.168.207.101 [21:08:51][D][api.connection:1321]: Home Assistant 2024.5.5 (192.168.207.101): Connected successfully [21:08:51][D][micro_wake_word:177]: State changed from IDLE to START_MICROPHONE [21:08:51][D][light:036]: 'test-01 Front LED' Setting: [21:08:51][D][light:051]: Brightness: 25% [21:08:51][D][light:059]: Red: 0%, Green: 0%, Blue: 100% [21:08:51][D][micro_wake_word:115]: Starting Microphone [21:08:51][D][esp_adf_pipeline:050]: Starting request, current state UNINITIALIZED [21:08:51][D][esp-idf:000]: I (10976) I2S: DMA Malloc info, datalen=blocksize=1024, dma_buf_count=4 [21:08:52][D][i2s_audio:072]: Installing driver : yes [21:08:52][D][esp_adf_pipeline:358]: pipeline tag 0, i2s_in [21:08:52][D][esp_adf_pipeline:358]: pipeline tag 1, pcm_reader [21:08:52][D][esp-idf:000]: I (10988) AUDIO_PIPELINE: link el->rb, el:0x3d8178b0, tag:i2s_in, rb:0x3d817b8c [21:08:52][D][esp_adf_pipeline:370]: Setting up event listener. [21:08:52][D][esp_adf_pipeline:302]: State changed from UNINITIALIZED to PREPARING [21:08:52][D][micro_wake_word:177]: State changed from START_MICROPHONE to STARTING_MICROPHONE [21:08:52][D][esp_audio_sinks:053]: Set bitdepth to 32 [21:08:52][D][esp_adf_pipeline:302]: State changed from PREPARING to STARTING [21:08:52][D][esp-idf:000]: I (11005) AUDIO_ELEMENT: [i2s_in-0x3d8178b0] Element task created [21:08:52][D][esp-idf:000]: I (11007) AUDIO_ELEMENT: [pcm_reader-0x3d817a44] Element task created [21:08:52][D][esp-idf:000]: I (11009) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8451851 Bytes, Inter:163104 Bytes, Dram:163104 Bytes [21:08:52][D][esp-idf:000][i2s_in]: I (11012) AUDIO_ELEMENT: [i2s_in] AEL_MSG_CMD_RESUME,state:1 [21:08:52][D][esp-idf:000]: I (11015) AUDIO_PIPELINE: Pipeline started [21:08:52][I][esp_adf_pipeline:214]: [ pcm_reader ] status: 12 [21:08:52][D][esp_adf_pipeline:131]: Check element [i2s_in] status, 3 [21:08:52][D][esp_adf_pipeline:131]: Check element [pcm_reader] status, 3 [21:08:52][D][esp_adf_pipeline:302]: State changed from STARTING to RUNNING [21:08:52][D][micro_wake_word:177]: State changed from STARTING_MICROPHONE to DETECTING_WAKE_WORD [21:08:52][I][esp_adf_pipeline:214]: [ pcm_reader ] status: 12 [21:08:52][I][esp_adf_pipeline:214]: [ i2s_in ] status: 12 [21:09:14][D][micro_wake_word:362]: Wake word sliding average probability is 0.522 and most recent probability is 1.000 [21:09:14][D][micro_wake_word:128]: Wake Word Detected [21:09:14][D][micro_wake_word:177]: State changed from DETECTING_WAKE_WORD to STOP_MICROPHONE [21:09:14][D][micro_wake_word:134]: Stopping Microphone [21:09:14][D][esp_adf_pipeline:302]: State changed from RUNNING to STOPPING [21:09:14][D][micro_wake_word:177]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE [21:09:14][D][esp-idf:000][i2s_in]: W (33041) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT [21:09:14][D][esp-idf:000][i2s_in]: W (33044) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT [21:09:14][D][esp-idf:000][i2s_in]: W (33047) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT [21:09:14][D][esp-idf:000][i2s_in]: W (33050) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT [21:09:14][D][esp_adf_pipeline:302]: State changed from STOPPING to STOPPED [21:09:14][D][micro_wake_word:177]: State changed from STOPPING_MICROPHONE to IDLE [21:09:14][D][media_player:061]: 'media_player' - Setting [21:09:14][D][media_player:065]: Command: STOP [21:09:14][D][esp_adf_pipeline:085]: Called 'stop' while in UNINITIALIZED state. [21:09:14][D][light:036]: 'test-01 Front LED' Setting: [21:09:14][D][light:051]: Brightness: 75% [21:09:14][D][light:059]: Red: 0%, Green: 100%, Blue: 0% [21:09:14][D][light:109]: Effect: 'Pulse' [21:09:14][D][voice_assistant:502]: State changed from IDLE to START_MICROPHONE [21:09:14][D][voice_assistant:508]: Desired state set to START_PIPELINE [21:09:14][D][voice_assistant:220]: Starting Microphone [21:09:14][D][ring_buffer:024]: Created ring buffer with size 32768 [21:09:14][D][esp_adf_pipeline:050]: Starting request, current state STOPPED [21:09:14][D][esp_adf_pipeline:302]: State changed from STOPPED to PREPARING [21:09:14][D][voice_assistant:502]: State changed from START_MICROPHONE to STARTING_MICROPHONE [21:09:14][D][esp_adf_pipeline:302]: State changed from PREPARING to STARTING [21:09:14][D][esp-idf:000]: I (33111) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8422123 Bytes, Inter:168276 Bytes, Dram:168276 Bytes [21:09:14][D][esp-idf:000][i2s_in]: I (33115) AUDIO_ELEMENT: [i2s_in] AEL_MSG_CMD_RESUME,state:1 [21:09:14][D][esp-idf:000]: I (33117) AUDIO_PIPELINE: Pipeline started [21:09:14][I][esp_adf_pipeline:214]: [ pcm_reader ] status: 14 [21:09:14][I][esp_adf_pipeline:214]: [ i2s_in ] status: 14 [21:09:14][I][esp_adf_pipeline:214]: [ i2s_in ] status: 12 [21:09:14][D][esp_adf_pipeline:131]: Check element [i2s_in] status, 3 [21:09:14][D][esp_adf_pipeline:131]: Check element [pcm_reader] status, 3 [21:09:14][D][esp_adf_pipeline:302]: State changed from STARTING to RUNNING [21:09:14][D][voice_assistant:502]: State changed from STARTING_MICROPHONE to START_PIPELINE [21:09:14][I][esp_adf_pipeline:214]: [ pcm_reader ] status: 12 [21:09:14][D][voice_assistant:274]: Requesting start... [21:09:14][D][voice_assistant:502]: State changed from START_PIPELINE to STARTING_PIPELINE [21:09:14][D][voice_assistant:523]: Client started, streaming microphone [21:09:14][D][voice_assistant:502]: State changed from STARTING_PIPELINE to STREAMING_MICROPHONE [21:09:14][D][voice_assistant:508]: Desired state set to STREAMING_MICROPHONE [21:09:14][D][voice_assistant:625]: Event Type: 1 [21:09:14][D][voice_assistant:628]: Assist Pipeline running [21:09:14][D][voice_assistant:625]: Event Type: 3 [21:09:14][D][voice_assistant:639]: STT started [21:09:14][D][light:036]: 'test-01 Front LED' Setting: [21:09:14][D][light:051]: Brightness: 25% [21:09:14][D][light:059]: Red: 0%, Green: 0%, Blue: 100% [21:09:14][D][light:109]: Effect: 'Wakeword' [21:09:14][D][voice_assistant:625]: Event Type: 11 [21:09:14][D][voice_assistant:779]: Starting STT by VAD [21:09:16][D][voice_assistant:625]: Event Type: 12 [21:09:16][D][voice_assistant:783]: STT by VAD end [21:09:16][D][voice_assistant:502]: State changed from STREAMING_MICROPHONE to STOP_MICROPHONE [21:09:16][D][voice_assistant:508]: Desired state set to AWAITING_RESPONSE [21:09:16][D][esp_adf_pipeline:302]: State changed from RUNNING to STOPPING [21:09:16][D][voice_assistant:502]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE [21:09:16][D][esp-idf:000][i2s_in]: W (35138) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT [21:09:16][D][esp-idf:000][i2s_in]: W (35141) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT [21:09:16][D][esp-idf:000][i2s_in]: W (35146) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT [21:09:16][D][esp-idf:000][i2s_in]: W (35150) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT [21:09:16][D][esp_adf_pipeline:302]: State changed from STOPPING to STOPPED [21:09:16][D][voice_assistant:502]: State changed from STOPPING_MICROPHONE to AWAITING_RESPONSE [21:09:16][D][voice_assistant:625]: Event Type: 4 [21:09:16][D][voice_assistant:653]: Speech recognised as: "Zet licht in het kantoor uit?" [21:09:16][D][voice_assistant:625]: Event Type: 5 [21:09:16][D][voice_assistant:658]: Intent started [21:09:19][D][voice_assistant:625]: Event Type: 6 [21:09:19][D][voice_assistant:625]: Event Type: 7 [21:09:19][D][voice_assistant:681]: Response: "Het licht in het kantoor is uitgeschakeld." [21:09:19][D][light:036]: 'test-01 Front LED' Setting: [21:09:19][D][light:051]: Brightness: 75% [21:09:19][D][light:059]: Red: 0%, Green: 100%, Blue: 0% [21:09:19][D][light:109]: Effect: 'Pulse' [21:09:19][D][voice_assistant:625]: Event Type: 8 [21:09:19][D][voice_assistant:701]: Response URL: "http://192.168.207.101:8123/api/tts_proxy/80ccb868b538746ae77149736cbb3987d8494075_nl-nl_f57ed9f7cb_tts.home_assistant_cloud.mp3" [21:09:19][D][voice_assistant:502]: State changed from AWAITING_RESPONSE to STREAMING_RESPONSE [21:09:19][D][voice_assistant:508]: Desired state set to STREAMING_RESPONSE [21:09:19][D][media_player:061]: 'media_player' - Setting [21:09:19][D][media_player:068]: Media URL: http://192.168.207.101:8123/api/tts_proxy/80ccb868b538746ae77149736cbb3987d8494075_nl-nl_f57ed9f7cb_tts.home_assistant_cloud.mp3 [21:09:19][D][media_player:074]: Announcement: yes [21:09:19][D][adf_media_player:030]: Got control call in state 1 [21:09:19][D][esp_adf_pipeline:050]: Starting request, current state UNINITIALIZED [21:09:19][D][esp-idf:000]: I (38335) MP3_DECODER: MP3 init [21:09:19][D][esp-idf:000]: I (38339) I2S: DMA Malloc info, datalen=blocksize=2048, dma_buf_count=4 [21:09:19][D][i2s_audio:072]: Installing driver : yes [21:09:19][D][esp_adf_pipeline:358]: pipeline tag 0, http [21:09:19][D][esp_adf_pipeline:358]: pipeline tag 1, decoder [21:09:19][D][esp_adf_pipeline:358]: pipeline tag 2, resampler [21:09:19][D][esp_adf_pipeline:358]: pipeline tag 3, i2s_out [21:09:19][D][esp-idf:000]: I (38356) AUDIO_PIPELINE: link el->rb, el:0x3d820ed8, tag:http, rb:0x3d8215a8 [21:09:19][D][esp-idf:000]: I (38358) AUDIO_PIPELINE: link el->rb, el:0x3d8210dc, tag:decoder, rb:0x3d8225e8 [21:09:19][D][esp-idf:000]: I (38361) AUDIO_PIPELINE: link el->rb, el:0x3d821278, tag:resampler, rb:0x3d823628 [21:09:19][D][esp_adf_pipeline:370]: Setting up event listener. [21:09:19][D][esp_adf_pipeline:302]: State changed from UNINITIALIZED to PREPARING [21:09:19][I][adf_media_player:135]: got new pipeline state: 1 [21:09:19][D][adf_i2s_out:127]: Set final i2s settings: 16000 [21:09:19][D][esp_audio_processors:079]: New settings: SRC: rate: 16000, ch: 2 DST: rate: 16000, ch: 2 [21:09:19][W][component:237]: Component voice_assistant took a long time for an operation (55 ms). [21:09:19][W][component:238]: Components should block for at most 30 ms. [21:09:19][D][voice_assistant:625]: Event Type: 2 [21:09:19][D][voice_assistant:715]: Assist Pipeline ended [21:09:19][D][esp-idf:000]: I (38392) AUDIO_THREAD: The http task allocate stack on external memory [21:09:19][D][esp-idf:000]: I (38394) AUDIO_ELEMENT: [http-0x3d820ed8] Element task created [21:09:19][D][esp-idf:000]: I (38398) AUDIO_THREAD: The decoder task allocate stack on external memory [21:09:19][D][esp-idf:000]: I (38401) AUDIO_ELEMENT: [decoder-0x3d8210dc] Element task created [21:09:19][D][esp-idf:000][http]: I (38404) AUDIO_ELEMENT: [http] AEL_MSG_CMD_RESUME,state:1 [21:09:19][D][esp-idf:000][decoder]: I (38407) AUDIO_ELEMENT: [decoder] AEL_MSG_CMD_RESUME,state:1 [21:09:19][D][esp_aud:000]d]: ERROR Fatal error: protocol.data_received() call failed. protocol: <aioesphomeapi._frame_helper.noise.APINoiseFrameHelper object at 0x7fa4dc080670> transport: <_SelectorSocketTransport fd=6 read=polling write=<idle, bufsize=0>> Traceback (most recent call last): File "/usr/lib/python3.11/asyncio/selector_events.py", line 1009, in _read_ready__data_received self._protocol.data_received(data) File "aioesphomeapi/_frame_helper/noise.py", line 136, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper.data_received File "aioesphomeapi/_frame_helper/noise.py", line 163, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper.data_received File "aioesphomeapi/_frame_helper/noise.py", line 319, in aioesphomeapi._frame_helper.noise.APINoiseFrameHelper._handle_frame File "/usr/local/lib/python3.11/dist-packages/noise/state.py", line 74, in decrypt_with_ad plaintext = self.cipher.decrypt(self.k, self.n, ad, ciphertext) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/dist-packages/noise/backends/default/ciphers.py", line 13, in decrypt return self.cipher.decrypt(nonce=self.format_nonce(n), data=ciphertext, associated_data=ad) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "src/chacha20poly1305_reuseable/__init__.py", line 127, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt File "src/chacha20poly1305_reuseable/__init__.py", line 147, in chacha20poly1305_reuseable.ChaCha20Poly1305Reusable.decrypt File "src/chacha20poly1305_reuseable/__init__.py", line 263, in chacha20poly1305_reuseable._decrypt_with_fixed_nonce_len File "src/chacha20poly1305_reuseable/__init__.py", line 273, in chacha20poly1305_reuseable._decrypt_data cryptography.exceptions.InvalidTag WARNING test-01 @ 192.168.207.246: Connection error occurred: test-01 @ 192.168.207.246: Invalid encryption key: received_name=test-01 INFO Processing unexpected disconnect from ESPHome API for test-01 @ 192.168.207.246 WARNING Disconnected from API INFO Successfully connected to test-01 @ 192.168.207.246 in 0.004s INFO Successful handshake with test-01 @ 192.168.207.246 in 0.105s [21:09:19][I][esp_adf_pipeline:214]: [ i2s_out ] status: 12 [21:09:19][D][esp_adf_pipeline:131]: Check element [http] status, 2 [21:09:19][I][esp_adf_pipeline:214]: [ resampler ] status: 12 [21:09:19][D][esp_adf_pipeline:131]: Check element [http] status, 2 [21:09:19][D][esp-idf:000][http]: I (38612) HTTP_CLIENT: Body received in fetch header state, 0x3fcc511b, 1841 [21:09:19][D][esp-idf:000][http]: I (38617) HTTP_STREAM: total_bytes=23199 [21:09:19][I][esp_adf_pipeline:214]: [ http ] status: 12 [21:09:19][D][esp_adf_pipeline:131]: Check element [http] status, 3 [21:09:19][D][esp_adf_pipeline:131]: Check element [decoder] status, 2 [21:09:19][I][esp_adf_pipeline:214]: [ decoder ] status: 12 [21:09:19][D][esp_adf_pipeline:131]: Check element [http] status, 3 [21:09:19][D][esp_adf_pipeline:131]: Check element [decoder] status, 3 [21:09:19][D][esp_adf_pipeline:131]: Check element [resampler] status, 3 [21:09:19][D][esp_adf_pipeline:131]: Check element [i2s_out] status, 3 [21:09:19][D][esp_adf_pipeline:302]: State changed from STARTING to RUNNING [21:09:19][I][adf_media_player:135]: got new pipeline state: 3 [21:09:19][D][adf_i2s_out:127]: Set final i2s settings: 24000 [21:09:19][D][esp_audio_processors:079]: New settings: SRC: rate: 24000, ch: 1 DST: rate: 24000, ch: 1 [21:09:19][I][HTTPStreamReader:129]: [ * ] Receive music info from mp3 decoder, sample_rates=24000, bits=16, ch=1 [21:09:19][D][adf_i2s_out:127]: Set final i2s settings: 24000 [21:09:19][D][esp_audio_processors:079]: New settings: SRC: rate: 24000, ch: 1 DST: rate: 24000, ch: 1 [21:09:22][D][esp-idf:000][http]: W (41118) HTTP_STREAM: No more data,errno:0, total_bytes:23199, rlen = 0 [21:09:22][D][esp-idf:000][http]: I (41123) AUDIO_ELEMENT: IN-[http] AEL_IO_DONE,0 [21:09:22][I][esp_adf_pipeline:214]: [ http ] status: 15 [21:09:22][D][esp_adf_pipeline:302]: State changed from RUNNING to STOPPING [21:09:22][I][adf_media_player:135]: got new pipeline state: 4 [21:09:22][D][esp-idf:000][decoder]: I (41799) AUDIO_ELEMENT: IN-[decoder] AEL_IO_DONE,-2 [21:09:23][D][esp-idf:000][decoder]: I (42161) MP3_DECODER: Closed [21:09:23][D][esp-idf:000][resampler]: I (42267) AUDIO_ELEMENT: IN-[resampler] AEL_IO_DONE,-2 [21:09:23][D][esp-idf:000][i2s_out]: I (42330) AUDIO_ELEMENT: IN-[i2s_out] AEL_IO_DONE,-2 [21:09:23][D][esp_adf_pipeline:302]: State changed from STOPPING to STOPPED [21:09:23][I][adf_media_player:135]: got new pipeline state: 5 [21:09:23][D][light:036]: 'test-01 Front LED' Setting: [21:09:23][D][light:047]: State: ON [21:09:23][D][light:051]: Brightness: 25% [21:09:23][D][light:059]: Red: 0%, Green: 0%, Blue: 100% [21:09:23][D][micro_wake_word:177]: State changed from IDLE to START_MICROPHONE [21:09:23][D][micro_wake_word:115]: Starting Microphone [21:09:23][D][esp_adf_pipeline:050]: Starting request, current state STOPPED [21:09:23][D][esp_adf_pipeline:302]: State changed from STOPPED to PREPARING [21:09:23][D][micro_wake_word:177]: State changed from START_MICROPHONE to STARTING_MICROPHONE [21:09:23][D][esp_adf_pipeline:302]: State changed from PREPARING to STARTING [21:09:23][D][esp-idf:000]: I (42537) AUDIO_PIPELINE: Func:audio_pipeline_run, Line:359, MEM Total:8376627 Bytes, Inter:152808 Bytes, Dram:152808 Bytes [21:09:23][D][esp-idf:000][i2s_in]: I (42539) AUDIO_ELEMENT: [i2s_in] AEL_MSG_CMD_RESUME,state:1 [21:09:23][D][esp-idf:000]: I (42543) AUDIO_PIPELINE: Pipeline started [21:09:23][I][esp_adf_pipeline:214]: [ pcm_reader ] status: 14 [21:09:23][I][esp_adf_pipeline:214]: [ i2s_in ] status: 14 [21:09:23][I][esp_adf_pipeline:214]: [ i2s_in ] status: 12 [21:09:23][D][esp_adf_pipeline:131]: Check element [i2s_in] status, 3 [21:09:23][D][esp_adf_pipeline:131]: Check element [pcm_reader] status, 3 [21:09:23][D][esp_adf_pipeline:302]: State changed from STARTING to RUNNING [21:09:23][D][micro_wake_word:177]: State changed from STARTING_MICROPHONE to DETECTING_WAKE_WORD [21:09:23][I][esp_adf_pipeline:214]: [ pcm_reader ] status: 12 [21:09:25][D][micro_wake_word:362]: Wake word sliding average probability is 0.585 and most recent probability is 1.000 [21:09:25][D][micro_wake_word:128]: Wake Word Detected [21:09:25][D][micro_wake_word:177]: State changed from DETECTING_WAKE_WORD to STOP_MICROPHONE [21:09:25][D][micro_wake_word:134]: Stopping Microphone [21:09:25][D][esp_adf_pipeline:302]: State changed from RUNNING to STOPPING [21:09:25][D][micro_wake_word:177]: State changed from STOP_MICROPHONE to STOPPING_MICROPHONE [21:09:25][D][esp-idf:000][i2s_in]: W (44496) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT [21:09:25][D][esp-idf:000][i2s_in]: W (44499) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT [21:09:25][D][esp-idf:000][i2s_in]: W (44503) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT [21:09:25][D][esp-idf:000][i2s_in]: W (44506) AUDIO_ELEMENT: OUT-[i2s_in] AEL_IO_ABORT [21:09:25][D][esp_adf_pipeline:302]: State changed from STOPPING to STOPPED [21:09:25][D][micro_wake_word:177]: State changed from STOPPING_MICROPHONE to IDLE [21:09:25][D][media_player:061]: 'media_player' - Setting [21:09:25][D][media_player:065]: Command: STOP [21:09:25][D][esp_adf_pipeline:085]: Called 'stop' while in STOPPED state. [21:09:25][D][light:036]: 'test-01 Front LED' Setting: [21:09:25][D][light:051]: Brightness: 75% [21:09:25][D][light:059]: Red: 0%, Green: 100%, Blue: 0% [21:09:25][D][light:109]: Effect: 'Pulse'
Does anyone know what's going on here?
try this:
external_components:
- source:
type: git
url: https://github.com/gnumpi/esphome_audio
ref: dev-next
components: [ adf_pipeline, i2s_audio ]
refresh: 0s
That seems to have solved the issue. Thank you!
Regarding high pitch sounds during playback clicks are typically caused by signal integrity issues since these are relatively high speed signals. Especially if the distortion stops if you touch the high speed lines with your hand, move things. You should be able to so solve such issues by using better quality dupont lines with higher load spring mechanisms, soldering the dupont lines without the plastic cases, twisting the high speed lines together with gnd or even adding some grounded tinfoil around the signal wires. Another thing I noticed the INMP441 can pick up alot of noise if the PCB touches a surface that is vibrating. A good solution probably is to use some polyester fiber like in pillows to make sure the MEMS microphone does touch any hard surfaces in casing that you have.
Longer pauses like over 300ms are most likely caused by buffering issues, which is another story altogether.
Hey Battman2013, I have been hacking away at a config for the ESP32 S3 N16R8. I purchased some from Amz and the following config is working pretty well. I was troubleshooting the MAX98357A only working on the 3.3v pin when I ran across this thread:
https://forum.arduino.cc/t/chinese-esp32-s3-5v-pin-warning/1192758
Once I bridged the IN-OUT pads, the 5v pin started making 5v. Also, I'm only using the onboard LED for the light. Saving my 16bit rings for boards that don't have them, like the XIAO ESP32 S3. Still troubleshooting the MIC on that one.
Next up is local wake word.
edited to correct INMP441 channel reference in substitutions