Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Freepbx Voicemail Transcription Script: Google Speech API
#!/bin/sh
# sendmail-gcloud
#
# Installation instructions
# Copy the content of this file to /usr/sbin/sendmail-gcloud
#
# Google Account
# ---------------
# Create a Google Cloud account if you don't have one yet. Free trial is available at https://console.cloud.google.com/freetrial
# Within console.cloud.google.com search for Cloud Speech-to-Text API and enable it
#
# From the Linux command line on the FreePBX machine
# -------------------------------------------
# Follow steps 1 and 2 of the instructions on Google Cloud https://cloud.google.com/sdk/docs/downloads-yum
# Run the following commands on FreePBX;
# cd /usr/sbin/
# chown asterisk:asterisk sendmail-gcloud
# chmod 744 sendmail-gcloud
# chmod 777 /usr/bin/dos2unix
#
# Verify that you have the following (by simply running the command) and if not use yum install;
# jq
# sox
# flac
# dos2unix -V
# Ensure dos2unix is executable by the asterisk user (chmod 777 /usr/bin/dos2unix)
#
# Connect FreePBX to Google Cloud
# su asterisk
# gcloud auth login
# CLI will provide you a url. Copy that and paste it into your browser. Google will give you a verification code to copy. Paste it into the cli waiting for a verification code.
#
# Open FreePBX web interface
# Go to Settings > Voicemail Admin > Settings > Email Config
# Change Mail Command to: /usr/sbin/sendmail-gcloud
# Submit and apply changes
#
# Original source created by N. Bernaerts: https://github.com/NicolasBernaerts/debian-scripts/tree/master/asterisk
# modified per: https://jrklein.com/2015/08/17/asterisk-voicemail-transcription-via-ibm-bluemix-speech-to-text-api/
# modified per: https://gist.github.com/lgaetz/2cd9c54fb1714e0d509f5f8215b3f5e6
# current version: https://gist.github.com/tony722/7c6d86be2e74fa10a1f344a4c2b093ea
#
# Notes: This is a script modified from the original to work with FreePBX so that email notifications sent from
# Asterisk voicemail contain a speech to text transcription provided by Google Cloud Speech API
#
# License: There are no explicit license terms on the original script or on the blog post with modifications
# I'm assumig GNU/GPL2+ unless notified otherwise by copyright holder(s)
#
# Version History:
# 2020-08-27 Add fix by chrisduncansn
# Minor edit in instruction wording
# 2020-05-27 Add instructions from sr10952
# Add export fix by levishores
# 2019-02-27 Initial commit by tony722
# set PATH
PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
# save the current directory
pushd .
# create a temporary directory and cd to it
TMPDIR=$(mktemp -d)
cd $TMPDIR
# dump the stream to a temporary file
cat >> stream.org
# get the boundary
BOUNDARY=$(grep "boundary=" stream.org | cut -d'"' -f 2)
# if mail has no boundaries, assume no attachment
if [ "$BOUNDARY" = "" ]
then
# send the original stream
mv stream.org stream.new
else
# cut the original stream into parts
# stream.part - header before the boundary
# stream.part1 - header after the bounday
# stream.part2 - body of the message
# stream.part3 - attachment in base64 (WAV file)
# stream.part4 - footer of the message
awk '/'$BOUNDARY'/{i++}{print > "stream.part"i}' stream.org
# cut the attachment into parts
# stream.part3.head - header of attachment
# stream.part3.wav.base64 - wav file of attachment (encoded base64)
sed '7,$d' stream.part3 > stream.part3.wav.head
sed '1,6d' stream.part3 > stream.part3.wav.base64
# convert the base64 file to a wav file
dos2unix -o stream.part3.wav.base64
base64 -di stream.part3.wav.base64 > stream.part3.wav
# convert the wav file to FLAC
sox -G stream.part3.wav --channels=1 --bits=16 --rate=8000 stream.part3.flac
# convert to MP3
sox stream.part3.wav stream.part3-pcm.wav
lame -m m -b 24 stream.part3-pcm.wav stream.part3.mp3
base64 stream.part3.mp3 > stream.part3.mp3.base64
# create mp3 mail part
sed 's/x-[wW][aA][vV]/mpeg/g' stream.part3.wav.head | sed 's/.[wW][aA][vV]/.mp3/g' > stream.part3.new
dos2unix -o stream.part3.new
unix2dos -o stream.part3.mp3.base64
cat stream.part3.mp3.base64 >> stream.part3.new
# save voicemail in tmp folder in case of trouble
# TMPMP3=$(mktemp -u /tmp/msg_XXXXXXXX.mp3)
# cp "stream.part3.mp3" "$TMPMP3"
export CLOUDSDK_CONFIG=/home/asterisk/.config/gcloud
RESULT=`gcloud ml speech recognize stream.part3.flac --language-code='en-US'`
FILTERED=`echo "$RESULT" | jq -r '.results[].alternatives[].transcript'`
# generate first part of mail body, converting it to LF only
mv stream.part stream.new
cat stream.part1 >> stream.new
sed '$d' < stream.part2 >> stream.new
# beginning of transcription section
echo "" >> stream.new
echo "--- Google transcription result ---" >> stream.new
# append result of transcription
if [ -z "$FILTERED" ]
then
echo "(Google was unable to recognize any speech in audio data.)" >> stream.new
else
echo "$FILTERED" >> stream.new
fi
# end of message body
tail -1 stream.part2 >> stream.new
# add converted attachment
cat stream.part3.new >> stream.new
# append end of mail body, converting it to LF only
echo "" >> stream.tmp
echo "" >> stream.tmp
cat stream.part4 >> stream.tmp
dos2unix -o stream.tmp
cat stream.tmp >> stream.new
fi
# send the mail thru sendmail
cat stream.new | sendmail -t
# go back to original directory
popd
# remove all temporary files and temporary directory
rm -Rf $TMPDIR
@mantisae

This comment has been minimized.

Copy link

@mantisae mantisae commented Nov 9, 2019

I have been attempting to make this work for the last 4 hours and getting nowhere. I know I have google cloud set up correctly because I can change the TMPDIR to an actual location and have the system write the files there then copy the "gcloud ml speech recognize stream.part3.flac --language-code='en-US'" command and run it in that dir as the asterisk user and get the JSON however it is not running as part of the script.

@tony722

This comment has been minimized.

Copy link
Owner Author

@tony722 tony722 commented Nov 10, 2019

This is code that I'm using on a production FreePBX server (just diffed it to be sure) . Keep trying, checking logs, messages, permissions, and whatever else. Hopefully you can get it. 😄 Good luck!

@mantisae

This comment has been minimized.

Copy link

@mantisae mantisae commented Nov 10, 2019

I forgot to post back I decided to try a stupid this morning and "reboot -h now" I left a vm after the system restarted and it transcribed properly.

@pete1019

This comment has been minimized.

Copy link

@pete1019 pete1019 commented Nov 15, 2019

Thanks a lot for your great work.
Could you please make this work with VitalPBX?
This is working good with IBM Watson (example):
http://incrediblepbx.com/sendmailibm-vitalpbx.tar.gz
Doc here: https://nerdvittles.com/?p=25697

But i would love this do work with Google Speech to Text.

Tried copy/paste your script but i only get:
--- Google transcription result ---
(Google was unable to recognize any speech in audio data.)

@tony722

This comment has been minimized.

Copy link
Owner Author

@tony722 tony722 commented Nov 15, 2019

That's a great idea @pete1019. However I'm not running VitalPBX so I don't have a way to do this. I maintain a single FreePBX system, and just put this out there in case it helps anyone else. :-) I will say that I get this message from Google occasionally. However if you get it every time there's likely a problem.

Feel free to fork this and make any changes needed. Good luck!

@pete1019

This comment has been minimized.

Copy link

@pete1019 pete1019 commented Nov 15, 2019

I think it is just a owner / rights problem.
Everything works but nothing happens here when your script is used here:
RESULT=gcloud ml speech recognize stream.part3.flac --language-code='en-US'

If i run this command as root everyhing is fine and i get results.
How should i set rights and ownership in CentOS since you said:
"gcloud utility must be authenticated. Before doing this, 'su asterisk' so authentication happens in the correct user account"
I did this as root cause i did not know any better.
If i do su asterisk it says: This account is currently not available. Maybe since user asterisk has nologin? Solution?
THANK YOU!

@tony722

This comment has been minimized.

Copy link
Owner Author

@tony722 tony722 commented Nov 15, 2019

I've never looked at VItalPBX at all, so I don't know what user accounts it runs under. I'd suggest taking this script to the guys on the VitalPBX forums and see if they can help--I'm really unable to help at all! 😄 Good luck!

@pete1019

This comment has been minimized.

Copy link

@pete1019 pete1019 commented Nov 18, 2019

"gcloud utility must be authenticated. Before doing this, 'su asterisk' so authentication happens in the correct user account"
This does the trick:
su -s /bin/bash asterisk

@voltagecontrolled

This comment has been minimized.

Copy link

@voltagecontrolled voltagecontrolled commented Dec 17, 2019

I've been using this script for some time now (with FreePBX 14), and was previously using the IBM Bluemix version with no issues. However lately I've noticed that every several weeks I'll start getting the "Google was unable to recognize any speech in audio data." result. But if I pipe a raw message through the script as the asterisk user it works perfectly fine. It seems the issue only occurs when the script is called by FreePBX. Only a reboot seems to solve it. Dug through the mail/asterisk/gcloud logs and I'm not seeing anything useful. Any ideas?

@tony722

This comment has been minimized.

Copy link
Owner Author

@tony722 tony722 commented Dec 17, 2019

Might be good to ask on the FreePBX forums. I've seen that kind of issue randomly, but it's always the odd message here or there and never required a reboot. I'm not even pretending to be an expert on all this. But this script worked for me so I posted it here. 😄 Please post back a fix here if you find one! :-)

@levishores

This comment has been minimized.

Copy link

@levishores levishores commented May 22, 2020

FYI, we found that the cause of the periodic failure of this script was caused by the Script attempting to get the gcloud config from /root/.config/gcloud. Seems like the parent process's HOME environment variable is used by gcloud. I added
export CLOUDSDK_CONFIG=/home/asterisk/.config/gcloud
above
RESULT=gcloud ml speech recognize stream.part3.flac --language-code='en-US'``
which corrected the issue for us without a reboot.

Time will tell if this keeps up.

@voltagecontrolled

This comment has been minimized.

Copy link

@voltagecontrolled voltagecontrolled commented May 22, 2020

@levishores Makes total sense, good find! I've implemented your fix and it worked.

@levishores

This comment has been minimized.

Copy link

@levishores levishores commented May 22, 2020

Also, for posterity, you can debug the gcloud command by adding this commented line:

//Insert this line to use for debugging - will dump console errors to 'error' in the tmp dir
gcloud ml speech recognize stream.part3.flac --language-code='en-US' 1>result 2>error
//put it near this line in the script (don't edit this line):
RESULT=`gcloud ml speech recognize stream.part3.flac --language-code='en-US'`

//comment this line at the bottom to keep the TMP directory for analysis after the script runs
rm -Rf $TMPDIR

Hope this makes sense...
The concept is

  1. Comment the line at the bottom which deletes the temp directory - this lets you go into /tmp/tmp.[varies] to analyze the results
  2. adding a second gcloud command with '1>result 2>error' at the end results in console errors going to 'error' file. (This is the critical part of tracking down why the script isn't executing properly when called by Asterisk. Errors don't feed into the variable.
  3. Once changed, leave a VM, then navigate to /tmp/tmp.[uid] and investigate the error file
@sr10952

This comment has been minimized.

Copy link

@sr10952 sr10952 commented May 26, 2020

Thanks a lot for your code. I am in the process of setting it up. While setting it up I have actually tried to create a helpful user guide for, less technical people to follow along. After much troubleshooting, my FreePBX is talking to google... However, the email received is saying "Google was unable to recognize any speech in audio data."

Looking through the "result" file in the log, I do see results being sent back from Google. below are two different "results" I received, from two separate voicemails. What is my next step on here?

{ "results": [ { "alternatives": [ { "confidence": 0.818743, "transcript": "hi, this is just a test message to see if" } ] }, { "alternatives": [ { "confidence": 0.6924237, "transcript": " Google being able to transcribe this message." } ] } ] }

Thinking the issue is with two levels of confidence within the result, I left another message, which was also not passed down in the email body.

{ "results": [ { "alternatives": [ { "confidence": 0.73869956, "transcript": "Hi, this message is for test how Google will be able to transcribe this voicemail and send it back to me." } ] } ] }
edit;
ok, figured it out

@tony722

This comment has been minimized.

Copy link
Owner Author

@tony722 tony722 commented May 27, 2020

It looks like a change Google made may have broken this script. Do you have a fix that I could incorporate here?

@sr10952

This comment has been minimized.

Copy link

@sr10952 sr10952 commented May 27, 2020

Tony, I would love to be able to help you with this.
The issues I had is not something that would "break". Are you not getting back from Google anything? The issue I had in the previous comment was due to some troubleshooting I took. Perhaps this is the case by you as well. Basically, on line 89 of your current code above, I had changed it based on @levishores comment. I changed that line to read
RESULT= gcloud ml speech recognize stream.part3.flac --language-code='en-US' 1>result 2>error. this broke everything. I changed it back to the original,
gcloud ml speech recognize stream.part3.flac --language-code='en-US' and now it seems to work perfectly.

@sr10952

This comment has been minimized.

Copy link

@sr10952 sr10952 commented May 27, 2020

Steps needed to get FreePBX to work with Google Cloud Speech to Text for voicemail transcripts

Google Account

  1. Create an account, if you don't have yet, with Google Cloud. Free trials can be made at https://console.cloud.google.com/freetrial
  2. Within console.cloud.google.com search for Cloud Speech-to-Text API and enable it

Within FreePBX

  1. Follow steps 1 and 2 of the instructions on Google Cloud https://cloud.google.com/sdk/docs/downloads-yum
  2. Copy the content of the code on this page to /usr/sbin/sendmail-gcloud
  3. Run the following commands on FreePBX;
    cd /usr/sbin/
    chown asterisk:asterisk sendmail-gcloud
    chmod 744 sendmail-gcloud
    chmod 777 /usr/bin/dos2unix
  4. Verify that you have the following (by simply running the command) and if not use yum install;
    jq
    sox
    flac

Connect FreePBX to Google Cloud

  1. Within FreePBX run the following;
    su asterisk
    gcloud auth login
    CLI will provide you a url. Copy that and paste it into your browser. Google will give you a verification code to copy. Paste it into the cli waiting for a verification code.

Have FreePBX inject this code
Within FreePBX go to Settings > Voicemail Admin > Settings > Email Config, and add “/usr/sbin/sendmail-gcloud” (without the quotes) into the Mail Command. Submit and apply changes

Hopefully this should do get it to work

@levishores

This comment has been minimized.

Copy link

@levishores levishores commented May 27, 2020

I have updated my comment. Github screwed with the ticks in the code, so yeah, if you just blindly copied and pasted
RESULT= gcloud ml speech recognize stream.part3.flac --language-code='en-US' 1>result 2>error
you were gonna break the script. (I'll also add, this isn't even the line I suggested you add - you're setting the RESULT variable equal to the line that logs the output to two different files. Not even sure what bash is going to do there, but I'll bet it's not gonna work.)

Did you read that full comment? The concept is using it for debugging because the gcloud command doesn't execute properly when it's executed by Asterisk as part of the script. You can't see the error, but if you add an additional gcloud command and log the error text to a file, then keep the script from deleting the TMP file, you can look at the error. Root cause was that it was trying to access the gcloud config inside the root home folder, not asterisk's home.

The fix was simply to add
export CLOUDSDK_CONFIG=/home/asterisk/.config/gcloud
anywhere above the gcloud command.

@sr10952

This comment has been minimized.

Copy link

@sr10952 sr10952 commented May 27, 2020

I have actually been able to debug it by only stopping deletion of the temp files, so I was able to see somewhat at which step it stops, although it was much more difficult to troubleshoot... I wrote this guide above that actually worked for me in production. (latest verison of FreePBX and gcloud).

@tony722

This comment has been minimized.

Copy link
Owner Author

@tony722 tony722 commented May 27, 2020

@sr10952. I've incorporated your instructions into the original gist as comments so they can be seen too. Thanks!

@levishores, I've added the export line to the gist too. Thanks for your contrib. :-)

@Acpek23

This comment has been minimized.

Copy link

@Acpek23 Acpek23 commented Aug 5, 2020

Hi there I hope you guys can help, I tried to implement this scriot But im having issues, I dont really know whats wrong.
I tried to debbug and this is what im getting on my temp file when i read stream.new
Subject: FreePBX Voicemail Notification
Message-ID: Asterisk-10-1085661504-5555-2888@domain
X-Asterisk-CallerID: 1243
X-Asterisk-CallerIDName: Alejandro Cardenas
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

I was also getting the template script witch i removed in order to do the testing.
I can see that my project is connected(apparently)

You are now logged in as [myemailaccount].
Your current project is [None]. You can change this setting by running:
$ gcloud config set project PROJECT_ID
[asterisk@pbx sbin]$ gcloud config set project my project id
Updated property [core/project].
[asterisk@pbx sbin]$

What else i can do to test this. Any suggestion?
Thanks in advance

FYI; since i do not have a comercial license(or pay for any license) I use posfix to send SMTP emails thru google, that part works great.

@chrisduncansn

This comment has been minimized.

Copy link

@chrisduncansn chrisduncansn commented Aug 27, 2020

I've been using this script since late 2019 (thank you @tony722) and in recent months it's started to return Google was unable to recognize any speech in audio data. So I added the latest changes provided by @sr10952 and @levishores. Still not working. So, I debugged line by line and determined I was getting a transcription back from Google. The failure was here:

if [ -z "$FILTERED" ]

The extra space before the closing bracket was triggering the if statement. The odd thing is this extra space has always been in @tony722's script. Removing it fixed it though.

@Acpek23 give that a try. If it doesn't work, see if Google is returning a transcription: login as the asterisk user and run

RESULT=gcloud ml speech recognize stream.part3.flac --language-code='en-US'
echo $RESULT

@chrisduncansn

This comment has been minimized.

Copy link

@chrisduncansn chrisduncansn commented Aug 28, 2020

Thanks for updating the script @tony722!

@tony722

This comment has been minimized.

Copy link
Owner Author

@tony722 tony722 commented Aug 28, 2020

Thanks @chrisduncansn for finding that! 👍

@Acpek23

This comment has been minimized.

Copy link

@Acpek23 Acpek23 commented Sep 23, 2020

I've been using this script since late 2019 (thank you @tony722) and in recent months it's started to return Google was unable to recognize any speech in audio data. So I added the latest changes provided by @sr10952 and @levishores. Still not working. So, I debugged line by line and determined I was getting a transcription back from Google. The failure was here:

if [ -z "$FILTERED" ]

The extra space before the closing bracket was triggering the if statement. The odd thing is this extra space has always been in @tony722's script. Removing it fixed it though.

@Acpek23 give that a try. If it doesn't work, see if Google is returning a transcription: login as the asterisk user and run

RESULT=gcloud ml speech recognize stream.part3.flac --language-code='en-US'
echo $RESULT

Unfortuanaly im still unable to use this im getting nothing.
i changed if [ -z "$FILTERED" ]-- same issue

I also tried to save the .mp3 on temp file to see if the conversion was success, nothing

save voicemail in tmp folder in case of trouble

    TMPMP3=$(mktemp -u /tmp/msg_XXXXXXXX.mp3)
    cp "stream.part3.mp3" "$TMPMP3"

I also tried this:
[asterisk@pbx sbin]$ RESULT=gcloud ml speech recognize stream.part3.flac --language-code='en-US'
bash: ml: command not found

I ran: gcloud ml speech recognize stream.part3.flac --language-code='en-US' 1>result 2>error
The $RESULT is create but is empty

Any other suggestion?
Thanks again!!

@chrisduncansn

This comment has been minimized.

Copy link

@chrisduncansn chrisduncansn commented Sep 23, 2020

[asterisk@pbx sbin]$ RESULT=gcloud ml speech recognize stream.part3.flac --language-code='en-US'
bash: ml: command not found

You're missing a space between =gcloud

@Acpek23

This comment has been minimized.

Copy link

@Acpek23 Acpek23 commented Sep 23, 2020

now im getting this error: ERROR: (gcloud.ml.speech.recognize) Invalid audio source [stream.part3.flac]. The source must either be a local path or a Google Cloud Storage URL (such as gs://bucket/object).

Any suggestion?

@Acpek23

This comment has been minimized.

Copy link

@Acpek23 Acpek23 commented Sep 23, 2020

RESULT=gcloud ml speech recognize stream.part3.flac --language-code='en-US'

youre right and im getting this:
[asterisk@pbx sbin]$ RESULT= gcloud ml speech recognize stream.part3.flac --language-code='en-US'
ERROR: (gcloud.ml.speech.recognize) Invalid audio source [stream.part3.flac]. The source must either be a local path or a Google Cloud Storage URL (such as gs://bucket/object).
[asterisk@pbx sbin]$

@chrisduncansn

This comment has been minimized.

Copy link

@chrisduncansn chrisduncansn commented Sep 23, 2020

//comment this line at the bottom to keep the TMP directory for analysis after the script runs
rm -Rf $TMPDIR

run the script again, then cd in to the temp directory and re-run

RESULT=gcloud ml speech recognize stream.part3.flac --language-code='en-US'
echo $RESULT

@Acpek23

This comment has been minimized.

Copy link

@Acpek23 Acpek23 commented Sep 23, 2020

same result
[asterisk@pbx tmp]$ cd tmp.t6QQWwfhbN/
[asterisk@pbx tmp.t6QQWwfhbN]$ ls
stream.new
[asterisk@pbx tmp.t6QQWwfhbN]$ RESULT= gcloud ml speech recognize stream.part3.flac --language-code='en-US'
ERROR: (gcloud.ml.speech.recognize) Invalid audio source [stream.part3.flac]. The source must either be a local path or a Google Cloud Storage URL (such as gs://bucket/object).
[asterisk@pbx tmp.t6QQWwfhbN]$ echo $RESULT

[asterisk@pbx tmp.t6QQWwfhbN]$

on stream.new im able to see the "normal message" this is the one that im currently sending.
or do i need to disabled this on the pbx configuration?
i have this config: Mail Command : /usr/sbin/sendmail-gcloud

Alejandro Cardenas,

Hay un nuevo correo de voz en el buzón ext:

De:	"Gustavo Martinez" <ext>
Duración del mensaje:	0:19 seconds
Fecha:	Wednesday, September 23, 2020 at 05:24:35 PM

Marca *98 para acceder a su correo de voz por teléfono.
Ingresa a url para revisar su correo de voz con un navegador web.

@skippy1976

This comment has been minimized.

Copy link

@skippy1976 skippy1976 commented Oct 14, 2020

You may wish to consider using the phone_call model. This will improve the transcription.

@pete1019

This comment has been minimized.

Copy link

@pete1019 pete1019 commented Oct 14, 2020

You may wish to consider using the phone_call model. This will improve the transcription.

Could you please be more specific? Example what to change? Thanks

@kevinrossen

This comment has been minimized.

Copy link

@kevinrossen kevinrossen commented Oct 28, 2020

You may wish to consider using the phone_call model. This will improve the transcription.

Could you please be more specific? Example what to change? Thanks

Looks like skippy1976 is referring to the speech model options available in Speech-to-Text. But I don't see an option to set a model using gcloud from the terminal.
Here's model documentation
Here's the gcloud documentation

@kevinrossen

This comment has been minimized.

Copy link

@kevinrossen kevinrossen commented Oct 28, 2020

Looks like there's a 60 second limit for the transcriptions using "gcloud ml speech recognize". But there would be no limit to the length using "gcloud ml speech recognize-long-running". I know the length of the message is stored somewhere as that ends up in the body of the email. Anyone have any ideas on how to modify this to use an "if then" option for longer voicemails?

@kevinrossen

This comment has been minimized.

Copy link

@kevinrossen kevinrossen commented Oct 28, 2020

I found a couple options that I like while digging into the documentation. The options I like are on the alpha channel, so there's a good chance they won't work long-term, but I'm okay with that on my setup. Here's what I changed:

ORIGINAL: RESULT=`gcloud ml speech recognize stream.part3.flac --language-code='en-US'\`

NEW: RESULT=`gcloud alpha ml speech recognize stream.part3.flac --language-code='en-US' --interaction-type='voicemail' --include-word-time-offsets --filter-profanity --enable-automatic-punctuation`

@CadillacRick

This comment has been minimized.

Copy link

@CadillacRick CadillacRick commented Oct 31, 2020

Hey guys, new to this topic... Trying to get this to work on my Asterisk box... follow all the steps as indicated. Didn't get any errors along the way, but I don't seem to get any results... The voicemail still answers, records the file... and I still get the audio file to my email.. but at the bottom I see

--Google transcription result --
(Google was unable to recognize any speech in audio data.)

Also noticed that I can't play the MP3 file attached with the email.... says it's unsupported or corrupt.

Did some more testing... when I leave a voicemail... and I go into the /tmp/tmp.xxxxxxx folder... I can run the command manually

RESULT= `gcloud ml speech recognize stream.part3.flac --language-code='en-US'`

and with echo $RESULT I get the transcription like so...

{  "results": [  {  "alternatives": [  {  "confidence": 0.7456564, "transcript": "This is yet another test of the voicemail system Richard testing Richard testing."  } ] } ] }

But still unable to get it in the email from Asterisk....

In the error file in the /tmp/tmp.xxxxxx I see this --

ERROR: (gcloud.ml.speech.recognize) Your current active account [xxxxxxxxxxx@gmail.com] does not have any valid credentials
Please run:

$ gcloud auth login

to obtain new credentials.

For service account, please activate it first:

$ gcloud auth activate-service-account ACCOUNT

Which is weird because the command runs manually....

Thanks!

Richard

@CadillacRick

This comment has been minimized.

Copy link

@CadillacRick CadillacRick commented Oct 31, 2020

Ok, well it turns out that this line --- export CLOUDSDK_CONFIG=/home/asterisk/.config/gcloud

was a problem for my setup... Now I get the transcription....

But!!!!! the audio file is still a problem. the mp3 file doesn't work... can't listen to it..

@tony722 Any ideas ??? Anyone ???

@kevinrossen have the MP3 attachments been working for you ?

@dcat127

This comment has been minimized.

Copy link

@dcat127 dcat127 commented Jan 26, 2021

This script fails for any voicemail longer than 1 minute, with the following error:
ERROR: (gcloud.ml.speech.recognize) INVALID_ARGUMENT: Sync input too long. For audio longer than 1 min use LongRunningRecognize with a 'uri' parameter.

I have fixed it by replacing
sox -G stream.part3.wav --channels=1 --bits=16 --rate=8000 stream.part3.flac

with
sox -G stream.part3.wav --channels=1 --bits=16 --rate=8000 stream.part3.flac trim 0 59

this does not "fix" the issue of too long voicemails, but it changes it so it only transcribes the first 59 seconds, which in my case is good enough.

@chrisduncansn

This comment has been minimized.

Copy link

@chrisduncansn chrisduncansn commented Feb 9, 2021

I found a couple options that I like while digging into the documentation. The options I like are on the alpha channel, so there's a good chance they won't work long-term, but I'm okay with that on my setup. Here's what I changed:

ORIGINAL: RESULT=`gcloud ml speech recognize stream.part3.flac --language-code='en-US'\`

NEW: RESULT=`gcloud alpha ml speech recognize stream.part3.flac --language-code='en-US' --interaction-type='voicemail' --include-word-time-offsets --filter-profanity --enable-automatic-punctuation`

@kevinrossen how are those alpha options working for you? I checked and it looks like they are still in Alpha status, which is a bummer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment