Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
Freepbx Voicemail Transcription Script: Google Speech API
# sendmail-gcloud
# Installation instructions
# Copy the content of this file to /usr/sbin/sendmail-gcloud
# Google Account
# ---------------
# Create a Google Cloud account if you don't have one yet. Free trial is available at
# Within search for Cloud Speech-to-Text API and enable it
# From the Linux command line on the FreePBX machine
# -------------------------------------------
# Follow steps 1 and 2 of the instructions on Google Cloud
# Run the following commands on FreePBX;
# cd /usr/sbin/
# chown asterisk:asterisk sendmail-gcloud
# chmod 744 sendmail-gcloud
# chmod 777 /usr/bin/dos2unix
# Verify that you have the following (by simply running the command) and if not use yum install;
# jq
# sox
# flac
# dos2unix -V
# Ensure dos2unix is executable by the asterisk user (chmod 777 /usr/bin/dos2unix)
# Connect FreePBX to Google Cloud
# su asterisk
# gcloud auth login
# CLI will provide you a url. Copy that and paste it into your browser. Google will give you a verification code to copy. Paste it into the cli waiting for a verification code.
# Open FreePBX web interface
# Go to Settings > Voicemail Admin > Settings > Email Config
# Change Mail Command to: /usr/sbin/sendmail-gcloud
# Submit and apply changes
# Original source created by N. Bernaerts:
# modified per:
# modified per:
# current version:
# Notes: This is a script modified from the original to work with FreePBX so that email notifications sent from
# Asterisk voicemail contain a speech to text transcription provided by Google Cloud Speech API
# License: There are no explicit license terms on the original script or on the blog post with modifications
# I'm assumig GNU/GPL2+ unless notified otherwise by copyright holder(s)
# Version History:
# 2021-05-06 Add fix by dcat127: trim flac file to 59 seconds
# 2020-08-27 Add fix by chrisduncansn
# Minor edit in instruction wording
# 2020-05-27 Add instructions from sr10952
# Add export fix by levishores
# 2019-02-27 Initial commit by tony722
# set PATH
# save the current directory
pushd .
# create a temporary directory and cd to it
TMPDIR=$(mktemp -d)
# dump the stream to a temporary file
cat >>
# get the boundary
BOUNDARY=$(grep "boundary=" | cut -d'"' -f 2)
# if mail has no boundaries, assume no attachment
if [ "$BOUNDARY" = "" ]
# send the original stream
# cut the original stream into parts
# stream.part - header before the boundary
# stream.part1 - header after the bounday
# stream.part2 - body of the message
# stream.part3 - attachment in base64 (WAV file)
# stream.part4 - footer of the message
awk '/'$BOUNDARY'/{i++}{print > "stream.part"i}'
# cut the attachment into parts
# stream.part3.head - header of attachment
# stream.part3.wav.base64 - wav file of attachment (encoded base64)
sed '7,$d' stream.part3 > stream.part3.wav.head
sed '1,6d' stream.part3 > stream.part3.wav.base64
# convert the base64 file to a wav file
dos2unix -o stream.part3.wav.base64
base64 -di stream.part3.wav.base64 > stream.part3.wav
# convert the wav file to FLAC
sox -G stream.part3.wav --channels=1 --bits=16 --rate=8000 stream.part3.flac trim 0 59
# convert to MP3
sox stream.part3.wav stream.part3-pcm.wav
lame -m m -b 24 stream.part3-pcm.wav stream.part3.mp3
base64 stream.part3.mp3 > stream.part3.mp3.base64
# create mp3 mail part
sed 's/x-[wW][aA][vV]/mpeg/g' stream.part3.wav.head | sed 's/.[wW][aA][vV]/.mp3/g' >
dos2unix -o
unix2dos -o stream.part3.mp3.base64
cat stream.part3.mp3.base64 >>
# save voicemail in tmp folder in case of trouble
# TMPMP3=$(mktemp -u /tmp/msg_XXXXXXXX.mp3)
# cp "stream.part3.mp3" "$TMPMP3"
export CLOUDSDK_CONFIG=/home/asterisk/.config/gcloud
RESULT=`gcloud ml speech recognize stream.part3.flac --language-code='en-US'`
FILTERED=`echo "$RESULT" | jq -r '.results[].alternatives[].transcript'`
# generate first part of mail body, converting it to LF only
mv stream.part
cat stream.part1 >>
sed '$d' < stream.part2 >>
# beginning of transcription section
echo "" >>
echo "--- Google transcription result ---" >>
# append result of transcription
if [ -z "$FILTERED" ]
echo "(Google was unable to recognize any speech in audio data.)" >>
echo "$FILTERED" >>
# end of message body
tail -1 stream.part2 >>
# add converted attachment
cat >>
# append end of mail body, converting it to LF only
echo "" >> stream.tmp
echo "" >> stream.tmp
cat stream.part4 >> stream.tmp
dos2unix -o stream.tmp
cat stream.tmp >>
# send the mail thru sendmail
cat | sendmail -t
# go back to original directory
# remove all temporary files and temporary directory
rm -Rf $TMPDIR
Copy link

chrisduncansn commented Feb 9, 2021

I found a couple options that I like while digging into the documentation. The options I like are on the alpha channel, so there's a good chance they won't work long-term, but I'm okay with that on my setup. Here's what I changed:

ORIGINAL: RESULT=`gcloud ml speech recognize stream.part3.flac --language-code='en-US'\`

NEW: RESULT=`gcloud alpha ml speech recognize stream.part3.flac --language-code='en-US' --interaction-type='voicemail' --include-word-time-offsets --filter-profanity --enable-automatic-punctuation`

@kevinrossen how are those alpha options working for you? I checked and it looks like they are still in Alpha status, which is a bummer.

Copy link

msc1 commented May 6, 2021

was a problem for my setup... Now I get the transcription....

@CadillacRick did you run it as asterisk or root?

Copy link

rr10 commented Jan 13, 2022

Thanks @tony722 @chrisduncansn I preferred to modify the APIs used to support multiple languages ​​and punctuation. Now I no longer have to worry about the caller's language (at least up to three additional languages).
I have noticed that if the sentence starts with a word in a different language from the rest of the message there can be problems. It might be interesting to use different APIs for multiple languages ​​in succession.
RESULT=gcloud alpha ml speech recognize-long-running stream.part3.flac --language-code='ro-RO' --additional-language-codes='it-IT','en-US' --enable-automatic-punctuation --interaction-type=voicemail

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment