-
-
Save Psychokiller1888/cf10af3220b5cd6d9c92c709c6af92c2 to your computer and use it in GitHub Desktop.
#!/usr/bin/env bash | |
# By Psycho | |
# Shell script to handle different TTS and online / offline connectivity | |
# This bash script can be set as a custom TTS for snips but also called directly from your skills | |
# a great way to give more than one personality to your assistant | |
# Original script: https://gist.github.com/Psychokiller1888/cf10af3220b5cd6d9c92c709c6af92c2 | |
####### COMMON ####### | |
#------------------------------------ | |
# Set your cache path | |
cache="/home/pi/snipsSuperTTS/cache" | |
#------------------------------------ | |
# - Install mpg123 | |
# - The cache path is created by the script itself at first run! If you already have the cache directories, make sure to set its owner to "_snips" => sudo chown _snips /home/pi/snipsSuperTTS/cache | |
# - Edit /etc/snips.toml | |
# - Set "customtts" as snips-tts provider | |
# - Add as customtts: command = ["/home/pi/snipsSuperTTS/snipsSuperTTS.sh", "%%OUTPUT_FILE%%", "google", "%%LANG%%", "US", "Wavenet-C", "FEMALE", "%%TEXT%%", "22050"] | |
# Change "US" to another language country code, "GB" per exemple for a british voice | |
# You can customize the "Wavenet-C" to another voice of your choice. https://cloud.google.com/text-to-speech/docs/voices / https://docs.aws.amazon.com/polly/latest/dg/voicelist.html | |
# Offline voices possibilities are: picotts or mycroft | |
# Fit "FEMALE" to the voice gender you want. Note this is linked to google voices | |
# You can change the sample rate, the last argument, to your needs | |
# Restart snips: systemctl restart snips-* | |
# If you want a total customized experience, you can easily enable both google and amazon and use a different one depending on what you want. Don't be shy, there's no limit | |
# Note that not all parameters do something depending on the voice you choose. Those are marked with "--" in the provided examples | |
####### MycroftAI - Mimic ####### | |
# https://github.com/MycroftAI/mimic | |
# This one is pretty long to install, but hey, the quality compared to pico is worth it! | |
# Set the following option to True to use Mycroft instead of pico. Don't forget to set the path too | |
# | |
# Example: command = ["/home/pi/snipsSuperTTS/snipsSuperTTS.sh", "%%OUTPUT_FILE%%", "mycroft", "%%LANG%%", "--", "slt_hts", "--", "%%TEXT%%", "22050"] | |
# | |
# Available voices: aew ahw aup awb axb bdl clb eey fem gka jmk ksp ljm rms rxr slt slt_hts | |
# Or you can use an external voice on http. | |
# | |
# sudo apt-get install gcc make pkg-config automake libtool libasound2-dev | |
# git clone https://github.com/MycroftAI/mimic.git | |
# cd mimic | |
# ./dependencies.sh --prefix="/usr/local" | |
# ./autogen.sh | |
# ./configure --prefix="/usr/local" | |
# make | |
# sudo /sbin/ldconfig | |
# make check | |
useMycroft=false | |
mycroftPath="/home/pi/mimic" | |
####### GOOGLE ####### | |
# Install Google SDK: https://cloud.google.com/text-to-speech/docs/quickstart-protocol | |
# Follow point 6. to initialize the sdk after creating your service account | |
# Get your api key from the console https://console.developers.google.com | |
# Uncomment the following and set it accordingly: | |
# | |
#googleWavenetAPIKey="" | |
# | |
# Example: command = ["/home/pi/snipsSuperTTS/snipsSuperTTS.sh", "%%OUTPUT_FILE%%", "google", "%%LANG%%", "US", "Wavenet-C", "FEMALE", "%%TEXT%%", "22050"] | |
###### AMAZON ####### | |
# Install Amazon sdk | |
# curl "https://s3.amazonaws.com/aws-cli/awscli-bundle.zip" -o "awscli-bundle.zip" | |
# unzip awscli-bundle.zip | |
# ./awscli-bundle/install -i /usr/local/aws -b /usr/local/bin/aws | |
# Uncomment the following lines and set them accordingly: | |
# | |
#export AWS_ACCESS_KEY_ID="" | |
#export AWS_SECRET_ACCESS_KEY="" | |
#export AWS_DEFAULT_REGION="eu-central-1" | |
#awscli='/usr/local/bin/aws' | |
# | |
# command = ["/home/pi/snipsSuperTTS/snipsSuperTTS.sh", "%%OUTPUT_FILE%%", "amazon", "%%LANG%%", "US", "Joanna", "--", "%%TEXT%%", "22050"] | |
# | |
#################################################################################################### | |
outfile="$1" | |
service="$2" | |
lang="$3" | |
country="$4" | |
voice="$5" | |
gender="$6" | |
text="$7" | |
sampleRate="$8" | |
if [ "$service" = "mycroft" ] || [ "$service" = "picotts" ]; then | |
status="offline" | |
else | |
echo -e "GET http://google.com HTTP/1.0\n\n" | nc google.com 80 > /dev/null 2>&1 | |
if [ $? -eq 0 ]; then | |
status="online" | |
else | |
status="offline" | |
fi | |
# Alternative for some people having problem pinging google.com. Comment the above and uncomment the following | |
#wget -q --tries=1 --timeout=1 --spider http://google.com | |
#if [[ $? -eq 0 ]]; then | |
# status="online" | |
#else | |
# status="offline" | |
#fi | |
fi | |
function picotts() { | |
case "$lang" in | |
*en*) | |
lang="en-US";; | |
*de*) | |
lang="de-DE";; | |
*es*) | |
lang="es-ES";; | |
*fr*) | |
lang="fr-FR";; | |
*it*) | |
lang="it-IT";; | |
*) | |
lang="en-US";; | |
esac | |
text=$(sed 's/<[^>]*>//g' <<< "$text") | |
pico2wave -w "$outfile" -l "$lang" "$text" | |
} | |
function mycroft() { | |
text=$(sed 's/<[^>]*>//g' <<< "$text") | |
."$mycroftPath/mimic" -t "$text" -o "$outfile" -voice "$mycroftPath""/voices/cmu_us_""$voice"".flitevox" | |
} | |
if [ "$service" = "google" ]; then | |
cache="$cache/google/" | |
mkdir -p "$cache" | |
text=${text//\'/\\\'} | |
languageCode="$lang"-"$country" | |
googleVoice="$languageCode"-"$voice" | |
md5string="$text"_"$googleVoice"_"$sampleRate"_"$lang" | |
hash="$(echo -n "$md5string" | md5sum | sed 's/ .*$//')" | |
cachefile="$cache""$hash".wav | |
downloadFile="/tmp/""$hash" | |
if [[ -f "$cachefile" ]]; then | |
cp "$cachefile" "$outfile" | |
else | |
if [ "$status" != "online" ]; then | |
if [[ "$useMycroft" = true ]]; then | |
mycroft | |
else | |
picotts | |
fi | |
else | |
if [ "$text" != *"<speak>"* ]; then | |
text="<speak>""$text""</speak>" | |
fi | |
curl -H "Content-Type: application/json; charset=utf-8" \ | |
--data "{ | |
'input':{ | |
'ssml':'$text' | |
}, | |
'voice':{ | |
'languageCode':'$languageCode', | |
'name':'$googleVoice', | |
'ssmlGender':'$gender' | |
}, | |
'audioConfig':{ | |
'audioEncoding':'MP3', | |
'sampleRateHertz':'$sampleRate' | |
} | |
}" "https://texttospeech.googleapis.com/v1/text:synthesize?key="$googleWavenetAPIKey > "$downloadFile" | |
sed -i 's/audioContent//' "$downloadFile" && \ | |
tr -d '\n ":{}' < "$downloadFile" > "$downloadFile".tmp && \ | |
base64 "$downloadFile".tmp --decode > "$downloadFile".mp3 | |
mpg123 --quiet --wav "$cachefile" "$downloadFile".mp3 | |
rm "$downloadFile" && \ | |
rm "$downloadFile".tmp && \ | |
rm "$downloadFile".mp3 | |
cp "$cachefile" "$outfile" | |
fi | |
fi | |
elif [ "$service" = "amazon" ]; then | |
cache="$cache/amazon/" | |
mkdir -p "$cache" | |
amazonVoice=$voice | |
md5string="$text""_""$amazonVoice"_"$sampleRate"_"$lang" | |
hash="$(echo -n "$md5string" | md5sum | sed 's/ .*$//')" | |
cachefile="$cache""$hash".mp3 | |
if [ -f "$cachefile" ]; then | |
mpg123 -q -w $outfile $cachefile | |
else | |
if [ "$status" != "online" ]; then | |
if [[ "$useMycroft" = true ]]; then | |
mycroft | |
else | |
picotts | |
fi | |
else | |
if [ "$text" != *"<speak>"* ];then | |
text="<speak>""$text""</speak>" | |
fi | |
$awscli polly synthesize-speech \ | |
--output-format mp3 \ | |
--voice-id "$voice" \ | |
--sample-rate "$sampleRate" \ | |
--text-type ssml \ | |
--text "$text" \ | |
"$cachefile" | |
mpg123 -q -w $outfile $cachefile | |
fi | |
fi | |
elif [ "$service" = "picotts" ]; then | |
picotts | |
elif [ "$service" = "mycroft" ]; then | |
mycroft | |
else | |
if [[ "$useMycroft" = true ]]; then | |
mycroft | |
else | |
picotts | |
fi | |
fi |
@Psychokiller1888
You could also add the account free google-translate tts API call to cover all bases. I created the following bash script for my simple needs, but feel free to integrate it in to your super script. Enjoy.
#!/bin/bash
if [ $1 = "-h" ]
then
echo "Usage: $0 outfile language text"
exit 0
fi
cache_dir="/var/cache/google-translate-tts"
tmpfile=$(mktemp /tmp/google-translate-tts.XXXXXXXX)
filename_md5=$(echo -n $3 | md5sum)
filename=$cache_dir"/"${filename_md5%% *}".wav"
echo $tmpfile
echo $filename
if [ ! -d $cache_dir ]
then
echo $cache_dir" does not exist. Please create with needed permissions."
fi
if [ -s $filename ]
then
echo "Using cached audio file "$filename
else
echo "Caching new audio file "$filename
wget -q -U Mozilla -O $tmpfile "http://translate.google.com/translate_tts?ie=UTF-8&client=tw-ob&tl=$2&q=$3"
mpg123 --quiet --wav $filename $tmpfile
rm -f $tmpfile
fi
cp $filename $1
The gcloud auth application-default print-access-token
call is teribly slow on my Raspi2B+ and is also intended to be used for debugging only. Would you consider switching to OAuth or other possible auth mechanisms?
To speed up Google TTS, just create an API key in Google cloud console and add "?key=XXX" to the Google URL while removing the Authorization-Header. Way faster than generating access tokens at each call.
@hokascha Thx! I updated to use the api key!
Thank you also! Will add it. I have a few others coming also, such as IBM
How would you go about changing Google TTS mp3 to LINEAR16?
'audioEncoding':'mp3',
to 'audioEncoding':'LINEAR16',
When I do this the cache file is created but playback is a short 'glitch/chirp' sound.
Google TTS API states that a .wav header is included when requesting LINEAR16.
}" "https://texttospeech.googleapis.com/v1/text:synthesize?key="$googleWavenetAPIKey > "$downloadFile"
sed -i 's/audioContent//' "$downloadFile" && \
tr -d '\n ":{}' < "$downloadFile" > "$downloadFile".tmp && \
base64 "$downloadFile".tmp --decode > "$downloadFile".wav
Is the audio file ok somewhat? I mean, if you open it with audacity per exemple, and change the rates and all, is it a normal sound file?
Is the audio file ok somewhat? I mean, if you open it with audacity per exemple, and change the rates and all, is it a normal sound file?
@Psychokiller1888 Thanks for the reply.
My mistake: I was passing the decoded output through mpg123.
mpg123 --quiet --wav "$cachefile" "$downloadFile".wav
This worked for me for LINEAR16 with aplay.
'audioConfig':{
'audioEncoding':'LINEAR16',
'sampleRateHertz':'$sampleRate',
'effectsProfileId': ['large-home-entertainment-class-device']
}
}" "https://texttospeech.googleapis.com/v1/text:synthesize?key="$googleWavenetAPIKey > "$downloadFile"
sed -i 's/audioContent//' "$downloadFile" && \
tr -d '\n ":{}' < "$downloadFile" > "$downloadFile".tmp && \
base64 "$downloadFile".tmp --decode > "$downloadFile"
aplay "$downloadFile".wav
cp "$downloadFile" "$cachefile"
rm "$downloadFile" && \
rm "$downloadFile".tmp && \
rm "$downloadFile".wav
cp "$cachefile" "$outfile"
Hey I try to setup snips super tts with Google. And it works if I execute the script from the command line with Google but not if I set it up via /etc/snips.toml. With Mycroft it works perfectly. And I don't have any clue what to do to get it work. Any suggestions?
did you edit the script to put your credentials? Check /var/log/syslog for errors returned also
Yes I edited it and put my credentials in the scripts. And if I execute the script from the command line it works with Google so the API key is correct. But if I try to call the script from snips it doesn't work with Google but it works with Mycroft so the path and everything should be also correct.
So it's most prolly a premission issue. Check your syslog to see if it fails accessing the cache directory
I thought I already changed the owner of the cache directory but it seems like I forgot about that... Now it works perfectly! Thanks for the fast help
hi psycho, the first time i installed this a few weeks ago everything worked fine. Now that i've started with a fresh copy of stretch and snips ive tried setting this up again but have the following error when trying to install mimic
pi@snips:~ $ cd mimic
pi@snips:~/mimic $ ./dependencies.sh --prefix="/usr/local"
/usr/bin/pkg-config
Have PCRE2? [no]
--2019-08-16 00:39:41-- ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre2-10.23.zip
=> ‘pcre2-10.23.zip’
Resolving ftp.csx.cam.ac.uk (ftp.csx.cam.ac.uk)... 131.111.8.115
Connecting to ftp.csx.cam.ac.uk (ftp.csx.cam.ac.uk)|131.111.8.115|:21... failed: Connection timed out.
Retrying.
--2019-08-16 00:41:58-- ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre2-10.23.zip
(try: 2) => ‘pcre2-10.23.zip’
Connecting to ftp.csx.cam.ac.uk (ftp.csx.cam.ac.uk)|131.111.8.115|:21... failed: Connection timed out.
Retrying.
been that way for three days now. i also cant ping ftp.csx.cam.ac.uk so assuming the site is down ??
is there a alternative for this part of the install process ?
cheers
FYI.
i edited the wget line in dependencies.sh file to be
wget "https://ftp.pcre.org/pub/pcre/pcre2-10.23.zip"
and that seemed to be a good alternative so far.
PCRE2 installation succeeded
@Psychokiller1888
I created a fork of your skript with some additional functionalities (mainly more wavenet params and translate-tts). Feel free to copy code back to your gist.
Hi,
I have problems to get this running. I installed the AWS cli bundle successfully.
I added this to snips.toml after uncommenting customtts provider:
customtts = { command = ["/home/pi/snipsSuperTTS/snipsSuperTTS.sh", "%%OUTPUT_FILE%%", "amazon", "%%LANG%%", "DE", "Vicky", "--", "%%TEXT%%", "22050"] }
I created an additional IAM account in AWS, giving it the Polly access profile. The keys I got there I put here:
export AWS_ACCESS_KEY_ID="myID"
export AWS_SECRET_ACCESS_KEY="mykey"
export AWS_DEFAULT_REGION="eu-central-1"
awscli='/usr/local/bin/aws'
When I now give a command to Snips, everything works fine but I hear no voice and sam watch says:
[11:20:28] [AudioServer] was asked to play a wav of 0.0 kB with id 'a5365dfa-27d1-4a91-b383-506beb1c4482' on site default
Of course there is no voice when it is asked to play 0kb ... the question is how I find out why the file did not get created?
I connected to the AWS console and can see that the key was used today...so it must have connected to AWS successfully.
I also ran aws configure and used polly through aws cli which created me a mp3 successfully. But with Snips still no success.
Okay, I found the problem in the syslog. The girls name is Vicki, not Vicky. She doesn't like being called Vicky... :)
I think it more a permission problem. You can check the syslog when it tries to answer tail -f /var/log/syslog
Yeah that way I found it I spelled the name of the voice girl wrong. Now it's working :)
Installed mimic first, then realised it can't speak german :/
Yeah, it's sad mimic only speaks english
Anyways, thanks for your work! Polly is quite nice. However a local solution that speaks german would be great.
Hi @Psychokiller1888,
thanks for the script, it's been very useful. I have it working with Polly.
But now, I would like to test Mycroft's Mimic as default offline service (instead of pico).
When I follow your instructions (the same as Mycroft's) I get an error. I asked them directly, but do you have any clue what is this related to?
Thanks!
MycroftAI/mimic1#186
Hi @Psychokiller1888,
thank you very much for that awesome script!
Unfortunately, I can`t get it running.
Here is what I did so far:
I installed mpg123 on my Raspberry Pi.
I edited the snips.toml file in /etc/: I uncommented the "provider: customtts" line and the "customtts = { command ..."-line with the command for Google's Wavenet TTS.
I copied the snipsSuperTTS.sh file to /home/pi
In the google console (after I was forced to enter my credit card number -.- ) I activated "Cloud Text-to-Speech API" and then I created an API-Key
Then I edited the snipsSuperTTS file: uncommented the line "googleWavenetAPIKey="MY API-KEY CONSISTING OF A LOT OF LETTERS AND NUMBERS"" and entered the API key I created in the console.cloud.google.
As advised in the Google Paragraph in snipsSuperTTS.sh-file, I installed Google SDK on my RaspberryPi, following the Linux-instructions here in Point 6 (https://cloud.google.com/text-to-speech/docs/quickstart-protocol)
Then, I tried to run the snipsSuperTTS.sh-file trying the command "sh snipsSuperTTS.sh" resulting in the error: "snipsSuperTTS.sh: 104: snipsSuperTTS.sh: Syntax error: "(" unexpected"
I don´t know how to solve this. The cache path "/home/pi/snipsSuperTTS/cache" was not created
Any clues are very welcome. Thank you very much!
Hi! As the error states, there's a syntax error on line 104 of you shell script
Thank you for the quick reply!
I thought so, but I did not change anything in that line.
Lines 104-107 are:
"function mycroft() {
text=$(sed 's/<[^>]*>//g' <<< "$text")
."$mycroftPath/mimic" -t "$text" -o "$outfile" -voice "$mycroftPath""/voices/cmu_us_""$voice"".flitevox"
}"
@Pittermaennchen check the lines before 103. Might have lost a } ?
@Pittermaennchen check the lines before 103. Might have lost a } ?
hm. I can't find the source of the error. :(
The lines before are:
outfile="$1"
service="$2"
lang="$3"
country="$4"
voice="$5"
gender="$6"
text="$7"
sampleRate="$8"
if [ "$service" = "mycroft" ] || [ "$service" = "picotts" ]; then
status="offline"
else
echo -e "GET http://google.com HTTP/1.0\n\n" | nc google.com 80 > /dev/null 2>&1
if [ $? -eq 0 ]; then
status="online"
else
status="offline"
fi
# Alternative for some people having problem pinging google.com. Comment the above and uncomment the following
#wget -q --tries=1 --timeout=1 --spider http://google.com
#if [[ $? -eq 0 ]]; then
# status="online"
#else
# status="offline"
#fi
fi
Thanks for this great script. I've only tried it with Mycroft for now, but I did have to add the executable in line 125:
from:
."$mycroftPath" -t "$text" -o "$outfile" -voice "$mycroftPath""/voices/cmu_us_""$voice"".flitevox"
to:
."$mycroftPath/mimic" -t "$text" -o "$outfile" -voice "$mycroftPath""/voices/cmu_us_""$voice"".flitevox"