Base URL: https://www.google.com/speech-api/v1/recognize
It accepts POST
requests with voice file encoded in FLAC format, and query parameters for control.
client
The client's name you're connecting from. For spoofing purposes, let's use chromium
lang
Speech language, for example, ar-QA
for Qatari Arabic, or en-US
for U.S. English
maxresults
Maximum results to return for utterance
body
Should contain FLAC formatted voice binary
Content-Type
Should be audio/x-flac; rate=16000;
, where MIME and sample rate of the FLAC file is included
User-Agent
Can be the client's user agent string, for spoofing purposes, we'll use Chrome's
These examples assume you have a voice file encoded in FLAC called alsalam-alikum.flac
.
sudo aptitude install sox sox input.flac input_fixed.flac rate 16k channels 1
This will save JSON response in a file called recognized.json
wget --post-file='alsalam-alikum.flac' \
--user-agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7' \
--header='Content-Type: audio/x-flac; rate=16000;' \
-O 'recognized.json' \
'https://www.google.com/speech-api/v1/recognize?client=chromium&lang=ar-QA&maxresults=10'
curl -X POST \
--data-binary @alsalam-alikum.flac \
--user-agent 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7' \
--header 'Content-Type: audio/x-flac; rate=16000;' \
'https://www.google.com/speech-api/v1/recognize?client=chromium&lang=ar-QA&maxresults=10'
-
quote from https://gist.github.com/alotaiba/1730160/#comment-841611
$ cat speech.py import urllib2 url = "https://www.google.com/speech-api/v1/recognize?xjerr=1&client=chromium&lang=en-US" audio = open('rainspain.flac','rb').read() headers={'Content-Type': 'audio/x-flac; rate=16000', 'User-Agent':'Mozilla/5.0'} request = urllib2.Request(url, data=audio, headers=headers) response = urllib2.urlopen(request) print response.read() $ python speech.py {"status":0,"id":"57d2d1a7e7f1fa12d200026dde946c34-1","hypotheses":[{"utterance":"the rain in Spain falls mainly on the plains","confidence":0.8385102}]}