ompugao/google_speech2text.md

## google_speech2text.md

      
    Raw
  

              google_speech2text.md
            
          
    Google Speech To Text API

Base URL: https://www.google.com/speech-api/v1/recognize

It accepts POST requests with voice file encoded in FLAC format, and query parameters for control.
Query Parameters

client

The client's name you're connecting from. For spoofing purposes, let's use chromium
lang

Speech language, for example, ar-QA for Qatari Arabic, or en-US for U.S. English
maxresults

Maximum results to return for utterance
POST

body

Should contain FLAC formatted voice binary
HTTP Header

Content-Type

Should be audio/x-flac; rate=16000;, where MIME and sample rate of the FLAC file is included
User-Agent

Can be the client's user agent string, for spoofing purposes, we'll use Chrome's
Examples

These examples assume you have a voice file encoded in FLAC called alsalam-alikum.flac.
create flac file from wav

sudo aptitude install sox
sox input.flac input_fixed.flac rate 16k channels 1
wget

This will save JSON response in a file called recognized.json
wget --post-file='alsalam-alikum.flac' \
--user-agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7' \
--header='Content-Type: audio/x-flac; rate=16000;' \
-O 'recognized.json' \
'https://www.google.com/speech-api/v1/recognize?client=chromium&lang=ar-QA&maxresults=10'

curl

curl -X POST \
--data-binary @alsalam-alikum.flac \
--user-agent 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7' \
--header 'Content-Type: audio/x-flac; rate=16000;' \
'https://www.google.com/speech-api/v1/recognize?client=chromium&lang=ar-QA&maxresults=10'

python


quote from https://gist.github.com/alotaiba/1730160/#comment-841611
  $ cat speech.py
  import urllib2
  url = "https://www.google.com/speech-api/v1/recognize?xjerr=1&client=chromium&lang=en-US"
  audio = open('rainspain.flac','rb').read()
  headers={'Content-Type': 'audio/x-flac; rate=16000', 'User-Agent':'Mozilla/5.0'}
  request = urllib2.Request(url, data=audio, headers=headers)
  response = urllib2.urlopen(request)
  print response.read()

  $ python speech.py
  {"status":0,"id":"57d2d1a7e7f1fa12d200026dde946c34-1","hypotheses":[{"utterance":"the rain in Spain falls mainly on the plains","confidence":0.8385102}]}