public
Last active

Google Speech To Text API

  • Download Gist
google_speech2text.md
Markdown

Google Speech To Text API

Base URL: https://www.google.com/speech-api/v1/recognize
It accepts POST requests with voice file encoded in FLAC format, and query parameters for control.

Query Parameters

client
The client's name you're connecting from. For spoofing purposes, let's use chromium

lang
Speech language, for example, ar-QA for Qatari Arabic, or en-US for U.S. English

maxresults
Maximum results to return for utterance

POST

body
Should contain FLAC formatted voice binary

HTTP Header

Content-Type
Should be audio/x-flac; rate=16000;, where MIME and sample rate of the FLAC file is included

User-Agent
Can be the client's user agent string, for spoofing purposes, we'll use Chrome's

Examples

These examples assume you have a voice file encoded in FLAC called alsalam-alikum.flac.

wget

This will save JSON response in a file called recognized.json

wget --post-file='alsalam-alikum.flac' \
--user-agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7' \
--header='Content-Type: audio/x-flac; rate=16000;' \
-O 'recognized.json' \
'https://www.google.com/speech-api/v1/recognize?client=chromium&lang=ar-QA&maxresults=10'

curl

curl -X POST \
--data-binary @alsalam-alikum.flac \
--user-agent 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7' \
--header 'Content-Type: audio/x-flac; rate=16000;' \
'https://www.google.com/speech-api/v1/recognize?client=chromium&lang=ar-QA&maxresults=10'

API might have changed

<HTML>
<HEAD>
<TITLE>Recognition failed: NO_MATCH</TITLE>
</HEAD>
<BODY BGCOLOR="#FFFFFF" TEXT="#000000">
<H1>Recognition failed: NO_MATCH</H1>
<H2>Error 400</H2>
</BODY>
</HTML>
<HEAD>
<TITLE>HTTP method GET is not supported by this URL</TITLE>
</HEAD>
<BODY BGCOLOR="#FFFFFF" TEXT="#000000">
<H1>HTTP method GET is not supported by this URL</H1>
<H2>Error 405</H2>
</BODY>
</HTML>
wget -q -U "Mozilla/5.0" --post-file /message.flac --header="Content-Type: audio/x-flac; rate=16000" -O - "http://www.google.com/speech-api/v1/recognize?lang=pl-pl&client=chromium"

This should work.

In Python:

$ cat speech.py
import urllib2
url = "https://www.google.com/speech-api/v1/recognize?xjerr=1&client=chromium&lang=en-US"
audio = open('rainspain.flac','rb').read()
headers={'Content-Type': 'audio/x-flac; rate=16000', 'User-Agent':'Mozilla/5.0'}
request = urllib2.Request(url, data=audio, headers=headers)
response = urllib2.urlopen(request)
print response.read()

$ python speech.py
{"status":0,"id":"57d2d1a7e7f1fa12d200026dde946c34-1","hypotheses":[{"utterance":"the rain in Spain falls mainly on the plains","confidence":0.8385102}]}

google does not return anything... no result, has somebody an idea why?

Any one can tell how to implement this using javascript(Ajax post)

Note that the API only accepts audio files of 15 seconds or less.

use java or jsp

<%@ page import="java.io.BufferedInputStream"%>
<%@ page import="java.io.BufferedOutputStream"%>
<%@ page import="java.io.ByteArrayOutputStream"%>
<%@ page import="java.io.File"%>
<%@ page import="java.io.*"%>
<%@ page import="java.io.FileOutputStream"%>
<%@ page import="java.net.HttpURLConnection"%>
<%@ page import="java.net.URL"%>
<%@ page import="java.net.URLEncoder"%>
<%@ page import="java.net.Proxy"%>
<%@ page import="java.net.InetSocketAddress"%>

<%@ page import="javaFlacEncoder.FLACEncoder"%>
<%@ page import="javaFlacEncoder.FLACFileOutputStream"%>
<%@ page import="javaFlacEncoder.FLAC_FileEncoder"%>
<%@ page import="javaFlacEncoder.StreamConfiguration"%>

<%@ page import="org.apache.http.client.HttpClient"%>
<%@ page import="org.apache.http.client.methods.HttpPost"%>
<%@ page import="org.apache.http.entity.mime.MultipartEntity"%>
<%@ page import="org.apache.http.entity.mime.content.ContentBody"%>
<%@ page import="org.apache.http.entity.mime.content.FileBody"%>
<%@ page import="org.apache.commons.httpclient.params.HttpClientParams"%>
<%@ page import="org.apache.http.impl.client.DefaultHttpClient"%>
<%@ page import="org.apache.http.params.CoreProtocolPNames"%>
<%@ page import="org.apache.http.HttpVersion"%>
<%@ page import="org.apache.http.HttpResponse"%>
<%@ page import="org.apache.commons.httpclient.methods.PostMethod"%>
<%@ page import="org.apache.http.conn.params.ConnRoutePNames"%>
<%@ page import="org.apache.http.HttpHost"%>

<%@ page import="javax.sound.sampled.AudioFormat"%>
<%@ page import="javax.sound.sampled.AudioInputStream"%>
<%@ page import="javax.sound.sampled.AudioSystem"%>
<%@ page import="java.nio.ByteBuffer"%>
<%@ page import="java.nio.ByteOrder"%>

<%@ page import="java.net.URLConnection"%>

<%@ page import="java.util.Date"%>

<% String SPEECH_TEXT_TO_SERVICE = "https://api.twilio.com/2010-04-01/Accounts/AC9d34d1192faffb0989f0227e5768d24d/Recordings/RE8a3e55ab37113780ca65f1e02fa1ad11";
//"https://api.twilio.com/2010-04-01/Accounts/AC9d34d1192faffb0989f0227e5768d24d/Recordings/RE23b0826200ff4f2adcaa9e1d9ac4180a";

      String strUrl = (request.getParameter("urls")+"" != null)?request.getParameter("urls")+"":SPEECH_TEXT_TO_SERVICE;
      //SPEECH_TEXT_TO_SERVICE;
      //
      String USER_AGENT =  "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:11.0) " + "Gecko/20100101 Firefox/11.0";

      URL url = new URL(strUrl);          
      Proxy proxy =new Proxy(Proxy.Type.HTTP, new InetSocketAddress("192.168.53.29", 3128));
      HttpURLConnection connection=(HttpURLConnection) url.openConnection(proxy);
      connection.setRequestMethod("GET");
      connection.addRequestProperty("User-Agent", "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.70 Safari/537.36");

      out.println(new Date().toString().substring(11,20).replaceAll("IST 2013 ","")+" =====> Starting downloading <br>");
      connection.connect();
      for (int i=0;connection.getHeaderField(i) != null;i++)
      { 
         //out.println(connection.getContentType());
      }
    Long time = connection.getDate();
    InputStream inputStream = connection.getInputStream();
      File input = new File(time+"_input.wav");

      FileOutputStream f = new FileOutputStream(input);
    OutputStream outputStream = f;
      int read = 0;
    byte[] bytes = new byte[1024];
    while ((read = inputStream.read(bytes)) != -1) 
    {
        outputStream.write(bytes, 0, read);
    }
    outputStream.close();
    AudioInputStream audioInputStream1 = AudioSystem.getAudioInputStream(input);
    AudioFormat audioFormat1 = audioInputStream1.getFormat();
    int sampleRate =(int) audioFormat1.getSampleRate();

    out.println(new Date().toString().substring(11,20)+" =====> Create file <br>");
    File output = new File(time+"output.flac");          
      out.println(new Date().toString().substring(11,20)+" =====> FLAC Convert Start <br>");

        StreamConfiguration streamConfiguration = new StreamConfiguration();
    streamConfiguration.setSampleRate(sampleRate);
    streamConfiguration.setBitsPerSample(16);
    streamConfiguration.setChannelCount(1);

    AudioInputStream audioInputStream = AudioSystem.getAudioInputStream(input);
    AudioFormat format = audioInputStream.getFormat();

    int frameSize = format.getFrameSize();

    FLACEncoder flacEncoder = new FLACEncoder();
    FLACFileOutputStream flacOutputStream = new FLACFileOutputStream(output);

    flacEncoder.setStreamConfiguration(streamConfiguration);
    flacEncoder.setOutputStream(flacOutputStream);

    flacEncoder.openFLACStream();

    int[] sampleData = new int[(int) audioInputStream.getFrameLength()];
    byte[] samplesIn = new byte[frameSize];

      int x = 0;

        while (audioInputStream.read(samplesIn, 0, frameSize) != -1) {
            if (frameSize != 1) 
            {
                ByteBuffer bb = ByteBuffer.wrap(samplesIn);
                bb.order(ByteOrder.LITTLE_ENDIAN);
                short shortVal = bb.getShort();
                sampleData[x] = shortVal;
            } else {
                sampleData[x] = samplesIn[0];
            }

            x++;
        }

    flacEncoder.addSamples(sampleData, x);
    flacEncoder.encodeSamples(x, false);
    flacEncoder.encodeSamples(flacEncoder.samplesAvailableToEncode(), true);

    audioInputStream.close();
    flacOutputStream.close();
    out.println(new Date().toString().substring(11,20)+" =====> Convet End <br>");
    out.println(new Date().toString().substring(11,20)+" =====> Send google api Start <br>");

    StringBuilder sb = new StringBuilder("https://www.google.com/speech-api/v1/recognize?client=chromium&lang=en-US&maxresults=10");

// sb.append("&lang=en");
sb.append("&pfilter=0");
// sb.append("&maxresults=2");
URL url1 = new URL(sb.toString());
URLConnection urlConn = url1.openConnection(proxy);
urlConn.setDoOutput(true);
urlConn.setUseCaches(false);
urlConn.setRequestProperty("Content-Type", "audio/x-flac; rate=8000");
OutputStream outputStream1 = urlConn.getOutputStream();
FileInputStream fileInputStream = new FileInputStream(output);
byte[] buffer = new byte[1024];
while ((fileInputStream.read(buffer, 0, 256)) != -1)
{
outputStream1.write(buffer, 0, 256);
outputStream1.flush();
}
fileInputStream.close();
outputStream1.close();
BufferedReader br = new BufferedReader(new InputStreamReader(urlConn.getInputStream()));
String response2 = br.readLine();
br.close();

    input.delete();
    output.delete();
      out.println(new Date().toString().substring(11,20)+" =====> receive google api result <br>");
      String resp[] = response2.split("utterance");
      for(int k=0;k<resp.length;k++)
      { resp[k]=resp[k].replaceAll("\":\"","").replace("\"},{\"","").replace("\"}]}","");
        if (k==1)
        {
            String temp[] =resp[k].split("\",\"confidence\":");
            out.println("<br> Confidence :<b> "+temp[1].replace("},{\"","")+"</b>");
            out.println("<br>"+k+" . "+temp[0]+"<br>");
        }
        else if(k==resp.length-1)
        {
            out.println(k+" . "+resp[k]+"<br>");
        }
        else if(k!=0)
        {
            out.println(k+" . "+resp[k]+"<br>");
        }

    }
    //out.println(response2+"<br>");
      out.println(sampleRate+"<br>");

%>

use java or jsp

<%@ page import="java.io.BufferedInputStream"%>
<%@ page import="java.io.BufferedOutputStream"%>
<%@ page import="java.io.ByteArrayOutputStream"%>
<%@ page import="java.io.File"%>
<%@ page import="java.io.*"%>
<%@ page import="java.io.FileOutputStream"%>
<%@ page import="java.net.HttpURLConnection"%>
<%@ page import="java.net.URL"%>
<%@ page import="java.net.URLEncoder"%>
<%@ page import="java.net.Proxy"%>
<%@ page import="java.net.InetSocketAddress"%>

<%@ page import="javaFlacEncoder.FLACEncoder"%>
<%@ page import="javaFlacEncoder.FLACFileOutputStream"%>
<%@ page import="javaFlacEncoder.FLAC_FileEncoder"%>
<%@ page import="javaFlacEncoder.StreamConfiguration"%>

<%@ page import="org.apache.http.client.HttpClient"%>
<%@ page import="org.apache.http.client.methods.HttpPost"%>
<%@ page import="org.apache.http.entity.mime.MultipartEntity"%>
<%@ page import="org.apache.http.entity.mime.content.ContentBody"%>
<%@ page import="org.apache.http.entity.mime.content.FileBody"%>
<%@ page import="org.apache.commons.httpclient.params.HttpClientParams"%>
<%@ page import="org.apache.http.impl.client.DefaultHttpClient"%>
<%@ page import="org.apache.http.params.CoreProtocolPNames"%>
<%@ page import="org.apache.http.HttpVersion"%>
<%@ page import="org.apache.http.HttpResponse"%>
<%@ page import="org.apache.commons.httpclient.methods.PostMethod"%>
<%@ page import="org.apache.http.conn.params.ConnRoutePNames"%>
<%@ page import="org.apache.http.HttpHost"%>

<%@ page import="javax.sound.sampled.AudioFormat"%>
<%@ page import="javax.sound.sampled.AudioInputStream"%>
<%@ page import="javax.sound.sampled.AudioSystem"%>
<%@ page import="java.nio.ByteBuffer"%>
<%@ page import="java.nio.ByteOrder"%>

<%@ page import="java.net.URLConnection"%>

<%@ page import="java.util.Date"%>

<% String SPEECH_TEXT_TO_SERVICE = "https://api.twilio.com/2010-04-01/Accounts/AC9d34d1192faffb0989f0227e5768d24d/Recordings/RE8a3e55ab37113780ca65f1e02fa1ad11";
//"https://api.twilio.com/2010-04-01/Accounts/AC9d34d1192faffb0989f0227e5768d24d/Recordings/RE23b0826200ff4f2adcaa9e1d9ac4180a";

      String strUrl = (request.getParameter("urls")+"" != null)?request.getParameter("urls")+"":SPEECH_TEXT_TO_SERVICE;
      //SPEECH_TEXT_TO_SERVICE;
      //
      String USER_AGENT =  "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:11.0) " + "Gecko/20100101 Firefox/11.0";

      URL url = new URL(strUrl);          
      Proxy proxy =new Proxy(Proxy.Type.HTTP, new InetSocketAddress("192.168.53.29", 3128));
      HttpURLConnection connection=(HttpURLConnection) url.openConnection(proxy);
      connection.setRequestMethod("GET");
      connection.addRequestProperty("User-Agent", "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.70 Safari/537.36");

      out.println(new Date().toString().substring(11,20).replaceAll("IST 2013 ","")+" =====> Starting downloading <br>");
      connection.connect();
      for (int i=0;connection.getHeaderField(i) != null;i++)
      { 
         //out.println(connection.getContentType());
      }
    Long time = connection.getDate();
    InputStream inputStream = connection.getInputStream();
      File input = new File(time+"_input.wav");

      FileOutputStream f = new FileOutputStream(input);
    OutputStream outputStream = f;
      int read = 0;
    byte[] bytes = new byte[1024];
    while ((read = inputStream.read(bytes)) != -1) 
    {
        outputStream.write(bytes, 0, read);
    }
    outputStream.close();
    AudioInputStream audioInputStream1 = AudioSystem.getAudioInputStream(input);
    AudioFormat audioFormat1 = audioInputStream1.getFormat();
    int sampleRate =(int) audioFormat1.getSampleRate();

    out.println(new Date().toString().substring(11,20)+" =====> Create file <br>");
    File output = new File(time+"output.flac");          
      out.println(new Date().toString().substring(11,20)+" =====> FLAC Convert Start <br>");

        StreamConfiguration streamConfiguration = new StreamConfiguration();
    streamConfiguration.setSampleRate(sampleRate);
    streamConfiguration.setBitsPerSample(16);
    streamConfiguration.setChannelCount(1);

    AudioInputStream audioInputStream = AudioSystem.getAudioInputStream(input);
    AudioFormat format = audioInputStream.getFormat();

    int frameSize = format.getFrameSize();

    FLACEncoder flacEncoder = new FLACEncoder();
    FLACFileOutputStream flacOutputStream = new FLACFileOutputStream(output);

    flacEncoder.setStreamConfiguration(streamConfiguration);
    flacEncoder.setOutputStream(flacOutputStream);

    flacEncoder.openFLACStream();

    int[] sampleData = new int[(int) audioInputStream.getFrameLength()];
    byte[] samplesIn = new byte[frameSize];

      int x = 0;

        while (audioInputStream.read(samplesIn, 0, frameSize) != -1) {
            if (frameSize != 1) 
            {
                ByteBuffer bb = ByteBuffer.wrap(samplesIn);
                bb.order(ByteOrder.LITTLE_ENDIAN);
                short shortVal = bb.getShort();
                sampleData[x] = shortVal;
            } else {
                sampleData[x] = samplesIn[0];
            }

            x++;
        }

    flacEncoder.addSamples(sampleData, x);
    flacEncoder.encodeSamples(x, false);
    flacEncoder.encodeSamples(flacEncoder.samplesAvailableToEncode(), true);

    audioInputStream.close();
    flacOutputStream.close();
    out.println(new Date().toString().substring(11,20)+" =====> Convet End <br>");
    out.println(new Date().toString().substring(11,20)+" =====> Send google api Start <br>");

    StringBuilder sb = new StringBuilder("https://www.google.com/speech-api/v1/recognize?client=chromium&lang=en-US&maxresults=10");

// sb.append("&lang=en");
sb.append("&pfilter=0");
// sb.append("&maxresults=2");
URL url1 = new URL(sb.toString());
URLConnection urlConn = url1.openConnection(proxy);
urlConn.setDoOutput(true);
urlConn.setUseCaches(false);
urlConn.setRequestProperty("Content-Type", "audio/x-flac; rate=8000");
OutputStream outputStream1 = urlConn.getOutputStream();
FileInputStream fileInputStream = new FileInputStream(output);
byte[] buffer = new byte[1024];
while ((fileInputStream.read(buffer, 0, 256)) != -1)
{
outputStream1.write(buffer, 0, 256);
outputStream1.flush();
}
fileInputStream.close();
outputStream1.close();
BufferedReader br = new BufferedReader(new InputStreamReader(urlConn.getInputStream()));
String response2 = br.readLine();
br.close();

    input.delete();
    output.delete();
      out.println(new Date().toString().substring(11,20)+" =====> receive google api result <br>");
      String resp[] = response2.split("utterance");
      for(int k=0;k<resp.length;k++)
      { resp[k]=resp[k].replaceAll("\":\"","").replace("\"},{\"","").replace("\"}]}","");
        if (k==1)
        {
            String temp[] =resp[k].split("\",\"confidence\":");
            out.println("<br> Confidence :<b> "+temp[1].replace("},{\"","")+"</b>");
            out.println("<br>"+k+" . "+temp[0]+"<br>");
        }
        else if(k==resp.length-1)
        {
            out.println(k+" . "+resp[k]+"<br>");
        }
        else if(k!=0)
        {
            out.println(k+" . "+resp[k]+"<br>");
        }

    }
    //out.println(response2+"<br>");
      out.println(sampleRate+"<br>");

%>

<%@ page import="java.io.File"%>
<%@ page import="java.io.InputStream"%>
<%@ page import="java.io.OutputStream"%>
<%@ page import="java.io.BufferedReader"%>
<%@ page import="java.io.InputStreamReader"%>
<%@ page import="java.io.FileInputStream"%>
<%@ page import="java.io.FileOutputStream"%>
<%@ page import="java.net.HttpURLConnection"%>

<%@ page import="java.net.URL"%>
<%@ page import="java.net.URLConnection"%>
<%@ page import="java.net.URLEncoder"%>
<%@ page import="java.net.Proxy"%>
<%@ page import="java.net.InetSocketAddress"%>

<%@ page import="javaFlacEncoder.FLACEncoder"%>
<%@ page import="javaFlacEncoder.FLACFileOutputStream"%>
<%@ page import="javaFlacEncoder.FLAC_FileEncoder"%>
<%@ page import="javaFlacEncoder.StreamConfiguration"%>

<%@ page import="javax.sound.sampled.AudioFormat"%>
<%@ page import="javax.sound.sampled.AudioInputStream"%>
<%@ page import="javax.sound.sampled.AudioSystem"%>

<%@ page import="java.nio.ByteBuffer"%>
<%@ page import="java.nio.ByteOrder"%>

<%@ page import="java.util.Date"%>

<%

      String SPEECH_TEXT_TO_SERVICE = "https://api.twilio.com/2010-04-01/Accounts/AC9d34d1192faffb0989f0227e5768d24d/Recordings/RE8a3e55ab37113780ca65f1e02fa1ad11";
      //"https://api.twilio.com/2010-04-01/Accounts/AC9d34d1192faffb0989f0227e5768d24d/Recordings/RE23b0826200ff4f2adcaa9e1d9ac4180a";
      //"https://api.twilio.com/2010-04-01/Accounts/AC9d34d1192faffb0989f0227e5768d24d/Recordings/REc17df0c2d73720fbc55cf6c28f584fca";
      //"https://api.twilio.com/2010-04-01/Accounts/AC9d34d1192faffb0989f0227e5768d24d/Recordings/RE8a3e55ab37113780ca65f1e02fa1ad11";
      //"http://rajesh-zu273:8080/audiotest/speach.wav";
      //"https://api.twilio.com/2010-04-01/Accounts/AC9d34d1192faffb0989f0227e5768d24d/Recordings/RE45b081bfe0f00a3cddac4c27512d0159";
      //
      //"https://api.twilio.com/2010-04-01/Accounts/AC9d34d1192faffb0989f0227e5768d24d/Recordings/RE5c752448b7498b68125e33cdf9c971dd";
      //"https://api.twilio.com/2010-04-01/Accounts/AC9d34d1192faffb0989f0227e5768d24d/Recordings/RE26ecb20f15953f6177fda582f9a32aa4";
      //"https://api.twilio.com/2010-04-01/Accounts/ACda006002ebc844b68606b05be0df5e4c/Recordings/REfda7271ce5acc220258dcc1f48d210c2";
      //"http://www5.online-convert.com/download-file/c700fea6cbfc16e7b2172c7f89602a62/converted-b2773e34.wav";
      //"http://www5.online-convert.com/download-file/353123ab67c2ec6659f3722230df2713/converted-fa81c0f8.wav";
      //"http://www19.online-convert.com/download-file/7466a3c51f7c5875e01335a3d5212b7f/converted-6b28c038.wav";
      String strUrl = (request.getParameter("url")+"" != null)?request.getParameter("url")+"":SPEECH_TEXT_TO_SERVICE;

      String USER_AGENT =  "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:11.0) " + "Gecko/20100101 Firefox/11.0";

      URL url = new URL(strUrl);          
      Proxy proxy =new Proxy(Proxy.Type.HTTP, new InetSocketAddress("192.168.53.29", 3128));
      HttpURLConnection connection=(HttpURLConnection) url.openConnection(proxy);
      connection.setRequestMethod("GET");
      connection.addRequestProperty("User-Agent", "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.70 Safari/537.36");

      out.println(new Date().toString().substring(11,20).replaceAll("IST 2013 ","")+" =====> Starting downloading <br>");
      connection.connect();

    Long time = connection.getDate();
    InputStream inputStream = connection.getInputStream();
      File input = new File(time+"_input.wav");

      FileOutputStream f = new FileOutputStream(input);
    OutputStream outputStream = f;
      int read = 0;
    byte[] bytes = new byte[1024];
    while ((read = inputStream.read(bytes)) != -1) 
    {
        outputStream.write(bytes, 0, read);
    }
    outputStream.close();



    AudioInputStream audioInputStream1 = AudioSystem.getAudioInputStream(input);
    AudioFormat audioFormat1 = audioInputStream1.getFormat();
    int sampleRate =(int) audioFormat1.getSampleRate();

    out.println(new Date().toString().substring(11,20)+" =====> Create file <br>");
    File output = new File(time+"output.flac");  


    //FLAC Convert Start

      out.println(new Date().toString().substring(11,20)+" =====> FLAC Convert Start <br>");

    StreamConfiguration streamConfiguration = new StreamConfiguration();
    streamConfiguration.setSampleRate(sampleRate);
    streamConfiguration.setBitsPerSample(16);
    streamConfiguration.setChannelCount(1);

    AudioInputStream audioInputStream = AudioSystem.getAudioInputStream(input);
    AudioFormat format = audioInputStream.getFormat();

    int frameSize = format.getFrameSize();

    FLACEncoder flacEncoder = new FLACEncoder();
    FLACFileOutputStream flacOutputStream = new FLACFileOutputStream(output);

    flacEncoder.setStreamConfiguration(streamConfiguration);
    flacEncoder.setOutputStream(flacOutputStream);

    flacEncoder.openFLACStream();

    int[] sampleData = new int[(int) audioInputStream.getFrameLength()];
    byte[] samplesIn = new byte[frameSize];

      int x = 0;
    while (audioInputStream.read(samplesIn, 0, frameSize) != -1) 
    {
            if (frameSize != 1) 
            {
                ByteBuffer bb = ByteBuffer.wrap(samplesIn);
                bb.order(ByteOrder.LITTLE_ENDIAN);
                short shortVal = bb.getShort();
                sampleData[x] = shortVal;
            }
            else
            {
                sampleData[x] = samplesIn[0];
            }
            x++;
        }

    flacEncoder.addSamples(sampleData, x);
    flacEncoder.encodeSamples(x, false);
    flacEncoder.encodeSamples(flacEncoder.samplesAvailableToEncode(), true);

    audioInputStream.close();
    flacOutputStream.close();

    //FLAC Convert Start
    out.println(new Date().toString().substring(11,20)+" =====> Convet End <br>");

    out.println(new Date().toString().substring(11,20)+" =====> Send google api Start <br>");


    StringBuilder sb = new StringBuilder("https://www.google.com/speech-api/v1/recognize?client=chromium&lang=en-US&maxresults=10");

// sb.append("&lang=en");
sb.append("&pfilter=0");
// sb.append("&maxresults=2");
URL url1 = new URL(sb.toString());
URLConnection urlConn = url1.openConnection(proxy);
urlConn.setDoOutput(true);
urlConn.setUseCaches(false);
urlConn.setRequestProperty("Content-Type", "audio/x-flac; rate=8000");
OutputStream outputStream1 = urlConn.getOutputStream();
FileInputStream fileInputStream = new FileInputStream(output);
byte[] buffer = new byte[1024];
while ((fileInputStream.read(buffer, 0, 256)) != -1)
{
outputStream1.write(buffer, 0, 256);
outputStream1.flush();
}
fileInputStream.close();
outputStream1.close();
BufferedReader br = new BufferedReader(new InputStreamReader(urlConn.getInputStream()));
String response2 = br.readLine();
br.close();

    input.delete();
    output.delete();
      out.println(new Date().toString().substring(11,20)+" =====> receive google api result <br>");
      String resp[] = response2.split("utterance");
      for(int k=0;k<resp.length;k++)
      { resp[k]=resp[k].replaceAll("\":\"","").replace("\"},{\"","").replace("\"}]}","");
        if (k==1)
        {
            String temp[] =resp[k].split("\",\"confidence\":");
            out.println("<br> Confidence :<b> "+temp[1].replace("},{\"","")+"</b>");
            out.println("<br>"+k+" . "+temp[0]+"<br>");
        }
        else if(k==resp.length-1)
        {
            out.println(k+" . "+resp[k]+"<br>");
        }
        else if(k!=0)
        {
            out.println(k+" . "+resp[k]+"<br>");
        }

    }
    //out.println(response2+"<br>");
      out.println(sampleRate+"<br>");

%>

Flux3000 mentioned above that "Note that the API only accepts audio files of 15 seconds or less." Is there any way to make it longer? I would like to convert longer audio files to text like 2 or 3 minutes. Has anyone tried it yet?

Just tested from Python, with @ewoudenberg wonderfully simple script.
Still working. Perfectly when audio is clear enough.

I guess @kumachan2013 already found a solution. The first workaround I can think about is to split the audio on every pause (commas and periods). If there's an easy way to detect a pause in your audio, that is. ^_^u

i have tried the command below:

curl -v -X POST \
--data-binary @speech.flac \
--user-agent 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7' \
--header 'Content-Type: audio/x-flac; rate=8000;' \
'https://www.google.com/speech-api/v1/recognize?client=chromium&lang=zh-CN&maxresults=10' 

It never works but return as below:

<!DOCTYPE html>
<html lang=en>
  <meta charset=utf-8>
  <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">
  <title>Error 500 (Server Error)!!1</title>
  <style>
   ...
  </style>
  <a href=//www.google.com/><img src=//www.google.com/images/errors/logo_sm.gif alt=Google></a>
  <p><b>500.</b> <ins>That’s an error.</ins>
  <p>The server encountered an error and could not complete your request.<p>If the problem persists, please <A HREF="http://www.google.com/support/">report</A> your problem and mention this error message and the query that caused it.  <ins>That’s all we know.</ins>

does anyone have the same problem?

@zjh1943, I had encountered the same error when I set incorrect sample rate, I found this notice:
** Your sample rate must match your file- if it doesn’t, you’ll either get nothing returned, or you’ll get a really bad transcription. **
at http://mikepultz.com/2013/07/google-speech-api-full-duplex-php-version/, so check your sample rate.

Google has a second version of their API available, I've posted some research and examples in a seperate repository. https://github.com/gillesdemey/google-speech-v2

I have tried both first and second versions of the API. They are too slow. Don't know how the speech demo is so fast (https://www.google.com/intl/en/chrome/demos/speech.html).
Even spoofing doesn't seems to work.
Has anyone got it work close to the Chrome demo.

im showing a rate limit of 50 per day @ https://console.developers.google.com/ for the speech-api/v2 mention by @gillesdemey . There does not appear to be any way to up the increase the limit or to pay for an increased limit.

any advice? I have an app ready for beta making extensive use of V2 speech-api.

another award winning piece from Google

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.