Skip to content

Instantly share code, notes, and snippets.

@germanattanasio
Last active November 22, 2021 17:21
Show Gist options
  • Save germanattanasio/ae26dc0144f229ad913a to your computer and use it in GitHub Desktop.
Save germanattanasio/ae26dc0144f229ad913a to your computer and use it in GitHub Desktop.
curl commands to use the Speech to Text service
#!/bin/sh
# This script clears the terminal, call the IBM Watson Speech to Text service.
USERNAME="<SERVICE_USERNAME>"
PASSWORD="<SERVICE_PASSWORD>"
SESSION_ID="<SESSION_ID>" # you will get this after running (1)
# 1. Create a session:
curl -X POST -b cookies.txt -c cookies.txt -u $USERNAME:$PASSWORD -d "{}" "https://stream.watsonplatform.net/speech-to-text/api/v1/sessions"
# This returns you a session URL. Note that the client needs to support cookies for sessions to work.
# 2. GET as follows to fetch the interim transcription results:
curl -b cookies.txt -c cookies.txt -u $USERNAME:$PASSWORD \
"https://stream.watsonplatform.net/speech-to-text/api/v1/sessions/$SESSION_ID/observer_result?interim_results=true"
# This request will wait until the audio is submitted, and then it will return interim results in a timely manner.
# 3. POST the audio to the session recgonize URL, similar to the above examples.
# Here the audio can be sent in realtime, as it is being recorded from the system microphone.
curl -X POST -b cookies.txt -c cookies.txt -u $USERNAME:$PASSWORD \
"https://stream.watsonplatform.net/speech-to-text/api/v1/sessions/$SESSION_ID/recognize?continuous=true" --header "Content-Type: audio/flac" --header "Transfer-Encoding: chunked" --data-binary @pcm0003.flac
# At this point you can continue submitting requests and observing interim results.
# 4. When you are done, close the session:
curl -X POST -b cookies.txt -c cookies.txt -u $USERNAME:$PASSWORD \
-X DELETE "https://stream.watsonplatform.net/speech-to-text/api/v1/sessions/$SESSION_ID
@CreativeMaladjustment
Copy link

line 19 comment says that audio from a live mic can be sent but the code seems to be calling for a .flac file. Is there an arecord option that writes out to a file that this monitors or can you show me how to have live audio from the mic sent in the example above?

i change line 21 and 22 to be something like...
arecord -D sysdefault:CARD=Device | curl -X POST -b cookies.txt -c cookies.txt -u $USERNAME:$PASSWORD
"https://stream.watsonplatform.net/speech-to-text/api/v1/sessions/$SESSION_ID/recognize?continuous=true" --header "Content-Type: audio/flac" --header "Transfer-Encoding: chunked" --data-binary @-

and it seems to be waiting for stdin... but it just keeps waiting nothing is returned... i suppose i am not ending the transmission so i do not ever get a response back. Is this on the right track? Can i use cURL to send live mic and get a constant stream of text back or do we have to send something, get something back and then send some more to get more back?

thank you in advance.

--jd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment