Skip to main content

Speech to Text Using IBM Watson from Command Line

This is more like a personal notes for my reference. If it assists anyone it is a bonus.

Assumed that 

  • You have created an account on blue-mix
  • You have obtained the user-id and password (I will call them your_userid, your_password)
  • You have a speech file called speech.wav 
  • You desire to obtain the transcript in the file transcribed_speech.txt
  • You have installed curl on your machine
  • You are not behind a proxy.com

Then from the command (long) line using curl 


curl -u your_userid:your_password -X POST --header "Content-Type: audio/wav" --header "Transfer-Encoding: chunked" --data-binary @speech.wav https://stream.watsonplatform.net/speech-to-text/api/v1/recognize?continuous=true > transcribed_speech.txt

should create  the best according to watson speech to text engine transcription of the audio speech.wav in the file transcribed_speech.txt

if you are behind a firewall with p_user as the user-id and p_passwd is the password. Then the following should do the job!


curl --proxy http://p_user:p_passwd@proxy.com:8080 -your_userid:your_password -X POST --header "Content-Type: audio/wav" --header "Transfer-Encoding: chunked" --data-binary @speech.wav https://stream.watsonplatform.net/speech-to-text/api/v1/recognize?continuous=true > transcribed_speech.txt

Happy speech to text-ing :-) 
Sample very rudimentary perl script

Also: See how to similarly use Google ASR 

Comments

Sujit Devkar said…
This comment has been removed by the author.

Popular posts from this blog

Visualizing Speech Processing Challenges!

Often it is difficult to emphasize the difficulty that one faces during speech signal processing. Thanks to the large population use of speech recognition in the form of Alexa, Google Home when most of us are asking for a very limited information ("call my mother", "play the top 50 international hits" or "switch off the lights") which is quite well captured by the speech recognition engine in the form of contextual knowledge (it knows where you are; it knows your calendar, it know you parents phone number, it knows your preference, it knows your facebook likes .... ). Same Same - Different Different:   You speak X = /My voice is my password/ and I speak Y= /My voice is my password/. In speech recognition both our speech samples (X and Y) need to be recognized as "My voice is my password" while in speaker biometric X has to be attributed to you and and Y has to be attributed to me! In this blog post we try to show   visually   what it means to pro

BITS Pilani Goa Campus - Some Useful Information

You have cleared the BIT Aptitude Test and have got admission to BITS Pilani Goa Campus. Congratulation . Well Done. This is how the main building looks! Read on for some useful information, especially since you are traveling for the first time to the campus and more or less you will face the same scenario that we faced! We were asked report on 29-Jul-2018 (Sunday) to take admission on, 30-Jul-2018.  We reached Madgoan (we traveled by train though the airport is pretty close to the BITS campus, primarily to allow us to carry more luggage!)at around 0700 hours (expect a few drizzles to some good rain - so carry an umbrella) on 29-July-2019. As you come out you will be hounded by several taxi drivers, but the best is to take the official pre-paid taxi. It should cost you INR 700 to reach the BITS campus. We had booked a hotel in Vasco (this is one of the closest suburb from BITS campus, a taxi should charge you around 300-350 INR; you will make plenty of trips!) and

Appraisals - The Yearly Ache

{A Personal View} Background Apprisals are an yearly ache that we have imposed on ourselves to make ourselves a better person than we were previously. It is stressful to both the  assessor (manager) who is on one side of the spectrum and the assessee (team member). It is at this time of the year the distance between the manager and the team members is at the maximum, even though they are part of the same team. Irrespective of which role you play you have a solid list of greviences against the  other role. So it is nothing to do with the role, it is to do with the situation. While the tension is on during the apprisal time - the folks in human resource bring in their wisdom and introduce the dreaded Gaussian curve\band and introduce a constraint where you as a manager have to necessarily have a certain number of your team members in a certain region under the curve and then the finance folks attach a monetary value to each part of the curve. These two, the constraint linked to money, ma