Tag Archives: speech-api

Google Speech API – Full Duplex PHP Version

So this is a follow up to my post a while ago, talking about how to use the Google Speech Recognition API built in to Google Chrome.

Since my last post, Chrome has had some significant upgrades to this feature- specifically around the length of audio you can pass to the API. The old version would only let you pass very short clips (only a few seconds), but the new API is a full-duplex streaming API. What this means, is that it actually uses two HTTP connections- one POST request to upload the content as a “live” chunked stream, and a second GET request to access the results, which makes much more sense for longer audio samples, or for streaming audio.

I created a simple PHP class to access this API; while this likely won’t make sense for anybody that wants to do a real-time stream, it should satisfy most cases where people just want to send “longer” audio clips.

Before you can use this PHP class, you must get a developer API key from Google. The class does not include one, and I cannot give you one- they’re free, and easy to get just go to the Google APIs site, and sign up for one.

google_apis

Then download the class below, and start with a simple example:

<? 
require 'google_speech.php';

$s = new cgoogle_speech('put your API key here'); 

$output = $s->process('@test.flac', 'en-US', 8000);      

print_r($output);
?>

Audio can be passed as a filename (by prefixing the ‘@’ sign in front of the file name), or by passing in raw FLAC content. The second argument is an IETF language tag. I’ve only been able to test with both English and French, but I assume others work. It defaults to ‘en-US’. The third argument is sample rate, it defaults to 8000.

** Your sample rate must match your file- if it doesn’t, you’ll either get nothing returned, or you’ll get a really bad transcription. **

The output will return as an array, and should look something like this:

Array
(
    [0] => Array
        (
            [alternative] => Array
                (
                    [0] => Array
                        (
                            [transcript] => my CPU is a neural net processor a learning computer
                            [confidence] => 0.74177068
                        )
                    [1] => Array
                        (
                            [transcript] => my CPU is the neuron that process of learning
                        )
                    [2] => Array
                        (
                            [transcript] => my CPU is the neural net processor a learning
                        )
                    [3] => Array
                        (
                            [transcript] => my CPU is the neuron that process a balloon
                        )
                    [4] => Array
                        (
                            [transcript] => my CPU is the neural net processor a living
                        )
                )
            [final] => 1
        )
)

Get the PHP class here: http://mikepultz.com/uploads/google_speech.php.zip