don't_panic
personal and professional blog of mike pultz, technology specialist and serial entrepreneur.

23Mar/1172

Accessing Google Speech API / Chrome 11

Like this article? Follow me on Twitter @mikepultz for more updates.

Just yesterday, Google pushed version 11 of their Chrome browser into beta, and along with it, one really interesting new feature- support for the HTML5 speech input API. This means that you'll be able to talk to your computer, and Chrome will be able to interpret it. This feature has been available for awhile on Android devices, so many of you will already be used to it, and welcome the new feature.

If you're running Chrome version 11, you can test out the new speech capabilities by going to their simple test page on the html5rocks.com site:

http://slides.html5rocks.com/#speech-input

Genius! but how does it work? I started digging around in the Chromium source code, to find out if the speech recognition is implemented as a library built into Chrome, or, if it sends the audio back to Google to process- I know I've seen the Sphynx libraries in the Android build, but I was sure the latter was the case- the speech recognition was really good, and that's really hard to do without really good language models- not something you'd be able to build into a browser.

I found the files I was looking for in the chromium source repo:

http://src.chromium.org/viewvc/chrome/trunk/src/content/browser/speech/

It looks like the audio is collected from the mic, and then passed via an HTTPS POST to a Google web service, which responds with a JSON object with the results. Looking through their audio encoder code, it looks like the audio can be either FLAC or Speex- but it looks like it's some sort of specially modified version of Speex- I'm not sure what it is, but it just didn't look quite right.

If that's the case, there should be no reason why I can't just POST something to it myself?

The URL listed in speech_recognition_request.cc is:

https://www.google.com/speech-api/v1/recognize

So a quick few lines of PERL (or PHP or just use wget on the command line):

#!/usr/bin/perl

require LWP::UserAgent;

my $url = "https://www.google.com/speech-api/v1/recognize?xjerr=1&client=chromium&lang=en-US";
my $audio = "";

open(FILE, "<" . $ARGV[0]);
while(<FILE>)
{
    $audio .= $_;
}
close(FILE);

my $ua = LWP::UserAgent->new;

my $response = $ua->post($url, Content_Type => "audio/x-flac; rate=16000", Content => $audio);
if ($response->is_success)
{
    print $response->content;
}

1;

This quick PERL script uses LWP::UserAgent to POST the binary audio from my audio clip; I recorded a quick wav file, and then converted it to FLAC on the command line (see SoX for more info)

To run it, just do:

[root@prague mike]# ./speech i_like_pickles.flac

The response is pretty straight forward JSON:

{
    "status": 0,
    "id": "b3447b5d98c5653e0067f35b32c0a8ca-1",
    "hypotheses": [
    {
        "utterance": "i like pickles",
        "confidence": 0.9012539
    },
    {
        "utterance": "i like pickle"
    }]
}

I'm not sure if Google is intending this to be a public, usable web service API, but it works- and has all sorts of possibilities!

2Dec/100

Net_DNS2 Version 1.0.1 Released

The new version of Net_DNS2 is now available from the PEAR website.

This release fixes a small bug in the size calculation for TCP packets, and adds support for the WKS resource record.

Get it here.

12Sep/101

Mr.DNS Network Tools v1.9

This Mr.DNS release includes support for the AAAA (IPv6) record, the LOC record (for storing geo-location information in DNS)

and all the DNSSEC resource records (DNSKEY, RRSIG, NSEC, DS, NSEC3 and NSEC3PARAM).

This release is also the first release to us our new Net_DNS2 PEAR module. This module is significantly faster than the Net_DNS module, using new PHP5 constructs, exceptions, and includes many more RR's.

Net_DNS2 is available for download from the Google Code page, and soon from the PEAR site and command line PEAR installer.

2Aug/101

Mr.DNS Network Tools v1.7

I've released a new version of the Mr.DNS Network Tools website.

This release just has one new feature - "Website Neighbors"; this gives you a comprensive list of all the websites hosted on the same IP address as the IP address or hostname provided.

Hosting hundreds if not thousands of sites, on a single IP address is fairly common practice for shared web hosting providers- there's absolutely nothing wrong with it, but this gives you a good idea of how many, and what sites are hosted on the same system as your website.

This list is never 100% complete, but it will give you a fairly accurate estimate.

Filed under: Development, Mr.DNS 1 Comment
16Jul/100

Mr.DNS Network Tools v1.6

I've released a new version of the Mr.DNS Network Tools website.

New features include:

SPF Parsing and Validation:

DNS SRV Records:

HTTP Header Parsing for any HTTP/HTTPS URL:

and many other small changes and fixes.