Tag Archives: PHP

Google Speech API – Full Duplex PHP Version

So this is a follow up to my post a while ago, talking about how to use the Google Speech Recognition API built in to Google Chrome.

Since my last post, Chrome has had some significant upgrades to this feature- specifically around the length of audio you can pass to the API. The old version would only let you pass very short clips (only a few seconds), but the new API is a full-duplex streaming API. What this means, is that it actually uses two HTTP connections- one POST request to upload the content as a “live” chunked stream, and a second GET request to access the results, which makes much more sense for longer audio samples, or for streaming audio.

I created a simple PHP class to access this API; while this likely won’t make sense for anybody that wants to do a real-time stream, it should satisfy most cases where people just want to send “longer” audio clips.

Before you can use this PHP class, you must get a developer API key from Google. The class does not include one, and I cannot give you one- they’re free, and easy to get just go to the Google APIs site, and sign up for one.

google_apis

Then download the class below, and start with a simple example:

<? 
require 'google_speech.php';

$s = new cgoogle_speech('put your API key here'); 

$output = $s->process('@test.flac', 'en-US', 8000);      

print_r($output);
?>

Audio can be passed as a filename (by prefixing the ‘@’ sign in front of the file name), or by passing in raw FLAC content. The second argument is an IETF language tag. I’ve only been able to test with both English and French, but I assume others work. It defaults to ‘en-US’. The third argument is sample rate, it defaults to 8000.

** Your sample rate must match your file- if it doesn’t, you’ll either get nothing returned, or you’ll get a really bad transcription. **

The output will return as an array, and should look something like this:

Array
(
    [0] => Array
        (
            [alternative] => Array
                (
                    [0] => Array
                        (
                            [transcript] => my CPU is a neural net processor a learning computer
                            [confidence] => 0.74177068
                        )
                    [1] => Array
                        (
                            [transcript] => my CPU is the neuron that process of learning
                        )
                    [2] => Array
                        (
                            [transcript] => my CPU is the neural net processor a learning
                        )
                    [3] => Array
                        (
                            [transcript] => my CPU is the neuron that process a balloon
                        )
                    [4] => Array
                        (
                            [transcript] => my CPU is the neural net processor a living
                        )
                )
            [final] => 1
        )
)

Get the PHP class here: http://mikepultz.com/uploads/google_speech.php.zip

Mining Twitter API v1.1 Streams from PHP – with OAuth

This is a quick update to my post about a year ago, with details on how to mine Twitter streams in real-time using PHP. This new code includes updates for the v1.1 API, including authentication using OAuth.

The first thing you need to do is sign in to the Twitter developer portal with your Twitter account here: https://dev.twitter.com/user/login twitter_account

Once you’ve logged in, click on your profile icon in the top right hand corner, select
“My applications”, and create a new application if you don’t already have one.

Select the option to create the access token as well, as the requests need to be signed by a Twitter account.

The Code

ctwitter_stream.php

class ctwitter_stream
{
    private $m_oauth_consumer_key;
    private $m_oauth_consumer_secret;
    private $m_oauth_token;
    private $m_oauth_token_secret;

    private $m_oauth_nonce;
    private $m_oauth_signature;
    private $m_oauth_signature_method = 'HMAC-SHA1';
    private $m_oauth_timestamp;
    private $m_oauth_version = '1.0';

    public function __construct()
    {
        //
        // set a time limit to unlimited
        //
        set_time_limit(0);
    }

    //
    // set the login details
    //
    public function login($_consumer_key, $_consumer_secret, $_token, $_token_secret)
    {
        $this->m_oauth_consumer_key     = $_consumer_key;
        $this->m_oauth_consumer_secret  = $_consumer_secret;
        $this->m_oauth_token            = $_token;
        $this->m_oauth_token_secret     = $_token_secret;

        //
        // generate a nonce; we're just using a random md5() hash here.
        //
        $this->m_oauth_nonce = md5(mt_rand());

        return true;
    }

    //
    // process a tweet object from the stream
    //
    private function process_tweet(array $_data)
    {
        print_r($_data);

        return true;
    }

    //
    // the main stream manager
    //
    public function start(array $_keywords)
    {
        while(1)
        {
            $fp = fsockopen("ssl://stream.twitter.com", 443, $errno, $errstr, 30);
            if (!$fp)
            {
                echo "ERROR: Twitter Stream Error: failed to open socket";
            } else
            {
                //
                // build the data and store it so we can get a length
                //
                $data = 'track=' . rawurlencode(implode($_keywords, ','));

                //
                // store the current timestamp
                //
                $this->m_oauth_timestamp = time();

                //
                // generate the base string based on all the data
                //
                $base_string = 'POST&' . 
                    rawurlencode('https://stream.twitter.com/1.1/statuses/filter.json') . '&' .
                    rawurlencode('oauth_consumer_key=' . $this->m_oauth_consumer_key . '&' .
                        'oauth_nonce=' . $this->m_oauth_nonce . '&' .
                        'oauth_signature_method=' . $this->m_oauth_signature_method . '&' . 
                        'oauth_timestamp=' . $this->m_oauth_timestamp . '&' .
                        'oauth_token=' . $this->m_oauth_token . '&' .
                        'oauth_version=' . $this->m_oauth_version . '&' .
                        $data);

                //
                // generate the secret key to use to hash
                //
                $secret = rawurlencode($this->m_oauth_consumer_secret) . '&' . 
                    rawurlencode($this->m_oauth_token_secret);

                //
                // generate the signature using HMAC-SHA1
                //
                // hash_hmac() requires PHP >= 5.1.2 or PECL hash >= 1.1
                //
                $raw_hash = hash_hmac('sha1', $base_string, $secret, true);

                //
                // base64 then urlencode the raw hash
                //
                $this->m_oauth_signature = rawurlencode(base64_encode($raw_hash));

                //
                // build the OAuth Authorization header
                //
                $oauth = 'OAuth oauth_consumer_key="' . $this->m_oauth_consumer_key . '", ' .
                        'oauth_nonce="' . $this->m_oauth_nonce . '", ' .
                        'oauth_signature="' . $this->m_oauth_signature . '", ' .
                        'oauth_signature_method="' . $this->m_oauth_signature_method . '", ' .
                        'oauth_timestamp="' . $this->m_oauth_timestamp . '", ' .
                        'oauth_token="' . $this->m_oauth_token . '", ' .
                        'oauth_version="' . $this->m_oauth_version . '"';

                //
                // build the request
                //
                $request  = "POST /1.1/statuses/filter.json HTTP/1.1\r\n";
                $request .= "Host: stream.twitter.com\r\n";
                $request .= "Authorization: " . $oauth . "\r\n";
                $request .= "Content-Length: " . strlen($data) . "\r\n";
                $request .= "Content-Type: application/x-www-form-urlencoded\r\n\r\n";
                $request .= $data;

                //
                // write the request
                //
                fwrite($fp, $request);

                //
                // set it to non-blocking
                //
                stream_set_blocking($fp, 0);

                while(!feof($fp))
                {
                    $read   = array($fp);
                    $write  = null;
                    $except = null;

                    //
                    // select, waiting up to 10 minutes for a tweet; if we don't get one, then
                    // then reconnect, because it's possible something went wrong.
                    //
                    $res = stream_select($read, $write, $except, 600, 0);
                    if ( ($res == false) || ($res == 0) )
                    {
                        break;
                    }

                    //
                    // read the JSON object from the socket
                    //
                    $json = fgets($fp);

                    //
                    // look for a HTTP response code
                    //
                    if (strncmp($json, 'HTTP/1.1', 8) == 0)
                    {
                        $json = trim($json);
                        if ($json != 'HTTP/1.1 200 OK')
                        {
                            echo 'ERROR: ' . $json . "\n";
                            return false;
                        }
                    }

                    //
                    // if there is some data, then process it
                    //
                    if ( ($json !== false) && (strlen($json) > 0) )
                    {
                        //
                        // decode the socket to a PHP array
                        //
                        $data = json_decode($json, true);
                        if ($data)
                        {
                            //
                            // process it
                            //
                            $this->process_tweet($data);
                        }
                    }
                }
            }

            fclose($fp);
            sleep(10);
        }

        return;
    }
};

The “process_tweet()” method will be called for each matching tweet- just modify that method to process the tweet however you want (load it into a database, print it to screen, email it, etc). The keyword matching isn’t perfect- if you search for a string of words, it won’t necessarily match the words in that exact order, but you can check that yourself from the process_tweet() method.

Then create a simple PHP application to run the collector:

require 'ctwitter_stream.php';

$t = new ctwitter_stream();

$t->login('consumer_key', 'consumer secret', 'access token', 'access secret');

$t->start(array('facebook', 'fbook', 'fb'));

You’ll need to provide the Consumer Key, Consumer Secret, Access Token, and the Access Secret, all of which are available from the Details section of your Application.

This new class uses the PHP hash_hmac() function for OAuth, which is available only in PHP 5.2.1 and up, and in the PECL hash extension 1.1 and up.

You can also Download the file here: http://mikepultz.com/uploads/ctwitter_stream.php.zip

Net_DNS2 Version 1.3.0 – More DNSSEC Features

This release includes many new DNSSEC changes, including a new, simple “dnssec” flag that tells the server to send all the DNSSEC related resource records for the given zone, as well as include the AD flag indicating if the data is authentic. This is analogous to the “+dnssec” option on the command line dig command.

Setting “dnssec” to true makes Net_DNS2 automatically add an OPT record to the additional section of the request, with the DO bit set to 1, indicating that we would like the DNSSEC information related to the given zone.

$resolver = new Net_DNS2_Resolver(array('nameservers' => array('8.8.8.8')));

$resolver->dnssec = true;

$result = $resolver->query('org', 'SOA', 'IN');

print_r($result);

Produces:

Net_DNS2_Packet_Response Object
(
    [answer_from] => 8.8.8.8
    [answer_socket_type] => 2
    [header] => Net_DNS2_Header Object
        (
            [id] => 31102
            [qr] => 1
            [opcode] => 0
            [aa] => 0
            [tc] => 0
            [rd] => 1
            [ra] => 1
            [z] => 0
            [ad] => 1
            [cd] => 0
            [rcode] => 0
            [qdcount] => 1
            [ancount] => 2
            [nscount] => 0
            [arcount] => 1
        )

    [question] => Array
        (
            [0] => Net_DNS2_Question Object
                (
                    [qname] => org
                    [qtype] => SOA
                    [qclass] => IN
                )

        )

    [answer] => Array
        (
            [0] => Net_DNS2_RR_SOA Object
                (
                    [mname] => a0.org.afilias-nst.info
                    [rname] => noc.afilias-nst.info
                    [serial] => 2010472684
                    [refresh] => 1800
                    [retry] => 900
                    [expire] => 604800
                    [minimum] => 86400
                    [name] => org
                    [type] => SOA
                    [class] => IN
                    [ttl] => 886
                    [rdlength] => 51
                )

            [1] => Net_DNS2_RR_RRSIG Object
                (
                    [typecovered] => SOA
                    [algorithm] => 7
                    [labels] => 1
                    [origttl] => 900
                    [sigexp] => 20130429014033
                    [sigincep] => 20130408004033
                    [keytag] => 31380
                    [signname] => org
                    [signature] => KBWEIC7BTypmbMTPU2KjCkPDbN1tV29ShWqa2zoGb4uQcRDBgYhz2ajpOaaJPrK+YY2E7BavLI+kulhJn9r/5kjXlOHQG/34B+OFlQwTTwHIRqtSmBu1qJorJSrSObQGVjZt4hteNVF6rfbS2u1m/Rh43eaoVCHfhJaeyr+MzLA=
                    [name] => org
                    [type] => RRSIG
                    [class] => IN
                    [ttl] => 886
                    [rdlength] => 151
                )

        )

    [authority] => Array
        (
        )

    [additional] => Array
        (
            [0] => Net_DNS2_RR_OPT Object
                (
                    [option_code] => 
                    [option_length] => 0
                    [option_data] => 
                    [extended_rcode] => 0
                    [version] => 0
                    [do] => 1
                    [z] => 0
                    [name] => 
                    [type] => OPT
                    [class] => 512
                    [ttl] => 32768
                    [rdlength] => 0
                    [rdata] => 
                )

        )
)

You can see that the response includes the original OPT RR in the additional section, with the DO bit set to 1. The header section also includes the AD bit set to 1, indicating that the server considers the data authentic.

I’ve also included the ability to adjust the AD flag  when making a query (to indicate to the server that we’d like the value of the AD bit, without having to set the DO bit in the OPT RR – see RFC6840 section 5.7), and to adjust the CD flag (telling the server that the client will perform it’s own signature validation).

Net_DNS2 does not validate the DNSSEC signatures itself, but it does provide all the data from DNS needed so that users can. Future versions of Net_DNS2 may provide support for this.

See the change log page for a full list of changes in this release.

You can install Net_DNS2 version 1.3.0 directly from PEAR, using the command line PEAR installer:

pear install Net_DNS2

Or download it directly from the Google Code page here.

Net_DNS2 Version 1.2.5 Released

I’ve released version 1.2.5 of the PEAR Net_DNS2 library- you can install it now through the command line PEAR installer:

pear install Net_DNS2

Or download it directly from the Google Code page here.

This release includes some important fixes to the way I was calculating the offset values when building the DNS packets. Here is the full list of changes for this release:

  • changed the socket_connect() code to start off non-blocking, and call select() after connect() so a timeout on a invalid server works properly
  • added the new TLSA RR – RFC 6698
  • fixed the socket defines again; apparently the values of the SOCK are different under solaris
  • changed the Net_DNS2_Updater::update() so you can pass a reference to a variable that will be populated with the response object
  • moved the lines that add the response server/type to after the is_null() check- it should have been there to begin with.
  • fixed a whole bunch of cases where I wasn’t incrementing the offset values properly
  • added support to set the RD (recursion desired) bit when making a request

How To Mine Twitter Streams from PHP in Real Time

UPDATE: I’ve wrote a new post with an example on how to connect to the v1.1 Twitter API, using OAuth – here.

Need to mine Twitter for tweets related to certain keywords?

No problem-

Twitter provides a pretty simple streaming interface to the onslaught of tweets it receives, letting you specify whatever keywords you want to search for, in a real-time “live” way.

To do this, I created a simple PHP class that can run in the background, collecting tweets for certain keywords:

ctwitter_stream.php

class ctwitter_stream
{
    private $m_username;
    private $m_password;

    public function __construct()
    {
        //
        // set a time limit to unlimited
        //
        set_time_limit(0);
    }

    //
    // set the login details
    //
    public function login($_username, $_password)
    {
        $this->m_username = $_username;
        $this->m_password = $_password;
    }

    //
    // process a tweet object from the stream
    //
    private function process_tweet(array $_data)
    {
        print_r($_data);

        return true;
    }

    //
    // the main stream manager
    //
    public function start(array $_keywords)
    {
        while(1)
        {
            $fp = fsockopen("ssl://stream.twitter.com", 443, $errno, $errstr, 30);
            if (!$fp)
            {
                echo "ERROR: Twitter Stream Error: failed to open socket";
            } else
            {
                //
                // build the request
                //
                $request  = "GET /1/statuses/filter.json?track=";
                $request .= urlencode(implode($_keywords, ',')) . " HTTP/1.1\r\n";
                $request .= "Host: stream.twitter.com\r\n";
                $request .= "Authorization: Basic ";
                $request .= base64_encode($this->m_username . ':' . $this->m_password);
                $request .= "\r\n\r\n";

                //
                // write the request
                //
                fwrite($fp, $request);

                //
                // set it to non-blocking
                //
                stream_set_blocking($fp, 0);

                while(!feof($fp))
                {
                    $read   = array($fp);
                    $write  = null;
                    $except = null;

                    //
                    // select, waiting up to 10 minutes for a tweet; if we don't get one, then
                    // then reconnect, because it's possible something went wrong.
                    //
                    $res = stream_select($read, $write, $except, 600, 0);
                    if ( ($res == false) || ($res == 0) )
                    {
                        break;
                    }

                    //
                    // read the JSON object from the socket
                    //
                    $json = fgets($fp);
                    if ( ($json !== false) && (strlen($json) > 0) )
                    {
                        //
                        // decode the socket to a PHP array
                        //
                        $data = json_decode($json, true);
                        if ($data)
                        {
                            //
                            // process it
                            //
                            $this->process_tweet($data);
                        }
                    }
                }
            }

            fclose($fp);
            sleep(10);
        }

        return;
    }
};

The “process_tweet()” method will be called for each matching tweet- just modify that method to process the tweet however you want (load it into a database, print it to screen, email it, etc). The keyword matching isn’t perfect- if you search for a string of words, it won’t necessarily match the words in that exact order, but you can check that yourself from the process_tweet() method.

Then create a simple PHP application to run the collector:

require 'ctwitter_stream.php';

$t = new ctwitter_stream();

$t->login('your twitter username', 'your twitter password');

$t->start(array('facebook', 'fbook', 'fb'));

Just provide your twitter account username/password, and then an array of keywords/strings to search for.

Since this application runs continuously in the background, it’s obviously not meant to be run via a web request, but meant to be run from the command line of your Unix or Windows box.

According to the Twitter documentation, the default access level allows up to 400 keywords, so you can track all sorts of things at the same time. If you need more details about the Twitter streaming API, it’s available here.

This class uses the HTTPS PHP stream– so you’ll need the OpenSSL extension enabled for it to work.