Tag Archives: cepstral

PERL Text-to-Speech using Cepstral voices (libswift)

I’ve released two new PERL modules:

Speech::Swift – a PERL interface to the Cepstral text-to-speech engine, Swift.

and

Speech::Swift::Simple – a simplified interface to Speech::Swift

The libswift shared library is required to support this code, which is included with every voice downloaded from Cepstral.

The reason for two releases, is that the Speech::Swift module exports all (well, almost all) the underlying functions of the libswift.so library, while Speech::Swift::Simple has a simplified interface to generate speech in a just a few function calls.

For example:

#!/usr/bin/perl

use Speech::Swift::Simple;

#
# create a new Speech::Swift::Simple with one channel audio, and 16bit encoding.
#
my $s = new Speech::Swift::Simple(
         channels => 1,
         encoding => Speech::Swift::AUDIO_ENCODING_PCM16
);

#
# set the voice to use by name
#
$s->set_voice("Allison");

#
# synthesize the text, and return it as a Speech::Swift::Simple::Wav object
#
my $wav = $s->generate("My name is allison");

#
# write the wav object to a file.
#
$wav->write("test.wav");

Or use the Speech::Swift library directly, for a more low-level interface.

The audio output is always as a WAV file; you can use one of the many audio modules available from CPAN, like Audio::GSM or Audio::MPEG, to re-encode the audio as needed.

Both PERL modules are available for download from CPAN now.

First Release of PHP Swift TTS Extension

I’m happy to announce the first release of the Swift Text-To-Speech PHP extension; the swift engine is the free TTS engine provided with any Cepstral TTS voice. A lot of Asterisk fans out there will recognize the Cepstral Allison voice, as the default voice for Asterisk installations.

The extension will only work on systems support by the Swift engine, and has only been tested (so far) on Linux (CentOS).

The extension will generate audio based on the text provided, and can be exported in several different audio formats, including:

  • PCM (RAW audio)
  • u-law / a-law (logarithmically encoded RAW audio)
  • WAV (RAW audio)
  • GSM (when compiled with the libgsm library)
  • MP3 (when compiled with the libmp3lame library)

A simple example on how to use this:

//
// create the new TTS object
//
$tts = new SwiftTTS();

//
// set a voice to use for generation
//
$tts->setVoice("Allison");

//
// generate text, and return a stream for the audio
//
$s = $tts->generate("hello my name is allison", SwiftTTS::FORMAT_WAV);
if ($s !== false)
{
        //
        // write the stream contents to a file
        //
        file_put_contents("audio.wav", $s);
}

For more details, and to download the current version, see the Google Code page.