I’m not sure how I managed to miss this, but I just happened to run across a ‘new’ (well, newish), albeit still unofficial, offering of Google today: text-to-speech. You can see what few details there are on this Techcrunch post. Basically, it just boils down to this though – you send your phrase to be spoken in a GET request like so ‘http://translate.google.com/translate_tts?tl=en&q=hello%20world’, and in return Google gives you an .mp3 formatted sound file of your phrase being spoken in some non-threatening female robot voice.
Of course loading .mp3 files into Flash is a piece of cake, so integration with the Flash platform is nothing at all. One can even try their hands by running the trial for integration on a mock api. The one thing that may get you is that Flash doesn’t like to load .mp3’s across domains. A very basic serverside proxy script will get you around that easily enough though. Here’s a quick example.
This will take a String phrase and generate the URL to the Google translator for you:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
package { public class GTextToSpeech { private var _language:String; public function GTextToSpeech(language:String = "en") { _language = language; } /** * Use this to get the URL of the mp3 containing the spoken words of the 'phrase' parameter * @param phrase * @return String URL to Google Text to Speech engine */ public function say(phrase:String):String { if (phrase.length > 100) throw new Error("Google currently only supports phrases less than 100 characters in length."); var qs:String = "tl=" + _language + "&q="; qs += encodeURI(phrase); return "http://translate.google.com/translate_tts?" + qs; } } } |
And here’s an example of a bare minimum proxy in php:
1 |
<!--?php $url = (isset($_POST['url']) && $_POST['url']) ? $_POST['url'] : 'undefined'; if ($url !='undefined') echo file_get_contents($url); ?--> |
And put it together (using the Bit-101 MinimalComps):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 |
package { import com.bit101.components.HBox; import com.bit101.components.InputText; import com.bit101.components.PushButton; import com.bit101.components.Style; import flash.display.Sprite; import flash.events.Event; import flash.events.MouseEvent; import flash.media.Sound; import flash.media.SoundChannel; import flash.net.URLRequest; import flash.net.URLRequestMethod; import flash.net.URLVariables; /** * quick test of Google text-to-speech tool * @author Devon O. */ [SWF(width='350', height='80', backgroundColor='#FFFFFF', frameRate='12')] public class Main extends Sprite { private var _phrase:InputText; private var _goButton:PushButton; private var _speech:Sound = new Sound(); private var _channel:SoundChannel; private var _tts:GTextToSpeech = new GTextToSpeech(); public function Main():void { if (stage) init(); else addEventListener(Event.ADDED_TO_STAGE, init); } private function init(e:Event = null):void { removeEventListener(Event.ADDED_TO_STAGE, init); // entry point initUI(); } private function initUI():void { Style.embedFonts = true; Style.BUTTON_FACE = 0x000000; var container:HBox = new HBox(this, 25, 25); _phrase = new InputText(container); _phrase.width = 200; _phrase.maxChars = 100; _goButton = new PushButton(container, 0, 0, "Speak", goHandler); } private function goHandler(event:MouseEvent):void { if (_phrase.text != "") { _goButton.enabled = false; var req:URLRequest = new URLRequest("proxy.php"); req.method = URLRequestMethod.POST; var vars:URLVariables = new URLVariables(); vars.url = _tts.say(_phrase.text); req.data = vars; _speech.load(req); _channel = _speech.play(); _channel.addEventListener(Event.SOUND_COMPLETE, soundDoneHandler); } else { _phrase.text = "Type something here, dummy."; } } private function soundDoneHandler(event:Event):void { _goButton.enabled = true; } } } |
And that will give you this (just type something in the box, and hit the ‘Speak’ button. Of course, you’ll need your sound on):
According to Techcrunch, the API is currently limited to 100 character long phrases, but I haven’t actually tested that to check it’s validity. Even if that’s the case, it’s still quite a fun little toy to play with and can add a bit of dimension and accessibility to Flash apps to come.
Awesome! It looks like it does work for other languages. Here’s French: http://translate.google.com/translate_tts?tl=fr&q=je+ne+comprends+pas
Going to have to have a play around with this…
Thanks Paul. It is pretty damn fun.
One thing I just noticed – it seems to max out at less than 100 chars though – so if anyone has problems with the example above where the button locks up and nothing plays, try refreshing the page and using a shorter phrase.
Hi there
I posted a simple AS3 interface to Google’s textToSpeech API about 3 weeks ago (8 May 2010), you can read more about it and download at: http://peteshand.net/blog/index.php/actionscript-text-to-speech/
I’ve also added the ability to convert strings greater than 100 characters in length as well by splitting the string and making multiple requests. Multilingual support is next on the to do list.
Hey Pete. I am definitely slow on the draw. Splitting the phrases up into smaller requests is a great idea!
Awesome idea :)
Anyway it should maxout at the browser’s/vendor capability for the URL Request Method ( IE supports 2083 ) other browsers support more ( 4k for opera )
opps .. 2083 and 4k characters
Very neat. I will definitely be giving this a try in ActionScript.
Also, you can reduce that proxy to one single line of code. ;)
if ($_POST[‘url’]) echo file_get_contents($_POST[‘url’]);
REMEMBER, use file_get_contents() with care. If anyone decides to pass in the name of a PHP file (only the file name, such as “index.php” without the actual domain) it will return the PHP code for that page, which might leak sensitive information such as database passwords etc. Use with care!
Actually that proxy is unnecessary in this case, simply *play* sound from other domain via Sound class doesn’t require the crossdomain authorization.
forget comment the change:
_speech.load(new URLRequest(_tts.say(_phrase.text)));
Hi
Is there a way to also save the mp3 on your own server for later use?
hey David. Good question. I suppose once you get the url to the mp3 file, you could play around with a FileReference instance to download it locally. Saving it to a different web server could be a whole other ball of fish though. May be easier to convert it to wav and save it to your own server that way.
Hi Devon
I found an easy way here
http://masnun.com/blog/2009/12/14/googles-text-to-speech-api-a-php-wrapper-class/
so all you would have to do is pass the text from flex to this php file..simple.
the proxy will not allow for the correct pronunciation of special characters, i.e. “é”. so far i’ve tried the cURL and fopen functions, and now “echo file_get_contents($url);”. what is it that’s screwing up the special chars!?
I tried to use the same code and compile it and run it locally in Adobe Air 2.6. I got the stream error: “Error #2044: Unhandled IOErrorEvent:. text=Error #2032: Stream Error”. Anybody can point a direction to how solve it? I thought that Air is less on security checking
Hi Pete.
Found this great text-to-speech of yours when I was still an AS2 freak so couldnt test it. Now I am an AS3 freak (and loving it) but cant get the downloaded code to work I get the followning errors’;
\psUtils\examples\TextToSpeech\main.as, Line 15 1046: Type was not found or was not a compile-time constant: TextToSpeech.
\psUtils\examples\TextToSpeech\main.as, Line 23 1180: Call to a possibly undefined method TextToSpeech.
\psUtils\examples\TextToSpeech\main.as, Line 24 1120: Access of undefined property Language.
\psUtils\examples\TextToSpeech\main.as, Line 6 1172: Definition ps.GText2Speech:TextToSpeech could not be found.
\psUtils\examples\TextToSpeech\main.as, Line 7 1172: Definition ps.GTranslate:Language could not be found.
Sorry but its quite a lot. Any pointers on how I can overcoem these ? Maybe its obvious to you !
Thanks
Paul
Does anyone have a fix for the AIR 2.6 stream error ?
Paul
Killer……any idea why it doesn’t work more than once without refreshing the page? Or am I the only one having this problem?
the reason you have to refresh the page to hear more than one word is because of a coding error: each sound instance can load, at most, one sound file.
to remedy replace:
private var _speech:Sound = new Sound();
with:
private var _speech:Sound;
and replace:
_speech.load(req);
with:
_speech = new Sound();
_speech.load(req);
If you are looking for an offline solution to use TTS with AS3 and AIR for mobile, take a look at my ANE :
http://fabricemontfort.com/product/ezspeech-ane-air-native-extension/
Hope it helps.