Text-to-Speech Audio

The <audio type="tts"> tag creates an audio layer that is rendered using Amazon's Polly Text-to-Speech service.

Example (Plain Text)

<movie width="640" height="360" framerate="30000/1001">
  <scene duration="audio">
    <audio type="tts" voice="Salli">
        Hello Impossible Software!
    </audio>
  </scene>
</movie>

Example (SSML)

<movie width="640" height="360" framerate="30000/1001">
  <scene duration="audio">
    <audio type="tts" voice="Joanna">
        <speak>This is a demo. The following words are
        <prosody volume="x-loud">
            quite a bit louder than the rest of this passage.
        </prosody>
        <prosody rate="x-slow">
            This text is spoken very slowly.
        </prosody>  
        <prosody pitch="+15%">
            This is spoken with a higher pitch,
        </prosody> or 
        <prosody pitch="-10%"> 
        is a lower pitch better?</prosody></speak>
    </audio>
  </scene>
</movie>

SYNTAX

<audio
    type="tts"
    voice="name of Polly voice"
    duration="time | variable"
    offset="time"
    volume="number"
    condition="variable"
    >
    TEXT TO SPEAK | variable | resource | <speak>SSML</speak>
</audio>

ATTRIBUTES

voice
An Amazon Polly voice. See list of voices.
duration
The duration this layer is rendered. The duration can be given in frame or time format. Default is the natural duration of the spoken audio. Can be a variable.
offset
The time from the beginning of the scene when this layer shall begin to render. Default is "0".
volume
The volume this audio layer shall have. Range is 0..1, default is "1".
condition
A variable definition which determines whether the layer is rendered or not. Default is to always render the layer.

Terms of Use | © 2017, Impossible Software, or its affiliates. All rights reserved.