Text to Speech Action (Beta) | Actions Developer Guide

This action is available in API version 66.0 and later.

Supported REST HTTP Methods

URI: /services/data/v66.0/actions/standard/textToSpeech

Formats: JSON, XML

HTTP Methods: POST

Authentication: Authorization: Bearer token

Inputs

Input	Type	Description
inputText	string	Required. The text to convert to voice.
voiceSpeed	string	Optional. Specifies the speed at which the generated speech is delivered. This parameter increases or decreases the playback speed of the spoken audio output.
voiceStability	string	Optional. Specifies the stability of the generated speech output. This parameter controls the consistency and variation in speech delivery. Higher values produce more uniform speech, while lower values result in greater expressive variation.
voiceId	string	Optional. Specifies the identifier of the voice used to generate spoken audio. This parameter controls the tone and characteristics of the generated speech output. To retrieve available voice IDs, send a GET request to the Text to Speech REST endpoint.
fileOutput	boolean	Optional. Specifies whether the response returns an audio file output instead of Base64-encoded audio. The default is false.

Outputs

Output	Type	Description
convertedAudio		The generated audio output returned in Base64-encoded format based on the provided input text and voice settings.

Usage

Sample Input

This sample converts text input to Base64-encoded audio using the Text to Speech action.

1{
2  "inputs": [
3    {
4      "inputText": "Hello! How are you?",
5      "voiceSpeed": "1",
6      "voiceStability": "0.5",
7      "voiceId": "Jbte7ht1CqapnZvc4KpK"
8      "fileOutput": true
9    }
10  ]
11}

If fileOutput is set to false or not specified, the response returns Base64-encoded audio output.

Sample Output

The response returns generated spoken audio as Base64-encoded audio data.

1{
2  "outputs": [
3    {
4      "audioFile": "<audio file output>",
5      "contentType": "audio/mpeg"
6    }
7  ]
8}

The response returns generated spoken audio as a file output when fileOutput is set to true.