Text to Speech Action (Beta) | Actions Developer Guide

This action is available in API version 66.0 and later.

Supported REST HTTP Methods

URI: /services/data/v66.0/actions/standard/textToSpeech
Formats: JSON, XML
HTTP Methods: POST
Authentication: Authorization: Bearer token

Inputs

Input	Details
inputText	Type string Description Required. The text to convert to voice.
voiceSpeed	Type string Description Optional. Specifies the speed at which the generated speech is delivered. This parameter increases or decreases the playback speed of the spoken audio output.
voiceStability	Type string Description Optional. Specifies the stability of the generated speech output. This parameter controls the consistency and variation in speech delivery. Higher values produce more uniform speech, while lower values result in greater expressive variation.
voiceId	Type string Description Optional. Specifies the identifier of the voice used to generate spoken audio. This parameter controls the tone and characteristics of the generated speech output. To retrieve available voice IDs, send a GET request to the Text to Speech REST endpoint.
fileOutput	Type boolean Description Optional. Specifies whether the response returns an audio file output instead of Base64-encoded audio. The default is false.

Outputs

OUTput	Details
convertedAudio	Description The generated audio output returned in Base64-encoded format based on the provided input text and voice settings.

Usage

Sample Input

This sample converts text input to Base64-encoded audio using the Text to Speech action.

1{
2  "inputs": [
3    {
4      "inputText": "Hello! How are you?",
5      "voiceSpeed": "1",
6      "voiceStability": "0.5",
7      "voiceId": "Jbte7ht1CqapnZvc4KpK"
8      "fileOutput": true
9    }
10  ]
11}

If fileOutput is set to false or not specified, the response returns Base64-encoded audio output.

Sample Output

The response returns generated spoken audio as Base64-encoded audio data.

1{
2  "outputs": [
3    {
4      "audioFile": "<audio file output>",
5      "contentType": "audio/mpeg"
6    }
7  ]
8}

The response returns generated spoken audio as a file output when fileOutput is set to true.