Speech to Text Action
Converts speech in an audio file to text using AI-powered speech
recognition.
This action is available in API version 66.0 and later.
Supported REST HTTP Methods
URI: /services/data/v66.0/actions/standard/speechToText
Formats: JSON, XML
HTTP Methods: POST
Authentication: Authorization: Bearer token
Inputs
| Input | Type | Description |
|---|---|---|
| contentDocumentId | string | Required. The ID of the audio file stored in Salesforce Files. This value is the contentDocumentId from the ContentDocument object that represents the file to be transcribed. |
| transcriptionModel | string | Optional. The transcription model used to convert speech to text. Valid values are whisper-v3-turbo and elevenlabs-scribe-v2. If you don't specify a value, whisper-v3-turbo is used. |
Outputs
| Output | Type | Description |
|---|---|---|
| convertedText | The transcript of the audio file, returned as plain text in the detected language. |
Usage
Sample Input
This sample uses elevenlabs-scribe-v2. You can also set transcriptionModel to whisper-v3-turbo, which is the default if the value isn't specified.
1{
2 "inputs": [
3 {
4 "contentDocumentId": "069xx000004WhFoAAK",
5 "transcriptionModel": "elevenlabs-scribe-v2"
6 }
7 ]
8}Sample Output
1{
2 "outputs": [
3 {
4 "convertedText": "Thank you for contacting support. How can I help you today?"
5 }
6 ]
7}