Convert Base64 Speech to Text Action
Converts Base64-encoded audio captured from a microphone to text using AI-powered
speech recognition.
This action is available in API version 66.0 and later.
Supported REST HTTP Methods
URI: /services/data/v66.0/actions/standard/voiceToText
Formats: JSON, XML
HTTP Methods: POST
Authentication: Authorization: Bearer token
Inputs
| Input | Type | Description |
|---|---|---|
| voiceContent | string | Required. Base64-encoded audio content captured from a microphone. |
| transcriptionModel | string | Optional. The transcription model used to convert speech to text. Valid values are whisper-v3-turbo and elevenlabs-scribe-v2. If you don't specify a value, whisper-v3-turbo is used. |
Outputs
| Output | Type | Description |
|---|---|---|
| convertedText | The transcribed text generated from the provided audio input. |
Usage
Sample Input
This sample converts Base64-encoded microphone audio to text using the Convert Base64 Speech to Text action. It sets transcriptionModel to elevenlabs-scribe-v2. You can also use whisper-v3-turbo, which is the default if the value isn't specified.
1{
2 "inputs": [
3 {
4 "voiceContent": "UklGRiQAAABXQVZFZm10IBAAAAABAAEA...",
5 "transcriptionModel": "elevenlabs-scribe-v2"
6 }
7 ]
8}Sample Output
This sample shows the transcribed text returned by the Speech to Text microphone capture action for the provided Base64-encoded audio input.
1{
2 "outputs": [
3 {
4 "convertedText": "Please create a support case for my login issue."
5 }
6 ]
7}