Convert Base64 Speech to Text Action

Converts Base64-encoded audio captured from a microphone to text using AI-powered speech recognition.

This action is available in API version 66.0 and later.

Supported REST HTTP Methods

URI: /services/data/v66.0/actions/standard/voiceToText

Formats: JSON, XML

HTTP Methods: POST

Authentication: Authorization: Bearer token

Inputs

Input Type Description
voiceContent string Required. Base64-encoded audio content captured from a microphone.
transcriptionModel string Optional. The transcription model used to convert speech to text. Valid values are whisper-v3-turbo and elevenlabs-scribe-v2. If you don't specify a value, whisper-v3-turbo is used.

Outputs

Output Type Description
convertedText The transcribed text generated from the provided audio input.

Usage

Sample Input

This sample converts Base64-encoded microphone audio to text using the Convert Base64 Speech to Text action. It sets transcriptionModel to elevenlabs-scribe-v2. You can also use whisper-v3-turbo, which is the default if the value isn't specified.

1{
2  "inputs": [
3    {
4      "voiceContent": "UklGRiQAAABXQVZFZm10IBAAAAABAAEA...",
5      "transcriptionModel": "elevenlabs-scribe-v2"
6    }
7  ]
8}

Sample Output

This sample shows the transcribed text returned by the Speech to Text microphone capture action for the provided Base64-encoded audio input.

1{
2  "outputs": [
3    {
4      "convertedText": "Please create a support case for my login issue."
5    }
6  ]
7}