Convert Base64 Speech to Text Action | Actions Developer Guide

This action is available in API version 66.0 and later.

Supported REST HTTP Methods

URI: /services/data/v66.0/actions/standard/voiceToText

Formats: JSON, XML

HTTP Methods: POST

Authentication: Authorization: Bearer token

Inputs

Input	Type	Description
voiceContent	string	Required. Base64-encoded audio content captured from a microphone.
transcriptionModel	string	Optional. The transcription model used to convert speech to text. Valid values are `whisper-v3-turbo` and `elevenlabs-scribe-v2`. If you don't specify a value, `whisper-v3-turbo` is used.

Outputs

Output	Type	Description
convertedText		The transcribed text generated from the provided audio input.

Usage

Sample Input

This sample converts Base64-encoded microphone audio to text using the Convert Base64 Speech to Text action. It sets transcriptionModel to elevenlabs-scribe-v2. You can also use whisper-v3-turbo, which is the default if the value isn't specified.

1{
2  "inputs": [
3    {
4      "voiceContent": "UklGRiQAAABXQVZFZm10IBAAAAABAAEA...",
5      "transcriptionModel": "elevenlabs-scribe-v2"
6    }
7  ]
8}

Sample Output

This sample shows the transcribed text returned by the Speech to Text microphone capture action for the provided Base64-encoded audio input.

1{
2  "outputs": [
3    {
4      "convertedText": "Please create a support case for my login issue."
5    }
6  ]
7}