Supported Models for Models API

This feature is a Beta Service. Customer may opt to try such Beta Service in its sole discretion. Any use of the Beta Service is subject to the applicable Beta Services Terms provided at Agreements and Terms.

The Models API supports large language models (LLMs) from multiple providers, such as Azure OpenAI and OpenAI.

New model announcements and model deprecation announcements are part of the Einstein Platform release notes on a monthly basis.

Model deprecation is the process where a model provider gradually phases out a model (usually in favor of a new and improved model). The process starts with an announcement outlining when the model will no longer be accessible or supported. The deprecation announcement usually contains a specific shutdown date. Deprecated models are still available to use until the shutdown date.

After the shutdown date, you will not be able to use that model in your application and requests to that model will be rerouted to a replacement model. We recommend that you start migrating your application away from a model as soon as its deprecation is announced. During migration, update and test each part of your application with the replacement model that we recommend.

The following models are deprecated:

Deprecated ModelRecommended ReplacementDeprecated DateShutdown Date
Azure OpenAI GPT 3.5 Turbo 16kOpenAI GPT 3.5 Turbo2023-11-062024-10-01
OpenAI GPT 3.5 Turbo 16kOpenAI GPT 3.5 Turbo2023-11-062024-09-13
OpenAI GPT 4 32kTo be determined2024-06-062025-06-06

To access an LLM with the Models API, you must know its API name.

Most endpoints for the Models API require the model’s API name in the URL path. For example, to use OpenAI’s GPT 3.5 Turbo model with the /generations endpoint, the URL looks like this:

The API name is also required in the modelName property when making a Models API request using Apex. For example:

The API name is a string made up of substrings:

  • Namespace: sfdc_ai
  • Separator: __
  • Configuration name: Default
  • Provider name: OpenAI
  • Model name: GPT35Turbo

To look up the API name in Einstein Studio:

  1. Go to the Models page.
  2. Click the Generative tab.
  3. Click the name of a configured model.
  4. The API name is shown in the configured model details.

Einstein Studio API Name

This table lists the API names for all the standard configuration models in Einstein Studio, plus one API-only model.

ModelAPI NameNotes
Azure OpenAI Ada 002sfdc_ai__DefaultAzureOpenAITextEmbeddingAda_002Embeddings only
Azure OpenAI GPT 3.5 Turbosfdc_ai__DefaultAzureOpenAIGPT35Turbo
Azure OpenAI GPT 3.5 Turbo 16ksfdc_ai__DefaultAzureOpenAIGPT35Turbo_16kDeprecated
Azure OpenAI GPT 4 Turbosfdc_ai__DefaultAzureOpenAIGPT4TurboNot supported
OpenAI Ada 002sfdc_ai__DefaultOpenAITextEmbeddingAda_002Embeddings only
OpenAI GPT 3.5 Turbosfdc_ai__DefaultOpenAIGPT35Turbo
OpenAI GPT 3.5 Turbo 16ksfdc_ai__DefaultOpenAIGPT35Turbo_16kDeprecated
OpenAI GPT 4sfdc_ai__DefaultOpenAIGPT4
OpenAI GPT 4 32ksfdc_ai__DefaultOpenAIGPT4_32kDeprecated
OpenAI GPT 4o (Omni)sfdc_ai__DefaultOpenAIGPT4OmniAPI only
OpenAI GPT 4 Turbosfdc_ai__DefaultOpenAIGPT4Turbo

The sfdc_ai__DefaultOpenAIGPT4Omni model is provided on a temporary basis for API use only. It can’t be configured in Einstein Studio or used in Prompt Builder.

The Models API doesn’t support OpenAI’s snapshot model names, such as gpt-3.5-turbo-0613. Always test your prompts to make sure that they perform as expected with new model versions.

Geographical routing is only available in Salesforce orgs where Einstein Generative AI was enabled after June 13, 2024.

You can choose an API name that automatically routes your request to a regional LLM provider based on the country associated with your org. Geographical routing offers greater control over data residency, and using nearby infrastructure minimizes latency.

This table lists the API names for the models that support geographical routing. (We sometimes call these models “geo-aware” or “providerless.”)

ModelAPI NameNotes
GPT 3.5 Turbosfdc_ai__DefaultGPT35Turbo
GPT 3.5 Turbo 16Ksfdc_ai__DefaultGPT35Turbo_16kDeprecated
GPT 4o (Omni)sfdc_ai__DefaultGPT4OmniLatest

Geographical routing also means that requests can be routed to an older version of the model than you expect. To determine exactly which model versions are available in each region, review the following table from Azure OpenAI: Model summary table and region availability.

For most countries, the regional provider is Azure OpenAI and hosted in one of its Azure availability zones.

For Brazil, Canada, the United States, and all other countries where geographical routing is not yet supported, the request is routed to OpenAI in the United States instead of Azure OpenAI.

The Trust Layer also has separate data residency regions for:

  • Data masking and toxicity detection models
  • Audit Trail data stored in Data Cloud

This table describes the location of Trust Layer data and Azure availability zones for all models that support geographical routing.

CountryTrust LayerAzure ZoneAzure Fallback
AustraliaAustraliaAustralia EastEast US 2
BrazilUnited States and Brazil*Not supportedNot supported
CanadaUnited StatesNot supportedNot supported
FranceGermanyFrance CentralEast US 2
IndiaIndiaSouth IndiaEast US 2
ItalyGermanyFrance CentralEast US 2
JapanJapanJapan EastEast US 2
GermanyGermanyFrance CentralEast US 2
SpainGermanyFrance CentralEast US 2
SwedenGermanyFrance CentralEast US 2
SwitzerlandGermanyFrance CentralEast US 2
United KingdomGermanyFrance CentralEast US 2
United StatesUnited StatesNot supportedNot supported
All othersUnited StatesNot supportedNot supported

*For Brazil, data masking models and toxicity detection models are hosted in the United States and Audit Trail data is hosted in Brazil.

For most tasks, choose a model that offers a balance of many criteria like GPT 3.5 Turbo.

To choose the right model for your application, consider these criteria.

Capabilities: What can the model do? Advanced models can perform a wider variety of tasks (usually at the expense of higher costs and slower speeds—or both). The ability to follow complex instructions is a key indicator of model capabilities.

Cost: How much does the model cost to serve and use? For details on usage and billing, see Einstein Usage.

Quality: How well does the model respond? The quality of model responses can be hard to measure quantitatively, but a good place to start is the LMSYS Chatbot Arena.

Speed: How long does it take the model to complete a task? Includes measures of latency and throughput.

For benchmarks and evaluations of LLMs and embedding models, see these resources.

The following models are available to use with the Models API.

The context window determines how many input and output tokens the model can process in a single request. The context window includes system messages, prompts, and responses.

The latest versions of GPT 3.5 Turbo and GPT 4 Turbo have a hard limit of 4,096 tokens on output, despite their extended context window for input. All models currently have a maximum context window of 32,768 tokens to ensure compatibility with Trust Layer features.

ProvidersModelGood ForContext Size
Azure OpenAI, OpenAIAda 002Retrieval-augmented generation8,191 tokens
Azure OpenAI, OpenAIGPT 3.5 TurboMost tasks (balanced)16,385 tokens
Azure OpenAI, OpenAIGPT 3.5 Turbo 16kDeprecated16,385 tokens
Azure OpenAI, OpenAIGPT 4Deprecated8,192 tokens
Azure OpenAI, OpenAIGPT 4 32kDeprecated32,768 tokens
Azure OpenAI, OpenAIGPT 4o (Omni)Advanced tasks (latest model)32,768 tokens
Azure OpenAI, OpenAIGPT 4 TurboAdvanced tasks (older model)32,768 tokens

Salesforce has partnered with several LLM providers to offer you a wide range of models to choose from. Learn more about each provider and what they have to offer.

The Azure OpenAI service, offered by Microsoft, enables Salesforce to provide models developed by OpenAI with additional enterprise features that aren’t yet offered by OpenAI themselves. Features include:

  • Regional availability outside of the United States
  • Certified compliance with HIPAA, ISO27001, SOC 1, SOC 2 (type 1 and 2), and SOC 3
  • More formal processes and procedures for access control, data management, and security testing

To learn more about a particular model, see Azure OpenAI’s models overview.

OpenAI is one of the best-known AI labs due to the popularity of their ChatGPT product. Their GPT 4 series of models is focused on advanced capabilities, while the GPT 3.5 series is optimized for speed.

To learn more about a particular model, see OpenAI’s models overview.