Supported Models for Models API
This feature is a Beta Service. Customer may opt to try such Beta Service in its sole discretion. Any use of the Beta Service is subject to the applicable Beta Services Terms provided at Agreements and Terms.
The Models API supports large language models (LLMs) from multiple providers, such as Azure OpenAI and OpenAI.
New model announcements and model deprecation announcements are part of the Einstein Platform release notes on a monthly basis.
Model deprecation is the process where a model provider gradually phases out a model (usually in favor of a new and improved model). The process starts with an announcement outlining when the model will no longer be accessible or supported. The deprecation announcement usually contains a specific shutdown date. Deprecated models are still available to use until the shutdown date.
After the shutdown date, you will not be able to use that model in your application and requests to that model will be rerouted to a replacement model. We recommend that you start migrating your application away from a model as soon as its deprecation is announced. During migration, update and test each part of your application with the replacement model that we recommend.
The following models are deprecated:
Deprecated Model | Recommended Replacement | Deprecated Date | Shutdown Date |
---|---|---|---|
Azure OpenAI GPT 3.5 Turbo 16k | OpenAI GPT 3.5 Turbo | 2023-11-06 | 2024-10-01 |
OpenAI GPT 3.5 Turbo 16k | OpenAI GPT 3.5 Turbo | 2023-11-06 | 2024-09-13 |
OpenAI GPT 4 32k | To be determined | 2024-06-06 | 2025-06-06 |
To access an LLM with the Models API, you must know its API name.
Most endpoints for the Models API require the model’s API name in the URL path. For example, to use OpenAI’s GPT 3.5 Turbo model with the /generations
endpoint, the URL looks like this:
The API name is also required in the modelName
property when making a Models API request using Apex. For example:
The API name is a string made up of substrings:
- Namespace:
sfdc_ai
- Separator:
__
- Configuration name:
Default
- Provider name:
OpenAI
- Model name:
GPT35Turbo
To look up the API name in Einstein Studio:
- Go to the Models page.
- Click the Generative tab.
- Click the name of a configured model.
- The API name is shown in the configured model details.
This table lists the API names for all the standard configuration models in Einstein Studio, plus one API-only model.
Model | API Name | Notes |
---|---|---|
Azure OpenAI Ada 002 | sfdc_ai__DefaultAzureOpenAITextEmbeddingAda_002 | Embeddings only |
Azure OpenAI GPT 3.5 Turbo | sfdc_ai__DefaultAzureOpenAIGPT35Turbo | |
Azure OpenAI GPT 3.5 Turbo 16k | sfdc_ai__DefaultAzureOpenAIGPT35Turbo_16k | Deprecated |
Azure OpenAI GPT 4 Turbo | sfdc_ai__DefaultAzureOpenAIGPT4Turbo | Not supported |
OpenAI Ada 002 | sfdc_ai__DefaultOpenAITextEmbeddingAda_002 | Embeddings only |
OpenAI GPT 3.5 Turbo | sfdc_ai__DefaultOpenAIGPT35Turbo | |
OpenAI GPT 3.5 Turbo 16k | sfdc_ai__DefaultOpenAIGPT35Turbo_16k | Deprecated |
OpenAI GPT 4 | sfdc_ai__DefaultOpenAIGPT4 | |
OpenAI GPT 4 32k | sfdc_ai__DefaultOpenAIGPT4_32k | Deprecated |
OpenAI GPT 4o (Omni) | sfdc_ai__DefaultOpenAIGPT4Omni | API only |
OpenAI GPT 4 Turbo | sfdc_ai__DefaultOpenAIGPT4Turbo |
The sfdc_ai__DefaultOpenAIGPT4Omni
model is provided on a temporary basis for API use only. It can’t be configured in Einstein Studio or used in Prompt Builder.
The Models API doesn’t support OpenAI’s snapshot model names, such as gpt-3.5-turbo-0613
. Always test your prompts to make sure that they perform as expected with new model versions.
Geographical routing is only available in Salesforce orgs where Einstein Generative AI was enabled after June 13, 2024.
You can choose an API name that automatically routes your request to a regional LLM provider based on the country associated with your org. Geographical routing offers greater control over data residency, and using nearby infrastructure minimizes latency.
This table lists the API names for the models that support geographical routing. (We sometimes call these models “geo-aware” or “providerless.”)
Model | API Name | Notes |
---|---|---|
GPT 3.5 Turbo | sfdc_ai__DefaultGPT35Turbo | |
GPT 3.5 Turbo 16K | sfdc_ai__DefaultGPT35Turbo_16k | Deprecated |
GPT 4o (Omni) | sfdc_ai__DefaultGPT4Omni | Latest |
Geographical routing also means that requests can be routed to an older version of the model than you expect. To determine exactly which model versions are available in each region, review the following table from Azure OpenAI: Model summary table and region availability.
For most countries, the regional provider is Azure OpenAI and hosted in one of its Azure availability zones.
For Brazil, Canada, the United States, and all other countries where geographical routing is not yet supported, the request is routed to OpenAI in the United States instead of Azure OpenAI.
The Trust Layer also has separate data residency regions for:
- Data masking and toxicity detection models
- Audit Trail data stored in Data Cloud
This table describes the location of Trust Layer data and Azure availability zones for all models that support geographical routing.
Country | Trust Layer | Azure Zone | Azure Fallback |
---|---|---|---|
Australia | Australia | Australia East | East US 2 |
Brazil | United States and Brazil* | Not supported | Not supported |
Canada | United States | Not supported | Not supported |
France | Germany | France Central | East US 2 |
India | India | South India | East US 2 |
Italy | Germany | France Central | East US 2 |
Japan | Japan | Japan East | East US 2 |
Germany | Germany | France Central | East US 2 |
Spain | Germany | France Central | East US 2 |
Sweden | Germany | France Central | East US 2 |
Switzerland | Germany | France Central | East US 2 |
United Kingdom | Germany | France Central | East US 2 |
United States | United States | Not supported | Not supported |
All others | United States | Not supported | Not supported |
*For Brazil, data masking models and toxicity detection models are hosted in the United States and Audit Trail data is hosted in Brazil.
For most tasks, choose a model that offers a balance of many criteria like GPT 3.5 Turbo.
To choose the right model for your application, consider these criteria.
Capabilities: What can the model do? Advanced models can perform a wider variety of tasks (usually at the expense of higher costs and slower speeds—or both). The ability to follow complex instructions is a key indicator of model capabilities.
Cost: How much does the model cost to serve and use? For details on usage and billing, see Einstein Usage.
Quality: How well does the model respond? The quality of model responses can be hard to measure quantitatively, but a good place to start is the LMSYS Chatbot Arena.
Speed: How long does it take the model to complete a task? Includes measures of latency and throughput.
For benchmarks and evaluations of LLMs and embedding models, see these resources.
- Artificial Analysis: Aggregated data on LLM performance.
- LLM Benchmark for CRM: Evaluation of LLMs for Sales and Service use cases. Provided by Salesforce AI Research.
- LMSYS Chatbot Arena: Human scoring of LLMs based on blind testing. Anyone can participate!
- MTEB Leaderboard: Benchmarks for embedding models from Huggingface.
- SEAL Leaderboard: Evaluations of LLMs using private datasets from Scale AI.
The following models are available to use with the Models API.
The context window determines how many input and output tokens the model can process in a single request. The context window includes system messages, prompts, and responses.
The latest versions of GPT 3.5 Turbo and GPT 4 Turbo have a hard limit of 4,096 tokens on output, despite their extended context window for input. All models currently have a maximum context window of 32,768 tokens to ensure compatibility with Trust Layer features.
Providers | Model | Good For | Context Size |
---|---|---|---|
Azure OpenAI, OpenAI | Ada 002 | Retrieval-augmented generation | 8,191 tokens |
Azure OpenAI, OpenAI | GPT 3.5 Turbo | Most tasks (balanced) | 16,385 tokens |
Azure OpenAI, OpenAI | GPT 3.5 Turbo 16k | Deprecated | 16,385 tokens |
Azure OpenAI, OpenAI | GPT 4 | Deprecated | 8,192 tokens |
Azure OpenAI, OpenAI | GPT 4 32k | Deprecated | 32,768 tokens |
Azure OpenAI, OpenAI | GPT 4o (Omni) | Advanced tasks (latest model) | 32,768 tokens |
Azure OpenAI, OpenAI | GPT 4 Turbo | Advanced tasks (older model) | 32,768 tokens |
Salesforce has partnered with several LLM providers to offer you a wide range of models to choose from. Learn more about each provider and what they have to offer.
The Azure OpenAI service, offered by Microsoft, enables Salesforce to provide models developed by OpenAI with additional enterprise features that aren’t yet offered by OpenAI themselves. Features include:
- Regional availability outside of the United States
- Certified compliance with HIPAA, ISO27001, SOC 1, SOC 2 (type 1 and 2), and SOC 3
- More formal processes and procedures for access control, data management, and security testing
To learn more about a particular model, see Azure OpenAI’s models overview.
OpenAI is one of the best-known AI labs due to the popularity of their ChatGPT product. Their GPT 4 series of models is focused on advanced capabilities, while the GPT 3.5 series is optimized for speed.
To learn more about a particular model, see OpenAI’s models overview.
- Models API Developer Guide: Access Models API with REST
- Models API Developer Guide: Access Models API with Apex
- Models API Developer Guide: Rate Limits for Models API
- Models REST API Reference (Beta)