Take a Tour of Model Builder and the Models API (Beta)

Imagine being able to create custom chatbots or generative AI services within your Salesforce org, effortlessly integrating with multiple large language models (LLMs) using REST API or Apex. This is now possible with Model Builder and the Models API.

In this blog post, we will explore Model Builder and the Models API, showcase how to configure models with Model Builder, and demonstrate how to use the Models API’s core generative AI capabilities.

What is Model Builder?

Model Builder is a tool for working with LLMs in Salesforce. When you activate generative AI features in Setup, you’ll see pre-configured models hosted by Salesforce (managed models) and default configured models from external providers like Azure and OpenAI.

You can also connect custom external models using Bring Your Own LLM (BYOLLM) technology, which supports OpenAI, Azure OpenAI, Google Vertex AI, and Amazon Bedrock. Additionally, the BYOLLM Open Connector allows you to connect to any LLM, including custom-built models.

Configuring and connecting models with Model Builder

In Model Builder, you can test and configure various settings for each model, such as temperature, frequency penalty, and presence penalty. These settings will be applied whenever the model is invoked from the platform.

Image showing model configuration in Model Builder

Once your models are configured and ready to be used, you can invoke them from the platform. One way to invoke them is through prompt templates. When you create a prompt template in Prompt Builder, you assign a model to the template. That’s the model that the template will use every time it is invoked. Watch this video for more information on templates.

Assigning a model to a prompt template

Introducing the Models API

Prompt templates aren’t the only way to interact with your custom models. Since the Summer ’24 release, models can now be invoked directly via Apex or REST through the Models API, currently in beta. When invoking models through the Models API, there’s no prompt template involved. You craft your prompts directly with code.

Note that prompt templates can also be invoked from code using the Connect API, but generally, the Models API and prompt templates are suitable for different use cases. Prompt templates are a low-code tool that helps admins combine generative AI and CRM data to solve CRM-centric use cases. The Models API is an extensible and flexible set of tools to solve custom AI use cases in Salesforce and beyond.

LLMs and the Einstein Trust Layer

Something important to keep in mind is that every interaction between the Salesforce Platform and a model always goes through the Einstein Trust Layer. This is true for all entry points through which models are invoked, including prompt templates and the Models API, and for all models. This means that every time you use generative AI in Salesforce, you get all the security benefits that the trust layer provides, together with the capabilities of auditing and logging feedback. Watch this video to learn more about the trust layer.

The Einstein Trust Layer

The benefits of a unified interface

One more benefit of the Models API is that you can swap the model being used behind the scenes without having to change your code. The APIs act as interfaces that decouple the calling code from the model. This makes writing apps much easier than if you had to write specific code to call OpenAI or Azure APIs directly.

The endpoints available to work with models on the Connect API and the Models API (see Connect REST API Developer Guide and Models REST API Reference) have equivalent Apex classes available to be used on the platform (see Connect API Apex Reference and Models API Apex Reference). You can use these classes in your Apex business logic and Lightning web components to build generative AI-powered apps. And because the APIs can be reached out externally, you can create custom external apps that invoke models through the Einstein Trust Layer. Cool, right?

The Models API at a glance

The Models API is a comprehensive suite of services built to support generative AI applications in Salesforce and elsewhere. It includes:

Generations API
Chat Generations API
Feedback API
Embeddings API

Let’s take a tour of each to understand how and when we might want to use them.

Generations API

You can use the Generations API to generate text in response to a single-turn interaction that doesn’t require context from previous responses or CRM data. Which generative AI use cases are single-turn interactions? Many, if not most of them. Examples of single-turn interactions include summarizing notes from a transcript, generating SOQL queries using natural language, and translating text to different languages.

Below, we use the Generations API to write a custom SOQL query. This is exactly the kind of use case that works best with the Generations API because the entire context (the SOQL query) can be contained within a single interaction and doesn’t require Salesforce data.

// Instantiate the API class and create the request
aiplatform.ModelsAPI modelsAPI = new aiplatform.ModelsAPI();
aiplatform.ModelsAPI.createGenerations_Request request = new aiplatform.ModelsAPI.createGenerations_Request();

// Specify model and create request body with prompt
request.modelName = 'sfdc_ai__DefaultGPT35Turbo';
request.body = new aiplatform.ModelsAPI_GenerationRequest();
request.body.prompt = 'Write a SOQL query to retrieve the oldest Account without active Opportunities.';

// Call the API and get the response
aiplatform.ModelsAPI.createGenerations_Response response = modelsAPI.createGenerations(request);
String generatedQuery = response.Code200.generation.generatedText;
System.debug(generatedQuery);
System.debug(promptResponse);

You may be asking yourself, “Can’t I generate text with a prompt template?” The answer is yes! However, prompt templates were designed with the admin persona in mind. Prompt templates allow admins to test, change, version, and ground prompt templates easily with just clicks, not code. Meanwhile, the Models API empowers developers to build prompts dynamically in code to support more complex use cases.

Chat Generations API

The Chat Generations API is your go-to solution for any use case that involves multi-turn interactions, whether that’s a general-use chatbot for brainstorming or a specialized virtual assistant. If the use case requires long-running context or multiple interactions, then use the Chat Generations API.

Creating an Apex service for a chatbot is straightforward. The chat begins with a user’s prompt, which is sent to the Chat Generations API as a chat message request. The chat message request contains a message list which contains the user’s prompts. For each subsequent prompt, append the user’s new prompt and the assistant’s previous response as a ChatMessageRequest to the list. Voila! You have the backend for a generative AI chatbot.

In the code below, we create a conversation for a chatbot using the Chat Generation API.

// Get previous messages with a helper for as an input to the class
List<Map<String, String>> previousMessages = getPreviousMessages();

// Instantiate the API class and create the request
aiplatform.ModelsAPI modelsAPI = new aiplatform.ModelsAPI();
aiplatform.ModelsAPI.createChatGenerations_Request request = new aiplatform.ModelsAPI.createChatGenerations_Request();

// Specify the model
request.modelName = 'sfdc_ai__DefaultGPT35Turbo';

// Create a list to hold chat messages
List messageRequestList = new List();

// Loop over the previous messages, setting the role (assistant, user) and content of the message
for (Map<String, String> message : previousMessages) {
    aiplatform.ModelsAPI_ChatMessageRequest messageRequest = new aiplatform.ModelsAPI_ChatMessageRequest();
    messageRequest.role = message.get('role');
    messageRequest.content = message.get('content');
    messageRequestList.add(messageRequest);
}

// Create the request body and set the messages
request.body = new aiplatform.ModelsAPI_ChatGenerationsRequest();
request.body.messages = messageRequestList;

// Call the API and get the response
aiplatform.ModelsAPI.createChatGenerations_Response apiResponse = modelsAPI.createChatGenerations(request);
String assistantResponse = apiResponse.Code200.generationDetails.generations[0].content;

Keep in mind that the Chat Generations API isn’t limited to chats. Consider the Chat Generations API whenever the use case calls for some back-and-forth, brainstorming, or open-ended dialogue between the assistant and the user.

Feedback API

Gathering feedback from users is crucial to understanding and improving the user experience. A tweak to a system prompt or difference in model selection can have a meaningful impact on your generations. Use the Feedback API to power feedback UIs wherever you’re deploying Chat Generations and Generations APIs.

The Feedback API expects at least three inputs: an identifier for the feedback generation, an identifier for the generation, and the user’s feedback. The API includes fields to store the user’s sentiment overall (feedback) and comments (feedbackText). All data from the Feedback API is stored in Salesforce Data Cloud. This same API powers the feedback frameworks for other generative AI features like Prompt Builder and Agentforce.

Below, we build the backend to collect feedback from our generations using the Feedback API.

// Instantiate the Models API and create the request
aiplatform.ModelsAPI modelsAPI = new aiplatform.ModelsAPI();
aiplatform.ModelsAPI.submitFeedback_Request request = new aiplatform.ModelsAPI.submitFeedback_Request();

// Create the FeedbackRequest body 
aiplatform.ModelsAPI_FeedbackRequest feedbackRequest = new aiplatform.ModelsAPI_FeedbackRequest();

// Create an ID for the FeedbackRequest
UUID randomUuid = UUID.randomUUID();

// Create the feedback request with an Id, generationId and feedback fields
feedbackRequest.id = randomUuid.toString();
feedbackRequest.generationId = 'chatcmpl-9wEoshWXU4C4QeNv9xSoleL32MXPW';
feedbackRequest.feedback = 'GOOD';
feedbackRequest.feedbackText = 'This response was very helpful';
feedbackRequest.source = 'user';
request.body = feedbackRequest;

// Submit feedback
aiplatform.ModelsAPI.submitFeedback_Response response = modelsAPI.submitFeedback(request); 
System.debug('Models API response: ' + response.Code200.embeddings);

In addition to collecting feedback, you can also collect app-specific feedback using the appFeedback and appFeedbackText fields to separate sentiments about the generation and the app.

Embeddings API

Embeddings are numerical representations of a chunk of text, an image, or a video. In Salesforce, these embeddings are a list of doubles.

Think of an embedding as coordinates on a very complicated map. Text that has similar meanings is located more closely together on the map. This is true even if the text uses different words because embeddings are capturing meaning. Using embeddings, you can programmatically identify whether two or more chunks are related, which can power a litany of use cases, including a semantic search engine and retrieval augmented generation for prompt engineering.

With the Embeddings API, you can create embeddings of multiple chunks of text in a single API call. From there, you can store the embeddings for later or create a quick ranking or comparison between them.

In our code example below, we create embeddings of three chunks of text with a single call to the Embeddings API.

// Instantiate the Models API and create generate embeddings request
aiplatform.ModelsAPI modelsAPI = new aiplatform.ModelsAPI();
aiplatform.ModelsAPI.createEmbeddings_Request request = new aiplatform.ModelsAPI.createEmbeddings_Request();

// Specify model on the request
request.modelName = 'sfdc_ai__DefaultOpenAITextEmbeddingAda_002';

// Add some strings to the request body
aiplatform.ModelsAPI_EmbeddingRequest body = new aiplatform.ModelsAPI_EmbeddingRequest();
body.input = List{'Smartphone', 'Portable charger', 'Gardening gloves'};
request.body = body;

// Execute request
aiplatform.ModelsAPI.createEmbeddings_Response response = modelsAPI.createEmbeddings(
 request
);

// Get the embeddings. You can compare the embeddings to rank products by similarity
List embeddings = response.Code200.embeddings;

// Process the embeddings to rank or compare the text by similarity
// ...

Note that an embedding model will always return an embedding of the same length irrespective of how much text it receives so that you can calculate. For example, OpenAI’s text-embedding-ada-002 model always returns an embedding with 1536 values (or dimensions), no matter how long or short the chunk of text.

Conclusion

Model Builder and the Models API are exciting and flexible solutions for extending Salesforce’s generative AI capabilities on the platform and beyond. The Models API’s unified interface simplifies the process of integrating a growing list of generative AI capabilities into custom applications to solve a variety of use cases. You can use the Generations API for single-turn interactions, the Chat Generations API for multi-turn conversations, the Feedback API for gathering user feedback, and the Embeddings API for semantic similarity tasks.

By using Model Builder and the Models API, you have the opportunity to be at the forefront of building custom AI applications while benefiting from the security of Salesforce’s Einstein Trust Layer. Whether you’re building chatbots, generating content, analyzing sentiment, or powering semantic search, these tools empower you to bring your ideas to life in Salesforce and beyond.

Ready to learn more? Check out the Get Started With the Models API Trailhead module.

Resources

Trailhead: Get Started with the Models API
Guide: Models API Developer Guide (Beta)
Documentation: Create, Connect, and Activate Models
Video: Inside the Einstein Trust Layer

About the authors

Charles Watkins is a Lead Developer Advocate at Salesforce. You can follow him on GitHub or LinkedIn.

Alba Rivas works as a Principal Developer Advocate at Salesforce. You can follow her on GitHub or Linkedin.