Imagine being able to create custom chatbots or generative AI services within your Salesforce org, effortlessly integrating with multiple large language models (LLMs) using REST API or Apex. This is now possible with Model Builder and the Models API.
In this blog post, we will explore Model Builder and the Models API, showcase how to configure models with Model Builder, and demonstrate how to use the Models API’s core generative AI capabilities.
What is Model Builder?
Model Builder is a tool for working with LLMs in Salesforce. When you activate generative AI features in Setup, you’ll see pre-configured models hosted by Salesforce (managed models) and default configured models from external providers like Azure and OpenAI.
You can also connect custom external models using Bring Your Own LLM (BYOLLM) technology, which supports OpenAI, Azure OpenAI, Google Vertex AI, and Amazon Bedrock. Additionally, the BYOLLM Open Connector allows you to connect to any LLM, including custom-built models.
Configuring and connecting models with Model Builder
In Model Builder, you can test and configure various settings for each model, such as temperature, frequency penalty, and presence penalty. These settings will be applied whenever the model is invoked from the platform.
Once your models are configured and ready to be used, you can invoke them from the platform. One way to invoke them is through prompt templates. When you create a prompt template in Prompt Builder, you assign a model to the template. That’s the model that the template will use every time it is invoked. Watch this video for more information on templates.
Introducing the Models API
Prompt templates aren’t the only way to interact with your custom models. Since the Summer ’24 release, models can now be invoked directly via Apex or REST through the Models API, currently in beta. When invoking models through the Models API, there’s no prompt template involved. You craft your prompts directly with code.
Note that prompt templates can also be invoked from code using the Connect API, but generally, the Models API and prompt templates are suitable for different use cases. Prompt templates are a low-code tool that helps admins combine generative AI and CRM data to solve CRM-centric use cases. The Models API is an extensible and flexible set of tools to solve custom AI use cases in Salesforce and beyond.
LLMs and the Einstein Trust Layer
Something important to keep in mind is that every interaction between the Salesforce Platform and a model always goes through the Einstein Trust Layer. This is true for all entry points through which models are invoked, including prompt templates and the Models API, and for all models. This means that every time you use generative AI in Salesforce, you get all the security benefits that the trust layer provides, together with the capabilities of auditing and logging feedback. Watch this video to learn more about the trust layer.
The benefits of a unified interface
One more benefit of the Models API is that you can swap the model being used behind the scenes without having to change your code. The APIs act as interfaces that decouple the calling code from the model. This makes writing apps much easier than if you had to write specific code to call OpenAI or Azure APIs directly.
The endpoints available to work with models on the Connect API and the Models API (see Connect REST API Developer Guide and Models REST API Reference) have equivalent Apex classes available to be used on the platform (see Connect API Apex Reference and Models API Apex Reference). You can use these classes in your Apex business logic and Lightning web components to build generative AI-powered apps. And because the APIs can be reached out externally, you can create custom external apps that invoke models through the Einstein Trust Layer. Cool, right?
The Models API at a glance
The Models API is a comprehensive suite of services built to support generative AI applications in Salesforce and elsewhere. It includes:
- Generations API
- Chat Generations API
- Feedback API
- Embeddings API
Let’s take a tour of each to understand how and when we might want to use them.
Generations API
You can use the Generations API to generate text in response to a single-turn interaction that doesn’t require context from previous responses or CRM data. Which generative AI use cases are single-turn interactions? Many, if not most of them. Examples of single-turn interactions include summarizing notes from a transcript, generating SOQL queries using natural language, and translating text to different languages.
Below, we use the Generations API to write a custom SOQL query. This is exactly the kind of use case that works best with the Generations API because the entire context (the SOQL query) can be contained within a single interaction and doesn’t require Salesforce data.
You may be asking yourself, “Can’t I generate text with a prompt template?” The answer is yes! However, prompt templates were designed with the admin persona in mind. Prompt templates allow admins to test, change, version, and ground prompt templates easily with just clicks, not code. Meanwhile, the Models API empowers developers to build prompts dynamically in code to support more complex use cases.
Chat Generations API
The Chat Generations API is your go-to solution for any use case that involves multi-turn interactions, whether that’s a general-use chatbot for brainstorming or a specialized virtual assistant. If the use case requires long-running context or multiple interactions, then use the Chat Generations API.
Creating an Apex service for a chatbot is straightforward. The chat begins with a user’s prompt, which is sent to the Chat Generations API as a chat message request. The chat message request contains a message
list which contains the user’s prompts. For each subsequent prompt, append the user’s new prompt and the assistant’s previous response as a ChatMessageRequest
to the list. Voila! You have the backend for a generative AI chatbot.
In the code below, we create a conversation for a chatbot using the Chat Generation API.
Keep in mind that the Chat Generations API isn’t limited to chats. Consider the Chat Generations API whenever the use case calls for some back-and-forth, brainstorming, or open-ended dialogue between the assistant and the user.
Feedback API
Gathering feedback from users is crucial to understanding and improving the user experience. A tweak to a system prompt or difference in model selection can have a meaningful impact on your generations. Use the Feedback API to power feedback UIs wherever you’re deploying Chat Generations and Generations APIs.
The Feedback API expects at least three inputs: an identifier for the feedback generation, an identifier for the generation, and the user’s feedback. The API includes fields to store the user’s sentiment overall (feedback
) and comments (feedbackText
). All data from the Feedback API is stored in Salesforce Data Cloud. This same API powers the feedback frameworks for other generative AI features like Prompt Builder and Agentforce.
Below, we build the backend to collect feedback from our generations using the Feedback API.
In addition to collecting feedback, you can also collect app-specific feedback using the appFeedback
and appFeedbackText
fields to separate sentiments about the generation and the app.
Embeddings API
Embeddings are numerical representations of a chunk of text, an image, or a video. In Salesforce, these embeddings are a list of doubles.
Think of an embedding as coordinates on a very complicated map. Text that has similar meanings is located more closely together on the map. This is true even if the text uses different words because embeddings are capturing meaning. Using embeddings, you can programmatically identify whether two or more chunks are related, which can power a litany of use cases, including a semantic search engine and retrieval augmented generation for prompt engineering.
With the Embeddings API, you can create embeddings of multiple chunks of text in a single API call. From there, you can store the embeddings for later or create a quick ranking or comparison between them.
In our code example below, we create embeddings of three chunks of text with a single call to the Embeddings API.
Note that an embedding model will always return an embedding of the same length irrespective of how much text it receives so that you can calculate. For example, OpenAI’s text-embedding-ada-002 model always returns an embedding with 1536 values (or dimensions), no matter how long or short the chunk of text.
Conclusion
Model Builder and the Models API are exciting and flexible solutions for extending Salesforce’s generative AI capabilities on the platform and beyond. The Models API’s unified interface simplifies the process of integrating a growing list of generative AI capabilities into custom applications to solve a variety of use cases. You can use the Generations API for single-turn interactions, the Chat Generations API for multi-turn conversations, the Feedback API for gathering user feedback, and the Embeddings API for semantic similarity tasks.
By using Model Builder and the Models API, you have the opportunity to be at the forefront of building custom AI applications while benefiting from the security of Salesforce’s Einstein Trust Layer. Whether you’re building chatbots, generating content, analyzing sentiment, or powering semantic search, these tools empower you to bring your ideas to life in Salesforce and beyond.
Ready to learn more? Check out the Get Started With the Models API Trailhead module.
Resources
- Trailhead: Get Started with the Models API
- Guide: Models API Developer Guide (Beta)
- Documentation: Create, Connect, and Activate Models
- Video: Inside the Einstein Trust Layer
About the authors
Charles Watkins is a Lead Developer Advocate at Salesforce. You can follow him on GitHub or LinkedIn.
Alba Rivas works as a Principal Developer Advocate at Salesforce. You can follow her on GitHub or Linkedin.