Inside the Einstein Trust Layer

Today, companies are looking to adopt generative AI at a record pace. For Salesforce Developers, the Einstein 1 Platform enables this rapid technology shift without compromising trust, thanks to the Einstein Trust Layer. As you get to know the new Einstein 1 Platform, you likely want to learn more about how the Trust Layer works. In this blog post, we’ll take a deep dive into the Trust Layer and learn how your data flows through it to create rich, powerful, and correct responses.

Generating content with large language models (LLMs)

There are many degrees of complexity that come along with bringing generative AI into your tech stack. It can be as complex as training your own model or as simple as using existing LLMs through their APIs. In fact, extending existing APIs is the fastest and most common strategy for getting started with LLMs. You can learn more about these methods in our previous blog post, Building Apps with LLMs and Einstein.

The Einstein 1 Platform makes it easier for Salesforce Developers to build apps powered by LLMs. It provides a secure entry point into LLM offerings from many of our AI partners, which enables you to deliver powerful AI apps to your company more quickly.

There are three ways to generate content within Salesforce, including:

Our CRM solutions, like Sales Cloud and Service Cloud, are packed full of features that use generative AI to assist users and create content. For example, Einstein Reply Recommendations uses generative AI to help users write the perfect chat response based on the history of the customer and the conversation.
We have recently launched the Einstein Copilot Studio, which brings together a number of tools including Prompt Builder, which helps you construct prompt templates using merge fields from records and data provided by Flow and Data Cloud. This allows you to generate text responses for field values and emails, and even generate responses inside of your flows.
Developers will soon be able to make calls to Einstein right inside of Apex using the Einstein LLM Generations API. This will allow you to generate responses from an LLM anywhere in Salesforce and very easily bring AI into all of the apps that you build.

The most important aspect of these AI capabilities is that every single LLM generation is created through the Trust Layer.

The Einstein Trust Layer

The Trust Layer is a secure intermediary for user interactions with LLMs, masking personally identifiable information (PII), checking output toxicity, ensuring data privacy, preventing user data persistence or its use in further training, and standardizing differences among model providers. Let’s take a deeper look at how data flows through the Trust Layer and how it ensures that each transaction is controlled.

Securing a prompt before generation

The first step to generating a response from the Trust Layer is to provide it with a prompt. That prompt can come from any of the out-of-the box CRM apps, it could be defined in Prompt Builder, or could be passed from Apex. No matter the source, it then runs through the Trust Layer.

If the concept of a prompt is new to you, think of it as the input given to the model to instruct or guide its response. It’s essentially the starting point or trigger for the model to generate content. Depending on the prompt, the model can produce a wide range of outputs, from answering questions and writing emails to creating poetry or even generating code.

The Einstein Trust Layer handling the prompt before generation

When the prompt is provided, the Trust Layer is activated. Let’s follow along with an example of a prompt created in Prompt Builder. This prompt takes a contact record and creates an account overview for the customer based on quite a bit of data.

The sample prompt parameterizes the following data:

Customer name
Customer description
Lifetime spend
The year they joined as a customer
A list of Salesforce tasks that reference the customer
A list of the customer’s recent orders

All of this data can be used to generate a powerful response, but it also contains sensitive data that you may not want to share with a third-party provider or even have in transit. Let’s take a look at how we protect information before it is processed.

Secure data retrieval with the Einstein Trust Layer

The first step in the Trust Layer is secure data retrieval. This takes any direct merge fields inside of the prompt and grounds it with record data from Salesforce. Grounding is the process of adding additional context to the prompt that will enable the LLM to generate a response with much more relevance and less chance of hallucination. Hallucination occurs when an AI model “makes-up” information that isn’t based on its training data or the given prompt. This is necessary because many commercially available models are only trained up to a certain date, and providing recent data inside of the prompt will result in a higher level of accuracy.

There are two types of secure data retrieval that may happen:

Client-side grounding: Also referred to as grounding with the page context, this occurs when a prompt is being selected in the context of a record page and populates the merge fields with the currently stored data. For example, grounding occurs when generating an email to a contact in the context of the current record.
Server-side grounding: This occurs when a response is being generated behind the scenes. For example, if a prompt is being populated through a Flow or Apex, the Trust Layer will be compiled by querying the database directly.

In the account overview example above, all of the simple fields that are direct references to the target contact record would be grounded in the secure data retrieval phase.

In the example, the fields include:

{{{Contact.Name}}}
{{{Contact.Description}}}
{{{Contact.LifetimeSpend__c}}}
{{{Contact.FirstPurchaseYear__c}}}

This phase does not include the related data retrieving mechanisms like querying data from Flow or Data Cloud. These are both part of the next step: dynamic grounding.

Dynamic grounding with the Einstein Trust Layer

Through the process of dynamic grounding, additional data is brought into the prompt that incorporates business logic or external data sources. This level of grounding only occurs on the server side of the transaction. When the prompt is being hydrated, each data provider, like Flow or Data Cloud, is called to add additional information.

There is currently support for data providers through Flow, that can be made available to prompts. These flows can take the input of a target recordId and return data to the prompt, for example, if you want to retrieve a list of related cases and add them to give further context to the prompt. This also opens up a new world of bringing data from external integrations into the prompt through an API call inside of the flow.

We are also working on supporting vector search and direct MuleSoft API calls through dynamic grounding. This could allow you to perform semantic retrieval and search against your knowledge base to find the most relevant snippets of information.

In the account overview prompt, we are using dynamic grounding in two ways:

The {{{Flow.Get_Tasks_from_Contact}}} data retriever is querying data directly from the CRM and returning related tasks.
The {{{DataCloudRetrieve:RealTimePersonalizationModel:TYPE:Contact_00D8Z000001rteH_dll.recentOrders[0]}}} data retriever is used to get recent orders from Data Cloud. These orders exist in external systems and can be included via Data Cloud.

Once this process has been completed, we can now move on to the next step of the process: data masking.

Data masking with the Einstein Trust Layer

In many instances, companies don’t want PII to be shared with a third party, or to even be in transit over a connection. Through the process of data masking, we use a named entity detection tool that offers broader coverage, including government ID and payment card industry (PCI) entities, and helps protect a customer’s sensitive information.

When we identify a PII element within a prompt, we substitute it with a designated placeholder. Each detected entity is masked using a combination of its type and a sequential number. For example, the first detected name becomes PERSON_0, the next becomes PERSON_1, and so on. The Trust Layer temporarily stores the relationship between the original entities and their respective placeholders.

The entity types that are currently detected and masked are available in the help document for the Einstein Trust Layer and updated periodically.

In the account overview prompt, the data that has been populated in the secure data retrieval and dynamic grounding steps is then replaced with the relevant masking.

For example:

Write an account overview for the customer William Clark.

is masked and then becomes:

Write an account overview for the customer PERSON_0.

Once the data has been masked, it then goes into the final step before it is ready for generation.

Prompt defense with the Einstein Trust Layer

We follow masking with a set of prompt defense heuristics, such as instruction defense, to steer the model output toward a desirable outcome, along with post-prompting instructions that further guard against prompt injection attacks. These not only safeguard the prompt, but also ensure that if a model does not know the answer to a question based on the context provided, it will be steered away from hallucinating an incorrect response.

Here is a simple example showcasing how we are appending instructions to the start and end of the prompt:

Generating a completion with the LLM gateway

Once a prompt has been fully compiled and secured, it is ready to be sent through the LLM gateway. The gateway governs interactions with different model providers and represents a single way to communicate with multiple data sources.

The Einstein Trust Layer highlighting the large language model gateway

When the prompt hits the gateway, it is sent in the direction of the model that is required for processing. If the prompt is sent to external models that are part of our shared trust architecture, it is encrypted in flight and the data within them is not retained by the model that it is calling.

The first LLM partner that we have launched with is OpenAI. This “zero retention” architecture ensures that customer data is never stored outside of Salesforce.

Additionally, OpenAI has an enterprise API for content moderation that can alert us to unusual or abusive inputs or outputs of their model, then OpenAI forgets both the prompt and the output the moment their API processes our request. If abuse is detected, Salesforce is notified.

There are a number of ways that you can incorporate AI models into the LLM gateway. In the current release of the gateway, we have set up easy access to OpenAI models through the trust boundary. In the roadmap, we are working to enable teams to bring their own LLM and/or use additional external models like Anthropic. We also have the power to host our own models, like CodeGen which powers Einstein for Developers, which helps you generate code based on natural language.

We are also supporting a bring-your-own-model approach that enables teams to create predictive AI models using their existing machine learning platforms – training these models with data in Data Cloud – then activate those predictions across Salesforce. Customers can create these predictive models within Amazon SageMaker today, and we are additionally in Pilot with Google Vertex AI and Databricks.

Delivering a response after generation

Once the prompt has been passed into a model for generation, a response is generated and prepared to send back to the user. In this case, it is a text generation. The Trust Layer can now start to process and filter the response, so that it is acceptable by the time it gets back to the destination that the generation originated.

The Einstein Trust Layer handling the response

Toxicity detection

The first step in processing the response is for the Trust Layer to perform toxicity detection on the response. The API passes score information back to the caller and then stores this information in the data storage system for review. Most LLMs are trained on a vast corpus of text that includes a wide range of problematic content, and in some cases, this can lead to responses that are not suitable for companies. While we can do our best to safeguard against this type of response when preparing the prompt, the toxicity scoring adds a second layer of protection from inappropriate content.

Each response is tested based on the following categories:

Toxicity: A rude, disrespectful, or unreasonable comment that will likely cause people to leave the discussion
Hate: Content that expresses, incites, or promotes hatred based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, caste, or violence or severe harm toward the targeted group
Identity: Negative or hateful comments targeting someone because of their identity
Violence: Content that promotes or glorifies violence or celebrates the suffering or humiliation of others
Physical harm: Content that provides unsafe advice that may harm the user or others physically, or content that promotes, encourages, or depicts acts of self-harm
Sexual: Content meant to arouse sexual excitement, such as the description of sexual activity, or that promotes sexual services (excluding sex education and wellness)
Profanity: Swear words, curse words, or obscene or profane language

The gateway also provides an overall safety score from 0 (least safe, most toxic) to 1 (most safe), and represents an ensemble of all category scores.

The Einstein toxicity detector uses a hybrid solution combining a rule-based profanity filter and an AI model developed by Salesforce Research (Transformer / Flan-T5-base model trained on 2.3 M prompts from seven legal-approved datasets). Currently, toxicity confidence scoring is only supported for English.

Data demasking

Once toxicity detection is complete, we can then re-hydrate the prompt with the PII that was filtered out of the prompt in the data demasking step. This will allow us to use the same process of matching the keys back to the initial prompt and providing the user with a fully demasked response.

At this point, the response is delivered.

Feedback framework & audit trail

When a response is presented to a user in Salesforce, they have the ability to provide feedback on the quality of the content through the feedback framework. Users can accept, modify, or reject the result, the outcome of which is stored as part of our feedback framework. It is then logged in our audit trail, enabling us to refine our own internal and Salesforce-hosted models once those become available later this year.

The audit trail includes timestamped metadata detailing the context of the interaction with the LLM, including the original prompt, safety scores logged during toxicity detection, and the original output from the LLM. It also includes any action taken by the end user, such as whether or not they accepted or rejected the output, and any modifications they may have made before using or rejecting that generation. The audit trail helps to simplify compliant use of generative AI at scale.

Summary

Companies are eager to securely bring generative AI into their organizations, and we are making that possible with the Einstein Trust Layer.

To recap, the process that each and every prompt goes through includes: secure data retrieval; grounding the prompt by adding context; dynamic grounding, which fetches additional data; data masking to hide PII; and prompt defense to further secure the prompt against unwanted injections.

Once a prompt is compiled and secured, it’s transmitted through the LLM gateway, which oversees interactions with different model providers and ensures that the data remains within Salesforce. After a response is produced, the Trust Layer assesses it for toxicity and scores the response for overall safety.

After toxicity detection, the response is demasked and presented to the user, who can provide feedback on its quality. This feedback, alongside the entire interaction’s metadata, is logged in an audit trail, helping Salesforce refine its models and ensuring compliant use of generative AI at scale.

At the end of the day, the Einstein Trust Layer is a safety net, ensuring that our interactions with AI are both meaningful and trusted.

Resources

Learn more about Einstein Copilot
Learn more about Prompt Builder
Learn more about Salesforce Data Cloud

About the author

Stephan Chandler-Garcia is the Director of Strategic Content at Salesforce. He has been in the Salesforce ecosystem for over 10 years as a customer, partner, and ISV. You can find Stephan in person at a Trailblazer Community Group or at one of our conferences around the world. Alternatively, follow him on X(Twitter) or GitHub.