Generative AI is the most transformative technology since the Internet, revolutionizing the way we create and interact with information. For developers, this raises new questions: from the practical “How do I build AI-powered apps with Large Language Models (LLMs)?” to the deeper, “How will generative AI change the nature of applications?” We explore these two questions in this blog post.
How do I build AI-powered apps with LLMs?
Let’s start with the first question, “How do I build apps with LLMs?” and explore three options that are commonly considered:
- Train your own model
- Customize an open-source model
- Use existing models through APIs
Train your own model
Training your own model gives you full control over the data your model learns from. For example, you may train a model on data specific to your industry. A model trained on domain-specific data will generally be more accurate than a general-purpose model for use cases centered around that domain. While training your own model offers more control and accuracy, it may not always be the best approach. Here are a few things to consider:
- Time and Resources: Training your own LLM from scratch can take weeks or even months. As a point of reference, even though your model is likely to be much smaller, the GPT-3 model from OpenAI took 1.5 million GPU hours to train.
- Expertise: To train your model, you will also need a team of specialized Machine Learning (ML) and Natural Language Processing (NLP) engineers.
- Data Security: The power of LLMs makes it tempting to create models that learn from all your data, but this is not always the right thing to do from a data security standpoint. There can be tension between the way LLMs learn and the way data security policies are implemented at your company. LLMs learn from large amounts of data. The more data the better! However, with field-level security (FLS) and strict permissions, corporate data security policies are often based on the principle of least privilege: users should only have access to the data they need to do their specific job. The less data the better! A model trained on all available customer data and made available to everyone at your company may therefore not be a good idea and breach your company’s data security policies. However, a model trained on product specifications and past support ticket resolutions can help agents resolve new tickets without compromising data security.
Customize an open-source model
Customizing an open-source model typically takes less time and is less costly than training your own model from scratch. However, you still need a team of specialized Machine Learning (ML) and Natural Language Processing (NLP) engineers. Depending on the use case, you may also still experience the data security tension described above.
Use existing models through APIs
Using existing models through APIs is the easiest way to build applications with LLMs. This is also the most commonly used option at the moment. However, these models have not been trained on your own contextual or private company data and the output they produce may therefore be too generic to be useful.
In this blog post, we explore different techniques to add contextual or private company data through the prompt. Because the prompt is created dynamically on behalf of the user, it only includes data the user has access to, addressing the data security tension described above. You may be concerned about passing private data to a third-party API, but there are techniques to address that concern, and we describe them in this blog post as well.
Building AI-powered apps using existing models through APIs
Basic API call
Major model providers like OpenAPI, Anthropic, Google, Hugging Face, and Cohere offer APIs to work with their models. In the most basic implementation, your application captures a prompt from the user, passes it as part of the API call, and displays the generated output to the user.
For example, here is what the API call may look like using the OpenAI API:
This option may work for simple use cases that only require a general output based on general knowledge. For example, “Write a haiku about winter” or “Write a sample SQL statement with an outer join.” But if you need an answer that is tailored to your own contextual or private company data, the generated output is likely to be too generic to be useful.
For example, let’s say a user enters the following prompt:
Write an introduction email to the Acme CEO.
The generated email would not be personalized or relevant because the model doesn’t know anything about your relationship with Acme and the business you’ve done with them.
Grounding the LLM
To make the response more relevant and contextual, the user can ground the LLM with additional information. For example, they may enter the following prompt:
You are John Smith, Account Representative at Northern Trail Outfitters.
Write an introduction email to Lisa Martinez, CEO at ACME.
Here is a list of the last three orders Acme placed with Northern Trail Outfitters:
Summer Collection 2023: $375,286
Spring Collection 2023: $402,255
Winter Collection 2022: $357,542
This allows the LLM to generate a much more relevant output. There are, however, two problems with this approach:
- The user has to enter a lot of grounding information manually. The quality of the output is therefore highly dependent on the quality of the question entered by the user.
- You are passing confidential information to the model provider where it could potentially be persisted, or used to further train the model, which means that your private data could potentially show up in someone else’s model-generated response.
Prompt construction and dynamic grounding
To address the first limitation above, you can programmatically construct the prompt. The user enters a minimal amount of information or simply clicks a button in the app, and you then create the prompt programmatically by adding relevant data. For example, in response to a click on the “Write Intro Email” button, you could:
- Call a service to get information about the user.
- Call a service to get information about the contact.
- Call a service to get the list of recent opportunities.
- Construct the prompt using the information obtained from the data services above.
Here is what these prompt construction steps may look like in Apex:
The main drawback of this approach is that it requires custom code for each prompt in order to perform the simple task of merging dynamic data into static text.
Prompt templates
To facilitate the construction of the prompt, we can use templates: a well-known software development pattern that is commonly used to merge dynamic data into static documents. Using a template, you write a prompt file using placeholders that are dynamically replaced with dynamic data at runtime.
Here is what the Apex example above would look like using a generic template language:
You are {{user.Name}}, {{user.Title}} at {{user.CompanyName}}
Write an introduction email to {{contact.Name}}, {{contact.Title}} at {{contact.Account.Name}}
Here are the {{contact.Account.Name}} opportunities:
{{#opportunities}}
{{Name}} : {{Amount}}
{{/opportunities}}
Prompt templates are not only helpful for constructing prompts programmatically, but they can also be used as the foundation for graphical tools that support prompt creation in a drag-and-drop environment.
Prompt Builder
That’s why we created Prompt Builder, a new Salesforce builder that facilitates the creation of prompts. It allows you to create prompt templates in a graphical environment, and bind placeholder fields to dynamic data made available through record page data, a flow, Data Cloud, an Apex call, or an API call. Once created, a prompt template can be used in different places to query the model, including record pages and Apex code.
Einstein Trust Layer
Prompt Builder allows you to define dynamically grounded prompts in a graphical environment. But how do you send that prompt safely to an LLM provider?
You could send the prompt to the LLM provider’s API directly, but there are a number of questions to consider with that approach:
- What about compliance and privacy issues if you pass personally identifiable information (PII) data in the prompt? Could the PII data be persisted by the model provider or even used to further train the model?
- How do you avoid hallucinations, toxicity, and bias in the output generated by LLMs?
- How do you track and log the prompt creation steps for auditing purposes?
If you use the LLM provider’s API directly, you will have to write custom code to handle these questions. There are many things to consider and it can be hard to get it right for all use cases.
Enter the Einstein Trust Layer. The Einstein Trust Layer allows you to send requests to LLMs in a trusted way, addressing the concerns mentioned above.
Here is how it works:
- Instead of making direct API calls, you use the LLM Gateway to access the model. The LLM Gateway supports different model providers and abstracts the differences between them. You can even plug in your own model.
- Before the request is sent to the model provider, it goes through a number of steps including data masking which replaces PII data with fake data to ensure data privacy and compliance.
- To further protect your data, Salesforce has zero retention agreements with model providers, meaning model providers will not persist or further train their models with data sent from Salesforce.
- When the output is received from the model it goes through another series of steps, including demasking, toxicity detection, and audit trail logging. Demasking restores the real data that was replaced by fake data for privacy. Toxicity detection checks for any harmful or offensive content in the output. Audit trail logging records the entire process for auditing purposes.
Looking ahead: Building applications in a whole new way
Now let’s take a peek at what’s coming and address the second question raised at the beginning of this article: How will generative AI change the nature of applications?
Prompt chaining
The logic involved in creating a prompt can sometimes get complex. It may involve multiple API or data service calls like in the dynamic grounding example above. Answering a single user question may even involve multiple calls to the LLM. This is called prompt chaining. Consider the following example:
To construct the prompt:
- We make a first API or data service call to get contextual company data
- The data coming back from the first data service call is used to create a first prompt that we use to query the LLM
- The output of the LLM is used as the input for a second data service call
- The data coming back from the second data service call is used to create a second prompt whose response is sent back to the user.
The possibilities of combining data service calls and LLM calls to generate an output are endless.
AI orchestration
The approach described so far works well, but as these workflows become more complex, we can see the need for some form of orchestration. As a developer, you’d then build a series of building blocks that perform granular tasks: retrieve data about a customer, update a record, perform some computational logic, etc. These building blocks can then be orchestrated or remixed in different ways using an orchestration tool. This could be done using a traditional orchestration tool that lets you define which building blocks to use, in what order, and when (with different “if” branches). But what if the orchestration itself was powered by AI with an orchestrator that can reason and choose which building blocks to use and how to compose them to perform a specific task? AI-powered orchestration is a powerful new paradigm that has the potential to revolutionize the way we interact with AI systems and build applications.
The diagram below describes this new AI-orchestrated building blocks paradigm at a high level.
In this diagram, actions are the building blocks described above. They could be Apex invocable actions, MuleSoft APIs, or prompts. Some foundational actions are available by default and others will be developed by developers. This also creates an opportunity for a marketplace of actions built by developers and partners.
The planner is the AI-powered orchestrator. When the prompt is passed to the orchestration runtime, the planner chooses (creates a plan for) which actions to use and how to compose them to best answer the user request.
AI orchestration is an active area of research at Salesforce and in the industry as a whole.
Summary
Using existing models through APIs is a common way to build AI-powered apps with LLMs. Using this approach, you need to ground the model with private or contextual company data to get more relevant and useful outputs. Instead of asking the user to enter a lot of grounding information manually, you can construct the prompt programmatically by calling data services and adding contextual data to the prompt. Prompt Builder is a new Salesforce builder that facilitates prompt creation by allowing you to create prompt templates in a graphical environment, and bind placeholder fields to dynamic data. The Einstein Trust Layer allows you to send prompts to LLM providers’ APIs in a trusted way, addressing data privacy, bias, and toxicity concerns. AI-powered orchestration is an emerging paradigm that could change the way we interact with AI systems and build applications.
About the author
Christophe Coenraets is the Senior Vice President of Trailblazer Enablement at Salesforce. He is a developer at heart with 25+ years of experience building enterprise applications, enabling technical audiences, and advising IT organizations.