The AI landscape is changing at such a rapid pace that futuristic technologies like autonomous AI are already much closer than you may think. This is due to the way that large language models (LLMs) are starting to be incorporated into almost every way that you interact with apps. For developers, this brings a shift in the way that we approach building applications from the ways that we bring together to building with an entirely new conversational UX.
In this blog post, we’ll take a look at how autonomous agents bring AI into the way that applications function while bringing us closer to an autonomous world.
What are autonomous agents?
In our technological landscape, agents are advanced systems that harness the power of language models for reasoning and decision-making. What sets them apart from just another bot or framework is the fact that agents can perform tasks on your behalf using tools and memory.
Tools are extensions of a language model’s capabilities, bridging gaps in its knowledge and enabling it to interact with external data sources or computational resources. With these tools, a language model can fetch real-time data, execute tasks, and use the outcomes to inform its subsequent actions. For instance, if a language model is aware of information only up to a certain date, tools can provide it with more current information from the web, databases, or other external sources.
Memory provides the agents with the ability to recall past interactions, which can be essential for continuity in tasks and learning from previous actions. This memory can be short-lived, focusing on recent interactions, or long-term, recalling significant past events or patterns that are relevant to current situations.
Together, these elements transform a language model into an agent that can not only understand and generate text, but also act on that understanding in real-world contexts. Such agents can autonomously execute solutions for users, but they can also integrate human intervention, especially in scenarios where there are uncertainties or exceptions.
How do agents work?
There are many frameworks that have been built to support the advancement of agents, some of the most popular being AutoGPT and LangChain. Generally, agents follow a similar pattern: the ReAct Framework for Reasoning and Acting in Language Models.
This framework consists of a series of steps:
- The user provides input
- The agent “thinks” of the appropriate response
- The agent determines the action, selects the relevant tool, and decides on the input for that tool
- The tool delivers an output
- The process cycles through steps 2 to 4 until the agent determines that the task is complete
This process is what starts to make the agent autonomous. By relying on the LLM to think about the response and determine the appropriate actions needed, it is acting on its own to create the desired outcome.
Using LangChain as an example, let’s say that we want to build an app that allows a customer to manage their orders. We could first give the app access to our order database, customer database, and shipping partner APIs. Then, we’d set up a number of tools the app can access that can query data, update data, and use generative AI to draft a response.
This order management agent has six tools that it can use “within its domain of knowledge “:
- Query Orders is a tool that can query orders from a database through an API connected to a PostgreSQL database
- Update Order is a tool that can update a single order from a database through an API connected to a PostgreSQL database
- Manage Tracking Info is a tool that can manage a shipment through an API provided by a shipping company
- Get Customer is a tool that can query customer data from an API connected to a CRM system
- Update Customer is a tool that can update customer data through an API connected to a CRM system
- Compose Response is a tool that can pass prompts to an LLM and return a response
Let’s now take a look at how an agent would be able to handle use cases related to order management. For example, how can the agent help a user get an update on the status of their order?
- The user asks for the latest information from their order via a chatbot
- The agent “thinks” and determines the correct action that it needs to take
- The agent first uses the Query Customer tool to query the customer’s details
- The agent then uses the Query Orders tool to query orders from a database
- The agent then uses the Manage Tracking Info tool to get the latest shipping information from their shipping partner
- The agent then takes both of those outputs and uses the Compose Response Tool to generate a response
- The response is sent back to the user
In this scenario, the agent was able to take the tools that we have provided and determine the order and parameters they need to create the correct output for the user, in this case, all of their order and shipping information. What is important to note here is that the user can ask the agent any question about their order and the agent can use AI to reason and use the tools in any order that it needs.
As a developer, your role becomes more focused on creating the tools and letting the agent manage the orchestration.
Keeping a human in the loop
The ethical challenge with autonomous agents is that there is no human in the loop when it comes to executing the actions. At Salesforce, we are committed to the ethical use of AI and want to make that clear in our implementations of this type of technology. Certain rules mandate that a person be responsible for making the ultimate determination in matters with legal or comparably impactful consequences, including job recruitment, loan approvals, educational admissions, and suggestions in criminal justice. This insistence on human oversight, instead of automated decisions, aims to better identify and reduce potential biases and harm.
What does this mean for the future of Salesforce?
At Dreamforce this year, we gave you a glimpse at what the future of Salesforce and autonomous AI looks like on the Einstein 1 Platform. Einstein Copilot is our answer to an agent-based, generative AI conversational assistant that utilizes skills and actions to guide users through interacting with Salesforce. This introduces a whole new development paradigm to Salesforce, one where we are creating smaller pieces of functionality that can be orchestrated by Einstein Copilot.
How does Einstein Copilot compare to an AI agent?
While there are several similarities between Copilot and an open-source agent framework, the real difference is Copilot’s access to Salesforce’s entire metadata platform. Not only that, but the scope is much larger. Instead of individual agents, you have many skills, and instead of tools you have actions.
For example, if you would like to update an order using Copilot, you would create an order management skill. With other frameworks, you would need to create an entire agent for order management.
When it comes to actions, you have the power of the Einstein 1 Platform behind you. You will be able to use Apex, Flow, the many platform APIs, SOQL, and much more to give your skill the ability to bring together data from anywhere. You also have direct access to data from across the platform.
Einstein Copilot Studio
These skills and actions are brought together in the Einstein Copilot Studio, which allows you to assemble flows, prompts, Apex, and more into collections of functionality!
There are currently three tools within the Einstein Copilot Studio:
- Prompt Builder allows you to construct prompt templates using merge fields from records and data provided by Flow and Data Cloud
- Skills Builder allows you to assemble actions, such as Apex invocable methods, flows, and MuleSoft API callouts, and grant them to an agent
- Model Builder allows you to bring your own AI models to Salesforce
Together, you will be able to build powerful agents in Salesforce that can use your code to answer questions and assist users.
The Einstein Trust Layer
One huge advantage of Einstein Copilot is the Einstein Trust Layer. The Trust Layer provides a secure environment for data processing through a large language model, ensuring user data remains confidential by masking personally identifiable information, checking output for inappropriate content, and ensuring no data persistence outside Salesforce.
The Trust Layer runs through a multi-step process to ensure that data is grounded and masked before being processed by a third-party LLM provider, and it provides a secure gateway for interacting with said LLMs. Once a response has been generated, it checks it for toxic content and de-masks the data before presenting it back to the user. You can get a closer look at the trust layer in our blog post Inside the Einstein Trust Layer.
Summary
Autonomous AI is made a much closer reality through agents, ushering in a new era of technology where reasoning and decision-making are empowered by tools and memory. Salesforce’s Einstein Copilot introduces this agent-driven approach to the platform, offering a conversational AI assistant that guides users, leverages Salesforce’s vast metadata, and ensures data integrity through the Einstein Trust Layer. This transformative shift signifies not just an evolution in AI interactions, but a promise of secure, efficient, and seamless experiences for Salesforce users.
Resources
Large Action Models: Toward Actionable Generative AI.
Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization.
About the author
Stephan Chandler-Garcia is the Director of Strategic Content at Salesforce. He has been in the Salesforce ecosystem for over 10 years as a customer, partner, and ISV. You can find Stephan in person at a Trailblazer Community Group or at one of our conferences around the world. Alternatively, follow him on X(Twitter) or GitHub.