Alba Rivas, a passionate Salesforce Developer Advocate, shares her insights on the power of Salesforce’s Models API. Discover how this API enables developers to seamlessly integrate large language models via REST API or Apex, eliminating the need for custom code or prompt templates. Alba highlights the critical role of the Einstein Trust Layer in ensuring data security, compliance, and efficient integration with external models. Gain knowledge on AI implementation, including maintaining audit trails, monitoring data for toxicity, and using human feedback for refining AI systems. 

We also explore the flexibility and capabilities of the Models API, offering a centralized interface for developers, along with resources like Postman collections, codeLive sessions, and an upcoming Trailhead module to support your journey in mastering generative AI with Salesforce.

Show Highlights:

  • Discussion on the Einstein Trust Layer’s role in ensuring security, data masking, and compliance for generative AI applications.
  • Comparison between using prompt templates and the Models API.
  • Importance of maintaining an audit trail.
  • Focus on the value of human feedback in refining AI systems and the benefits of a centralized interface for developers.
  • Insights into practical AI implementation and the flexibility offered by the Models API.

Links:

Transcript

René Winkelmeyer:

Welcome to the Salesforce Developer podcast. My name is René Winkelmeyer, and in this podcast, I’m hosting Salesforce Developer advocates who are going to share insightful stories, new tips, techniques and everything actually you have to know these days as a Salesforce developer. Today, I’m very, very happy to turn to some of my, I wouldn’t say favorite, but actually yes, favorite members of the team, Alba Rivas.

Alba Rivas:

Thank you René, and hello everyone. I’m also very happy to be here with you sharing what I know to help developers better understand the features of the platform, what we are recently releasing and sharing knowledge, which is one of my favorite tasks as well.

René Winkelmeyer:

When I said favorite members just for the other members of my team, Alba has been in the ecosystem for a very, very long time. She worked for an ISV, now has been in the role for a developer advocate for a couple of years already, and she’s very, very passionate about the Salesforce platform.

One thing we’re going to talk about is actually a new feature that was introduced for the Einstein 1 Platform, which is the new Models API, which is in beta. Alba, tell us about this. What is that actually?

Alba Rivas:

Sure, so the new Models API is a new API that you can use like a regular REST API or also from APEX because there are some classes that mimic its behavior in APEX. This API allows you to use the large language models that you have imported using Model Builder directly, without using prompt templates or without having to write your custom code. You import your large language models or also there are some out of the box models that are offered as a service there, and you can create applications that either use the REST API or APEX to interact with those models, to execute actions using those models.

There are different actions available, different endpoints that you can reach out to. We’re going to describe what you can do with this API and how it’s different from what you can do with prompt templates and when you should be using one feature or the other.

René Winkelmeyer:

Just to recap in my own words it’s basically, so when we look at the suite, which is the Einstein 1 Studio with Prompt Builder, Cobol Builder, as well as Model Builder, we often spoke about Prompt Builder and Cobol Builder, but we not really, but often talked only with Model Builder in the sense around Data Cloud when to import for example SageMaker or Google Vertex.

Now the new capability that you just described is actually Model Builder can also be used or will be used to import any other large language model that you have in your company, Open Source or whatever you want to do, and then basically be able to hook it up through the Einstein Trust Layer, I assume so with an API either REST or APEX. Is that correct?

Alba Rivas:

Exactly. The main differentiator of the Models API with regards to using your custom code and reaching out like an OpenAI API directly is the Einstein Trust Layer. I think that every generative AI request that happens from Salesforce, it doesn’t matter if it comes from Copilot, Prompt Builder, the Models API or any new functionality that we release, go through the Einstein Trust Layer. Because of security and because we want to have all those features of data masking, toxicity detection, scoring and so on, applicable to our CRM data, our Data Cloud data, and our private customer data at the time of communicating with a large language model that could be external large language model.

If you are using Salesforce models offered as a service, we additionally have through the Einstein Trust Layer and no retention agreement so that those models don’t keep your data and don’t use it for self-training. It’s like a full suite of security features that the Einstein Trust Layer offers to you at the time of using generative AI features or creating your own generative AI applications that you wouldn’t have in any other way.

René Winkelmeyer:

That is very, very compelling, I would say, because when I’m a customer who uses Salesforce and you are going to use generative AI, be it through either the standard functionality or build custom generative AI applications in your organization, or even with some maybe custom LLMs because as big as the enterprise gets and you want to use domain-specific LLMs for example in other areas. You can now actually hook it up through Model Builder and expose this and leverage all the benefits of the Einstein Trust Layer with that, which I think is again really compelling so that you don’t have to spin up your own infrastructure and all the things around this.

Now, one thing you mentioned, and I think that would be really interesting is there are also reasons when you should use prompt templates or not. Because in the past we said like, okay, if you want to use like GenAI, go with Prompt Builder, here are the built-in or pre-defined OMs that we give you and that is the best way actually to interact with GenAI. But you mentioned there are also reasons maybe not or when one is better than the other. What is that?

Alba Rivas:

Yeah, so I like to imagine prompt templates and the Models API as two different layers that are meant to be used by slightly different personas. Prompt templates are mainly created by admins that can test them just with point and click, can change those prompt templates easily in Prompt Builder, can version the prompt templates easily, and can connect them very easily to the platform through flow for instance.

Another benefit of using prompt templates is obviously grounding because if you want to incorporate your CRM data or your Data Cloud data into prompt templates, it’s super easy through the Builder. Using all the data providers that we have available and even using semantic search retrieval-augmented generation, which is something that we are making generally available also now for Dreamforce for Data Cloud. That’s another super interesting feature that we should have a podcast about.

Obviously, developers are in the middle because developers can also create their own applications using prompt templates. They have a space in prompt templates, and because prompt templates are exposed through the Connect API, also in the REST API and also in APEX. This is you can invoke prompt templates from these APIs and create your custom applications using prompt templates.

Then Models API, it’s a little bit more closer to the developer persona because the Models API, it’s only code-based. You cannot use the Models API anywhere in Salesforce with just clicks. Could be the way to go if you want to build something completely custom for instance. If you want to build prompts dynamically at runtime and you don’t care about reusability.

With prompt templates, you create a prompt template and you reuse it many times or in many places in the platform. But with the Models API, you create your prompt yourself one time. On one hand that’s less flexible, but on the other hand, not less flexible but less reusable. But on the other hand, it’s more flexible because you have full control of what you put into the prompt. You can build those prompts dynamically at runtime.

In the Models API, so in prompt templates, you execute prompt templates and you generate a text. That’s the only use case, generate text. But in the Models API, there are more possibilities available. There is an endpoint that allows you to generate a chat session with the model and that keeps track of the conversation that the user is having with your agent or with your model.

It also exposes an endpoint that allows you to generate embeddings for creating your own semantic search capabilities for instance. It also provides the capability of providing feedback. I don’t know if people are aware of this, but in the platform you can activate something called, I think it’s generative AI audit trail that allows the platform to store all the interactions that users have with a large language model, and that also stores feedback. For instance, if in Prompt Builder you received a response that you didn’t like, there is a thumbs up, thumbs down functionality that’s behind the scenes is storing feedback in that audit trail and that lives in Data Cloud, that audit trail data.

That’s done automatically for prompt templates and for Copilot. But if you want to provide feedback from a custom app or a custom implementation, then you need to use the Models API because the Models API also has an endpoint exposed that allows you to provide feedback.

Additionally, there is a little bit more flexibility at the time of interacting with the model from your Models API applications because you can send things such as system prompts. System prompts are instructions that you say to the model behind the scenes, that the model should take into consideration on top of the user prompt, on top of what the user wrote. This is automatically sent by Prompt Builder when we execute or the apps that use prompt templates, those system prompts and maybe in those system prompts we are saying to the model, don’t hallucinate or use a proper language because you are in an enterprise ecosystem, things like that.

But with the Models API, we have flexibility to indicate those system prompts ourselves, and at the end, have a much more customized and controlled application. Of course, losing the usability and configurability benefits that we get with prompt templates. Hope that that responds to your question. I know it was a lot of information. We can explain it in depth step by step.

René Winkelmeyer:

I think that was a very good explanation. I think that also captures some of the complexity actually that everyone who’s going to work with GenAI in terms of how do I actually define my architecture for this, which approach is the best for this use case? Because when we look at all the things that GenAI brought to us, also it is a solution. A large language model is not the solution for everything because you will have different use cases and you will need to implement different elements and different approaches using directly the API being more automatic, being just like using builder, like prompt templates depending on the use case you want to do.

I think a good example is that you just brought up is if you use Prompt Builder and you use other grounding capabilities, you already have built-in capabilities, for example with Data Cloud and REC. It’s just there. While if you would go two levels deeper and want do your own embeddings, I think that’s just a very different level of architecture that you’re looking at as a developer when building with GenAI. I think that it just adds more complexity, which I guess also is really needed in the more complex use cases.

Alba Rivas:

Exactly, yes.

René Winkelmeyer:

One item that I want to highlight that is a bit unrelated but also related is you brought up the Einstein feedback capabilities. I think that’s really key because with everything GenAI, one of the most important things for me, and I think that aligns with what I’m hearing from my team and from others, it is AI with humans. The human makes a decision and gives them on for a response, no matter if it’s coming through the Models API or for example, through a template invoked by Prompt Builder, it is like there will be an explicit or an implicit feedback. I’m not going to use actually the generator response. I’m not going to use it. That’s very implicit. No, it’s not okay. Give me something else. Or if I use thumbs up, thumbs down, and that data is then actually stored the feedback, but also toxicity detection, all the other things in Data Cloud.

Coming back to how do I refine as an enterprise, all the things that I’m going to generate with the Models API and how people in my company actually using it. That’s then coming back on the Data Cloud side to say like, okay, give me now the actual usage data that it can refine this very cutting edge technology and use it to my company.

There’s a blog that we published, I think it was end of July, and I was working with the author on that. That goes also more in depth on actually how you can pull that data from all the invocations and how users interact with that, to actually see what and what doesn’t work with the way that you implement an AI in your company, which I think is really, really exciting because the feedback loop around this is also very, very crucial.

Alba Rivas:

Yeah, yeah, totally. To give a little bit more information to the audience, so whenever you activate the generative AI audit trail, every generative AI request that is performed is stored and the information that we store are the prompt details, but also the hydrated prompt, which is the final version of the prompt that was sent to the model because we do a lot of things behind the scenes. The Einstein Trust Layer really does this to make sure that your prompt is safe and the response of the large language model is appropriate.

We also store the data masked values and the data unmasked as well so that you can verify if data masking is working well for your use case. We also store toxicity and safety scores because when you do a request to a model, it doesn’t matter if it’s through prompt templates or through the Models APIs. We pass always the response through a toxicity detection model and we compute the hate score, physical score, profanity score, safety score, sexual score, toxicity score, and violence score, all that.

Those are going to be parameters that have a value from zero to one and that you can explore introspect in the model response if you are building a customer application. I know that, I don’t know if it’s available yet, but I know that we were working on surfacing some of these scores in the Prompt Builder app so that when you are testing your prompts, it gives you warnings and it tells you, “Hey, be careful with this prompt because maybe toxicity score was very high.”

René Winkelmeyer:

Awesome. I think that also really gives them, again, the choice when building these applications and the feedback, again, I think it’s AI with humans. When I’m building my prompt, and again, I don’t want to derive from the topic, but I think that’s again really important to highlight is when you build with AI, you want to see feedback to see before you roll it out. We even haven’t spoken, I guess it will be a topic for a different podcast or blog or video, is actually how do you implement DevOps around GenAI features because just doing everything in production is also potentially something you don’t want to do.

But I think that’s really key, and I’m looking forward to this feature actually that you just described that gives me, if I’m just building in a AI tool like Prompt Builder, my prompts, or if I’m going to do this where a code or REST where I definitely will see the results and can then reuse them as I see fit, I think that will be really great.

I would say that was a lot to learn, and I also think I’m really excited about that we really expose this capability to developers, even in the sense that it goes beyond what we just provide out of the box. I think that’s something where it also shows the openness of the Einstein 1 Platform that is bring your own large language model based on the ones that we support, so that you’re not restricted to what we do, but also what is going to happen in your company.

Alba Rivas:

Yeah, yeah, totally. Totally. Regarding bring your own large language model, also think that thanks to the Models API, we have a single interface that allows us to work with any model. This means that you could be creating your code once and then changing the model that is being used behind the scenes without having to change your code. It’s going to give you a lot of flexibility as well in regards to model selection because the model that you use is independent from the API, the interface that you are calling in your custom code. This allows you to decouple your implementation from the model being used, and that’s also really interesting for developers.

René Winkelmeyer:

It actually makes my heart sing as a developer. Just remembering many years ago when there were similar services, every developer had to go through this. There are three things that are doing the same thing in your company, but you have to write three different implementations because there is no integration there. You also would provide, for example, but in general you just do it yourself.

Having a centralized independent interface that also helps really to implement any kind of technology for sure. But also when you work with GenAI to build your custom apps, and if your company is going to swap out an LLM for a use case to something else, it doesn’t really change the implementation side on your applications that you’ve built upon. I think that’s a very, very good example actually.

Alba, you also shared a bunch of resources, which I’m going to link in the show notes. There’s definitely something from the Postman collection to CodeLive and a couple of other resources. I think our audience will have enough material to learn more about this exciting topic, which are bring your own models to Salesforce.

Alba Rivas:

Yeah, I wanted to add that there is a Trailhead module going out for Dreamforce on the Models API, so stay tuned because around September 16th or 17th, this new module will be available and I’m really looking forward to get hands-on with it.

René Winkelmeyer:

Awesome. Thank you very much, Alba. As always, I enjoyed your conversation and I would say, “Happy day.”

Alba Rivas:

Same, and thank you so much for all those who listen to this podcast. You can find me on LinkedIn if you have more questions.

René Winkelmeyer:

If you’re still here, thanks for listening, and head over to developer.salesforce.com/podcast to check out all our episodes. Also check out the show notes because that’s where you will find all the great links to provide you more information and insights about all the topics we talked about today. Bye bye.

Get notified of new episodes with the new Salesforce Developers Slack app.