Use Model Builder to Integrate Databricks Models with Salesforce

Model Builder, a capability of Einstein 1 Studio, is a user-friendly platform that enables you to create and operationalize AI in Salesforce. The platform uses and amplifies the power of other AI platforms, enabling you to build, train, and securely deploy custom AI models externally using data in Salesforce. In August 2023, Salesforce announced the launch of Einstein Studio’s integration with Amazon SageMaker , and in November 2023, its integration with Google Cloud Vertex AI. We are pleased to announce our new integration with Databricks. In this blog post, we’ll demonstrate how to use Model Builder for product recommendation predictions using Databricks.

About Model Builder

Model Builder’s Bring Your Own Model (BYOM) capabilities enable data specialists to build and deploy custom AI models in Salesforce Data Cloud. The custom models are trained on Databricks and registered in Salesforce. With the zero-copy approach, data from Data Cloud is used to build and train the models in Databricks. Using clicks, admins can then connect and use the models in Salesforce. Once deployment happens, predictions are automatically and continuously updated in near real-time to generate accurate, personalized insights.

Predictions and insights can also be directly embedded into business processes and applied by business users. For example, marketing analysts can create segments and customize the end-user experience across various channels using the predictions and insights from the model. Also, Salesforce developers can automate processes using flows from model predictions.

Screenshot showing the lifecycle of building AI models

Key benefits of Model Builder

Some of the key benefits of Model Builder include:

Supports diverse AI and ML use cases across Customer 360: Build expert models to optimize business processes across Customer 360. Examples include customer segmentation, personalization, lead conversion, case classification, automation, and more.
Leverage your chosen ML/ AI platform: Access familiar modeling tools in Databricks. Customers can use frameworks and libraries like TensorFlow, PyTorch, and XGBoost, to build and train models that deliver optimal predictions and recommendations.
Provides AI-based insights for optimization in Salesforce workflows: Easily operationalize the models across Customer 360 and embed the results into business processes to drive business value without latency.

Screenshot showing the Databricks Predicted Model in Model Builder

Architectural overview

Data from diverse sources can be consolidated and prepared using Data Cloud’s lakehouse technology and batch data transformations to create a training dataset. The dataset can then be used in Databricks to query, conduct exploratory analysis, and establish a preprocessing pipeline where the AI models are trained and built.

To complete the process, create an endpoint to model deployment and scoring in Data Cloud. Once records are scored, the potential of the Salesforce Platform comes into play through its powerful automation flow functionality. This enables the creation of curated tasks for Salesforce users or automatically including customers in personalized and tailored marketing journeys.

Architectural overview of Databricks and Einstein Copilot Studio Model Build

Operationalize a product recommendation model in Salesforce

Let’s look at how to bring inferences for product recommendations from Databricks into Salesforce using an XGBoost classification model.

In our example use case, fictional retailer Northern Trail Outfitters (NTO) uses Salesforce Sales, Service, and Marketing Clouds. The company wanted to be able to predict their customers’ product preferences to deliver personalized recommendations of products that are most likely to spark interest.

In this use case, we’ll leverage Customer 360 data in Data Cloud’s integrated profiles to develop AI models to forecast an individual’s product preferences. This will allow for precise marketing campaigns driven by AI insights, resulting in improved conversion and increased customer satisfaction, particularly among NTO’s rewards members. It will also increase customer engagement via automated tasks for service representatives to reach out to customers proactively.

Step 1: Prepare training data in Data Cloud

The AI model for the product recommendations use case is constructed based on a dataset of historical information encompassing the following information in Data Cloud data model objects (DMOs):

Customer demographics: Customer-specific information, such as location, age range, Customer Satisfaction (CSAT), or Net Promoter Score (NPS), and loyalty status
Case records: Prior purchases, including the total number of support cases, and if any of the cases were escalated for resolution
Purchase history: Comprehensive information about products purchased and the purchase dates
Website and engagement metrics: Metrics related to the customer’s website interactions, such as the number of visits, clicks, and engagement score

A table of customer data containing customers’ purchase history, engagement data, and other information.

Step 2: Set up, build, and train in Databricks

Once data is curated in Data Cloud, model training and deployment then take place in Databricks. Using a Python SDK connector, you can bring the Data Cloud DMO into Databricks.

Once you have data in the Databricks from Data Cloud, you can use Databricks notebooks for building and training your AI model. In the screenshot from Databricks below, you can see how to query for the input features that go into the model, such as products purchased, club member status, and so on.

Notebook interface to ingest, explore, and select statistically relevant predictors

Next is hyperparameter tuning, which is crucial for systematically adjusting the parameters and selecting the best algorithm. Hyperparameter tuning helps to maximize the performance of AI on a dataset. The optimization involves techniques such as grid search or random search, cross-validation, and careful evaluation of performance metrics, ensuring the model’s ability to perform on new data. Databricks logs the results of the tuning process as ML experiments and recommends the best model from an accuracy perspective.

Screenshot showing a Notebook instance to train the AI model

Deploy the model in Databricks

The final task in this step is to serve the model to enable the scoring of records in Data Cloud. A model endpoint is a URL that can request or invoke an AI model. It provides an interface to send requests (input data) to a trained model and receive the inferencing (scoring) results back from the model.

Model serving in Databricks

Step 3: Set up the model in Model Builder

Once the serving endpoint is created in Databricks, it’s simple to set up the model in Data Cloud using the no-code interface.

Navigate to Data Cloud → Einstein Studio → New

Interface for creating a model in Einstein Studio

Enter the following information: Name, endpoint URL from Databricks, auth header “Authorization” and secret key = bearer <<your personal access token>>.

Screenshot showing how to set up the model in Einstein Studio

Click Next to set up the schema for input-output features. Ensure that the data types of the predictor fields are the same as the scoring DMO.

Screenshot showing how to set up the model schema

Review and save.

Screenshot showing how to save the model

Click Activate to activate the model.

Screenshot showing how to activate the model

Navigate to the Usage tab.

Screenshot showing how to set up a prediction job

Create a new prediction job.

Screenshot showing how to set up a new prediction job

Select the data space and primary object where the predictors are stored, and click Next.

Screenshot showing how to select primary DMO

Map fields to map the schema objects to the predictors in the primary and related DMOs, and click Next.

Screenshot showing how to map fields to the schema

Select Update type. With streaming insights, you can automatically initiate inference calls based on changes in certain predictor values. The Batch update type ensures that the inference calls are made once every 60 minutes.

Screenshot showing how to select update type

Review and save the prediction job.

Screenshot showing review of prediction job

Save the prediction job and give it a name.

Screenshot showing a saved prediction job

Activate the prediction job.

Screenshot showing how to activate a prediction job

Once the prediction job is activated, you can run the job to get the predictions.

Screenshot showing how to run the prediction job

The inferences are stored as a stand-alone DMO with the same name as the prediction job.

Screenshot showing inferences DMO

The DMO has the ID field of the primary DMO that the model was set up on, along with the prediction field. Note that you can have the model return multiple predictions with one inference call.

Screenshot showing the DMO structure of a prediction DMO

The prediction DMO is automatically related to the parent DMO, so that it can be used in downstream processes, such as calculated insights, segmentation, etc., to act on these insights at scale.

Screenshot showing a prediction DMO related to the parent DMO

Step 4: Create flows to automate processes in Salesforce

Navigate to Setup → Flows.

Screenshot showing how to set up Salesforce flows

Select New → Data Cloud-Triggered Flow.

Screenshot showing how data-triggered flows capture changes in DMOs and kick-start the flow

Click Create. The system will ask you to associate it with a Data Cloud object. Select the DMO that stores the predictors. In this case, it is the Account Contact Object DMO.

Screenshot showing how to select the DMO where output predictions are stored

All the records that have a prediction value change, or records with new predictions, will now be reflected in this flow. Now, you can create automated tasks in Salesforce core based on specific criteria.

Diagram showing how based on business criteria, you create automated tasks in core Salesforce

Step 5: Create segments and activations in Data Cloud for targeted marketing campaigns

In Data Cloud, navigate to Segments.

Screenshot showing how to segment your data based on predicted inferences

Select New and give the segment a name.

Screenshot showing how to set up segmentation

Click Next. Choose publish type and schedule.

Screenshot showing how specify the publish schedule

Once you click Save, you can edit the segmentation rules. Add the segmentation rule and click Save.

Screenshot showing how to set up business rules for targeted segmentation

Now add the segmentation to an activation. Navigate to the Activations tab in Data Cloud.

Screenshot showing how to create activations

To add segmentation to an Activation, choose the segment and activation target; for example, Google Ads or Marketing Cloud. Then select the unified individual as the activation membership, and click Continue.

Screenshot showing how to set up activations from segment population

The activations have now been created. As the predictions change, the activations will automatically be refreshed and sent to the activation targets. This means that Northern Trail Outfitters (NTO) are now able to predict their customers’ product preferences, so they can deliver personalized recommendations of products that are most likely to spark their customers’ interest.

Conclusion

Model Builder is an easy-to-use AI platform that enables data science and engineering teams to build, train, and deploy AI models using external platforms and data in Data Cloud. External platforms include Amazon SageMaker, Google Cloud Vertex AI, Databricks, and other predictive or generative AI services. Once deployed, you can use the AI models to power Sales, Service, Marketing, and Commerce Clouds, and other Salesforce applications.

To elevate your AI strategy using Model Builder, watch our on-demand webinar with AI experts from Salesforce and Google Cloud.

Additional Resources

Newsroom release announcement
Einstein Studio Release Notes
Einstein Studio Salesforce documentation
E instein Studio GA announcement with Google Cloud Vertex AI
Einstein Studio GA with Amazon SageMaker
Learn about Generative AI and Large Language Models (LLMs) on the Salesforce 360 blog and Building AI-Powered Apps with LLMs and Einstein

About the authors

Sharda Rao is a Distinguished Technical Architect for Data Cloud. She has over 20+ experience in the financial industry specializing in implementing data science and machine learning solutions.

Daryl Martis is the Director of Product at Salesforce for Einstein. He has over 10 years of experience in planning, building, launching, and managing world-class solutions for enterprise customers, including AI/ML and cloud solutions. Follow him on LinkedIn or Twitter.

Anastasiya Zdzitavetskaya is the Director of Product at Salesforce for Einstein. She has extensive experience in enterprise software designing and building no-code and pro-code AI solutions. Follow her on LinkedIn.