Salesforce Data Cloud empowers developers to harness the power of big data for businesses. By utilizing Data Cloud, customers can consolidate customer data from multiple systems into a single Salesforce instance, creating a unified view of data across the entire enterprise. This data can be utilized for analytics, machine learning, and automated actions. In this first blog of our two-part series, we will explore different Apex utilities for querying data in Data Cloud and provide guidance on how to effectively utilize them.
Apex offers a range of utilities for Data Cloud. For example, it enables developers building with Lightning Web Components to customize Data Cloud standard user experiences, or ISVs building their own code to automate Data Cloud-specific operations, such as identity resolution, creating and running Data Cloud calculated insights, or segmentation.
Salesforce Data Cloud objects vs standard/custom objects
Before we look into how to query data from Data Cloud, let’s understand a bit about Salesforce Data Cloud objects and how they differ with respect to the standard/custom objects of the Salesforce Platform.
Salesforce Data Cloud has a canonical data model that includes data lake objects (DLO), and data model objects (DMO). You can read about how these objects are mapped to each other and their purposes in the help documentation.
Data Cloud objects can ingest and store much larger volumes of data (in the magnitude of billions of records) compared to regular custom and standard objects on the Salesforce Platform. Standard/Custom objects are designed for transactional use cases and are not suitable for storing and processing big data. On the other hand, Data Cloud objects add data lakehouse-like capabilities.
Another key distinction is that Data Cloud objects do not support Synchronous Apex triggers. However, you can still achieve process automation by subscribing to Change Data Capture (CDC) and utilizing Flows or Apex. What’s common between the Data Cloud objects and the platform objects is that they are built on the same metadata-driven foundation, making it possible to use platform features, such as Salesforce Flow, Apex, and Platform Events.
How to query data from Data Cloud in Apex
Before we deep dive into some code, let’s explore an example use case of a Data Cloud app.
Sample use case and assumptions
For our code examples in this blog post, let’s assume that we are working for a fictional company called Solar Circles that captures data from all of its installed Solar Panels in Data Cloud. Each month, there are tens of millions of data points generated from these panels. By having this data in Data Cloud, Solar Circles gains the capability to perform analytics, utilize machine learning techniques, and derive actionable insights from the data.
The Apex code in this post assumes an important condition: Data Cloud is enabled and the Apex code is running in the Data Cloud org and not on Salesforce orgs that are connected to the Data Cloud org.
Query data from Data Cloud using SQL
To access data from Data Cloud objects (DLO or DMO), use the CdpQuery
class (see docs) in Apex. This class is available under the ConnectApi
namespace (see docs).
Below is an example snippet of code that shows how to access the data from a Data Cloud object using a SQL statement.
In the above example, we are retrieving data for a custom LWC component on a Standard Object Case Lightning page for a service agent. The component shows recent device data coming from the panels installed on the customer site.
Code highlights
- The method takes a
customerId
parameter, indicating that it retrieves solar panel data for a specific customer - An instance of
ConnectApi.CdpQueryInput
calledqueryInput
is created to define the query operation - The
queryInput.sql
property is set with a SQL query that selects all fields from theSolar_Panel_Events_solar_panel_F4C03__dlm
data object, filtered byCustomerId__c
- The query is executed using
ConnectApi.CdpQuery.queryAnsiSqlV2(queryInput)
, which returns aConnectApi.CdpQueryOutputV2
object namedresponse
- The
response.metadata
is assigned toresponseMetadata
, which stores the metadata of the query response
Important Considerations
- Apex has a CPU limit of 10 seconds for synchronous transactions. Data Cloud can hold billions of rows of data. While retrieving data in Apex from Data Cloud, make sure you add sufficient filters and provide context (like the
recordId
you are working with) to limit the number of rows to avoid hitting the 10-second CPU limit. - If you are retrieving a large amount of data, use Queueable Apex to run the process asynchronously and take advantage of the CPU limit of 60 seconds.
- We recommend using
queryAnsiSqlV2
(see docs) instead ofqueryAnsiSql
to take advantage of subsequent requests and larger response sizes for use cases where you need to pull large volumes of data. - Use
nextBatchAnsiSqlV2(nextBatchId)
(see docs) to providebatchId
from the previous response to retrieve the next set of results. - You can also use SOQL instead of SQL, but make sure you obtain your SOQL using the Data Explorer since there are SOQL functions that may not be applicable for Data Cloud objects.
How to search for profile information
Before we look into how to search profile information from Data Cloud in Apex, we need to understand what a unified profile is.
Unified profile and identity resolution
Let’s say that Solar Circles, our fictitious solar panel manufacturer, has data about a customer named Martha in multiple systems. Each system has different information about her, such as different email addresses. These unique pieces of data are called contact points. Customers like Martha are represented by multiple contact records and system-specific profiles across various systems. This is necessary for each cloud and product to operate independently, but it can create data silos.
Data Cloud provides an identity resolution feature to solve this problem. By using identity rules, the system creates unified individual profiles that can be used for segmentation and activations across various other systems.
Search profile information from Data Cloud
Below is an example utility Apex code that searches for profile information. Note that the queryProfileApi
method of the ConnectApi.CdpQuery
class is used.
Here is an example snippet of code that invokes the above utility code by passing in the parameters.
The code searches for the profile information of customer Martha on the UnifiedIndividual__dlm
data model object.
Code highlights
- The method uses
ConnectApi.CdpQuery.queryProfileApi()
to execute the query for profile data in the Data Cloud - The query parameters include the names of the data model object (
dataModelName
), child data model object (childDataModelName
), search key field (searchKey
), and customer name (customerName
) - Additional optional parameters can be provided, such as equality expressions, child object field names, the number of items to return, the number of rows to skip, and the sort order for the result set
- The query response is stored in a
ConnectApi.CdpQueryOutput
object namedresponse
- The method returns
response.data
, which represents the data retrieved from the query
Important consideration
- Double-check the field and object names before using them in the Apex code as the method can otherwise throw exceptions and internal server errors.
How to query data from calculated insights?
Calculated insights let you define and calculate multidimensional metrics on your entire digital state in Data Cloud. Calculated insights are generated by Data Cloud by writing SQL, declaratively using Insights Builder, or using Apex.
Streaming vs calculated insights
There are two types of insights in Data Cloud: streaming and calculated insights.
Calculated insights are functions that can calculate metrics on historical data. They are processed in batches. For example, in our Solar Circles application, we can have a calculated insight that measures the total power generated by the panels grouped by every customer.
Streaming insights are generated in near real-time by analyzing the continuous incoming data stream. These insights enable the immediate triggering of actions in downstream systems. For instance, streaming insights can be utilized to identify customers whose solar panels are generating minimal power output. By leveraging a data action on streaming insights, we can proactively create a case for such customers in the Salesforce Service Cloud.
Query data from a calculated insight
To query data from the calculated insights, use the queryCalculatedInsights
method from CdpQuery
class. Below is an example code snippet that shows how to query for data from a known calculated insight.
Code highlights
- The
queryCalculatedInsights
method fromConnectApi.CdpQuery
is used to retrieve calculated insights from Data Cloud. - The first parameter is the API name of the calculated insight, which should end with
__cio
. For example,<calculted insight api name>
could be replaced withtotalpowergenerated__cio
. - The next parameters specify dimensions and measures. A dimension represents a field or attribute on which the insight is based, while a measure represents the calculated metric. Providing
null
for these parameters includes all available dimensions and measures. - The sort order for the result set can be specified, but in this code snippet, it is set to
null
. - Additional optional parameters include filtering the result set to a more specific scope or type, and specifying the number of items to return and the number of rows to skip before returning results.
- The resulting data is stored in a
ConnectApi.CdpQueryOutput
object namedresponse
.
Important consideration
- Make sure you provide the correct API name for the insights. An incorrect API name results in a system error.
Conclusion
In this blog post, we provided an overview of how you can harness the power of Salesforce Data Cloud and Apex to leverage big data for businesses. The code examples and highlights demonstrate practical approaches to accessing and querying data from Data Cloud objects.
The post also highlights best practices and limitations to consider when working with Data Cloud and Apex, such as managing CPU limits, utilizing asynchronous processing for large data sets, and ensuring correct API naming for calculated insights.
In the next part of the series, we will deep dive into Apex classes like CdpCalculatedInsight
(see docs), CdpIdentityResolution
(see docs), and CdpSegment
(see docs) that can be used to manage calculated insights, create identity resolution rules, and segmentation in Data Cloud using Apex.
Additional references
- CdpQuery Class Apex reference
- Data Cloud Decoded video series
- Data-Cloud Powered Experiences Trailhead module
- Explore Data Cloud trail
About the author
Mohith Shrivastava is a Developer Advocate at Salesforce with a decade of experience building enterprise-scale products on the Salesforce Platform. He is presently focusing on the Salesforce Developer Tools, Flow, Apex, and Lightning Web Components at Salesforce. Mohith is currently among the lead contributors on Salesforce Stack Exchange, a developer forum where Salesforce Developers can ask questions and share knowledge. You can follow him via his Twitter @msrivastav13.