Integrate Google BigQuery with Salesforce | Salesforce Developers Blog

If you research solutions that enable you to store and analyze big sets of data (and I mean REALLY big), you likely will come across BigQuery, a cloud-based data warehouse offered by our strategic partner Google. This blog post shows how to integrate transactional data from a Salesforce org and BigQuery using Apex and Lightning.

Use case

For this sample implementation we want to store changes on order records into BigQuery and later read them for display in a Lightning component. This diagram shows the functionality of the demo sample:


We’re using an Apex Trigger to store data on every order record change into BigQuery. Even if there is already a similar mechanism for order history built-in into the Salesforce Platform, it’s a good example for how to externally store this kind of data (and more). For authentication against Google OAuth, the Apex JWT library is used. You can grab the full code example including a detailed setup description here on GitHub.

BigQuery setup

Before you can start using BigQuery, you have to setup the API and access to it. I highly recommend that you take the Trailhead module Query GitHub Data using BigQuery, where the first unit leads you through the initial steps of the setup on Google Cloud Platform. In addition to that you can also check out Google’s Quickstart for BigQuery.

The next step, after enabling the API, is to create a service account that can access the API. You can find instructions for on the BigQuery Client Libraries site. Make sure to have the then-created JSON file handy, as it’ll be needed later.

For the demo integration I’ve set up a dataset OrderHistory in BigQuery, as well as a new table Salesforce. To that table I added the fields OrderHistory, Status, and TotalAmount. Note that the data type NUMERIC is, as of the time of this writing, in beta. You likely recognize the schema field names, as they reflect their Salesforce field name counterparts from the Order object.

Now we’re set up to insert to and read data from BigQuery using their REST API.

Inserting and reading data using Apex

Let’s check out the first snippet, which represents the code that’s called from the Apex Trigger:

String projectId = 'bigquery-sf-muenzpraeger';
String datasetId = 'OrderHistory';
String tableId = 'Salesforce';

// Get Order record
List<Order> orders = [SELECT OrderNumber, Status, TotalAmount FROM Order Where Id =: orderId];

if (orders.size()>0) {
    // Setup connector
    GoogleBigQuery google = new GoogleBigQuery(projectId, datasetId, tableId);
    GoogleBigQuery.InsertAll insertAll = new GoogleBigQuery.InsertAll();
    insertAll.addObject(orders.get(0));
    if (!google.add(insertAll)) {
       System.debug('Error: ' + google.getResponse());
    }
}

Code explanation:

  • Lines 1-3: These three variables represent the needed configuration values for executing the insert into BigQuery into the correct table.
  • Line 6: As the method is called from an Apex Trigger, we’re selecting the order record that has changed based on the record id.
  • Line 10: A new GoogleBigQuery object, which wraps the REST HTTP call to the BigQuery API, is created.
  • Line 11: The newly instantiated GoogleBigQuery.InsertAll object represents the object which gets send to the BigQuery API.
  • Line 12: We’re adding the object record via the addObject method to the GoogleBigQuery.InsertAll object. This method adds the object as a value to a Map<String, Object> variable within the GoogleBigQuery.InsertAll object. This will automatically convert the record data to JSON, so we don’t have to take care of that manually. It also allows a generic use as nothing is bound to a specific object.
  • Line 13: The add method on the GoogleBigQuery object gets called and inserts the record data into BigQuery.

You’ve created a way to get data into BigQuery; you’ll probably also want a way to query it. For that we use another Apex class as well as a lightning:datatable for displaying the data:


For retrieving the data from BigQuery we’re using this Apex code:

String projectId = 'bigquery-sf-muenzpraeger';

GoogleBigQuery google = new GoogleBigQuery(projectId);
GoogleBigQuery.JobsQuery query = new GoogleBigQuery.JobsQuery();
query.query = sql;
if (!google.query(query)) {
    throw new AuraHandledException('Error when Querying: ' + google.getResponse());
}
return google.getResponse();

Code explanation:

  • Line 3: A new GoogleBigQuery object is created. As you notice only the project ID is passed as parameter (compared to the previous insertAll example). This is because the SQL query already contains the dataset ID and the table ID.
  • Lines 4 + 5: The newly instantiated GoogleBigQuery.JobsQuery object represents the object which gets sent to the BigQuery REST API. In line 5 the SQL query, which is passed down as a parameter from the Lightning component, is set.
  • Line 6: The query method on the GoogleBigQuery object gets called. In case of an error, a new AuraHandledException is created, so that we can give appropriate feedback in the UI.

You will notice that we return the raw JSON response from BigQuery. This is because we want to use that data to dynamically configure a lightning:datatable, as you can see in this code snippet:

updateDatatable: function(component, response) {
    const jsonResponse = JSON.parse(response);
    let columns = [];
    let rows = [];
    jsonResponse.schema.fields.forEach(field => {
        var column = {
            label: field.name,
            fieldName: field.name,
            type: 'text',
        };
        columns.push(column);
    });
    component.set('v.columns', columns);
    jsonResponse.rows.forEach(row => {
        let data = {};
        columns.forEach((column, index) => {
            data[column.fieldName] = row.f[index].v;
        });
        rows.push(data);
    });
    component.set('v.rows', rows);
}

Code explanation:

  • Line 2: The response from the Apex method comes back as a string, so we have to parse it into a JSON object first.
  • Lines 3 + 4: For the datatable we’re using aura:attributes to configure the columns and rows. The arrays columns and rows are used to store the parsed data from the JSON response.
  • Lines 5 – 11: For every field object within the fields array from the JSON response a new object literal is created, based on the requirements of lightning:datatable.
  • Lines 14 – 20: Similar to the fields array for the column definition, the data for the individual rows is parsed. As BigQuery returns the data in strict field order we can leverage the index of the previously created columns array to properly assign the values to field names.

As you see, it’s not hard to connect the Salesforce Platform with BigQuery using Apex.

Summary

BigQuery is a performant, NoOps, large scale, data warehouse solution, which can be used to store many different kinds of datasets. The shown use cases here are only an excerpt of what kind of integrations with Salesforce you can build. For example you can take the Apex code from the demo sample and integrate it into a custom Apex Connector, so that you can interact with BigQuery using Salesforce Connect. Or you can leverage the BigQuery connector for Einstein Analytics to replicate data into Analytics for further combined analysis with other data. In an upcoming blog post we’ll dive into how to build even deeper integrations across Salesforce clouds and BigQuery. Stay tuned!

Here are a few resources to dive deeper into this topic:

About the author

René Winkelmeyer works as Principal Developer Evangelist at Salesforce. He focuses on enterprise integrations, Lightning, and all the other cool stuff that you can do with the Salesforce Platform. You can follow him on Twitter @muenzpraeger.

Stay up to date with the latest news from the Salesforce Developers Blog

Subscribe