Salesforce and Google have partnered to allow users and customers of both platforms to seamlessly integrate their data between Salesforce and Google Cloud in a variety of ways and activate it to enable data-driven business processes, workflows, automations, AI, and analytics. You can bring AI models built on the Google Cloud Platform to Data Cloud with Model Builder and use your Data Cloud data for training and predictions in Google Vertex. You can share data bidirectionally between Google BigQuery and Data Cloud for analytics and more without copying or moving data. You can also connect Data Cloud and Google Cloud Storage to ingest and export structured data between them.

This blog post describes a workflow for using the Google Cloud Storage connector to ingest data into Data Cloud from Google Cloud. Although this workflow requires copying and moving data from Google Cloud, it is beneficial and valuable — especially for customers already using and storing data in Google Cloud — because it provides an alternative way to ingest data into Data Cloud.

Let’s dive in!

Create a Google Cloud Storage bucket

The first step in ingesting data from Google Cloud into Data Cloud is to create a Google Cloud Storage bucket.

  • In the Google Cloud console, navigate to the Cloud Storage Buckets page and click +Create.

Image of Google Cloud Storage Bucket Page

  • Next, give your bucket a name. For simplicity, you can select the default options for location, location type, and storage class. If you’re not sure exactly what a setting is, Google has great documentation to help guide you through!
  • Then, configure the bucket to allow public access (subject to ACLs) by unchecking the “Enforce public access prevention on this bucket” option and set the Access Control to “Fine-grained.”
  • Click Create.

Screen for Creating a Google Cloud Storage bucket on the “Choose how to control access to objects” section. The Prevent public access option is unchecked and Access control is “Fine-grained”.

You now have a Google Cloud Storage bucket configured for Data Cloud! You can find up-to-date information about configuring Google Cloud Storage buckets for use with Data Cloud in the official documentation.

Details for a Google Cloud Storage bucket, datacloud-demo-test-2. The Location is us (multiple regions in the United States), the storage class is Standard, the public access settings are “Subject to object ACLs”, and protection is “None”. It does not contain any folders or objects.

Now you just need to add some data to your bucket.

Add a file and folder to the bucket

A storage bucket with no data is no fun at all. Next, you’ll add a file with data to the Google Cloud Storage bucket, so that you have something to pull into Data Cloud.

  • First, go to Cloud Storage in the Google Cloud console and click the bucket created in the previous step.
  • Next, click Create Folder to create a folder. Data Cloud requires that you ingest your data from a specific folder. This helps you separate the data you want Data Cloud to ingest while ensuring that you do not mistakenly upload the entirety of your bucket. If you need inspiration, consider “demo”.
  • Navigate to the newly created folder within your bucket.
  • Finally, upload a file to the folder. In the example shown here, the uploaded file is named “test.csv.”

Image of the Google Cloud Storage bucket home page with the test.csv file

In the next step, you’ll create a Google Cloud service account so that Data Cloud can access the file just uploaded to Google Cloud Storage.

Create a service account for Data Cloud

Service accounts are used to provide and manage access between Google Cloud services, such as Google Cloud Storage, and third-party services and APIs, including Data Cloud. You can think of a service account as a special access key that you can give Data Cloud to safely and securely pull data from Google Cloud.

Image of Cloud Storage Settings page with the Project Access Tab as the actively selected tab.

  • Under “Service account HMAC”, click Create a key for another service account. (If you have not yet created any keys, the button will be labeled Create a key for a service account.)
  • In the modal that appears, click Create new account to generate a new service account for Data Cloud.
  • Give the service account a name, skip the optional configuration steps related to accessing the account, and click Done.
  • After creating the service account, you are redirected back to Cloud Storage Settings and shown a modal with the newly created HMAC key. Copy the service account HMAC details – the service account email, access key, and secret – and store them somewhere for use later. After closing this modal, you will not be able to see the secret again, so it is very important to store these credentials before continuing. They are needed to make the connection with Data Cloud in the next section.

The modal that appears upon create a service account. It shows the service account email at the top, an access key, and a secret. The details are obscured for security. The secret must be copied before closing the modal, because it is only shown once during creation.

  • Next, go to the Google Cloud Storage bucket you created earlier and click the “Permissions” tab. Scroll down to the “Permissions” section and click Grant access.
  • Copy your service account’s email into the “New principals” field, give it the “Storage Legacy Bucket Reader” and “Storage Legacy Object Reader” roles, and click Save.

Image of Access page to grant role access of Storage Legacy Bucket Reader and Storage Legacy Object Reader

To recap, you’ve created a service account for the Data Cloud integration (steps 1 – 4), saved its email address, access key, and secret (step 5), and given it the permissions it needs to retrieve your bucket’s folders and files (steps 6-7). In the next two sections, you’ll connect the Google Cloud Storage bucket and Data Cloud.

Connect Data Cloud to your Google Cloud Storage bucket

Data Cloud provides a Google Cloud Storage connector that allows you to programmatically import data into Data Cloud from Google Cloud Storage (and vice versa) without writing a line of code. In this section, you’ll create a Google Cloud Storage connector to import data from a Google Cloud Storage bucket and later create a data stream to make the data accessible in Data Cloud.

  • Go to Data Cloud Setup and search for “More Connectors” in the Quick Find search box.
  • While on the “More Connectors” page, click New. In the modal that appears, select the Google Cloud Storage data source. Note: When trying this out yourself make sure that you’re on the “Source” tab.
  • Next, configure the connector by giving it a name and entering your Google Cloud Storage bucket name. Then paste your service account secret in the “Secret Key” field and the service account access key in the “Access Key” field. Set the parent folder name to your bucket’s folder name and add a trailing slash (/); for example, “demo/”.
  • Click Test Connection to verify that the bucket name and service account credentials work. Note that this test doesn’t verify that your parent directory (folder name) is accurate.
  • If the test is successful, click Save. If not, recheck that your service account credentials and bucket name are accurate.

Image of New Google Cloud Storage Source page with fields populated.

You can see that you have successfully connected your Google Cloud Storage account when the status is “Active.”

 Image of DataCloudDemo and GCS Connectors with Google Cloud Storage as the connection type, connector method as source, and status as active.

In the next (and final) step, you’ll use the new connection between your Google Cloud Storage bucket and Data Cloud to import the data from the file in the bucket.

Create a data stream

Data Cloud data streams are the connections and associated data that are ingested via connectors. In this section, you’ll create a data stream from the data within your Google Cloud Storage bucket using the Google Cloud Storage source you created in the previous step.

  • In Data Cloud, navigate to the Data Streams tab, click New, and then click the Google Cloud Storage source that you just created.
  • In the first step of the “New Data Stream” modal, choose a name for the Connection. The bucket name and parent directory will automatically be populated using the information you provided when you created your Google Cloud Storage source.
  • In the example use case, the Google Cloud Storage bucket only has one folder, so you can leave the “Import from Directory” field empty. If you had multiple folders within your Google Cloud Storage bucket’s “demo/” folder, you could specify a particular folder with the “Import from Directory” option.
  • Leave the “File name” field as *. This will automatically pull in all files within your bucket. If you had multiple files in your “demo/” folder and wanted to pull a specific one or a subset of them, you could specify a file name or name pattern using the “File Name” field.
  • The Source Details will be populated automatically for you.

Image of New Data Stream Creation page with Connection as GCSConnectors.

  • In the second step of the “New Data Stream” modal, choose the category that best represents your data. In the example use case, the CSV data most resembles profile data so it makes sense to choose “Profile” and set the “Primary Key” to “id”. If the “Data Lake Object API Name” is populated with a name that includes an underscore (_), remove the underscore so that it passes field validation, and click Next.

Image of New Data Stream Creation page with category selected as profile and primary key set to id.

  • In the final step of the “New Data Stream” modal, choose the refresh type that best suits your use case. Here the selected option is “Upsert” and “File Type” is “CSV”. Click Deploy.

Image of New Data Stream page with New Data Stream name, data space set to default, and refresh mode set to upsert

And voilà! You have created a data stream using the Google Cloud Storage connector. You can see the data stream’s fields and when it was last refreshed by clicking on it in the Data Streams tab.

Image of Data Cloud Demo data lake object with associated fields and field types.

Conclusion

Now that you know how to connect Google Cloud Storage and Data Cloud and how to ingest data via a data stream, you can start ingesting data from your Google Cloud Storage buckets into Data Cloud to keep building a single source of truth for your customers. And you don’t have to stop here! To learn how to ingest data from Amazon Web Services, see How to Use the Amazon S3 Storage Connector in Data Cloud. And get hands-on with the Create a Data Stream Trailhead module.

We would like to give a special acknowledgment and thanks to Chris Zullo for setting us on the path to producing this blog post.

Resources

FAQs

Q: How can I troubleshoot a failed test connection between my Google Cloud Storage bucket and Data Cloud?
A: First, check that your service account access secret and key are exactly as was shown when you created your HMAC key. If needed, create a new HMAC key for your service account in Google Cloud Storage Settings. Next, verify that your service account was added as a principal to your Google Cloud Storage bucket and that it has the “Storage Legacy Bucket Reader” and “Storage Legacy Object Reader” roles.

Q: My test connection is successful, but I cannot create a Data Stream. What’s wrong?
A: The Google Cloud Storage bucket’s folder name may be misspelled or your service account may be missing the correct roles. First, verify that your bucket has a folder with a file. Next, check that the folder name matches the parent directory field used when creating the Google Cloud Storage connector. Finally, verify that the service account that you’re using has the “Storage Legacy Bucket Reader” and “Storage Legacy Object Reader” roles for your Google Cloud Storage bucket.

About the Authors

Danielle Larregui is a Senior Developer Advocate at Salesforce focusing on the Data Cloud platform. She enjoys learning about cloud technologies, speaking at and attending tech conferences, and engaging with technical communities. You can follow her on X.

Charles Watkins is a Lead Developer Advocate at Salesforce and a full-stack software developer focused on the core Salesforce Platform. You can find him on X.

Get the latest Salesforce Developer blog posts and podcast episodes via Slack or RSS.

Add to Slack Subscribe to RSS