Connecting Microsoft Azure Blob Storage with Data Cloud

Salesforce Data Cloud offers pre-built connectors that allow you to configure data to flow into or out of Data Cloud through third-party integrations. If you are using Azure Blob Storage, you can bring that data into Data Cloud with the Microsoft Azure Blob Storage Connector.

This blog post describes the steps needed to create the connection in Data Cloud and then to use the connector to start the process of ingesting your Microsoft Azure Blob Storage data.

About Azure Blob Storage

Azure Blob Storage is Microsoft’s object storage solution for the cloud. Blob Storage is optimized for storing massive amounts of structured, semi-structured, and unstructured data. Unstructured data is data that doesn’t adhere to a particular data model or definition, such as email and chat transcripts. This is unlike structured data, which typically fits into databases and is organized into rows and columns.

A common use case for Blob Storage is to store big data for analysis as it provides an optimal environment for storing data sets used in big data analytics.

About the Microsoft Azure Blob Storage Connector

Data from Azure Blob Storage is a great source of information for Data Cloud, which can enable informed decision-making and create new insights. Bringing this data into Data Cloud with a connector unlocks this data source that may have previously been unattainable. Data Cloud supports various files and compression formats for your data stream sources.

The Microsoft Azure Blob Storage Connector, once setup is complete, will appear as a connector when configuring a data stream. For the connector to be available for selection, you’ll need to create an Azure Container and Blob, configure access to your Azure resources, and then connect to the Azure Storage Account from Data Cloud.

Microsoft Azure Blob Storage is now available as a data source

Let’s look at the steps needed.

How to set up and use the Microsoft Azure Blob Storage Connector

This section will unveil the setup process and considerations for a smooth integration.

Our example ingests a file containing a list of animals, their characteristics, and a boolean value indicating if they were adopted or not. By using this data source, we enable the ability to use artificial intelligence in Data Cloud to start predicting the chance of an animal being adopted in the future. This is an example of how Data Cloud unlocks your data sources to provide insights and take action when needed.

Step 1: Create a storage account

Open your Azure portal and create a storage account. For help creating a storage account, see Create a storage account. In this post, we’re calling our storage account datacloudstorage.

Step 2: Create a container

Navigate to your new storage account in the Azure portal.

In the left menu for the storage account, scroll to the Data storage section, then select Containers.

Creating data storage in your storage account in the Azure portal

Select the + Container button.

Type a name for your new container. In this example, we’ll use the container name animals. For more information about container and blob names, see Naming and referencing containers, blobs, and metadata.

Creating a new container in the Azure portal

Set the level of anonymous access to the container. The default level is Private (no anonymous access).

Select Create to create the container.

Step 3: Upload a block blob

Block blobs consist of blocks of data assembled to make a blob. Most scenarios using Blob Storage employ block blobs. They are ideal for storing text and binary data in the cloud, like files, images, and videos. We’ll create our animal data using block blobs.

In the Azure portal, navigate to the container you created in the previous section.

Select the container to show a list of blobs it contains. The animal container is new, so it won’t yet contain any blobs.

Select the Upload button to open the upload panel and browse your local file system to find a file to upload as a block blob. We will upload the animal data into a new virtual folder called all-animals.

Uploading a file to your new container in the Azure portal

Select the Upload button to upload the blob.

Once uploaded, the all-animals.csv file is now available in the animals container in a virtual folder called all-animals.

Viewing your uploaded file in the Azure portal

Step 4: Generate a shared access token

In the Azure portal, select Shared access token from the menu.

Creating a shared access token in the Azure portal

Create a Shared Access Signature (SAS) token with the following settings.

Signing method: Account key
Required permissions: Read, delete, list
Expiry: Set an expiration date that matches your organization’s security policies. We recommend a date between one to three months.
Allowed protocols: HTTPS only

Getting access to your Blob SAS token in the Azure portal

Copy the Blob SAS token and Blob SAS URL as we need these in Data Cloud when we create the connector.

You must regenerate the SAS token before its expiration, or when there’s a change in the container URL or container path of the Azure Blob Storage Connector in Data Cloud. The connector doesn’t support Azure account or Azure file tokens.

Step 5: Create a Microsoft Azure Blob Storage Connector in Data Cloud

Now we have all the information we need to connect Data Cloud.

Sign in to your Data Cloud instance and make sure you have Data Cloud Admin or Data Cloud Marketing Admin user permissions.

Choose Setup → Data Cloud Setup. Then choose Connectors and click New. Select the Microsoft Azure Blob Storage Connector and click Next.

Selecting the Microsoft Azure Blob Storage Connector in Data Cloud

Enter the following information on the Connectors page:

Connection name: Animals
Connection API name: Animals
SAS token: Use the SAS token saved above in Step 4.
Container URL: https://datacloudstorage.blob.core.windows.net/animals
Container path: /all-animals/

Next, click on Test Connection to verify there are no errors, then click Save.

Testing your connection in Data Cloud

Step 5: Ingest data from Azure Blob Storage in Data Cloud

Next, navigate to Data Cloud and click the Data Streams tab. Click New, select the Microsoft Azure Blob Storage data source, then click Next.

Creating a Data Cloud data stream using the Microsoft Azure Blob Storage Connector

On the next screen, we can specify the file with our animal data. In this post, we called it animal-list.csv. Then click Next.

Selecting the file to ingest in Data Cloud

Since the data for our animal has no clear unique identifier, we’ll create one by clicking New Formula Field.

Creating a formula field to use as a unique identifier

Create a new field called Animal Id of data type Text that uses the UUID() function to create a unique 36-character number, then click Save.

Using a function to create a UUID

Next, change the Data Lake Object Label and API Name to Animal. Select Other as the category, then select Animal_Id as the Primary Key. Then click Next.

Specifying attributes for the data lake object

Finally, click Deploy.

Your data stream will now retrieve new records from the connected Azure source approximately every 15 minutes. Review the documentation to explore the guidelines and limits for APIs. As soon as your Last Run Status is Success, you can view the records ingested by looking at the data lake object in Data Cloud.

Using Data Explorer in Data Cloud, we can now explore the data lake object called Animal that we created. This shows all the records ingested from the Azure Blob Storage connection.

Using Data Explorer to view the data ingested from Microsoft Azure Blob Storage

Conclusion

In this blog post, we covered how you can now ingest data from Microsoft Azure Blob Storage into Data Cloud. If you want to take it a step further and learn how to map your data lake object to data model objects in Data Cloud, please watch our Mapping Data Streams video, where we show you how to map your ingested data in your data lake object to data model objects in Data Cloud.

By ingesting data from enterprise data sources, we can now leverage the data to get insights. For this data set, we can use Einstein predictive modeling to look at the top predictors for animal adoption to not only improve the chance of adoption but also to use the output to help manage the day-to-day running of an animal shelter.

Resources

Documentation: Azure Blob Storage
Documentation: Set Up an Azure Blob Storage Connection
Video: Mapping Data Streams
Trailhead: Build AI Models in Einstein Studio

About the author

Dave Norris is a Developer Advocate at Salesforce. He’s passionate about making technical subjects broadly accessible to a diverse audience. Dave has been with Salesforce for over a decade, has over 35 Salesforce and MuleSoft certifications, and became a Salesforce Certified Technical Architect in 2013.