Salesforce Data Cloud offers pre-built connectors that allow you to configure data to flow into or out of Data Cloud through third-party integrations. If you are using Azure Blob Storage, you can bring that data into Data Cloud with the Microsoft Azure Blob Storage Connector.
This blog post describes the steps needed to create the connection in Data Cloud and then to use the connector to start the process of ingesting your Microsoft Azure Blob Storage data.
About Azure Blob Storage
Azure Blob Storage is Microsoft’s object storage solution for the cloud. Blob Storage is optimized for storing massive amounts of structured, semi-structured, and unstructured data. Unstructured data is data that doesn’t adhere to a particular data model or definition, such as email and chat transcripts. This is unlike structured data, which typically fits into databases and is organized into rows and columns.
A common use case for Blob Storage is to store big data for analysis as it provides an optimal environment for storing data sets used in big data analytics.
About the Microsoft Azure Blob Storage Connector
Data from Azure Blob Storage is a great source of information for Data Cloud, which can enable informed decision-making and create new insights. Bringing this data into Data Cloud with a connector unlocks this data source that may have previously been unattainable. Data Cloud supports various files and compression formats for your data stream sources.
The Microsoft Azure Blob Storage Connector, once setup is complete, will appear as a connector when configuring a data stream. For the connector to be available for selection, you’ll need to create an Azure Container and Blob, configure access to your Azure resources, and then connect to the Azure Storage Account from Data Cloud.
Let’s look at the steps needed.
How to set up and use the Microsoft Azure Blob Storage Connector
This section will unveil the setup process and considerations for a smooth integration.
Our example ingests a file containing a list of animals, their characteristics, and a boolean value indicating if they were adopted or not. By using this data source, we enable the ability to use artificial intelligence in Data Cloud to start predicting the chance of an animal being adopted in the future. This is an example of how Data Cloud unlocks your data sources to provide insights and take action when needed.
Step 1: Create a storage account
Open your Azure portal and create a storage account. For help creating a storage account, see Create a storage account. In this post, we’re calling our storage account datacloudstorage
.
Step 2: Create a container
Navigate to your new storage account in the Azure portal.
In the left menu for the storage account, scroll to the Data storage section, then select Containers.
Select the + Container button.
Type a name for your new container. In this example, we’ll use the container name animals
. For more information about container and blob names, see Naming and referencing containers, blobs, and metadata.
Set the level of anonymous access to the container. The default level is Private (no anonymous access).
Select Create to create the container.
Step 3: Upload a block blob
Block blobs consist of blocks of data assembled to make a blob. Most scenarios using Blob Storage employ block blobs. They are ideal for storing text and binary data in the cloud, like files, images, and videos. We’ll create our animal data using block blobs.
In the Azure portal, navigate to the container you created in the previous section.
Select the container to show a list of blobs it contains. The animal
container is new, so it won’t yet contain any blobs.
Select the Upload button to open the upload panel and browse your local file system to find a file to upload as a block blob. We will upload the animal data into a new virtual folder called all-animals
.
Select the Upload button to upload the blob.
Once uploaded, the all-animals.csv
file is now available in the animals
container in a virtual folder called all-animals
.
Step 4: Generate a shared access token
In the Azure portal, select Shared access token from the menu.
Create a Shared Access Signature (SAS) token with the following settings.
- Signing method: Account key
- Required permissions: Read, delete, list
- Expiry: Set an expiration date that matches your organization’s security policies. We recommend a date between one to three months.
- Allowed protocols: HTTPS only
Copy the Blob SAS token and Blob SAS URL as we need these in Data Cloud when we create the connector.
You must regenerate the SAS token before its expiration, or when there’s a change in the container URL or container path of the Azure Blob Storage Connector in Data Cloud. The connector doesn’t support Azure account or Azure file tokens.
Step 5: Create a Microsoft Azure Blob Storage Connector in Data Cloud
Now we have all the information we need to connect Data Cloud.
Sign in to your Data Cloud instance and make sure you have Data Cloud Admin or Data Cloud Marketing Admin user permissions.
Choose Setup → Data Cloud Setup. Then choose Connectors and click New. Select the Microsoft Azure Blob Storage Connector and click Next.
Enter the following information on the Connectors page:
- Connection name: Animals
- Connection API name: Animals
- SAS token: Use the SAS token saved above in Step 4.
- Container URL: https://datacloudstorage.blob.core.windows.net/animals
- Container path: /all-animals/
Next, click on Test Connection to verify there are no errors, then click Save.
Step 5: Ingest data from Azure Blob Storage in Data Cloud
Next, navigate to Data Cloud and click the Data Streams tab. Click New, select the Microsoft Azure Blob Storage data source, then click Next.
On the next screen, we can specify the file with our animal data. In this post, we called it animal-list.csv
. Then click Next.
Since the data for our animal has no clear unique identifier, we’ll create one by clicking New Formula Field.
Create a new field called Animal Id of data type Text that uses the UUID() function to create a unique 36-character number, then click Save.
Next, change the Data Lake Object Label and API Name to Animal. Select Other as the category, then select Animal_Id as the Primary Key. Then click Next.
Finally, click Deploy.
Your data stream will now retrieve new records from the connected Azure source approximately every 15 minutes. Review the documentation to explore the guidelines and limits for APIs. As soon as your Last Run Status is Success, you can view the records ingested by looking at the data lake object in Data Cloud.
Using Data Explorer in Data Cloud, we can now explore the data lake object called Animal
that we created. This shows all the records ingested from the Azure Blob Storage connection.
Conclusion
In this blog post, we covered how you can now ingest data from Microsoft Azure Blob Storage into Data Cloud. If you want to take it a step further and learn how to map your data lake object to data model objects in Data Cloud, please watch our Mapping Data Streams video, where we show you how to map your ingested data in your data lake object to data model objects in Data Cloud.
By ingesting data from enterprise data sources, we can now leverage the data to get insights. For this data set, we can use Einstein predictive modeling to look at the top predictors for animal adoption to not only improve the chance of adoption but also to use the output to help manage the day-to-day running of an animal shelter.
Resources
- Documentation: Azure Blob Storage
- Documentation: Set Up an Azure Blob Storage Connection
- Video: Mapping Data Streams
- Trailhead: Build AI Models in Einstein Studio
About the author
Dave Norris is a Developer Advocate at Salesforce. He’s passionate about making technical subjects broadly accessible to a diverse audience. Dave has been with Salesforce for over a decade, has over 35 Salesforce and MuleSoft certifications, and became a Salesforce Certified Technical Architect in 2013.