Create an Amazon S3 Data Stream in Data Cloud
Create a data stream in Data Cloud to start the data flow from the Amazon S3 source.
User Permissions Needed | |
---|---|
To create a data stream: | Data Cloud admin OR Data Cloud Marketing admin |
Before you begin:
- Make sure that the Amazon S3 connection is set up.
- Make sure that S3 Bucket permissions are set.
- Add the IP source addresses listed in the AWS ID IP Table section in IP addresses to your allowlist.
-
In Data Cloud, on the Data Streams tab, click New.
You can also use App Launcher to find and select Data Streams.
-
Select Amazon S3.
-
Select the applicable connector from the Connection dropdown.
-
Complete file and source details, and click Next.
Field Label Description File Type Select CSV or Parquet. Import from Directory Path name, or folder hierarchy, pointing to a file’s location. Place your source files in the directory because the data stream can’t recognize files stored in nested subdirectories. File Name Name of the file to retrieve from the specified directory. If no file is specified, the first file found is selected. After you create a data stream, it retrieves all files found in the directory. Wildcards are also supported. For example, you can use abc.csv to indicate the retrieval of all files containing “abc” in their name. Each time the stream runs, all files satisfying the wildcard are imported. Source A label indicating an external system from where data is sourced. Multiple data streams can use the same label for Source. -
Complete Object Details. You can create a data lake object (DLO) or use an existing DLO.
If you choose to create a DLO then refer to naming standards. If you choose to use an existing DLO then refer to Using existing data lake object to create a data stream and familiarize yourself with the guardrails to consider when using an existing DLO.
-
Review and optionally edit the fields identified in the table.
-
Select a data stream category and primary key.
-
Add new formula fields if needed. You can map the fields to an existing DLO field or create a DLO field.
-
Click Next.
-
From the Data Space dropdown, select the applicable data space or the default data space.
-
Fill in the deployment details and click Deploy.
Map your DLO to the data model to use the ingested data in segments, calculated insights, and other use cases.