Create a YouTube Unstructured Data Stream (Beta)
Create an unstructured data lake object in Data Cloud to ingest your organization’s content from YouTube into Data Cloud.
See the Unstructured Data Reference for a list of supported file formats.
This feature is a Beta Service. A customer may opt to try a Beta Service in its sole discretion. Any use of the Beta Service is subject to the applicable Beta Services Terms provided at Agreements and Terms. If you have questions or feedback about this Beta Service, contact the Data Cloud Connector team at datacloud-connectors-beta@salesforce.com.
User Permissions Needed | |
---|---|
To create a connection: | System Admin profile or Data Cloud Architect permission set |
Before you begin:
- Make sure that you’ve set up a YouTube connection and you know the name of the YouTube connection.
- Verify you have a list of tags for all the videos you want to ingest.
-
From App Launcher, select Data Cloud.
-
Select the Data Lake Objects tab and then select New.
-
Select the From External Files tile, and select Next.
-
From the New Data Lake Object screen, select the YouTube connector tile and select Next.
-
From the Connection Details dropdown, select the YouTube connection you previously created. Data Cloud auto-populates the source based on the connection that you select.
-
If you have playlists, configure which YouTube playlists to take videos from. Include each playlist name in a comma-separated list. Without filtering applied, all videos from the specified playlist are ingested. If you do have playlists, and you keep this field blank, all videos in all playlists are ingested.
-
If you want to include captions with your upload, check the Include captions checkbox.
Ingesting captions significantly inflates your Google API credit rate consumption. All YouTube connectors you create use the same pool of credits for API credit rate consumption. Reaching the API credit limit quota threshold causes the connector to fail. You can ingest about 25–30 videos with captions within a standard quota. This range can vary. You can optionally increase your Google API quota for YouTube. Optionally, you can apply filters to limit the ingestion to the videos you want. There are several filters available to use.
-
When you use more than one filter (even more than one filter of the same type), each filter applied removes or includes content from the number of ingested files. If after you apply filters, the videos aren't in Data Cloud, it’s usually because your filters are too restrictive. If you see that you ingested videos that you didn’t intend to ingest, it’s usually because your filters are too broad.
Apply any or all of the following filters:
- Included Labels: Provide a comma-separated list of labels. Any video tagged with the provided label is ingested. If multiple labels are listed, all videos tagged with either label are ingested. Note that this field is case-sensitive. If you misspell a label, it is ignored.
- Excluded Labels: Provide a comma-separated list of labels. Any video tagged with the provided label isn't ingested. If multiple labels are listed, all videos tagged with either label are excluded. Note that this field is case-sensitive. If you misspell a label, it is ignored.
- Creation Date: Select a date from the calendar widget. Any video created on or after the provided date is ingested. Only one date can be used.
- Last Update Date: Select a date from the calendar widget. Any video updated on or after the provided date is ingested. Only one date can be used.
-
Select Next. The connector runs by default every two hours. You can monitor sync status in Data Stream status.
-
To set up your unstructured lake object (UDLO) and its associated data model object (DMO), add an Object Name and an Object API Name for the UDLO, using Data Lake Object Naming Standards.
-
Map the UDLO to a UDMO.
- To create a new UDMO, click New. Then select from the Data Space dropdown list a data space in which to create it. Add an Object Name and an Object API Name for the UDLO,using Data Lake Object Naming Standards.
- To use an existing UDMO, click Existing, and select a data space and a UDMO from the list of existing UDMOs.
-
Optionally, leave the checkbox checked to create a search index now for the UDMO using the system defaults. Checking this checkbox automatically selects text fields and a chunking strategy for each field. You can deselect the checkbox and create a search index configuration later.
-
Select Next, or if you created a search index configuration, review the details, and Save your work.
The data stream ingests videos from YouTube into an unstructured data lake object (UDLO) and maps it to an unstructured data model object (UDMO). From this UDMO, a search index is created which can now be used to ground AI-generated responses.
Next Step: YouTube Limitations