Create a YouTube Unstructured Data Lake Object (UDLO)

Create an unstructured data lake object in Data 360 to ingest your organization’s content from YouTube into Data 360.

See the Unstructured Data File Formats and Connectors for a list of supported file formats.

User Permissions Needed 
To create a connection:System Admin profile or Data Cloud Architect permission set

Before you begin:

  • Make sure that you’ve set up a YouTube connection and you know the name of the YouTube connection.
  • Verify you have a list of tags for all the videos you want to ingest.
  1. From App Launcher, select Data Cloud.

  2. Select the Data Lake Objects tab and then select New.

  3. Select the From External Files tile, and select Next.

  4. From the New Data Lake Object screen, select the YouTube connector tile and select Next.

  5. From the Connection Details dropdown, select the YouTube connection you previously created. Data Cloud auto-populates the source based on the connection that you select.

  6. If you have playlists, configure which YouTube playlists to take videos from. Include each playlist name in a comma-separated list. Without filtering applied, all videos from the specified playlist are ingested. If you do have playlists, and you keep this field blank, all videos in all playlists are ingested.

  7. If you want to include captions with your upload, check the Include captions checkbox.

    Ingesting captions significantly inflates your Google API credit rate consumption. All YouTube connectors you create use the same pool of credits for API credit rate consumption. Reaching the API credit limit quota threshold causes the connector to fail. You can ingest about 25–30 videos with captions within a standard quota. This range can vary. You can optionally increase your Google API quota for YouTube. Optionally, you can apply filters to limit the ingestion to the videos you want. There are several filters available to use.

  8. When you use more than one filter (even more than one filter of the same type), each filter applied removes or includes content from the number of ingested files. If after you apply filters, the videos aren't in Data 360, it’s usually because your filters are too restrictive. If you see that you ingested videos that you didn’t intend to ingest, it’s usually because your filters are too broad.

    Apply any or all of the following filters:

    • Included Labels: Provide a comma-separated list of labels. Any video tagged with the provided label is ingested. If multiple labels are listed, all videos tagged with either label are ingested. Note that this field is case-sensitive. If you misspell a label, it is ignored.
    • Excluded Labels: Provide a comma-separated list of labels. Any video tagged with the provided label isn't ingested. If multiple labels are listed, all videos tagged with either label are excluded. Note that this field is case-sensitive. If you misspell a label, it is ignored.
    • Creation Date: Select a date from the calendar widget. Any video created on or after the provided date is ingested. Only one date can be used.
    • Last Update Date: Select a date from the calendar widget. Any video updated on or after the provided date is ingested. Only one date can be used.
  9. Click Next. The connector runs by default every two hours. You can monitor sync status in Data Stream status.

  10. Add an Object Name and an Object API Name for the UDLO. See Data Lake Object Naming Standards. Make sure that the object API name is unique, and the field autopopulates based on the object name.

  11. In the Unstructured Data Model Object Mapping section, select New.

  12. From the Data Space Dropdown, leave the selection as Default.

  13. For the UDMO mapping, enter an Object Name and an Object API Name. See Data Lake Object Naming Standards. Make sure that the object API name is unique. The field autopopulates based on the object name.

  14. Optionally, select the Enable Unstructured Content Harmonization with system defaults checkbox to turn on content harmonization for the UDMO. You can leave content harmonization turned off for now and turn on content harmonization later.
    If you are using this feature, go to the Feature Manager and turn on both content harmonization and rendering.

    When you turn on content harmonization, you turn on collection of content viewer engagement data.

  15. Select Next.

  16. In the Search Index Configuration section, leave the checkbox selected to Enable Semantic Search with System Defaults. The system default settings automatically select text fields and apply a chunking strategy for each field. Deselect the checkbox to create a search index configuration later.

  17. Leave the remaining fields as-is to use the default settings, or rename and change the Search Configuration details and objects to make changes.

  18. Save your work.

The data stream ingests videos from YouTube into an unstructured data lake object (UDLO) and maps it to an unstructured data model object (UDMO). From this UDMO, a search index is created which can now be used to ground AI-generated responses.

Next Step: YouTube Limitations