Web Content (Crawler) Connector
Use the Web Content connector to ingest your organization’s marketing, ecommerce, documentation, or other website content into Data Cloud. AI agents can use this content to provide reliable answers in their interactions with your users.
The Web Content connector provides two options to ingest website data.
- Web Content (Sitemap): Ingest content through the website’s sitemap. Use this option if you have a sitemap.
- Web Content (Crawler): Ingest content by crawling up to 4 levels of links on the website pages. Data Cloud ingests content in crawling depth levels. If, for example, you start crawling at your home page URL and set the crawl depth to 1, Data Cloud ingests the home page and its immediate links. If you set the crawl depth to 2, Data Cloud ingests the home page, the pages it links to, and the pages those pages link to. Links outside of the website’s domain are not ingested.
It’s your obligation to ensure that you have the rights to the data collected using this feature. Salesforce disclaims all liability with respect to such data collected.
- Unstructured
The direction of available connection methods.
Create a connection for ingesting data from external systems.
Supported connection method(s):
- Batch
Instructions for adding and configuring your connector.
Prepare Your Web Content Connection (Crawler)
Set Up a Web Content Connection (Crawler)
Create an Unstructured Data Connection from the Web Content (Crawler)
Web Content (Crawler) Connector Troubleshooting Guide