MuleSoft Direct Connector Reference (Beta)

The MuleSoft Direct connector supports various external sources and file formats for ingestion into Data Cloud.

The external sources are provided as MuleSoft application assets published to MuleSoft Anypoint Exchange. Each external source supports these file formats for ingestion.

SourceSupported Content TypesIngested format
ConfluencePagesHTML
Google DriveTXT, PDF, HTML, ASPX, Log, Google Docs
  • Same as source
  • Google Docs converted to PDF
SharepointTXT, PDF, HTML, ASPX, LogSame as source
SitemapXMLSame as source
SourceCredentials
Confluence
  • Your Confluence space ID
  • Hostname of your Confluence instance
  • Your Confluence username and an API token to authenticate
Google Drive
  • Your Google Drive ID
  • An OAuth client ID set for a web application
    Sharepoint
    • Your SharePoint Site ID
    • Microsoft Graph Hostname
    • Microsoft Tenant ID
    • Microsoft Entra Client ID and Secret
    SitemapAccess for public sitemaps that don't require authentication only

    When you create unstructured data lake objects (UDLOs), you can use these filters to define the files and content you want to ingest.

    SourceFile Name pattern (input)Expected output
    Confluence, Google Drive*All files
    *.pdfOnly PDF files
    [A specific word]Files matching [a specific word] exactly.
    [A specific word]*Files starting with [a specific word]
    [A specific word]Files containing [a specific word]
    Sitemap*Data from all URLs in Sitemap
    /amf.*Data from all URLs in Sitemap with a path starting with /amf.
    [A specific word]Data from all URLs in Sitemap with [a specific word]
    [A specific word]*Data from URLs in Sitemap with a path starting with [a specific word]*
    [A specific word]Data from all URLs in Sitemap with a path which contains [a specific word]