In today’s business landscape, a significant portion of critical enterprise data remains trapped in unstructured files, such as PDFs, images, and other document types. Salesforce Data Cloud’s newest capability, Document AI, emerges as a powerful solution to extract structured data from these unstructured sources, making it available for processing and retrieval within Salesforce.

In this blog post, we’ll discuss the benefits and key use cases of Document AI, and we’ll walk you through how to use it in three simple steps. 

Overview of Document AI

Document AI is Data Cloud’s native solution designed to ingest various sources of unstructured data and extract structured data from those sources. For instance, an image of an invoice can be converted into structured data within the Invoice data lake object (DLO).

There are a few ways you can access Document AI.

  • API modality: Suitable for real-time processing needs, requiring file data and extraction schema in the API call

You can also invoke Document AI API using the POST method shown below:

Body:

Response:

  • Batch modality: Processes a set of documents defined in a user-defined metadata object (UDMO), persisting extracted data as newly created data layer objects (DLOs) based on the provided schema
  • RAG modality (future capability): Enables Document AI as a pre-processing step before indexing, significantly improving RAG accuracy

How to work with Document AI

Document AI leverages the latest LLM models, like GPT-4 and Gemini, and supports various file input methods and file types, including PDF and image files.

Let’s now explore how to use Document AI in three simple steps.

Step 1: Create a document schema configuration

First, we’ll need to create a document schema configuration.. This can be done in the following two ways:

  • With a source object: Define schema from files in an unstructured data model object (UDMO), which consists of all your unstructured data  
  • Without a source object: Define schema without using a source object and extending the schema via Flow, Apex, and further, via Agentforce

Screenshot showing the two options for creating a document schema configuration for use with Document AI

Step 2: Test and validate your schema config 

Next, we’ll choose the type of unstructured data that we want to process. It could either be a PDF file or an image.

Screenshot showing the selection of a PDF file to process using Document AI

Step 3:  Leveraging Document schema configuration

Once your schema configuration is created, you can further use it to get structured responses in apps and agents using APIs, Flow, Apex, and Agentforce, Web interfaces, and so on.

Screenshot showing the details of a PDF invoice that has been processed using Document AI

Benefits and use cases of Document AI

Document AI offers several benefits, including support for prompt engineering, pre-processing of documents, and seamless integration with other Data Cloud capabilities. It is designed to handle different types of pre-processing needed before calling the LLM, such as extracting images within PDFs separately and handling them via multi-modal processing.

Use cases for Document AI across Salesforce clouds

Document AI can be useful in a wide variety of use cases specific to Salesforce Cloud features and capabilities.

  • Sales Cloud: Ingest contract documents and link them to customer profiles, or ingest invoices to compare them against data extracted from related loan applications
  • Service Cloud: Ingest guest survey data, service agents getting information from knowledge to resolve customer problems, review contracts, agreements, and follow up on future opportunities
  • Employee Service: Ingest tax return documents and link them to employee records
  • Health Cloud: Extract data from patient lab reports, which can be handwritten or system-generated

Current limitations of Document AI and future enhancements 

While Document AI offers robust capabilities, it has certain limitations, such as context size limits(less than 20 MB), JSON schema restrictions, and accuracy expectations. Future enhancements are planned to address these limitations, including support for additional file types and larger file sizes.

Document AI currently supports PDF and image files, and additional file formats, such as DocX, will be supported in the future.

Conclusion

Data Cloud’s Document AI will revolutionize how businesses handle unstructured data within Salesforce. Organizations can leverage this capability to streamline their data processing and gain valuable insights from previously untapped data sources. By understanding the features, benefits, and applications of Document AI, businesses can unlock new potential for data-driven decision-making and process automation.

Resources

About the author

Akshata Sawant is a Senior Developer Advocate at Salesforce and co-author of a book titled “MuleSoft for Salesforce Developers,” published by Packt Publication. For a more in-depth look at Akshata’s accomplishments, visit her LinkedIn profile.