DocumentExtractionDef

Represents the metadata definition used to manage and execute document data extraction processes.

Where possible, we changed noninclusive terms to align with our company value of Equality. We maintained certain terms to avoid any effect on customer implementations.

Important

Parent Type

This type extends the Metadata metadata type and inherits its fullName field.

File Suffix and Directory Location

DocumentExtractionDef values are stored in the developer_name.documentExtractionDef file in the documentExtractionDefs directory.

Version

DocumentExtractionDef is available in API version 66.0 and later.

Fields

Field Name Description
batchProcessingStartTime
Field Type
dateTime
Description
The date and time when batch processing starts. Files with earlier timestamps are excluded during processing.
batchUser
Field Type
User
Description
The user associated with the batch processing.
description
Field Type
string
Description
The description of the document extraction definition.
documentExtractionDefVer
Field Type
DocumentExtractionDefVer
Description
An array of specific version settings configuration objects metadata details associated with this definition.
documentExtrctDefProcFldr
Field Type
DocumentExtrctDefProcFldr
Description
The destination or source target file directories utilized for document data intake and execution processing.
isActive
Field Type
boolean
Description
Indicates whether the document extraction definition is currently active (true) or inactive (false).
isReviewRequired
Field Type
boolean
Description
Indicates whether the extraction requires review before the data is saved (true) post-extraction parsing cycles.
label
Field Type
string
Description
The user-facing display label assigned to the DocumentExtractionDef interface reference component entity.
status
Field Type
DocumentExtractionDefStatus (enumeration of type string)
Description
The status of the document extraction definition.

Possible values are:

  • BatchDefined
  • BatchActive
type
Field Type
DocumentExtractionDefType (enumeration of type string)
Description
The type of document extraction definition, such as standard (pre-defined) or custom.

Possible values are:

  • Standard
  • Custom

DocumentExtractionDefVer

Field Name Description
confidenceThreshold
Field Type
int
Description
The minimum target parsing percentage target reliability score limit needed to assume accurate field target identification values.
contextMappingConfig
Field Type
string
Description
The contextual data JSON string schema properties layout blueprint rule map definitions metadata block references path tracking array context.
description
Field Type
string
Description
A localized detailed log note describing updates modifications structural variation purpose for this unique schema line record definition item.
documentExtractionDefStep
Field Type
DocumentExtractionDefStep
Description
An object collection mapping sub-step definitions tracking exact layout actions scheduled during runtime extraction sequences.
isActive
Field Type
boolean
Description
Specifies if this individual instance specification criteria variant is actively queried during systemic runtime analysis operations.
largeLanguageModel
Field Type
string
Description
The identifying string name or version designation of the targeted Large Language Model engine pipeline system utilized.
llmInputPrompt
Field Type
string
Description
The structural systemic instruction or system prompt template text sequence supplied directly to the parsing large language model platform engine.
versionNumber
Field Type
int
Description
The unique sequential revision integer tracked for this layout iteration mapping block profile data records configuration.

DocumentExtractionDefStep

Field Name Description
executionProcessReference
Field Type
string
Description
The unique component path developer key string reference matching the automation runtime execution pipeline logic block element engine.
stepNumber
Field Type
int
Description
The sorting hierarchy sequencing index tracking the relative ordering path allocation logic map positioning order for execution.
stepType
Field Type
DocExtractReqStepType (enumeration of type string)
Description
Defines the operational category action logic behavior module processed through this specific sequenced execution schema element state item.
Values

Values are:

  • Extract
  • Transform
  • PostTransformAction
  • Save
  • PostSaveAction
  • Validation

DocumentExtrctDefProcFldr

Field Name Description
folderPath
Field Type
string
Description
The absolute uniform identifier path url map text pattern resolving locations target destination directory container locations repository asset paths.
folderType
Field Type
DocExtractDefFldrType (enumeration of type string)
Description
Identifies the file system interface backend hosting vendor engine target destination configuration format tracking specification metadata blocks.
Values

Values are:

  • ContentFolder
  • Amazon
  • AzureBlob
lastProcessedTime
Field Type
dateTime
Description
The date time record mark capturing historical execution sweeps tracking the absolute instance execution runtime cycle.

Declarative Metadata Sample Definition

The following is an example of a DocumentExtractionDef component.

1<DocumentExtractionDef xmlns="http://soap.sforce.com/2006/04/metadata">
2    <label>Standard Invoice Extraction</label>
3    <isActive>true</isActive>
4</DocumentExtractionDef>

The following is an example package.xml that references the previous definition.

1<Package xmlns="http://soap.sforce.com/2006/04/metadata">
2    <types>
3        <members>*</members>
4        <name>DocumentExtractionDef</name>
5    </types>
6    <version>66.0</version>
7</Package>