Initiate Text Extraction Action | Industries Common Resources Developer Guide

You can automate the Intelligent Form Reader’s text detection and extraction step using this invocable action.

Special Access Rules

This action is available in API version 58.0 and later for users with the AWSTextract1000LimitAddOn or IntelligentDocumentReaderAddOn license.

Supported REST HTTP Methods

URI: /services/data/vXX.X/actions/standard/initiateTextExtraction
Formats: JSON, XML
HTTP Methods: POST
Authentication: Authorization: Bearer token

Inputs

Input	Details
contentDocumentId	Type string Description Required. The unique content document ID of the uploaded document to initiate text extraction. You can specify up to 20 content Document IDs. Note
endPageIndex	Type integer Description Optional. The page number up to which the text must be extracted. The default value is the last page number in the specified document.
ocrService	Type picklist Description Optional. The name of the OCR service that extracts text from the document. Valid values are: `AMAZON_TEXTRACT` - Indicates AWS Document service. `AMAZON_TEXTRACT_ANALYZE_ID` - Indicates AWS Analyze ID service.
startPageIndex	Type integer Description Optional. The page number to start text extraction. By default, the starting page number is 1. You can extract text from up to 20 pages in a specified document. Note

Outputs

Output	Details
ocrDocumentScanResultDetails	Type string Description A comma-separated list containing an OcrDocumentScanResult ID and a page number for each extracted page of the specified document.

Example

Sample Request

1{
2   "inputs":[
3      {
4         "contentDocumentId":"069T10000004FnoIAE",
5         "startPageIndex":1,
6         "endPageIndex":20,
7         "ocrService":"AMAZON_TEXTRACT"
8      }
9   ]
10}

Sample Response

1[
2   {
3      "actionName":"initiateTextExtraction",
4      "errors":null,
5      "isSuccess":true,
6      "outputValues":{
7         "ocrDocumentScanResultDetails":{
8            "ocrDocumentScanResults":[
9               {
10                  "pageNumber":1,
11                  "ocrDocumentScanResultId":"0ixT100000000bv"
12               }
13            ]
14         }
15      },
16      "version":1
17   }
18]