DocumentScanner Data Types

DocumentScanner defines and uses several data types.

An object representing a scanned document. Returned as the result of a successful scan operation.

The following image illustrates how DocumentScanner might scan and structure a document (in this case, some sample text). There are three TextBlock elements, each outlined in purple, and each TextBlock is further broken down by line (TextLine) and word or character (TextElement).

A representation of the data model used to return scanned document results.

Property NameTypeDescriptionExample
imageBytesStringA string containing the base64 image data of the scanned document. Only provided when returnImageBytes is set to true in your DocumentScannerOptions configuration object."2432BcYPJhSkzZjHS-Uiz1g8iZdQnRHtPnFKbMltJEc"
textStringA string value providing the recognized text from the scanned document."This is a demonstration of how the text in a document is detected and broken down to TextBlock objects."
blocksTextBlock[]An array of TextBlock objects that represent a structured text result that is visually aligned with the corresponding image. See TextBlock for details of this structured text data.blocks[0] = [ [ ["This"], ["is"], ["a"], ["demonstration"], ["of"], ["how"], ["the"], ["text"], ["in"], ["a"], ["document"], ], [ ["is"], ["detected"], ["and"], ["broken"], ["down"], ["to"], ["TextBlock"], ["objects."], ], ]

An object representing a contiguous section of the scanned text. Text that is visually close together is grouped into a block of text. A document is made up of one to many blocks, and each block can be further broken down into smaller text elements: TextLine (a single line of text in a visually aligned run of text) and TextElement (an individual word or glyph).

Property NameTypeDescriptionExample
textStringA string containing the text content of the block."This is a demonstration of how the text in a document is detected and broken down to TextBlock objects."
linesTextLine[]An array of TextLine objects, each of which represents a visually aligned line of text within the TextBlock.lines = blocks[0].lines; lines[0] = [ ["This"], ["is"], ["a"], ["demonstration"], ["of"], ["how"], ["the"], ["text"], ["in"], ["a"], ["document"], ]
recognizedLangCodesString[]The BCP-47 language code values for the languages detected in the recognized text.["en", "ja"]
frameFrameAn object containing the coordinates — position and size — that represent the bounding rectangle in the scanned image that contains the TextBlock.{ x: 100, y: 100, width: 650, height: 200 }
cornerPointsPoint[]An array of Point objects that define a closed shape within the scanned image that contains the TextBlock.[ { x: 100, y: 100 }, { x: 750, y: 100 }, { x: 750, y: 300 }, { x: 100, y: 300 } ]

frame and cornerPoints both represent shapes that enclose the TextBlock in the scanned document image. You can use these regions to create user interfaces and displays for interacting with the scan results to further process them. For example, to assign different TextBlocks or TextLines to different form fields. An example of this would be scanning a business card into name, company, and contact fields.

The difference between the two is that frame is a rectangle, the smallest that fully encloses the scanned text within the image, while cornerPoints defines a not-necessarily rectangular shape, which more tightly encloses the scanned text. Use frame for approximate shapes, and cornerPoints when you want to be as close as possible to the scanned text.

The preceding example values of frame and cornerPoints represent the same region of a scanned document image. That’s unlikely to happen in a real world scan, where it’s hard to hold a camera perfectly aligned with a document. The nice, round numbers for the size of the region are similarly unlikely.

An object representing a single line of scanned text.

Property NameTypeDescriptionExample
textStringA string containing the text content of the line."This is a demonstration of how the text in a document"
elementsTextElement[]An array of TextElement objects, each of which represents a word or glyph within the TextLine.[ ["This"], ["is"], ["a"], ["demonstration"], ["of"], ["how"], ["the"], ["text"], ["in"], ["a"], ["document"], ]
recognizedLangCodesString[]The BCP-47 language code values for the languages detected in the recognized text.["en"]
frameFrameAn object containing the coordinates — position and size — that represent the bounding rectangle in the scanned image that contains the TextLine.{ x: 100, y: 100, width: 650, height: 100 }
cornerPointsPoint[]An array of Point objects that define a closed shape within the scanned image that contains the TextLine.[ { x: 100, y: 100 }, { x: 750, y: 100 }, { x: 750, y: 200 }, { x: 100, y: 200 } ]

An object representing a single word, individual character, or glyph.

Property NameTypeDescriptionExample
textStringA string containing the text content of the word or character.“This”
recognizedLangCodesString[]The BCP-47 language code values for the languages detected in the recognized text.["en"]
frameFrameAn object containing the coordinates — position and size — that represent the bounding rectangle in the scanned image that contains the TextElement.{ x: 100, y: 100, width: 45, height: 100 }
cornerPointsPoint[]An array of Point objects that define a closed shape within the scanned image that contains the TextElement.[ { x: 100, y: 100 }, { x: 145, y: 100 }, { x: 145, y: 200 }, { x: 100, y: 200 } ]

An object representing a bounding rectangle. When used in DocumentScanner, the Frame is the smallest that fully encloses a region of scanned text for a TextBlock, TextLine, or TextElement.

Property NameTypeDescriptionExample
xNumberThe X coordinate of the top-left of the rectangle, in pixels, within the coordinate system of the scanned image.100
yNumberThe Y coordinate of the top-left of the rectangle, in pixels, within the coordinate system of the scanned image.100
widthNumberThe width of the rectangle, in pixels.650
heightNumberThe height of the rectangle, in pixels.200

An object representing a point in a coordinate system.

Property NameTypeDescriptionExample
xNumberThe X coordinate of the point.100
yNumberThe Y coordinate of the point.100

An object containing configuration details for a document scanning session.

Property NameTypeDescriptionExample
permissionRationaleTextStringOptional, and only for Android implementations. The text shown in the UI when the device prompts the user to grant permission for your app to use the camera."Grant permission for [app name here] to use the camera to scan documents"
imageSourceDocumentScannerSourceOptional. Specifies the source of the document to be scanned. Defaults to "DEVICE_CAMERA"."PHOTO_LIBRARY"
scriptHintScriptOptional. Specifies the language writing system of the text to be scanned. Defaults to "LATIN"."DEVANAGARI"
returnImageBytesBooleanOptional. Specifies whether the image data should (true) or should not (false) be returned by the plugin. Defaults to false. This setting is overridden and set to false when imageSource is set to “INPUT_IMAGE”.true
inputImageBytesString[]Optional. A stringified array of base64 image data to be scanned. Used when imageSource is set to "INPUT_IMAGE"."2432BcYPJhSkzZjHS-Uiz1g8iZdQnRHtPnFKbMltJEc"

An object representing an error that occurred when accessing DocumentScanner features.

Property NameTypeDescription
codeDocumentScannerFailureCodeA value representing the reason for a biometrics service error. See DocumentScannerFailureCode for the list of possible values.
messageStringA string value describing the reason for the failure. This value is suitable for use in user interface messages. The message is provided in English and isn’t localized.

See Also