Detect Text in Standard Forms
Einstein OCR (optical character recognition) can now extract text from a variety of common forms.
Einstein OCR supports these standard forms.
- Driver's license, only United States, including Washington, DC
- Form 1040, only 2019
- Form W-2
- Passport, only Australia, Canada, United Kingdom, and United States
- Pay stub, generated by ADP or Workday
- Permanent Resident Card, only United States including Washington, DC
When you call the API, you send in the form as an image or PDF, and specify the
tabulatev2 modelId. The JSON response contains key-value pairs for each field in the form.
- The key is the label of the form field detected by Einstein OCR. If you fill out the form, the key is the text that you read to see what data to put in that field.
- The value is the data of the actual form field.
The model infers the document layout and word relationships, then extracts the entity and value for each field in the form. For example, in a driver's license, Einstein links together the issue date of the license (09/13/2020), how the issue date is referred to on the license (4a ISS), and the entity type (issue_date), which is a consistent field name across all variations of driver’s licenses. In addition, when the model detects a compound field (such as an address), it normalizes the text of that field into its constituent parts.
Here are some examples of key-value pairs from a form.
|Form||Form Key||Form Value||Normalized Text|
|Form W-2||123 Sample Street Phoenix, AZ, 01234|
When we detect a form field as an address, we normalize that text and return fields populated with street, city, state, postal code, and country. When an address has multiple lines (as in the case of P.O. Box, or Apartment), normalizedText returns the original text. if there’s a line separator in the text field, it will be present in the normalizedText field.
Currently only U.S. addresses are normalized.
See the Entity Normalization for more details on how detected text is normalized.