What is Einstein OCR?
Einstein OCR (optical character recognition) provides models that detect typed and handwritten text in an image or PDF.
You access the models from a single REST API endpoint. Each model is used for specific use cases, such as business card scanning, product lookup, and digitizing documents and tables.
Model ID | Used For | Recommended Task Parameter Values |
---|---|---|
OCRModel |
|
|
tabulatev2 |
|
|
When you call the API, you send in an image, and the JSON response contains various elements based on the value of the task parameter:
- String of alphanumeric characters that the model predicts.
- Confidence (probability) that the detected bounding box contains text.
- XY coordinates for the location of the character string within the image (also called a bounding box).
- For tabular data, the table row and column in which the text is located.
- For business cards, the entity type of the detected text such as ORG, PERSON, and so on.
- For common forms, key-value pairs that contain the detected text, bounding box data for the detected text, and the entity (a consistent field name).
Here’s a sample image. The orange boxes and text indicate the detected text.
Currently, Einstein OCR supports only English.