Einstein Vision is a new service that helps you build smarter applications by using machine learning to automatically identify image content. It provides an API that lets you use the power of image recognition to build AI-enabled apps. This blog post helps you learn how to use the Einstein Vision API using Apex and a custom wrapper. To understand this blog post properly, first read our Einstein Vision documentation and check out our new Trailhead Quick Start to learn how to implement your first image recognition model. Keep reading to see how Einstein Vision works with a real-world example.
How Einstein Vision works
Einstein Vision scales existing business processes and enables new ones through image-related automation. Let’s explore how Einstein Vision works with an example.
A manufacturing company can train Einstein Vision on different images of manufactured machines. When a service technician uploads a picture of a machine to a custom Customer Visit Report object using Salesforce1, the Einstein Vision API is automatically called. Based on the results of the prediction, the report object automatically gets tagged with the name and type of machine.
How does this work with Einstein Vision?
- First, the customer collects images (the more the better) of what they’d like to classify.
They then create a dataset using the Einstein Vision API, which holds all (or a portion) of the images used to train the model. - The most important elements of a dataset are labels. Think of a label as a category. Every image that a customer wants to identify falls within a specified label.
- Once the customer has collected sufficient images, they train the dataset, and the output is a trained model
- Now, service technicians can validate images from different data sources, such as a file or URL, against this model. For every check, the identified label(s) and probability values are returned.
The most important ingredient to getting an accurate model is gathering a training dataset. The dataset is representative of all image details, including angle and clarity, to best represent the outcome you’re trying to predict against. The more image variety provided for each labeled class, the more accurate the model, which is important to keep in mind when collecting, training, and predicting images.
How do I use those APIs from Apex?
There are different ways to implement Einstein Vision within your Salesforce org. You can write your own Apex code for your different use cases, based on this example. Alternately, use this open-source wrapper on GitHub, which can be easily deployed to any org. We focus on the latter in this blog post as it simplifies the implementation.
The class structure
All Apex classes are prepended with EinsteinVision_
. This makes it simple for an administrator or a developer to identify the classes within an existing org. Because naming conventions rule!
The entry class is EinsteinVision_PredictionService
. This class holds methods to communicate with the Einstein Vision REST APIs. This simplifies the user experience as you don’t have to set the correct HTTP headers, check required values, and more.
Let’s dig into two examples for how to use the wrapper classes. We’ll start with what will be most used—predicting an image.
Code explanation:
- Line 1: A new
EinsteinVision_PredictionService
gets created. For initialization, the bearer token needs to be passed as a parameter. See the Einstein Vision documentation to learn about the steps for getting a valid token. - Line 2: This calls the prediction of a remote URL. The first parameter in the method
predictUrl
relates to the ID of the trained model. The second relates to the URL of the image. The final parameter represents the ID of a specific sample that you can check again (which we aren’t here, so an empty string is passed). - Line 3–6: A simple iteration and console printout of the returned labels and their probability value.
You can also predict images based on base64 or a Blob. Check out the methods predictBase64
and predictBlob
for that.
In the second example, we do a bit more by querying all existing datasets to gauge training accuracy and print that to the console.
Code explanation:
- Line 2: We’re getting a list of all trained datasets for the authenticated bearer token.
- Line 3–6: For every dataset we’re retrieving the model metrics, which contain data such as training accuracy, test accuracy, and more.
From Triggers to Lightning to Process Builder
Real simplification happens when code is built in a way that it can be used not only in code but also in declarative tools such as Process Builder. The GitHub repo provides examples of how you can abstract the wrappers for easier access.
One use case is to provide a static utility method to return only labels for a specific dataset when the probability is higher than a configurable value.
Code explanation:
- Line 3: The method
predictBlob
takes a Blob and an Integer value. The Integer value determines the minimum probability value for all to be returned label values. - Line 6: The prediction of the Blob value runs against a specific dataset ID (312654).
- Line 10: We check the probability value of each returned label against the minimum value.
Your next steps
As previously mentioned, it’s important to get familiar with the API in the Einstein Vision documentation. Show off what you’ve learned by earning the Einstein Vision badge on Trailhead. There is also a great webinar about how to build smarter apps with Einstein Vision. And check out the wrappers on GitHub.
Einstein Vision is also available as a Heroku add-on (beta). You can call the API directly from your on-premises servers or native mobile apps. Stay tuned on this blog to learn more.
About the author
René Winkelmeyer works as a senior developer evangelist at Salesforce. He focuses on enterprise integrations, mobile, and security with the Salesforce Platform. You can follow him on Twitter on his handle @muenzpraeger.