Dataset and Model Best Practices

A dataset can have up to 500 labels, but we recommend a maximum of 100 labels for better model accuracy.
If you have a dataset that contains a lot of classes, increase the number of examples per label.
We recommend that an Einstein Intent or Einstein Sentiment dataset contain a maximum of 100 labels. If you need more than 100 labels, consider hierarchical classification.
We recommend less than 150 words for the length of the intent or sentiment string. This guideline applies to both a language dataset example and a string sent into a model for prediction.
During the training process, special text formatting, like emojis, words in all uppercase , and punctuation aren’t included. For example, if you add a text example containing a smiley emoji to a dataset, the emoji isn’t considered during training. Only the text is used.
When you send in text for prediction, the model doesn’t consider special text formatting and punctuation. For example, when you send the string “We had a great time! :)” to the model, the model returns a prediction for the string “We had a great time”.
Batch predictions aren’t supported. When you send text in for a prediction, you make a single API call to the /intent endpoint or the /sentiment endpoint.