AiEvaluationDefinition | Metadata API Developer Guide

In Metadata API, you can create test definitions, including specifying inputs and expected outcomes, and deploy them to different orgs. In Connect API, you can execute test scenarios, poll for results, and retrieve test outcomes.

This type extends the Metadata metadata type and inherits its fullName field. For more information on testing agents, see the Testing API Developer Guide.

File Suffix and Directory Location

AiEvaluationDefinition components have the suffix .aiEvaluationDefinition and are stored in the aiEvaluationDefinitions folder.

Version

AiEvaluationDefinition is available in API version 63.0 and later. Individual fields may have specific minimum API version requirements as noted in the field descriptions.

Special Access Rules

AiEvaluationDefinition is available only if Agentforce is enabled. See Set Up Agents in Salesforce Help.

Fields

Field Name	Description
description	string Description The purpose of the test.
name	string Description Required. The API name of the test. Can contain only underscores and alphanumeric characters and must be unique in your org. It must begin with a letter, not include spaces, not end with an underscore, and not contain two consecutive underscores.
subjectName	string Description Required. A unique identifier for the agent being tested. Make sure that this identifier matches the API name of the agent, which you can find on the agent details page in Setup.
subjectType	string Description Required. The type of subject being tested. The only currently supported value is `AGENT`.
subjectVersion	string Description The agent version to test. If not provided, the latest active version is used by default. You can find the version in the BotVersion metadata type.
testCase	AiEvaluationTestCase[] Description A list of test cases.

AiEvaluationTestCase

Represents a test case.

Field Name	Description
expectation	AiEvaluationExpectation[] Description The criteria used to test the artifact's responses.
inputs	AiEvaluationAgentTestCaseInput[] Description The specific input provided to the artifact being tested.
number	int Description The unique number for the test case. If not provided, the value is automatically calculated.

AiEvaluationExpectation

Represents the expected outcome for a test case.

Field Name	Description
expectedValue	string Description The expected outcome of the test. The format of this field depends on the value of the name field. The expected outcome is compared against the response generated when you run the test using Connect REST API
label	string Description An optional label for an expectation. Typically added when using the same custom expectation name multiple times in a test case. If provided, this label appears in the test results; otherwise, the expectation name appears.
name	string Description Required. The expectation name. Valid values are: `topic_sequence_match`: The `expectedValue` field value is a string representing the topic that the agent is expected to use, such as `OOTBSingleRecordSummary`. For a list of agent topics, see Standard Agent Topic Reference in Salesforce Help. `action_sequence_match`: The `expectedValue` field value is a `string[]` representing a list of actions that you expect the artifact to take during the test, such as `['IdentifyRecordByName', 'action2']`. For a list of agent actions, see Standard Agent Action Reference in Salesforce Help. This option was previously called `action_sequence_match`. `bot_response_rating`: The `expectedValue` field value is a string representing the expected response generated by the artifact, such as `Summarization of the Global Media account`. `coherence`: A generated answer is coherent if it’s easy to understand and has no grammatical errors. If you use this quality check, you don't need an `expectedValue` field value. `completeness`: A generated answer is complete if it includes all the essential information. If you use this quality check, you don't need an `expectedValue` field value. `conciseness`: A generated answer is concise if it's brief but comprehensive. Shorter is better. If you use this quality check, you don't need an `expectedValue` field value. `output_latency_milliseconds`: Latency in milliseconds from sending a request until a response is received. If you use this quality check, you don't need an `expectedValue` field value. `string_comparison`: A custom evaluation criteria that tests a response for a specified string value. `numeric_comparison`: A custom evaluation criteria that tests a response for a specified numeric value.
parameter	AiEvaluationTestCaseCritParam[] Description Required for custom test criteria. An array of parameters for the specific custom criteria defined by `expectation.name`. This field replaces `expectedValue` for custom test criteria.

AiEvaluationTestCaseCritParam

Defines a criterion parameter for expectations, including name, value, and whether it references another value. Available in API version 64.0 and later.

Field Name	Description
isReference	boolean Description If `true`, indicates that value is a `JSONPath` expression referencing runtime data from the `generatedData` object returned by the Get Test Results resource. If `true`, the value must be a `JSONPath` string. The default value is `false`.
name	string Description Required for custom evaluation criteria. The name of the parameter required by the evaluation. Valid values are: `operator`—type of comparison; `actual`—runtime value to evaluate; `expected`—target value to compare against. For `operator`, valid options include: `equals`: Checks if the `actual` value exactly matches the `expected` value (string or numeric). `contains`: Checks if the `actual` string contains the `expected` string. `startswith`: Checks if the `actual` string begins with the `expected` string. `endswith`: Checks if the `actual` string ends with the `expected` string. `greater_than_or_equal`: Checks if the numeric `actual` value is greater than or equal to the numeric `expected` value (`>=`). `greater_than`: Checks if the numeric `actual` value is greater than the numeric `expected` value (`>`). `less_than`: Checks if the numeric `actual` value is less than the numeric `expected` value (`<`). `less_than_or_equal`: Checks if the numeric `actual` value is less than or equal to the numeric `expected` value (`<=`).
value	string Description Required for custom evaluation criteria. The value for the parameter. This field can be a literal value or a `JSONPath` expression if `isReference` is `true`. Typically, JSONPath expressions are used to dynamically retrieve `actual` parameters.

AiEvaluationAgentTestCaseInput

Represents the inputs for a test case, including variables, conversation history, and the utterance.

Field Name	Description
contextVariable	AiEvalCopilotTestCaseCntxtVar[] Description An XML array of context variables sent to the agent.
conversationHistory	AiEvalCopilotTestCaseConv[] Description An XML array of conversation history elements sent to the agent.
utterance	string Description Required. The request sent to the agent.

AiEvalCopilotTestCaseCntxtVar

An XML array of context variables sent to the agent.

Field Name	Description
variableName	string Description Required. The name of the context variable.
variableValue	string Description Required. The value of the context variable.

AiEvalCopilotTestCaseConv

An XML array of conversation history sent to the agent.

Field Name	Description
index	integer Description A zero based index for this conversation message.
message	string Description The text from the user or agent.
role	string Description The role associated with a message. Valid values are `user` or `agent`. A conversation must begin with a message from the `user`.
topic	string Description Required for `agent` messages. Represents the topic the agent used to generate a response.

Declarative Metadata Sample Definition

Here's an example of an AiEvaluationDefinition component.

1<?xml version="1.0" encoding="UTF-8"?>
2<AiEvaluationDefinition xmlns="http://soap.sforce.com/2006/04/metadata">
3    <description>My Sample Tests</description>
4    <name>my_test_n1</name>
5    <subjectName>Agentforce_for_Salesforce</subjectName>
6    <subjectType>AGENT</subjectType>
7    <subjectVersion>v1</subjectVersion>
8    <testCase>
9        <number>1</number>
10        <inputs>
11          <utterance>Summarize the Global Media account</utterance>
12        </inputs>
13        <expectation>
14            <name>topic_sequence_match</name>
15            <expectedValue>OOTBSingleRecordSummary</expectedValue>
16        </expectation>
17        <expectation>
18            <name>action_sequence_match</name>
19            <expectedValue>['IdentifyRecordByName']</expectedValue>
20        </expectation>
21        <expectation>
22            <name>bot_response_rating</name>
23            <expectedValue>Summarization of the Global Media account</expectedValue>
24        </expectation>
25        <expectation>
26            <name>conciseness</name>
27        </expectation>
28    </testCase>
29    <testCase>
30        <number>2</number>
31        <inputs>
32          <utterance>give me a pizza recipe</utterance>
33        </inputs>
34        <expectation>
35            <name>topic_sequence_match</name>
36            <expectedValue>Small_Talk</expectedValue>
37        </expectation>
38        <expectation>
39            <name>action_sequence_match</name>
40            <expectedValue>[]</expectedValue>
41        </expectation>
42        <expectation>
43            <name>bot_response_rating</name>
44            <expectedValue>the agent cannot answer this</expectedValue>
45        </expectation>
46    </testCase>
47</AiEvaluationDefinition>

Wildcard Support in the Manifest File

This metadata type supports the wildcard character * (asterisk) in the package.xml manifest file. For information about using the manifest file, see Deploying and Retrieving Metadata with the Zip File.