Use Predictive Model Calls in Custom Chunking Functions

Call Einstein Studio predictive models from custom chunking functions when you want to enrich chunks with model predictions, such as regression values, scores, or classifications. This approach passes structured feature columns to a deployed model and maps the prediction into your chunk output.

In a custom chunking function, your code controls how content is split into chunks, either by splitting element text into smaller pieces or by passing each element through as a single chunk. See Write a Custom Chunking Function. Predictive calls take structured feature fields and return a prediction value that you can use anywhere in your chunking logic, such as to enrich citations, filter chunks, or classify chunk types. Build the feature fields from any data available in your function, such as fields in the SearchIndexChunkingV1Request input or values that you compute in your code.

Edition Table
Available in: Developer, Enterprise, Performance, and Unlimited Editions. See Data 360 edition availability.
Permission Sets Needed
To use predictive model call in custom chunking logic:Permission set:
  • Data Cloud Architect

Implement this flow inside the function(request, runtime) callable in payload/entrypoint.py: feature construction, prediction request, prediction invocation, response parsing, and chunk output mapping.

  1. Read the document elements to chunk from request.input. For each element, run steps 2-5 to generate its enriched chunk.
  2. Build the feature columns for the model. For each feature, use PredictionColumBuilder to set the column name and typed values, and use set_string_values for text features or set_double_values for numeric features. Read feature values from the element’s metadata (for example, source_dmo_fields) or from other content, such as an LLM-generated summary.
  3. Build a prediction request with PredictionRequestBuilder: set the prediction type (for example, PredictionType.REGRESSION), the model API name, and the prediction columns.
  4. Call runtime.einstein_predictions.predict(...).
  5. Check prediction_response.is_success and that the result type indicates success (for example, RegressionPredictionSuccess), then read the value from prediction_response.data (for example, results[0]["prediction"]["predictedValue"]).
  6. Map the prediction into the chunk (for example, add it to citations), and build a chunk (text, seq_no, chunk_type). Aggregate chunks across all elements with a continuous seq_no, and return SearchIndexChunkingV1Response(output=chunks).

For a reference implementation, see example/chunking_with_prediction/entrypoint.py in the initialized function package.

Handle failed predictions explicitly. Log failures, decide whether to continue with a fallback value or raise a controlled error, and avoid placing sensitive data in logs.

  1. From your function package root, confirm that your function entrypoint and test payload files are ready for local validation.

  2. Configure SDK authentication for local prediction testing.

    • Configure client credentials:
      • datacustomcode configure --auth-type client_credentials
    • Provide your org domain, connected app consumer key, and consumer secret when prompted.
  3. Run local validation against your function entrypoint and test payload.

    • sf data-code-extension function run -e payload/entrypoint.py -t tests/test.json