Newer Version Available

This content describes an older version of this product. View Latest

CustomFeatureExtractor Interface

Use the custom apex-based feature extractor interface to override or extend the default feature extractor implementation.

Namespace

AiAccelerator

Usage

Custom feature extractor interface takes feature extraction parameters as inputs, performs the required buiness logic according to the use-case, and then returns the extracted features as a <key,value> map. Here the expected keys of the map are:
  • columnNames:List<String>—It is input feature names required by the model.
  • rawData:List<List<String>>—A two-dimensional array that contains the value for each feature in the same order and a two-dimensional array for bulk feature retrieval for multiple records.

Consider an ML use case that predicts the price of a real estate property. To do so, the ML model can require features, such as the property’s square feet area, location, number of bedrooms, construction year, and construction age. Assume that all of these features are available in the property record except the construction age feature. In this case, the AI Accelerator API invokes a feature extractor to calculate the construction age in runtime by subtracting the construction year feature from the current year. Complex logic, such as taking joins across different entities and calculating the sum or average of some data over a period, can be used in feature extraction.

The behavior of the custom feature extractor interface implementation is dependent on the value of the FeatureExtractorType property set in your use case configuration file.
  • Apex—Default feature extractor implementation is overridden by the custom feature extractor implementation.
  • Hybrid—Default feature extractor implementation is used and can be extended by the custom feature extractor implementation.
  • Java—Default feature extractor implementation is overridden by the custom feature extractor implementation.
Keep these considerations in mind when implementing the interface for a custom feature extractor.
  • Every implementation receives parameters as a recordList of Map and a Map of the previously extracted features.
  • An implementation receives parameters as a Map of the previous extracted features when multiple Apex classes are involved in feature extraction or when the intermediate extracted features must be passed on across the classes.
  • Add validations for input parameters in the implementation for your use case. Make sure that the validations check for the presence of essential keys or columns required for the implementation.
  • Every implementation must merge newly extracted features with previously extracted features, and return the merged output. The previously extracted features can be null.
  • If a use case requires the execution of multiple implementations in sequence or parallel, provide a wrapper to invoke the required feature extractors. Mention the wrapper class name in the configuration file.
  • For some use cases, few features have to be extracted and others have to be provided as raw data in the input request. For such use cases, implementation must provide the final output by merging extracted features and values from the raw data.
  • For Java and Hybrid feature extractors, all the Java implementation classes must expose a default constructor without requiring any parameters to be passed.
  • AI Accelerator platform validates the output map of feature extraction implementations to ensure that it contains a non-empty list of raw data and column names.
    • Raw data is a List of Lists where each inner list represents the values of features for a record. The outer list supports bulk extraction.
    • Column names store the list of feature names.

CustomFeatureExtractor Methods

The following are methods for CustomFeatureExtractor.

extractFeatures(var1, var2)

Returns the extracted features for a model at run time to make a prediction. The extracted features are returned as a key value pair map.

Signature

public Map<String,Object> extractFeatures(List<Map<String,Object>> var1, Map<String,Object> var2)

Parameters

var1
Type: List<Map<String,Object>>
Represents the input parameters for deriving the feature extraction. For example, recordId that is required for a DB query, as per the implementation logic of the feature extractor.
var2
Type: Map<String,Object>
Represents the map of a previous extracted feature when multiple classes are involved in feature extraction. This is applicable when your custom feature extractor implementation is extending the default feature extractor implementation in case of Hybrid FeatureExtractorType setting in the use case configuration file.

Return Value

Type: Map<String,Object>

CustomFeatureExtractor Example Implementation

This is an example implementation of the aiaccelerator.CustomFeatureExtractor interface.

  • The feature extraction implementation is just a test implementation. It actually doesn't do any DB queries and just expects everything to be present in the rawData map to be returned in the final output.
  • The feature extraction parameters contain the rawData key. This can be used to pass the values of a few features directly, if applicable.
  • In actual, the keys like storeId and productId can be used as parameters in a DB query for feature derivation. The rawData keys and values can be merged with the extracted features.
1global virtual class SampleCustomFeatureExtractor implements CustomFeatureExtractor {
2
3    private static final String RAW_DATA = 'rawData';
4    private static final String COL_NAMES = 'columnNames';
5
6    /**
7    * A Sample implementation that extracts "rawData" key from the map
8    * and prepares response of columnNames list and rawData list of list.
9    * @request:
10    *   [{
11    *        "storeId":"st1",
12    *        "productId":"p1",
13    *        "rawData":{
14    *            "storeCategory":"PREMIUM"
15    *            "day": MONDAY
16    *         }
17    *    }]
18    * @return
19    * {
20    *   "rawData":[[PREMIUM, MONDAY]]
21    *   "columnNames": ["storeCategory", "day"]
22    * }
23    * */
24    global virtual Map<String,Object> extractFeatures(List<Map<String,Object>> request, Map<String,Object> extractedFeatures) {
25        if (request == null || request.size() == 0) {
26            return extractedFeatures;
27        }
28        Set<String> cols = new Set<String>();
29        List<List<String>> rawDataList = new List<List<String>>();
30
31        //iterating list
32        for (Map<String, Object> record: request) {
33
34            //iterating outer map
35            for (String key: record.keySet()) {
36                List<String> row = new List<String>();
37
38                //extract rawData and add to output features
39                if ('rawData' == key) {
40                    Object value = record.get(key);
41                    if (value instanceof Map<String,Object>) {
42                        Map<String,Object> raw = (Map<String, Object>) value;
43                        for (String keyRawData: raw.keySet()) {
44                            cols.add(keyRawData);
45                            row.add((String) raw.get(keyRawData));
46                        }
47                    }
48                    rawDataList.add(row);
49                }
50            }
51
52        }
53
54        return mergeFeatures(extractedFeatures, new List<String>(cols), rawDataList);
55    }
56
57    private Map<String, Object> mergeFeatures(Map<String, Object> extractedFeatures, List<String> columnNames, List<List<String>> rawDataList) {
58        Map<String, Object> features = new Map<String, Object>();
59        if (extractedFeatures == null || extractedFeatures.isEmpty() || extractedFeatures.get(COL_NAMES) == null
60                    || extractedFeatures.get(RAW_DATA) == null) {
61                    features.put(COL_NAMES, columnNames);
62                    features.put(RAW_DATA, rawDataList);
63                    return features;
64        }
65        if (columnNames.isEmpty()) {
66            return extractedFeatures;
67        }
68        List<String> oldCols = (List<String>) extractedFeatures.get(COL_NAMES);
69        List<List<String>> oldRows = (List<List<String>>) extractedFeatures.get(RAW_DATA);
70        List<String> extractedCols = new List<String>();
71        List<List<String>> extractedRows = new List<List<String>>();
72        extractedCols.addAll(oldCols);
73        extractedCols.addAll(columnNames);
74
75        for (Integer i = 0; i < rawDataList.size(); i++) {
76            List<String> mergedRow = new List<String>();
77            mergedRow.addAll(oldRows.get(i));
78            mergedRow.addAll(rawDataList.get(i));
79            extractedRows.add(mergedRow);
80        }
81        features.put(COL_NAMES, extractedCols);
82        features.put(RAW_DATA, extractedRows);
83        return features;
84    }
85}