Newer Version Available
CustomFeatureExtractor Interface
Namespace
Usage
- columnNames:List<String>—It is input feature names required by the model.
- rawData:List<List<String>>—A two-dimensional array that contains the value for each feature in the same order and a two-dimensional array for bulk feature retrieval for multiple records.
Consider an ML use case that predicts the price of a real estate property. To do so, the ML model can require features, such as the property’s square feet area, location, number of bedrooms, construction year, and construction age. Assume that all of these features are available in the property record except the construction age feature. In this case, the AI Accelerator API invokes a feature extractor to calculate the construction age in runtime by subtracting the construction year feature from the current year. Complex logic, such as taking joins across different entities and calculating the sum or average of some data over a period, can be used in feature extraction.
- Apex—Default feature extractor implementation is overridden by the custom feature extractor implementation.
- Hybrid—Default feature extractor implementation is used and can be extended by the custom feature extractor implementation.
- Java—Default feature extractor implementation is overridden by the custom feature extractor implementation.
- Every implementation receives parameters as a recordList of Map and a Map of the previously extracted features.
- An implementation receives parameters as a Map of the previous extracted features when multiple Apex classes are involved in feature extraction or when the intermediate extracted features must be passed on across the classes.
- Add validations for input parameters in the implementation for your use case. Make sure that the validations check for the presence of essential keys or columns required for the implementation.
- Every implementation must merge newly extracted features with previously extracted features, and return the merged output. The previously extracted features can be null.
- If a use case requires the execution of multiple implementations in sequence or parallel, provide a wrapper to invoke the required feature extractors. Mention the wrapper class name in the configuration file.
- For some use cases, few features have to be extracted and others have to be provided as raw data in the input request. For such use cases, implementation must provide the final output by merging extracted features and values from the raw data.
- For Java and Hybrid feature extractors, all the Java implementation classes must expose a default constructor without requiring any parameters to be passed.
- AI Accelerator platform validates the output map of feature extraction implementations
to ensure that it contains a non-empty list of raw data and column names.
- Raw data is a List of Lists where each inner list represents the values of features for a record. The outer list supports bulk extraction.
- Column names store the list of feature names.
CustomFeatureExtractor Methods
The following are methods for CustomFeatureExtractor.
extractFeatures(var1, var2)
Signature
public Map<String,Object> extractFeatures(List<Map<String,Object>> var1, Map<String,Object> var2)
Parameters
- var1
- Type: List<Map<String,Object>>
- Represents the input parameters for deriving the feature extraction. For example, recordId that is required for a DB query, as per the implementation logic of the feature extractor.
- var2
- Type: Map<String,Object>
- Represents the map of a previous extracted feature when multiple classes are involved in feature extraction. This is applicable when your custom feature extractor implementation is extending the default feature extractor implementation in case of Hybrid FeatureExtractorType setting in the use case configuration file.
Return Value
Type: Map<String,Object>
CustomFeatureExtractor Example Implementation
This is an example implementation of the aiaccelerator.CustomFeatureExtractor interface.
- The feature extraction implementation is just a test implementation. It actually doesn't do any DB queries and just expects everything to be present in the rawData map to be returned in the final output.
- The feature extraction parameters contain the rawData key. This can be used to pass the values of a few features directly, if applicable.
- In actual, the keys like storeId and productId can be used as parameters in a DB query for feature derivation. The rawData keys and values can be merged with the extracted features.
1global virtual class SampleCustomFeatureExtractor implements CustomFeatureExtractor {
2
3 private static final String RAW_DATA = 'rawData';
4 private static final String COL_NAMES = 'columnNames';
5
6 /**
7 * A Sample implementation that extracts "rawData" key from the map
8 * and prepares response of columnNames list and rawData list of list.
9 * @request:
10 * [{
11 * "storeId":"st1",
12 * "productId":"p1",
13 * "rawData":{
14 * "storeCategory":"PREMIUM"
15 * "day": MONDAY
16 * }
17 * }]
18 * @return
19 * {
20 * "rawData":[[PREMIUM, MONDAY]]
21 * "columnNames": ["storeCategory", "day"]
22 * }
23 * */
24 global virtual Map<String,Object> extractFeatures(List<Map<String,Object>> request, Map<String,Object> extractedFeatures) {
25 if (request == null || request.size() == 0) {
26 return extractedFeatures;
27 }
28 Set<String> cols = new Set<String>();
29 List<List<String>> rawDataList = new List<List<String>>();
30
31 //iterating list
32 for (Map<String, Object> record: request) {
33
34 //iterating outer map
35 for (String key: record.keySet()) {
36 List<String> row = new List<String>();
37
38 //extract rawData and add to output features
39 if ('rawData' == key) {
40 Object value = record.get(key);
41 if (value instanceof Map<String,Object>) {
42 Map<String,Object> raw = (Map<String, Object>) value;
43 for (String keyRawData: raw.keySet()) {
44 cols.add(keyRawData);
45 row.add((String) raw.get(keyRawData));
46 }
47 }
48 rawDataList.add(row);
49 }
50 }
51
52 }
53
54 return mergeFeatures(extractedFeatures, new List<String>(cols), rawDataList);
55 }
56
57 private Map<String, Object> mergeFeatures(Map<String, Object> extractedFeatures, List<String> columnNames, List<List<String>> rawDataList) {
58 Map<String, Object> features = new Map<String, Object>();
59 if (extractedFeatures == null || extractedFeatures.isEmpty() || extractedFeatures.get(COL_NAMES) == null
60 || extractedFeatures.get(RAW_DATA) == null) {
61 features.put(COL_NAMES, columnNames);
62 features.put(RAW_DATA, rawDataList);
63 return features;
64 }
65 if (columnNames.isEmpty()) {
66 return extractedFeatures;
67 }
68 List<String> oldCols = (List<String>) extractedFeatures.get(COL_NAMES);
69 List<List<String>> oldRows = (List<List<String>>) extractedFeatures.get(RAW_DATA);
70 List<String> extractedCols = new List<String>();
71 List<List<String>> extractedRows = new List<List<String>>();
72 extractedCols.addAll(oldCols);
73 extractedCols.addAll(columnNames);
74
75 for (Integer i = 0; i < rawDataList.size(); i++) {
76 List<String> mergedRow = new List<String>();
77 mergedRow.addAll(oldRows.get(i));
78 mergedRow.addAll(rawDataList.get(i));
79 extractedRows.add(mergedRow);
80 }
81 features.put(COL_NAMES, extractedCols);
82 features.put(RAW_DATA, extractedRows);
83 return features;
84 }
85}