+ Start a Discussion
apsullivanapsullivan 

Apex CPU time limit when parsing large amount of JSON

We recently increased the size of a table in our back end by about tenfold and it's causing me to hit CPU limits in a batch callout job. I have attempted to rewrite the code to make several calls, but ultimately I think the failure comes in assembling the large map. Is there any way around this without adding pagination to our back end API call and then making a bunch of classes? I'd love to contain this in a single class but it's so much data being processed that I don't know how to get around this. Could I more efficiently parse the JSON maybe? Store it to a separate table and have two batch classes running, one to populate that table with values mirroring our back end and another to do the sync?

Original Code:
global class AdminAPIAccountLookupSync_Batch implements Database.Batchable<sObject>, Database.AllowsCallouts, Database.Stateful {

	global Database.QueryLocator start(Database.BatchableContext bc) {
        return database.getQuerylocator([SELECT Id,Client_ID__c,Dash_Id__c FROM Account WHERE RecordType.Name = 'End User' OR RecordType.Name = 'Agency / Channel Partner']);
	}

    global void execute(Database.batchableContext info, List<Account> scope) {
        //Establish Lists and Maps for later
        Map<String,String> mapIdMatch = new Map<String,String>();
        List<Account> listAccountsToUpdate = new List<Account>();

        try {
            String endpointAddition = '/clients';

            // refer to helper class to complete callout
            HttpResponse res = APICalloutHelper.getResponse(APICalloutHelper.buildAdminAPIGetRequest(endpointAddition));

            // parse JSON to extract Icon URL
            JSONParser responseParser = JSON.createParser(res.getBody());

            String rowId;
            
            while (responseParser.nextToken() != null) {
                if (responseParser.getCurrentToken() == JSONToken.FIELD_NAME && responseParser.getText() == 'id') {
                    responseParser.nextToken();
                    rowId = responseParser.getText();
                } else if (responseParser.getCurrentToken() == JSONToken.FIELD_NAME && responseParser.getText() == 'salesforce_id') {
                    responseParser.nextToken();
                    if (responseParser.getText() != 'null') {
                        mapIdMatch.put(responseParser.getText(),rowId);
                    }
                }
            }
            
        } catch (Exception e) {
            System.debug(LoggingLevel.ERROR,'Error getting Admin response: ' + e.getMessage());
        }
        if (!mapIdMatch.isEmpty()) {
            for (Account acc : scope) {
                if (mapIdMatch.containsKey(acc.Client_ID__c)) {
                    acc.Dash_Id__c = mapIdMatch.get(acc.Client_ID__c);
                    listAccountsToUpdate.add(acc);
                }
            }
            update listAccountsToUpdate;
        }
    }

	global void finish(Database.batchableContext info){}
}


Rework Attempt (also hitting CPU limit, but maybe I'm on the right path?)
global class AdminAPIAccountLookupSync_Batch implements Database.Batchable<sObject>, Database.AllowsCallouts, Database.Stateful {

    public Map<String,String> mapIdMatch = new Map<String,String>();
    public List<Account> listAccounts = new List<Account>();

	global Database.QueryLocator start(Database.BatchableContext bc) {
        return database.getQuerylocator([SELECT Id,Client_ID__c,Dash_Id__c,API_Ref__c FROM Account WHERE RecordType.Name = 'End User' OR RecordType.Name = 'Agency / Channel Partner']);
	}

    global void execute(Database.batchableContext info, List<Account> scope) {
        for (Account acc : scope) {
            listAccounts.add(acc);
        }
    }

	global void finish(Database.batchableContext info) {
        Integer page = 1;
        while (page != null) {
            try {
                String endpointAddition = '/clients?rows_per_page=1000&page=' + String.valueOf(page);
                // refer to helper class to complete callout
                HttpResponse res = APICalloutHelper.getResponse(APICalloutHelper.buildAdminAPIGetRequest(endpointAddition));

                // parse JSON to extract Icon URL
                JSONParser responseParser = JSON.createParser(res.getBody());

                if (res.getBody() != null) {
                    String rowId;
                    
                    while (responseParser.nextToken() != null) {
                        if (responseParser.getCurrentToken() == JSONToken.FIELD_NAME && responseParser.getText() == 'id') {
                            responseParser.nextToken();
                            rowId = responseParser.getText();
                        } else if (responseParser.getCurrentToken() == JSONToken.FIELD_NAME && responseParser.getText() == 'salesforce_id') {
                            responseParser.nextToken();
                            if (responseParser.getText() != 'null') {
                                mapIdMatch.put(responseParser.getText(),rowId);
                            }
                        }
                    }
                    page++;
                } else {
                    page = null;
                }
            } catch (Exception e) {
                page = null;
                System.debug(LoggingLevel.ERROR,'Error getting Admin response: ' + e.getMessage());
            }
        }

        List<Account> listAccountsToUpdate = new List<Account>();
        if (!mapIdMatch.isEmpty()) {
            for (Account acc : listAccounts) {
                if (mapIdMatch.containsKey(acc.Client_ID__c) {
                    acc.Dash_Id__c = mapIdMatch.get(acc.Client_ID__c);
                    listAccountsToUpdate.add(acc);
                }
            }
            update listAccountsToUpdate;
        }
    }
}

 
Best Answer chosen by apsullivan
Daniel BallingerDaniel Ballinger
Batch Apex is usually pretty good with the Heap Size. I think it is currently 12MB in async contexts. That said, you need to be careful about how much data you keep on the heap as you process the JSON. There is always going to be an upper limit on how much data you can process. Maybe you could load the data from the JSON into a custom object that provides temporary storage. That could then be used as the basis of the batch job.

Certainly updating the external app to provide just the data you need will make the processing easier in Salesforce.

All Answers

Daniel BallingerDaniel Ballinger
I don't think moving the processing into the batch job finish method is the correct way to scale this functionality. 

My approach would be to extend the external service to provide data for the Id's you specify. E.g.
  1. Run batch job, and adjust scope size to only process a limited number of matching accounts per execution.
  2. In the execute method, call the external service to get the data for just the records being processed in that iteration
I'd justify this as being more scalable. You can always reduce the batch scope all the way down to 1 to keep the processing time down. Then it won't matter how many Account records there are. It might take longer, but it will get through all of them eventually.
apsullivanapsullivan
Hey Daniel - Thanks for the reply! I agree that moving the processing into the finish job is probably not the way to go.

I think your approach makes sense. The issue currently is that I am using an id with which I cannot currently query individual records to try and get the id with which I can. I think the solution here is going to require a change to our API. However, I think

As far as reducing batch scope, that won't help as the issue is actually processing the data returned from the API vs. processing the records in SFDC, so using this generalized API call I return a huge blob of data for every batch execution. Ultimately though I think SFDC just isn't really equipped to parse a giant JSON blob like this. I need to move to an approach where I am instead querying individual client data in a loop based on a different Id instead of trying to pull it all out of a big data set.

Thanks again.
Daniel BallingerDaniel Ballinger
Batch Apex is usually pretty good with the Heap Size. I think it is currently 12MB in async contexts. That said, you need to be careful about how much data you keep on the heap as you process the JSON. There is always going to be an upper limit on how much data you can process. Maybe you could load the data from the JSON into a custom object that provides temporary storage. That could then be used as the basis of the batch job.

Certainly updating the external app to provide just the data you need will make the processing easier in Salesforce.
This was selected as the best answer
apsullivanapsullivan
Hey Daniel,

I think your Custom Object solution might be the way to go. Do you think I actually need to insert records to a Custom Object table to avoid the heap upper limit? In other words, parse the JSON, assign what matters to me to object fields, insert the records, then query my inserted records and compare the objects to my Account object to sync the ids?