Scaling Data Access With App Layer Cache

Applications like CRM are always active; requests come from sales users, service users, APIs, report executions, and community users. All this access keeps the system busy and they are all business transactions, which means the data is critical and the database is a precious resource.

The usual CRM traffic patterns is not a concern; what would impact application scale are use cases where users repeatedly try to reference the same data to check if records are changed or if new data is available. These are users trying to get the latest lead or the most recent case. Developers, admins, and architects alike need to watch out for such use cases. Let’s explore why this is important.

Data, use cases, and impact

Impact on the health of an application comes from these types of users, use cases, and data:

  1. Thousands of sales or service agents wanting to retrieve the latest lead or case.
  2. The type of request polling, refreshing or reloading of pages resulting in millions of requests in a short amount of time.
  3. Data is dynamic, but between changes or new data, the database is queried with high frequency to check for changes.

And the impact:

  1. Database CPU spikes up during these heavy usage times
  2. Reduced capacity for other applications
  3. Performance slows down for applications and pages
  4. Dashboard and report executions time out
  5. User frustration and complaints

All this leads to lack of scalability for the application and a potential slow down of business growth and increased costs.

There are few solutions available; however, business prefers solutions that are quick to implement and are cost-effective. Caching is potentially a solution in such situations.

Let’s talk about caching

Caching has been around for a while. It is typically used to speed up webpages and protect backend systems from too many user requests. These days some sort of cache is available in all layers of an application stack.

These caches help tremendously in scaling and performance, however it’s not sufficient for high impact use cases. We need a new caching layer which caches data for short periods and intercepts requests from client.

Intro to application layer caching

To address the impact use cases we need to introduce a new cache layer called application layer cache. This cache is connected to the application server and is accessible from the application code.

This new cache layer has good features. It’s an in-memory cache for fast retrievals and is implemented using Redis, an open source caching software. Data is stored in key value pairs. Data structures such as lists, sets and hash-sets can also be used along with primitive datatypes such as numbers and strings. Application layer cache also supports partitioning. At Salesforce, Lightning Platform Cache is the feature we provide to enable this application layer caching to applications on the platform.

Using partitions and instances

Partitioning distributes data among nodes of the cache. This enables different partitions to be allocated to different applications or use cases. Each partition acts as a unique namespace for all keys.

A partition can have two types of cache: a common cache called an org cache which is accessible to all users of the application, and a session cache which stores data private a user and is attached to each logged in user session. The session cache is deleted at the end of a session.

On to solution

Now that we are familiar with cache and its internals, let’s see how it can be implemented for some high impact use cases.

Use case

Let’s take one use case and see how the cache can help. In this use case, agents take reservations from customers. The process involves giving offers based on loyalty programs. The offer is written to the database from an external API call. To check if an offer exists, the agent’s application constantly polls the server. There are thousands of agents doing the same and this causes the database CPU to spike.

We will refer to this use case throughout the rest of this document as the offers use case.

Implementing the app layer cache

The cache solution is as shown. Here session cache is used as each offer is unique to an agent based on reservation done by that agent.

 

The business flow:

  1. An offer record is created via an API call.
  2. Database Trigger fires on save or update.
  3. Record is put into app layer cache (session cache) by trigger code.
  4. Record is retrieved by the client from session cache.
  5. Cached record is shown to the client.

This pattern is called write through pattern. In this pattern, the cache is written immediately when the data is created or updated in the database.

This implementation ensures the client requests never go to the database. All requests are served by the app layer cache, therefore protecting the database from spiking up.

Here’s a code sample to read data from cache for offer use case:

Sample Apex Trigger code, adds data to cache.

Trigger simpleTrigger on Offer__c (after insert) {
    
    public getCustomerOffer(String customerId) {
    Customer custId = customerId;
    Offers[] Off = [‘35%’,1,’05/20/2020’]

    if (Cache.Org.contains(custId)) {
        Cache.Org.put(custId, Offers);
 }
public Offer__c getCustomerOffer(Id customerId) {
        Offer__c result;
        Cache.SessionPartition cachePartition = Cache.Session.getPartition('local.customers');
        if (cachePartition.contains(customerId)) { //cache hit
            newOffer = cachePartition.get(customerId)
        }else{ 
            result = null; //Offer not available yet. 
        }
        return result;
    }

Challenges

Race conditions

Not all data can be cached. Data that is shared among multiple users and getting constantly updated can lead to race conditions.

Knock knock! Who’s there? Race Condition! Race condition who? A race condition happens when multiple threads update the same record concurrently. Each of them write into the cache the data they have. The last update would be the “winner” however the winner might not have the latest update to the record and would result in the user getting the old data.

Work around

Race conditions can be worked around using a lazy load pattern. In this pattern, triggers are not employed to update the cache. Instead, one user is assigned to read data directly from the database, the data is then written to cache. Since data is visible to all users, the rest of the users read from the cache.

There is a slight overhead of that one user making frequent database requests. However, in situations where large volume of users concurrently use the system, allowing one user to access the database directly is less of an impact.

Cache miss

A cache miss is when the application expects data to be in the cache but can’t find it there. This causes the application to go to the database to get that data. Too many cache misses will reduce the effectiveness of the cache.

Cache miss happens because data in the cache is non-durable; cached data may be evicted when space is short. User code should handle cache misses.

As shown in the code snippet, the cache is checked to see if a key exists. Or else, a fallback condition executes and retrieves data from the database and caches it for subsequent requests.

Platform Cache provides CacheBuilder interface which helps with this use case. The link is provided below.

//A coding best practice using Salesforce Platform cache CacheBuilder interface
//which makes it easy to handle cache misses. 
//Retrieve the logged in user information from the cache.

public class userDetails {
Id userid = UserInfo.getUserId()

class UserInfoCache implements Cache.CacheBuilder { // Inner class
    public Object doLoad(String userid) { // Implement doLoad()
        User u = [SELECT Id, IsActive, username FROM User WHERE id =: userid];
        return u;
}
}


To retrieve a cached value
String UserId = UserInfo.getUserId()
User loggedIn = (User) Cache.Org.get(UserInfoCache.class, UserId)

//Retrieves if exists else executes the doLoad method to populate the cache.

Benefits

This table shows results monitoring App Layer cache implementation for the offers use case. Prior to the implementation of cache the database was the target of requests from agents checking for new offers. This caused high usage of database CPU. Post implementation of cache, we see a huge reduction as shown in the second table.

Before implementation:

Total Requests/Day DB CPU Usage ** DB CPU %*** Database Queries/day * Avg Response Time
12 Million 83 Hours 7 ~ 8% 12 Million 30ms

After app layer cache implementation:

Total Requests/Day DB CPU Usage DB CPU % Database Queries/day Avg Response Time
11.39 Million < 1 minute 0% 6K 6 ~ 11ms

Based on this data we see the cache hasa  significant impact by reducing database usage to 1 minute from 83 hours and also benefiting the application performance by reducing the response time by 50 percent.

Some of the other benefits include:

Area of impact Benefits
Reduced number of Database nodes Saves lot of money
Capacity released for critical operations Scale businesses processes
Simplifies data access Switching to cache access is easy
Better performance Cache access is faster

Conclusion

Designing for scale is critical as business grows. Scale in transaction-based apps on the Lightning Platform depends on critical resources like database. Data can be accessed millions of times before it changes.

Application layer caching is a solution to protect databases from such heavy load. Caching needs to be planned as there can be challenges.

References

About the author

Anil Jacob is a Lead Software Engineer on the Frontier Scale team at Salesforce. He works on large and complex customer implementations and related scale challenges. His areas of interest are application scale, user experience, UX performance, and application development and business scale. Prior to Salesforce, he was with Intuit, Bea Weblogic, and Wells Fargo Bank.

Leave your comments...

Scaling Data Access With App Layer Cache