A while back, a few of the developer evangelists were discussing something in our Slack channel and I was talking about having found some Apex code online that wasn’t written as a bulk operation. My colleague Kevin Poorman slipped in a “Sounds like a blog post, Peter.” I replied, “Haven’t we done bulkification to death?”
The upshot was something to the effect of “You can probably never talk about this enough.” Also, platform events introduce a new place where we have to think about bulkification. So here is my blog post!
“Bulkification,” you say?
If you’re thinking “What is ‘bulkification’?” then let me give you a wholehearted “Welcome to the world of developing on the Salesforce platform!” I’m so pleased you’re taking the time to read this post!
As I contemplated this blog post, knowing I should spend a few words introducing bulkification, I was wondering if I could come up with a fresh metaphor to explain the importance of bulkification, when my children presented me with exactly what I was looking for.
Each night after putting on their pajamas, the bedroom floor is inevitably covered with their cast-off dirty clothes from the day. And each time I find myself reminding them they need to get those clothes into the laundry hamper in the hallway. So how do they do that? They’ll pick up a pair of trousers and walk that one pair of trousers to the hamper, then go back. Then a shirt, then a sock, etc.
[Disclaimer: This is not my son, but I now aspire to teach my kids to do laundry like this.]
With Salesforce, the fundamental principle of bulkification is to pre-ordain that certain potentially resource-heavy operations must be undertaken with lists of data, rather than one record at a time. Most commonly these are interactions with certain platform subsystems. For instance, if you are querying, writing to the database, or undertaking activities in a trigger, there are well-defined APIs and patterns for ensuring that you can deal with lists of data.
In the Chittum household, daddy calls foul when too many trips are being made to the laundry hamper. In the world of Salesforce, to gently remind you of the need to be efficient, Apex enforces resource constraints with governor limits which define how many times you can undertake one of these potentially resource intensive actions.
Specifically, you get a maximum number of database operations and a maximum number of database rows per execution context. For instance, you can invoke any of the DML (insert, update, delete, etc.) or query APIs a maximum number of times before the governor limit makes your transaction stop. These are the operational limits. A trigger is invoked with up to 200 records each time it is invoked. If more than 200 records are being acted upon, the trigger will be invoked a second time with the next 200 records and so on.
These platform limits lead to several well-defined patterns. The three most common patterns are around DML (modifying data in the database), SOQL Queries (retrieving data from the database), and handling of records being operated on by trigger code.
The bulk DML pattern
Whenever changing data in the database, it needs to be done outside of loops. The example below deletes related records. There are two steps: the first is to collect the records to delete and the second is to delete the collected records.
The bulk query pattern
Whenever querying the database, it also should be done outside of any loop. In the example below, we want to search all contacts for a set of email addresses. Again we collect a set of email addresses, then perform one query.
For both DML and SOQL, the pattern is similar — collect, then act. The antipattern is to invoke the query or the DML statement inside of the loop. Don’t do that. Just. Don’t.
The bulk trigger pattern
In triggers, the Trigger.new variable stores a list of up to 200 records that are currently being acted upon in the database. You should always iterate over those records.
Platform events and EventBus
Platform events are relatively new and are an exciting innovation to the Lightning Platform. Previously, the architecture of the platform worked in either a synchronous transaction model or in an asynchronous fire-and-forget model (as found in asynchronous Apex implementations).
Platform events create a first-class event-driven publish-subscribe communication architecture on the platform. Internally, they are surfaced in Apex and Flow and externally, they can be either published to or subscribed to via APIs. To learn the basics of platform events, be sure to check out the Trailhead module Platform Event Basics and the project Build an Instant Notification App.
The first time I looked at platform events, I was doing a code review for a colleague. They were a pretty new technology at the time, and he was also relatively new to the platform. His code looked something like this:
This was in turn invoked by a trigger, once per iteration!
At first glance, I didn’t think anything of it. But then as I looked more closely, the first thing that jumped out at me was that the return type for the publish method was
Database.SaveResult. This is the same return type for all the
Database.dml methods. Reading the API reference for
EventBus.publish, I discovered that
publish is overloaded and can also take and return lists!
Sure enough, if I’m going to fire platform events from a
InvocableAction or any other potentially list-based source of many events, I need to ensure my invocation of
EventBus.publish is also bulk. The good news, if you’ve grokked the bulk DML and SOQL examples, there is nothing new here. Same pattern:
Triggers on platform events
One way to “subscribe” to a platform event on the platform is an Apex trigger. Once events are published, the trigger processes them, just like an Apex trigger on SObject records. There are two primary differences for platform event triggers:
- The triggers are asynchronous and occur after the event has been fully persisted to the Event Bus.
- Up to 2000 (not 200) events can be processed by one trigger execution.
Yet another way, bulkification is critically important.
Keeping up with bulk
The Lightning Platform is in a constant state of evolution. At a developer group meeting back in April, I was talking with some developers and we were all remarking how five years ago, you could wrap your head around the whole platform. It seemed possible to kind of know everything. Today, it has grown to a degree where that doesn’t seem possible anymore. But the basics are still the basics.
Bulkification is not actually a Salesforce invention. Too many calls to the database (death by a thousand cuts) and too much data saved or retrieved at once (pig in a python) are well-known performance problems that can stem from software design anti-patterns. These should always be concerns in implementing your solution, regardless. With Salesforce’s multi-tenant architecture, of course we put bumpers on with specific transaction limits to protect your org from some other developer in some other tenant’s bad code (You don’t ever design or write bad code, right?). As the platform evolves, it is always good to keep an eye out for new places where code optimization is not just the right thing to do, but it’s something that you must do in order to not crash up against the bumpers.