Do you use a long list of filters in your reports or SOQL statements to exclude “noisy” data from your query results? Do you wonder why your requests take so long to run, even when they return only a few hundred rows?
You might be able to overcome your performance issue with indexed formula fields, which this blog post explains in detail.
Avoiding Inequality Filters to Improve Force.com Query Performance
Even if you have already defined your business processes and set up field validation rules, you might still have to account for “noisy” data when running a report. And while you don’t want your analysis skewed by that data, you also don’t want to continue adding values to your picklists to rule out only a few exceptions. Adding inequality filters to your query might solve this problem, but consider how they affect performance, as they do in the following example.
Alex built a customer contact list for his organization’s customer services team. Originally, the business requirement was simply to follow up with every customer who scored lower than 3.5 (out of 5) points on a customer satisfaction survey. Then, the team told Alex that there were some situations in which the team should not reach out to its customers.
Alex is still getting additional requests, and he realizes that his list of filters is getting quite long. He also recognizes that the report request is taking longer to complete as the number of records in the Customer Feedback custom object increases over time.
Does this scenario sound familiar to you? Here’s a simplified version of how Alex’s filter would look in the Report Builder. Similarly, the statement would look something like this in SOQL.
In this example, inequality filters are used to exclude certain records from the data set. While you might think that adding more filters and subsequently getting fewer returned records means that you’re being selective, the Force.com query optimizer actually cannot use indexes on inequality filters to effectively select the rows that match the conditions. As a result, the time it takes to return the data set increases as the number of records in the object increases.
Using Custom Formula Fields to Optimize Your Report and SOQL Performance
Solving this performance problem is straightforward, as long as your combined filters are selective. Just create a custom formula field that captures all of your filters, then use it as a Boolean-like filter. You might already have a formula field that serves a similar purpose. The key idea is to predetermine if the record would be a candidate for the report, then store that derived value on an index. By replacing the inequality filters, the Force.com query optimizer can consider using this new formula field’s index.
For the previous example, the formula would look like this. You can write the formula as shown simply to return “TRUE” or “FALSE,” or you can use the “CASE” function to handle complex filtering conditions and return different values.
Note that you should not include filters with date functions and cross-object lookup fields only because you want to keep the formula field deterministic. You don’t have to include more than you need in the formula field. Not having some of the inequality filters included in the formula field won’t interfere with the selectivity of the new indexed formula field filter. Just make sure the formula field is deterministic and selective enough for an index to be used. To learn more about indexing formula fields, read this blog post about SOQL best practices and Steve’s blog post about performance gotchas that you might encounter when using formula fields.
After creating your formula field, contact salesforce.com Customer Support to verify that you have an effective filter, then have them create an index for it.
If you update the formula, and the returned value changes, you must rebuild the custom index for the existing records. Values stored on the index are updated only when records are created, updated, or deleted. To rebuild a custom index, contact salesforce.com Customer Support.
Now that you have a selective filter, simply replace the combined filters with the new formula field. In the report, it would look like this.
In SOQL, replacing the combined filters with the formula field would look like this.
In addition to creating formula fields, don’t forget to consider the following best practices.
- Consider using effective filters, such as date ranges, to narrow the number of targeted records. This solution alone might resolve performance issues, even without the formula field.
- Even if this formula field solution doesn’t apply to your situation, archive your data regularly. Reducing the number of records in a target object is always an effective technique for tuning query performance.
Summary
Although building a long list of filters to refine your report output is a common practice, it also commonly causes performance issues. If you have a long list of “exception” filters and are querying against a large set of data, consider using an indexed formula field as an effective filter to improve how your reports and SOQL requests perform.
Related Resources
- Force.com Formula Fields, Indexes, and Performance Gotchas
- Long- and Short-Term Approaches for Tuning Force.com Performance
- Force.com SOQL Best Practices: Nulls and Formula Fields
- Best Practices for Deployments with Large Data Volumes
- Database Query & Search Optimization Cheat Sheet
- Architect Core Resources
About the Author
Daisuke Kawamoto is an Architect Evangelist within the Technical Enablement team of the salesforce.com Customer-Centric Engineering group. The team’s mission is to help customers understand how to implement technically sound salesforce.com solutions. Check out all of the resources that this team maintains on the Architect Core Resources page of Developer Force.