Until the Winter ‘13 release, best practices for building Salesforce SOQL queries on large data volumes (LDV) included avoiding filtering on fields with
nulls, and similarly avoiding filtering on formula fields. If you plan to implement new queries—or want to clean up some of the workarounds you implemented before the Winter ’13 release—consider the following updates related to filtering on
nulls and formula fields, which could help tune your SOQL performance.
Custom Indexes for Deterministic Formula Fields
Formula fields are custom fields that can help you dynamically calculate the value of a field based on other fields, expressions, or values.
As you can with any other field, you can use formula fields to filter SOQL queries. For example, you might write a query such as:
SELECT Id, Name FROM Contact WHERE FormulaField__c = '10';
Just because you can filter queries using a formula field doesn’t mean that you should, however. By default, formula fields don’t have underlying indexes, so they require full scans to find target records. They also have to calculate values on the fly because actual values are not stored in the database. So when an object has a lot of records, queries that filter using a formula field can be painfully slow.
To get around the pre-Summer ’12 index limitations involving formula fields, many customers created workarounds. For example, you might have created triggers or workflow rules to store formula field values in a separate custom field that you could index, and then built queries that filtered using the custom field instead of the formula field. That strategy works, but it requires overhead and is not intuitive to developers.
Since the Winter ’13 release, you have been able to contact salesforce.com Customer Support to create a custom index on a formula field, provided that the function that defines the formula field is deterministic.
Here are examples of common non-deterministic formulas. Force.com cannot index fields that:
- Reference other entities (i.e., fields accessible through lookup fields)
- Include other formula fields that span over other entities
- Use dynamic date and time functions (e.g.,
A formula is also considered non-deterministic when it includes:
- Owner, autonumber, divisions, or audit fields (except for CreatedDate and CreatedByID fields)
- References to fields that Force.com cannot index
- Multi-select picklists
- Currency fields in a multicurrency organization
- Long text area fields
- Binary fields (blob, file, or encrypted text)
- Standard fields with special functionalities
- Opportunity: Amount, TotalOpportunityQuantity, ExpectedRevenue, IsClosed, IsWon
- Case: ClosedDate, IsClosed
- Product: Product Family, IsActive, IsArchived
- Solution: Status
- Lead: Status
- Activity: Subject, TaskStatus, TaskPriority
Because values might vary over time or change when a transaction updates a related entity, Force.com cannot index non-deterministic formulas.
Remember that indexing a deterministic formula field does not alone guarantee that the Force.com platform will always use it; general selectivity rules still apply. For more information about selective SOQL queries, see Working with Very Large SOQL Queries. To ask for an indexing evaluation for building new custom indexes on your deterministic formula fields, contact salesforce.com Customer Support.
Custom Indexes Containing null Rows
Customers often need to query an object and find only the records in which a certain field is empty (i.e., null
). For example:
SELECT Id, Name FROM Contact WHERE CustomField__c = null;
That sounds simple enough, but when your query targets an object that has a lot of records, consider the performance implications. By default, underlying Force.com field indexes don’t include
nulls. Therefore, queries such as the examples in this post require full scans, which again can be painfully slow.
Since the Winter ’13 release, you have been able to work with salesforce.com Customer Support to create custom indexes that include
null rows. Even if you already have custom indexes on your custom fields, they need to be explicitly enabled and rebuilt to get the empty-value rows into index tables.
Note that this option does not apply to picklists, external IDs, and foreign key fields. If you need to query on a
null external ID field, you can work with salesforce.com Customer Support to create a two-column (compound) index instead.
Again, creating a custom index with
null records does not alone guarantee that the platform will always make effective use of it, and general selectivity rules still apply. For best practices about filtering on fields with
null, see Best Practices for Deployments with Large Data Volumes.
When building your SOQL queries, consider the following new filtering options: filtering on deterministic formula fields and filtering on
nulls. While not all types of formula fields or fields with
nulls can benefit from these new options, they’re definitely worth exploring if you want to simplify your implementation and tune your SOQL performance.
Remember that to set up custom indexes and enable the
null indexing option, you must contact salesforce.com Customer Support. And if you would like to learn more about other techniques to optimize your queries, grab an updated copy of our Best Practices for Deployments with Large Data Volumes white paper!
- Best Practices for Deployments with Large Data Volumes
- Architect Core Resources
- Working with Very Large SOQL Queries
About the Author and CCE Technical Enablement
Daisuke Kawamoto is a member of the Technical Enablement team within the salesforce.com Customer-Centric Engineering group. The team’s mission is to help customers understand how to implement technically sound salesforce.com solutions. Check out all of the resources that this team maintains on the Architect Core Resources page of Developer Force.