The Force.com Query OptimizerThe Force.com query optimizer is not unlike the cost-based optimizer you’ll find in many traditional relational database systems–it helps map out the most efficient path to the data you request from the database. However, because the Force.com database is multitenant and uses a record-ownership model, it has some remarkable differences to be aware of so that you get the best performance for your applications.

Traditional Cost-Based Query Optimizers

Most databases that employ a cost-based query optimizer use the following flow to execute database queries efficiently.

  • The database routinely collects statistics about the data in the database. For example, the number of records in each table/object, the cardinality (number of corresponding records) of particular values in an indexed field, and much more.
  • When you execute a query, the optimizer considers such statistics to calculate costs for various execution plans.
  • The optimizer then chooses the plan with the lowest cost to execute the query.

If you’ve worked with a relational database much, you are probably familiar with cost-based query execution plans driven off of filter conditions, or WHERE clause conditions. In a nutshell, all cost-based optimizers consider the following things to determine when a filter condition is selective and is a candidate for driving a query execution plan.

  • Operator: The operator in the filter condition must be optimizable. Many operators such as != are not optimizable because they inherently lead to non-selective filter conditions, and as such, short-circuit the evaluation process for filter conditions that use them.
  • Selectivity: The value, or list of values, in the filter condition must be selective relative to the total number of records in the target table/object. The optimizer relies on statistics to determine whether a filter condition is selective.
  • Index: The field in the filter condition must have an index that the database can use to efficiently fetch the IDs of target records. If an index is not available for the field in a filter condition, the only alternative is to scan the entire table/object, even when the filter condition uses an optimizable operator with a selective value.

Once the optimizer determines all candidate selective filter conditions, it compares the relative costs of corresponding query execution plans and picks the plan with the lowest cost.

Now let’s see how query execution within Force.com is different (and similar) to a traditional database system.

The Force.com Query Optimizer

Force.com query execution relies on all of the aforementioned traditional cost-based optimizer techniques. But unlike traditional single-tenant database systems, Force.com is a multitenant platform, and as such, has it’s own query optimizer and related features to support it’s unique multitenant architecture.

Multitenant Statistics

Rather than rely on native database optimizer statistics alone, Force.com collects and maintains tenant-specific optimizer statistics to provide insight into each tenant’s data distribution.

Composite Index Joins

When a query has a compound WHERE clause condition (e.g., WHERE x AND y), Force.com considers the selectivity of single-column indexes alone as well as the intersecting selectivity that results from joining two single-column indexes.

Sharing Filters

In Force.com, every record has an owner, and lets you enforce private record sharing approaches to control the visibility that specific users have to records in an object. The Force.com query optimizer considers the selectivity of sharing filters alongside the selectivity of traditional filters (i.e., WHERE clauses) to determine the lowest cost plan for query execution.

Use Your Knowledge About the Force.com Optimizer

This post has some interesting background information about the Force.com query optimizer, but how do you use it to make a difference in your application performance? Read on.

Update Statistics

Force.com automatically updates optimizer statsitics regularly so that the query optimizer has access to accurate representation of data distribution. If you manage a large volume of data (1M+ records in an object, say) and complete a bulk data load that dramatically alters your data distribution, the existing optimizer statistics can get out of sync with your data. The platform automatically recalculates stats in the background when your data set changes by 25% or more. But if you are just below this threshold and you notice slower query performance or reports, consider filing a case with Salesforce Support to see if a manual statistics recalculation for select objects in your org can return operation to peak performance.

Understand Selectivity Thresholds and Operators

Creating optimal queries and reports requires some basic knowledge of good filter condition construction, including an understanding of operators and selectivity thresholds. The Query & Search Optimization Cheat Sheet is a great tool that you can use to learn more about these important topics.

Know the Selectivity of Filter Conditions

Database Query & Search Optimization Cheat SheetJust because you build a filter condition with optimizable operators and indexed fields doesn’t mean the condition is selective. You need to understand the selectivity of specific field values and make sure they fall below the thresholds mentioned in the Query & Search Optimization Cheat Sheet. Read this blog post to learn how to investigate the selectivity of field values. Make sure you understand the selectivity of individual conditions as well as the intersecting selectivity of compound filter conditions.

Create Custom Indexes

When queries regularly filter on a field with selective values, alone or in conjunction with other fields, make sure the field has an index. Using the facts you gather from the previous section, work with Salesforce Support to request custom indexes that make sense.

Publicize Selectivity and Indexes Internally

Anyone developing custom Force.com apps, customizing a Salesforce implementation, and building reports and list views should have a basic understanding about Force.com query optimization. Then they need to be armed with the specifics of what fields have indexes and selective values that will help them build optimal queries.

To make your Salesforce implementation run smoothly, it’s your responsibility to document and publicize relevant information about your data model throughout your team to make sure all power users have and know how to use this information. Next, lock down the permissions so that only qualified personnel has the ability to develop apps, create custom list views, and build reports.

Make Sharing Filters Available to Apex and Visualforce

When it makes sense to do so, enable Force.com to consider sharing filters for Apex (and Visualforce) by declaring classes and controllers using the “with sharing” keywords, like so …

public with sharing class invoiceController {
…

Note: Force.com automatically considers sharing filters for reports and list views.

Consider More Restrictive Sharing Architectures

More restrictive record sharing models will undoubtedly yield more selective sharing filters that the query optimizer can consider. However, tread carefully as you build restrictive sharing architectures in an effort to improve query performance. It’s important to strike a balance between the benefits of query optimization with the tradeoff of sharing recalculations that take time to complete. Consult a qualified Salesforce technical architect to help build a successful sharing architecture.

Related Resources

About the Author

Steve Bobrowski is an Architect Evangelist within the Technical Enablement team of the salesforce.com Customer-Centric Engineering group. The team’s mission is to help customers understand how to implement technically sound Salesforce solutions. Check out all of the resources that this team maintains on the Architect Core Resources page of Developer Force.

 

tagged Bookmark the permalink. Trackbacks are closed, but you can post a comment.