You are planning a Force.com implementation with large volumes of data. Your data model is in place, all your code is written and has been tested, and now it’s time to load the objects, some of which have tens of millions of records.
What is the most efficient way to get all those records into the system?
The Force.com Extreme Data Loading Series
This is the last entry in a six-part series of blog posts covering many aspects of data loading for very large enterprise deployments.
Here are the topics in this series.
- Designing the data model for performance
- Loading data into a lean configuration
- Suspending events that fire on insert
- Sequencing load operations
- Loading and extracting data
- Taking advantage of deferred sharing calculations
This post explains how you can use the Defer Sharing Calculation feature to minimize the time it takes to load records into your Salesforce organization.
Calculating Sharing in Organizations with Large Data Volumes
This series’ previous posts have focused on how you can load data into your Salesforce organization more efficiently. Of course, having that data in your organization doesn’t do your users much good if they cannot read and edit the records that they need for their work. At some point during your loading project, you must configure your sharing settings so that your users have the appropriate level of access to the appropriate records.
If your organization has large data volumes, you might find that the calculations you need to complete this configuration and implement your record access model add a substantial amount of time to your data loading.
The calculations can increase your total data loading time for several reasons.
- To determine if users who try to access records should actually be able to access those records, Salesforce stores data that specifies the individuals, roles, and groups that should have record access, as well as data that specifies which group members actually belong to a group.
- When new data is added or sharing settings are changed, the stored record sharing and group membership data must be updated accordingly.
- There are multiple ways to grant any user or group access to a record.
- Changes to the role and territory hierarchies can affect a large number of users in the organization, and require updating the sharing data for a large number of records.
Changes to organization-wide defaults and sharing rules can affect your users’ access to some or all of that object’s records.
Timing Your Sharing Configuration Steps
The sequencing load operations post explained how to best sequence various sharing configuration steps, and the next two sections focus on some of those steps, explaining both when and why you should follow them. We recommend using the Defer Sharing Calculation for all of these steps, except those that involve setting up or changing your objects’ organization-wide defaults.
Pre-Data Loading Configuration Steps
- Create the role hierarchy.
- Assign users to roles.
- Set organization-wide defaults to Public Read/Write.
In the role hierarchy, record access is inherited based on record ownership, and managers automatically gain access to all of the records owned by people assigned to subordinate roles. If you load a lot of data, and then make significant changes to the role hierarchy or the users assigned to roles within that hierarchy, those changes will trigger a recalculation of record access so that the managers above the affected roles and role members have the appropriate, adjusted record access.
If you set an object’s organization-wide default to Public Read/Write, the system will not use or maintain an object share table for that object—users will already have an unrestricted baseline level of access to all of that object’s records. Because the system is not maintaining an object share table for that object, it does not need to spend time calculating who should have access to its records when you load data.
Note: The Public Read/Write organization-wide default can speed up your data loading, but at some point, you might need to use a different sharing setting to meet your business requirements.
Post-Data Loading Configuration Steps
- Set organization-wide defaults to Public Read Only or Private.
- Create public groups and queues.
- Create sharing rules.
Once you have completed your data load and changed your objects’ organization-wide defaults to Public Read Only or Private, the system must perform a sharing calculation, which will take a substantial amount of time. As always, if you plan to load very large volumes of data, test your data load in a sandbox organization so that you can plan for the time that it will take to complete in production. Because sandbox and production environments are different, what you see in your sandbox organization might not line up perfectly with what you see in your production organization, but it should indicate the general scope and impact of your changes. Again, you cannot use Defer Sharing Calculation when changing your organization-wide defaults, but you can speed up these changes by asking salesforce.com Customer Support to enable the parallel sharing rule processing feature for your organization.
Because public groups and queues are not part of the role hierarchy or the territory hierarchy, creating them does not slow sharing performance significantly, and there isn’t a strong reason for creating them at a specific time. Configuring them later just allows you to focus on loading your records as quickly and efficiently as possible.
Finally, if you create a sharing rule before your data is loaded into an object, the platform must check every record that you insert during your data loading against that rule, and then adjust the object share table if necessary. This process does not take much time for each insert, but the aggregation of all of your inserts can slow your data loading noticeably. To avoid this drag on loading performance, you can create your sharing rules after all the data has been loaded.
There are two options for doing this.
- Create each rule and allow it to recalculate before creating the next rule.
- Use Defer Sharing Calculation to suspend the processing of sharing rules, create all of the rules at once, and then recalculate all of the rules at once.
The second option might be more efficient, but it might also require a very long time to recalculate all the rules. If you have limited ability to set aside maintenance windows for these activities, the first option might be more appropriate for your organization.
Tip: For more tips for tuning your sharing configuration for optimal performance, see Designing Record Access for Enterprise Scale.
Configuring Record Access with the Defer Sharing Calculation Feature
By default, when you create or modify a sharing rule, a role or a group—or change who belongs to a group—the system updates the object share tables and the group membership tables immediately. If these sharing calculations take longer than you expect, they can throw off your planned loading schedule. With Defer Sharing Calculation, you can turn off these recalculations while you submit additional sharing changes, and then let them all execute together later. And if you test these submissions and calculations in a sandbox organization, you can better predict how long they will take in your production organization, allowing you to negotiate more reasonable maintenance windows with your customers.
Configuring Defer Sharing Calculation
Contact salesforce.com Customer Support to enable Defer Sharing Calculation.
- Once the feature is enabled, users with “View Setup and Configuration” permission will see the “Defer Sharing Calculations” link in Setup. This link takes you to the Defer Sharing Calculations page shown below.
- To see the Suspend, Resume, and Recalculate buttons on this page, users must also have the “Manage Sharing Calculation Deferral” permission. Creating a permission set allows you to easily assign this permission to all users who require access to these buttons.
Note: This permission also grants the users who receive it the “Manage Users,” “View Setup and Configuration,” and “Reset User Passwords and Unlock Users” permissions, so it is a powerful permission that should be restricted to a few senior administrators.
Using the Defer Sharing Calculation Feature
The controls on the Defer Sharing Calculations page allow you to:
- View the current state of group membership and sharing calculations
- Suspend group membership calculations, which will also suspend sharing rule calculations, or suspend only sharing rule calculations
- Perform all of your planned changes to roles, groups, queues, and sharing rules quickly while calculations are suspended
- Resume the calculation of group membership or sharing rules
When you resume group membership and sharing calculations after making many changes in an organization with large data volumes, those calculations might take a long time to complete. Changes to group membership are calculated automatically when you resume the calculations, but changes to sharing rules are not. You can use the Recalculate button to ensure that all changes to sharing rules have taken effect. Until you click this button, users might not have the access that you have specified in your sharing rules, and they might continue to have access that you intended to remove.
Note: When you suspend group membership calculations, the system must recalculate all sharing rules, even if you did not add, delete, or modify any sharing rules. This is because the changes that make to groups might affect some or all of the “Owned by members of” or “Share with” groups specified in your sharing rules.
Remember that when group membership or sharing rule calculations are suspended, any administrators performing operations on roles, groups, queues, or sharing rules will discover that their changes have not taken effect. To develop good coordination between administrators—and realistic estimates of the maintenance windows you will need to make large scale changes to your sharing configuration—we recommend thoroughly testing deferred group membership and sharing calculations in a sandbox organization with the data volumes that you anticipate having in production.
When Not to Use Defer Sharing Calculation
Although using Defer Sharing Calculation is a best practice for organizations with large volumes of data, some customers might have so much data that attempting to recalculate all sharing changes at once is not feasible or takes an impractical amount of time. These customers might find that allowing sharing calculations to proceed normally while they configure and load data into their organizations provides the best overall throughput for their organization.
When you have to make a large number of group membership or sharing rule changes during a data loading project, you can use salesforce.com’s Deferred Sharing Calculation to make this maintenance more efficient and predictable. As always, use careful testing in a sandbox organization to understand the benefits of Defer Sharing Calculation and whether it is the right tool for your organization.
- Extreme Force.com Data Loading, Part 1: Tuning Your Data Model
- Extreme Force.com Data Loading, Part 2: Loading into a Lean Salesforce Configuration
- Extreme Force.com Data Loading, Part 3: Suspending Events That Fire on Insert
- Extreme Force.com Data Loading, Part 4: Sequencing Load Operations
- Extreme Force.com Data Loading, Part 5: Loading and Extracting Data
- Best Practices for Deployments with Large Data Volumes
- Designing Record Access for Enterprise Scale
- Record-Level Access: Under the Hood
- Architect Core Resources
About the Author
Bud Vieira is an Architect Evangelist within the Technical Enablement team of the salesforce.com Customer-Centric Engineering group. The team’s mission is to help customers understand how to implement technically sound salesforce.com solutions. Check out all of the resources that this team maintains on the Architect Core Resources page of Developer Force.