Newer Version Available
General Guidelines for Data Loads
These are some tips for planning your data loads for optimal processing time. Always
test your data loads in a sandbox organization first. Note that the processing times may be
different in a production organization.
- Use Parallel Mode Whenever Possible
- You get the most benefit from the Bulk API by processing batches in parallel, which is the default mode and enables faster loading of data. However, sometimes parallel processing can cause lock contention on records. The alternative is to process using serial mode. Don't process data in serial mode unless you know this would otherwise result in lock timeouts and you can't reorganize your batches to avoid the locks.
- You set the processing mode at the job level. All batches in a job are processed in parallel or serial mode.
- Organize Batches to Minimize Lock Contention
- For example, when an AccountTeamMember record is created or updated, the account for this record is locked during the transaction. If you load many batches of AccountTeamMember records and they all contain references to the same account, they all try to lock the same account and it's likely that you'll experience a lock timeout. Sometimes, lock timeouts can be avoided by organizing data in batches. If you organize AccountTeamMember records by AccountId so that all records referencing the same account are in a single batch, you minimize the risk of lock contention by multiple batches.
- The Bulk API doesn't generate an error immediately when encountering a lock. It waits a few seconds for its release and, if it doesn't happen, the record is marked as failed. If there are problems acquiring locks for more than 100 records in a batch, the Bulk API places the remainder of the batch back in the queue for later processing. When the Bulk API processes the batch again later, records marked as failed are not retried. To process these records, you must submit them again in a separate batch.
- If the Bulk API continues to encounter problems processing a batch, it's placed back in the queue and reprocessed up to 10 times before the batch is permanently marked as failed. Even if the batch failed, some records could have completed successfully. If errors persist, create a separate job to process the data in serial mode, which ensures that only one batch is processed at a time.
- Be Aware of Operations that Increase Lock Contention
- The following operations are likely to cause lock contention and necessitate using serial mode:
- Creating new users
- Updating ownership for records with private sharing
- Updating user roles
- Updating territory hierarchies
- If you encounter errors related to these operations, create a separate job to process the data in serial mode.
- Minimize Number of Fields
- Processing time is faster if there are fewer fields loaded for each record. Foreign key, lookup relationship, and roll-up summary fields are more likely to increase processing time. It's not always possible to reduce the number of fields in your records, but, if it is possible, loading times will improve.
- Minimize Number of Workflow Actions
- Workflow actions increase processing time.
- Minimize Number of Triggers
- You can use parallel mode with objects that have associated triggers if the triggers don't cause side-effects that interfere with other parallel transactions. However, Salesforce doesn't recommend loading large batches for objects with complex triggers. Instead, you should rewrite the trigger logic as a batch Apex job that is executed after all the data has loaded.
- Optimize Batch Size
- Salesforce shares processing resources among all its customers. To ensure that each organization doesn't have to wait too long to process its batches, any batch that takes more than 10 minutes is suspended and returned to the queue for later processing. The best course of action is to submit batches that process in less than 10 minutes. For more information on monitoring timing for batch processing, see Monitor a Batch.
- Batch sizes should be adjusted based on processing times. Start with 5000 records and adjust the batch size based on processing time. If it takes more than five minutes to process a batch, it may be beneficial to reduce the batch size. If it takes a few seconds, the batch size should be increased. If you get a timeout error when processing a batch, split your batch into smaller batches, and try again. For more information, see Bulk API Limits.
- Minimize Number of Batches in the Asynchronous Queue
- Salesforce uses a queue-based framework to handle asynchronous processes from such sources as future and batch Apex, as well as Bulk API batches. This queue is used to balance request workload across organizations. If more than 2,000 unprocessed requests from a single organization are in the queue, any additional requests from the same organization will be delayed while the queue handles requests from other organizations. Minimize the number of batches submitted at one time to ensure that your batches are not delayed in the queue.