Avoid Account Data Skew for Peak Performance

When you have a very large number of child records associated to the same account in Salesforce, we call that “data skew”. It can seem like the obviously right thing to do – for example, when you have a whole bunch of unassigned contacts and need some place to “park” them, it’s tempting to put them all under one account name ‘Unassigned’. Unfortunately this can cause issues with record locking and sharing performance. Let’s say you are updating a large number of contacts under the same account in multiple threads. For each update the system locks both the contact being changed and its parent account, to maintain integrity in the database. Even though each lock is held for a very short time, because all the updates are trying to lock the same account, there is a high risk that an update will fail because a previous one is still holding the lock on the account. There’s a similar dynamic in sharing. Depending on how you have sharing configured, when you do something that seems simple like changing the owner of that account, we may need to examine every one of those child records and adjust their sharing as well. That may include recalculating the role hierarchy and sharing rules. This can take a while if we are talking about hundreds of thousands of child records.

So how do you keep a lid on data skew to avoid these issues? To start, we recommend that you avoid associating more than about 10k child records to a single parent record. If you need to do something like the “parking lot” example above, you can create a larger number of accounts and modify your integration code or use a trigger on the child object to distribute the child records across this collection of “parking” accounts. This will help to avoid the locking and performance issues you might experience with a highly skewed account.

tagged Bookmark the permalink. Trackbacks are closed, but you can post a comment.
  • First

  • mohit chhabra

    Hello, we are following the data model where we have- One Account(A) is having multiple contacts(C1, C2 etc) and we have more than 10,000 contacts on same account. And we have a contact us form hosted on force.com site(used for case generation process) and the issue that we are facing is, we are getting some cases into system without any contact attached and the ratio is (25 out of 400) and probably this could be due to concurrent transactions from multiple end users and there can be a high risk that it will end up giving lock error.

    Basically under the hood,

    1) In the first step we are creating a dummy case and then associating all the parameters(contact lookup[contact we will create in second step], case fields etc) on case update DML Call.

    2) We are creating a contact based on email address captured from a form field and in this process we are querying same account(A) and associating it to all the contacts(C1,C2 etc) coming into the system from multiple users and then attaching that contact on case and when I debug on one case, I got unable to lock row error during contact insertion in contact before trigger.

    So could you suggest Is it because probably we have some Account data skew Issue in the system(because of multiple contacts on same account) which is causing record lock contention and there will be small probability that multiple users doing concurrent transactions will get record lock error and hence some of them will get fail.