Migrating Salesforce data between sandboxes can be a daunting task; this is especially true if you have a complex data model. So how do you make migrating data from one Salesforce org to another org in a breeze? Here within Salesforce, with dozens of sandboxes shared among multiple teams, we were always running in challenges with keeping those sandboxes up to date with correct product data. Though many teams have started to use scratch orgs, we still need the setup data in scratch orgs. Moving data manually is frustrating, time-consuming, error-prone, and not an ideal approach.
My team wrote a Java-based tool using Salesforce SOAP APIs to migrate the data from one Salesforce org to another Salesforce org. The now open source Org to Org Data Migration Tool lets you migrate the data from one Salesforce org to another Salesforce org. This is a standalone tool that one can run from a local machine.
Sound interesting? Let’s get to it.
The data migration tool connects to a source org and a target org using WSC (Web Service Connector) and partner APIs. It selects the records from source org based on JSON object mapping (discussed in the following sections) and inserts/upserts into the target org.
This tool provides the following features:
The tool connects to the source org only to retrieve the records. During a particular run, your source org is the source of truth and data in that org shouldn’t be changed. The tool never updates anything in the source org — it does insert/update/upsert or delete in the target org only.
High level architecture:
In addition to data migration, the tool provides a few other services.
At the heart of the tool is one SforceObjectPair pair class. An instance of this class holds the references to source and target records. A pairing is established between source and target records for parent-child object relationships. For example, if Approval_Process__c is parent object and Approval_Step__c is child object, then with that structure (below diagram) in place, it becomes easy to replicate it for any number of related objects. Child object Approval_Step__c can in turn be parent of another child object and above structure will still preserve that relationship too.
Let’s consider the below object relationship between three objects. Account and Asset are parent-child objects, Asset has a lookup to Product object. (Note: While our example utilizes standard objects the tool works equally well with custom objects.)
The JSON object mapping for above related objects will look like this:
[ { "parent": "Account", "unmappedFields": [ "PhotoUrl", "IsPartner", "IsCustomerPortal", "IsPersonAccount" ], "children": [ { "parentMappedField": "AccountId", "childObject": "Asset", "sequence": 1, "lookups": [ { "lookupMappedField": "Product2Id", "lookupObject": "Product2", "keys": [ "Name" ] } ] } ] } ]
The data migrator tool doesn’t store your password or logs it. All the data is kept in memory only. The data communication is between your machine and Salesforce using Salesforce apis.
The tool uses standard Salesforce SOAP APIs, which enforces CRUD/FLS restrictions. The source org user should have read access to objects/fields in the source org. The target org user should have proper CRUD and FLS in the target org. In the absence of CRUD/FLS, the migration will fail.
1. Input your source and target orgs credentials in build.properties file. As a best security practice, never store your username and passwords in source repositories.
# SOURCE source.sfdc.login=username@salesforce.com source.sfdc.password=****** source.env.endpoint=https://test.salesforce.com/services/Soap/u/41.0/ # TARGET target.sfdc.login=username@salesforce.com.sandboxorg target.sfdc.password=****** target.env.endpoint=https://test.salesforce.com/services/Soap/u/41.0/ target.threads.count=1
2. Map and relate your objects and fields in a JSON.
You can filter records, avoid certain records from migration etc. using the JSON mappings. By default, each field from the source org object is mapped to a target org object. You can exclude certain fields.
A lookup record can be identified using more than one field (in the form of an array keys). Use “keys” array to uniquely identify the lookup record. More details on JSON mapping can be found under GitHub wiki page.
[ { "parent": "Pricebook2", "where": " IsStandard = false ", "unmappedFields": ["IsStandard", "IsArchived"], "children": [ { "parentMappedField": "Pricebook2Id", "childObject": "PricebookEntry", "where": " Pricebook2.IsActive=true And Pricebook2Id != null And Product2Id != null ", "sequence": 1, "unmappedFields": [ "ProductCode", "Name" ], "lookups": [ { "lookupMappedField": "Product2Id", "lookupObject": "Product2", "where": "Name != null", "keys": [ "Name" ] } ] } ] } ]
3. Run using any of existing main classes provided or create your own class to trigger the migration.
mvn exec:java -Dexec.mainClass="migrator.module.client.MigrateProducts" -pl datamigrator-module
4. Alternatively, create your own main class to trigger the migration.
public class MigrateProducts { public void process() { List<CommandVO> list = new ArrayList<CommandVO>(); list.add(new CommandVO("create", "Products.json")); list.add(new CommandVO("create", "PricebookEntry_standard.json")); list.add(new CommandVO("create", "PricebookEntry_nonstandard.json")); MigrableObjectFactory.migrate(list); } static public void main(String[] args) { MigrateProducts migrate = new MigrateProducts(); migrate.process(); } }
Additionally, the Github wiki has more details around how you can compare the metadata/data in two orgs, create sample JSON object mappings, etc.
When migrating data for — let’s say — 100 objects, it can be challenging to manage the number of JSON mappings. As a best practice, try to group related set of objects in one JSON mapping. More details on JSON mapping can be found under the GitHub wiki page. To speed up execution, you can configure the number of threads in build.properties. (Note that the maximum value is 10)
This tool holds the data in memory and no other intermediate storage is used. Since the data is stored in memory only, there is a limit of how much data you can migrate in a single run. On a machine with 32GB RAM, we have tested the migration of 1 million records in a single run.
We have been adding more features in this migrator tool. Here’s what you can expect in a future release:
In the absence of such a tool, developers were spending lots of time in setting up their development environments. Some teams have moved on to use scratch orgs, but we still need to setup each scratch org with setup products data. With this tool, we are able to load the orgs with setup data with ease. Internally, we have automated this process as part of our CI/CP pipeline which has resulted in huge productivity gain.
We would love to hear the feedback from you. I would encourage you to download this tool and give it a shot!
Please let us know how you may want to use this tool in your CI/CD pipeline or in day-to-day development activities. Feel free to reach out to me on Twitter at @anoop_76