Manipulate Data with the Salesforce CLI

The Salesforce CLI is the cornerstone of Salesforce Developer tools. It’s used for building and deploying apps, but there’s a use case that can be overlooked: the CLI can also manage data as part of your DevOps pipeline. In this post, we’ll look at different Salesforce CLI data commands that let you interact with data. We’ll go from manipulating individual records to small record batches, and we’ll look at large bulk operations.

Working with individual records

Sometimes, you only need to get a single record and modify it as part of a script. When it comes to this, you can be tempted to use an API client (a cURL command for example) to run a REST API request, but this means that you have to deal with authentication. Handling authentication for multiple orgs can be a burden, and a security risk if you mishandle access tokens.

The good news is that you can do this simply and safely with Salesforce CLI by running the data get record command. For example, this gets you a contact record from a given ID on the default org (you could also specify a target org in order to work on another org):

Thanks to the --json flag, the above command outputs the following JSON.

You can then chain this call with a tool like jq to extract specific field values from the JSON output, so that you can process them programmatically.

Tip: Note the intermediate result object that we traverse to get to our field value.

After you’re done processing the data, you can update the record with the data update record command. For example, you could rename our contact to “Astro Nomical” with this command:

Tip: Be careful with the mix of single and double quotes in the command.

What’s also interesting from a DevOps perspective is that you can also use the two commands that we covered above with a --use-tooling-api flag to use Tooling API instead of REST API. This lets you interact with metadata, such as Flexipages, Apex classes, Lightning web components, and more.

There is also a set of dedicated cmdt commands that let you manage custom metadata types. This is useful when configuring your apps for your different environments (production, sandboxes, etc.).

The commands that we have covered so far are fine for working with individual records, but you may want to work with larger data sets.

Working with small record batches

When dealing with small batches ranging from tens to a couple of thousands of records, you can use SOQL queries and tree import/export commands.

SOQL queries

The Salesforce CLI grants you the ability to run SOQL queries via REST API thanks to the data query command. For example, you can retrieve a list of the five last modified contacts with this command:

Just like for data get record and data update record, you can add a --json flag to retrieve your data in JSON for programmatic use.

For reference, the JSON output of the previous query, looks like this:

You can also add a --use-tooling-api flag to switch Tooling API to query metadata. For example, you could retrieve the code from an OrderController Apex class with a command like this:

If you query for records and the result holds more than 10,000 records, you’ll have to switch the query to bulk mode. We’ll share more details on this in the section below dedicated to bulk operations.

Tree import/export

When preparing a test environment with some sample data, you can use the data tree export command and data tree import command. These two commands rely on the Composite sObject Tree API to manipulate records. This API is convenient for working with small batches of hundreds of records from multiple objects while maintaining record relationships.

You start by exporting data into a plan (or standalone JSON files) by passing a SOQL query to the data export tree command. For example, you can retrieve accounts with their related contacts with this command:

You’ll want to use nested queries to capture the relationships between the records. You can go as far as five levels deep in the relationship.

This data export tree command in the example above creates three files in a data directory: a plan file and two data files. The file names and content match the ones that are used in the schema below.

Schema showing the relationship between files and records in a data tree plan

data tree commands rely on a set of JSON files that can be grouped together in a plan. Each JSON file represents records for an object. Each record can expose a reference ID and use reference IDs from other records. These reference IDs are free text that can be either generated by the export command or edited manually by a user.

The export operation is powerful, but you need to be aware of some important limits: the SOQL query that you run for the export can return up to 2000 records, but each data file can only hold up to 200 records when importing the data back. This means that if you have more than 200 records for a given object, you need to split the generated data file into multiple files to create smaller batches of up to 200 records. If you do this, don’t forget to add the new data files to the plan file.

Once your files are ready, you can import the data with the data tree import command like so:

The Composite sObject Tree API that this command uses is powerful, but it has some limitations. It deals well with parent-child relationships, but it doesn’t handle junction tables or complex relationships. A simple rule is that if you can’t retrieve what you need with a single SOQL query, then the data tree command is not what you need. Also, If you need to work with more than a couple hundred records, you’ll need to use bulk commands.

Working at scale with bulk operations

Bulk SOQL queries

When a SOQL query run with the data query command returns more than 10,000 records, specify the --bulk flag. The command then runs the query using Bulk API 2.0, which has higher limits than the default API used by the command.

When you run a query in bulk mode, it’s important to remember that the operation becomes asynchronous. This means that you have to wait for the results as the query and result retrieval are in fact split into multiple Bulk API calls behind the scenes.

There are three possible scenarios when using the bulk query command:

Wait for the result up to a predetermined amount of time, thanks to the --wait flag. If the results are available before the time expires, the command returns the result without waiting further.
Don’t specify a wait flag. In this case, the CLI waits for up to three minutes by default.
Don’t wait for the result, thanks to the --async flag and continue the execution of your script.

In any of these three cases, if the results are not available before the wait time expires (or if you don’t wait), then it’s up to you to query for the results, thanks to the query ID that the query command returned. To do so, you’ll need to run a data query resume command using the --bulk-query-id flag followed by the query ID.

For example, we can bulk query contacts like this:

This immediately returns the following JSON with no records but with a query ID.

Later on, you can fetch the query results with a data query resume command. Wrapping these commands together in a script looks like this:

Tip: Note the done boolean field that is present both in the initial query result and the resume command output. This indicates whether the query operation is finished or not.

Bulk edit operations

Like running bulk queries, you can run bulk upsert and delete operations with the data upsert bulk command and data delete bulk command. These two commands require a CSV file as an input, and since they run asynchronously, you may need to run a resume command as a follow-up.

Let’s say that you want to create some contacts with a bulk operation. You start by preparing a contacts.csv CSV file like this:

You can then run the upsert “synchronously” (waiting for up to three minutes) with a wait flag like this:

Note the mandatory --external-id flag with its value set to Id. This helps you define a key for controlling whether a record should be updated or inserted. By default, we use Id as our identifier, but you could use any field that you’d like provided that its values are unique. The fact that the Id field is not present in our CSV file means that all records from the file will be created.

You can also work asynchronously by upserting your contacts in a script that runs some operations while the upsert is in progress, then retrieve the upsert results.

In both scenarios, you’ll get the following JSON with your two newly created contacts and their IDs.

Tip: Refer to the Bulk API Developer Guide for details on formatting your CSV file for bulk operations.

Closing words

This concludes our tour of the data CLI commands. We reviewed the different commands that let you work with individual records and deal with small record batches, and we saw how you can run bulk operations. You’re now aware of the benefits and limitations of the different commands. With this knowledge, you can make the best use of the CLI in your DevOps workflows.

If you need more than the native CLI features, you can also turn to plugins like SFDX Data Move Utility (SFDMU) or build your own custom plugin to manage data.

Resources

About the author

Philippe Ozil is a Principal Developer Advocate at Salesforce where he focuses on the Salesforce Platform. He writes technical content and speaks frequently at conferences. He is a full-stack developer and enjoys working with APIs, DevOps, robotics, and VR projects. Follow him on X @PhilippeOzil, on LinkedIn @PhilippeOzil, or check his GitHub projects @pozil.