Introducing Events and Partial Downloads in Bulk API 2.0

Salesforce’s REST-based Bulk API 2.0 provides a programmatic way to asynchronously insert, upsert, query, or delete large datasets in your Salesforce org. The Bulk API 2.0 Query jobs are built to process SOQL queries asynchronously and handle large data volumes (2,000 records or more) efficiently.

To further enhance performance of Bulk API 2.0 Query, we’re introducing two new capabilities to reduce the time it takes to retrieve query results:

Parallel downloads (GA in Winter ’25 release)
- Today, retrieving query results involves a single, sequential HTTP download. You create a job and poll for its completion status while the system processes the query and extracts records from the database. You then download the results in a single stream.
- With the Winter ’25 release, parallel downloads became generally available. By splitting the download process across multiple streams, the time needed to fetch data is significantly reduced.
Events and partial downloads (Beta in Spring ’25 release):
- With the new event-driven mechanism, platform events will be published every time there’s a job status update, such as when the job is complete or partial results are ready.
- Additionally, partial download capabilities means that as soon as a portion of the query results is read from the database and written to the file system, it becomes available for download — even while the query is still running. Each event contains a URL to access the available results, ensuring that you can begin processing data sooner.
- Use Salesforce Pub/Sub API to subscribe to these platform events, enabling real-time updates and a more efficient workflow.

These features empower you to process large datasets faster and integrate data querying into event-driven architectures, ensuring smoother, more responsive applications. Let’s explore these features in detail.

The current Bulk API 2.0 Query capabilities support serial data downloads, where results are fetched one page at a time. Before starting the download, you must repeatedly poll the job’s status using the API that retrieves information about a Bulk V2 Query Job (see docs) to determine if the query processing is complete. Once the job reaches a JobComplete state, results can be downloaded page by page, using the following endpoint.

You can also pass the maxRecords parameter to define the number of records to be downloaded per page. If omitted, the API returns the maximum number of records it can send in a single page and a locator to traverse through subsequent pages.

There are two drawbacks of this approach:

Delayed downloads: You must wait for the entire query job to complete before initiating downloads.
Excessive polling: Continuous polling of the job’s status is inefficient and consumes a lot of resources.

These limitations slow down workflows and make handling large datasets cumbersome. Fortunately, new capabilities in Bulk API 2.0 address these pain points.

Speed up with parallel downloads

With the Winter ’25 release, parallel downloads optimize how query results are retrieved. Unlike serial downloads, parallel downloads allow you to retrieve multiple pages simultaneously. When the job status is JobComplete, the API allows you to query resultUrl ‘s in groups of five using the endpoint below.

This is what the API response looks like for a sample Bulk API 2.0 Job with Job ID – 750Z9000000Cc9fIAC, where the job has more than five result pages:

The response above points to the first five pages which can be downloaded in parallel. The API response provides a nextRecordsUrl field which points to the next group of five pages, continuing the process until all results are retrieved. By fetching multiple pages at once, parallel downloads reduce the overall time required to download large datasets. You can integrate this capability into existing workflows with minimal changes.

Below is the request to get the next five pages for download.

While this feature optimizes data retrieval, it still requires waiting for the job to complete. The event-driven capabilities discussed below eliminate this final bottleneck.

In the Spring ’25 release, we are introducing a new platform event named BulkApi2JobEvent as part of the Salesforce Beta Program. To access this feature, you will need to turn on the org preference named PartialDownloadAndJobEvent. Once the org preference is turned on, you will be able to receive real-time updates at various stages of the query job lifecycle. This feature enables partial downloads while the query job is still in progress. We leverage platform events to notify you of key milestones during job processing and query result availability. This eliminates the need for continuous polling.

BulkApi2JobEvent is published in the following scenarios:

Job status changes: Notifications are triggered for state changes of the job to OPEN, UPLOAD_COMPLETE, IN_PROGRESS, JOB_COMPLETE, ABORTED, and FAILED .
Partial results are available: Events inform you when a portion of the query results is ready for download, even while the job is still running.
Job completion: A final event signals that all results are available for retrieval.

Below is the shape of the BulkApi2JobEvent.

Field	Description
`Type`	Indicates the type of event being notified. It can have one of the two values: `JOB_STATE` or `RESULT`
`JobIdentifier`	ID of the job which is being processed
`JobState`	The current state of the job
`ResultType`	Indicates the type of results available for retrieval in the event: `PARTIAL` or `FINAL`
`ResultUrl`	The URL to retrieve the job result from

For partial results, ResultType is set as PARTIAL, allowing developers to access subsets of the data immediately. When the job is complete, ResultType is set as FINAL, providing access to the full dataset.

Examples of BulkApi2JobEvent:

An event having Type as JOB_STATE notifies you of the job state change. An event having Type as RESULT notifies you of the partial or final results available for your download.

Note that the ResultType and ResultUrl fields will always be null in events having Type as JOB_STATE.

Download data in real-time with Pub/Sub API

To leverage this capability, developers can subscribe to BulkApi2JobEvent using the Salesforce gRPC-based Pub/Sub API. This event-driven approach solves both the problems we discussed in the serial data download section of this post.

No more polling: It eliminates the need to poll the API that retrieves information about a Bulk V2 Query job (see docs) continuously, as developers are automatically notified of job progress.

Immediate access to data: Partial downloads enable developers to start processing data as soon as it becomes available, even before the job completes. This improves efficiency and responsiveness. For instance, imagine a scenario where contacts extracted from Salesforce are required to be uploaded into a billing system. With partial downloads, you can start uploading the contacts while the records extraction is still in progress in Salesforce.

Pub/Sub API is the recommended way to subscribe to this event, but the legacy streaming API (cometD) also works. You can also subscribe to this event from the Salesforce Platform with Apex and Flows.

Handling errors and access rights

Implementing event-driven downloads requires handling scenarios where a job may fail after downloading some records partially. For example, imagine you have a Bulk 2.0 Query Job with total of 10 partial result download URLs. Three ResultType events have been published by the query job, so you download records from those three result URLs. The job now fails, so you could not download the remaining seven result pages. In cases like these, where jobs fail during download of results, your client implementation should handle error scenarios. You should ensure that your client application can manage such failures gracefully while utilizing/not utilizing any retrieved data.

Any user in the org can subscribe to the topic as long as you have platform events enabled. Only the user who created the job can download the result.

Sample app showcasing all three download mechanisms in action

Landing page of the sample Node.js app

We’ve developed a sample Node.js app integrating the three different download mechanisms. Feel free to clone the app and explore these new capabilities.

For the serial and parallel data download mode, you have to create the job, poll the job info, and start downloading the results after the job is complete. For the event-driven data download mode, you create the job (already subscribed to BulkApi2JobEvent using Pub/SubApi) and start receiving events in the text box below. Below is a screenshot from the app showcasing events published by the Bulk 2.0 Query Job.

Screenshot of a page in the app showing events published by the Bulk 2.0 Query Job

Conclusion

Bulk API 2.0’s new features — Parallel Downloads and Event-Driven Updates — revolutionize the way you handle large datasets. They save you time, improve efficiency, and enable real-time data processing. These tools empower you to build responsive, streamlined workflows that keep your applications running smoothly.

Have questions? Join the conversation in the Bulk API 2.0 Trailblazer Community or ask on the Salesforce Stack Exchange (SFSE) using the Bulk API 2.0 tag. Your feedback and insights are invaluable as we continue to refine these capabilities and prepare for GA.

Further resources

About the author

Shireen Nagdive is a Senior Software Engineer on Salesforce’s Enterprise API team, where she builds reliable, high-volume APIs that handle billions of transactions monthly. She empowers aspiring engineers and students by sharing real-world advice on social media, having built a community of over 50K members.