With the Summer ’10 release, we have added the ability to retrieve the original data that you send in a Bulk API batch. We keep this data around for 7 days just as we keep the job info, batch info and batch result files.
Because you can now directly retrieve the original data, it has become easier to do things like building a CSV file of all records with errors. You can do this without keeping track of the mappings between the original data set and the batches you submitted to the job.
Here is a very simple Java sample showing how you can do this:
You’ll need wsc-19 and a v19 partner jar to run it. Currently the WSC site doesn’t have the partner jar, but you can compile it yourself using WSC and a downloaded partner WSDL file from your org.
- Download a partner WSDL file by logging into your org and go to Setup -> Develop -> API -> Partner WSDL.
- Run something like: java -cp wsc-19.jar com.sforce.ws.tools.wsdlc `pwd`/partner.wsdl `pwd`/partner-19.jar
- Compile the class above and execute it with: java -cp wsc-19.jar:partner-19.jar GetErrors [sfdc_username] [sfdc_password] [jobId]
- The errors will be downloaded into errors.csv. The last column in the file will contain the error message for each row.
Disclaimer: I hacked this together pretty quickly and I haven’t done much testing. This is just meant as an example. If you put this into use, make sure you review and test your code thoroughly.
Also, note that the records will not be ordered the same way as in your original data set.