How many of you have felt this kind of database-driven apps pain? You design and build what you think is a great new app, test it with a minimal amount of data to confirm that it doesn’t have any obvious bugs, and then slog through both challenging investigations of problems and architecture redesigns–your application is not as usable as you originally thought it was, and it doesn’t perform well with production data volumes. If only you had tested the application with a data set representing what you expect in production, you could have proactively identified these issues before deploying the app to the world and avoided this embarrassing mess.
In this post, I’ll preview a sample app and related techniques that you can use to quickly create all the test data you need for a Force.com application or Salesforce implementation, and help you avoid disasters that stem from a lack of adequate testing.
Note: Just to be clear, this post does not necessarily refer to test data for running unit tests. It pertains to mock or fake test data, perhaps large volumes of fake test data, that you can use to confirm the usability, performance, and architectural design of a Force.com app or Salesforce implementation as your volume of production data ramps up after deployment.
Factories Should Create Representative Mock Data
Over the years, I’ve had many discussions with both new and experienced Force.com developers and architects, as well as Salesforce implementation experts, who struggle to build a process that can quickly create hundreds, thousands, perhaps even hundreds of thousands of representative fake data records to test a new application. I’ve seen some data “factory” solutions that are elegant. Others? Not so much.
In my opinion, a good data factory generates representative mock data. Here are some specific examples of my focus on the word representative:
- If in production, a field stores names with an average length of 20 characters, then your factory should produce similar field data.
- If a field stores phone numbers, your factory should generate representative phone numbers.
- If the production database contains parent and child records with an average parent-to-child ratio, your factory should generate a representative number of parent and related child records.
Why am I stressing this point about representative fake test data? It’s important to test what you expect in production to get accurate results. For example, when a field in a data structure consumes an average number of bytes per record, and a full scan of a data structure requires a corresponding number of disk I/Os based on the average byte count for a record, the mock test data you generate can greatly affect the accuracy of your test results.
Parent-child and record-ownership data distributions that you test can also affect many aspects of Force.com applications. If you want to better understand the effects of group maintenance on sharing performance, dive into Technical Enablement’s Best Practices for Deployments with Large Data Volumes paper.
To help you generate the mock test data you need for Force.com application testing, check out this blog post’s companion article, Generating and Loading Representative Test Data for Salesforce and Force.com Orgs. This article demonstrates how to build and use forcefactory, a simple Ruby on Rails app for generating test data for Account and Opportunity standard objects.
For example, at the command line, you can run the following command to create 10,000 new representative Account records.
How many fake Accounts do you want? 10000
forcefactory creates the mock data records in a local database in a matter of seconds. Once you create the records you need in your local database, the article teaches you how to export them to a CSV file and use a data-loading utility such as the free Jitterbit Data Loader for Salesforce to import those records into your org using both SOAP API and Bulk API calls under the hood.
Along the way, the article provides practical examples of several best practices related to Force.com data management, such as where to do your testing, and how to practice and tune data loads.
You can easily clone and customize forcefactory to suit your needs. Supplement the existing factories with additional fields. Create entirely new factories for other objects in your org using the sample factories as examples. And if you’d like to contribute your work to the community, please send me a pull request to have your work considered.
- Architect Core Resources
- Best Practices for Deployments with Large Data Volumes
- Designing Record Access for Enterprise Scale
- Jitterbit Data Loader for Salesforce
- forcefactory (on GitHub)
About the Author
Steve Bobrowski is an Architect Evangelist within the Technical Enablement team of the salesforce.com Customer-Centric Engineering group. The team’s mission is to help customers understand how to implement technically sound Salesforce solutions. Check out all of the resources that this team maintains on the Architect Core Resources page of Developer Force.