Today, we sit down and talk with Aaron Crosman, a Specialist Leader at Attain, about a variety of topics including his experience with PHP and some projects they’re working on in Open Source Commons. We take a deep dive into how Snowfakery fits into the Data Generation Toolkit and some challenges in integrating Drupal and Salesforce. 

Aaron initially got into Drupal as part of his first job out of college working at a nonprofit. Drupal is a content management platform built on PHP, which is one of the older and more established platforms. But as the web has gotten more powerful and more sophisticated over the years, so has Drupal. 

Show Highlights:

  • How he became a web developer and got involved in Drupal
  • Working with PHP as a modern language
  • Challenges in integrating Drupal and Salesforce
  • How they wrap the API from Drupal to Salesforce
  • How he got involved with the Open Source Commons
  • His first impression of the Open Source Sprints
  • How much time and work it would take to do data generation
  • Some projects they’re working on with the Sprint
  • How Snowfakery fits into the Data Generation Toolkit

Links:

Episode Transcript

Aaron Crosman:
Certainly by the time I got to college and I took a course or two, I really enjoyed the patterns of thinking it encourages.

Josh Birk:
That is Aaron Crosman, a specialist lead over at Attain. I’m Josh Birk, your host for the Salesforce Developer podcast. And here are the podcast, you’ll hear stories of insights from developers, for developers. Today, we sit down and talk with Aaron about a variety of topics. We’re going to talk a little PHP. We’re going to get back into some of the open source comments that we’ve been talking about recently, but we’re really going to do a deep dive into one of my favorite projects, Snowfakery and data generation in general. F.

Josh Birk:
Or now, however, we are going to start where we’re left off with that open quote and talking about how economics fits into conversations that Aaron has with his clients.

Aaron Crosman:
It certainly has when I work with clients who… I work with a lot of nonprofits. And so when I’m working with clients who have some kind of economic development aim, being able to kind of talk about some of those basic, you know, understand the basic economic principles they’re talking about. And again, to be able to be part of that, like, yeah, yeah, yeah. Economists say those things all the time, but yes, I agree. There’s this fundamental problem in their thinking about these assumptions that are built into the field and very useful for certain things, but not necessarily actually guiding the way humanity operates.

Josh Birk:
Fascinating. [crosstalk 00:01:28]

Aaron Crosman:
So it is actually, yeah. I find it useful in those contexts to do it. And I just find it interesting. I listen to a lot of Freakonomics podcasts and Planet Money out of NPR and a bunch of that kind of stuff. [crosstalk 00:01:38]

Josh Birk:
Nice. It’s interesting. What you’re describing kind of reminds me of a mental map of somebody who’s like doing consulting work, like sussing out the, yeah, those statements sound really good, but by the way, we’re not building that. So that’s a fascinating overlap. And you have a lot of experience with Drupal. How did you first start getting involved with that?

Aaron Crosman:
I got into Drupal as part of my first job out of college. I was working at a nonprofit and they kind of looked at the new kid and said, “Hey, there’s this web server over there that none of us want to deal with, go deal with it.” And so I became a web developer. And from there it was a flat file website and that was driving me crazy. And there was this exciting new world of WordPress, Drupal and other neat tools that would do these things for you much more gracefully.

Aaron Crosman:
And so I started… There was good reasons to be picking Drupal at the time. It’s a different set of good reasons to pick it now, but there were good reasons that aligned with our work at the time at that organization. And it was free because I had a budget of my time for the very first project.

Josh Birk:
Oh. Nice.

Aaron Crosman:
So yes. What can Aaron get done by Friday? And so I needed tools that could do that. And those versions of Drupal were well positioned for setting that up. And so I rolled out a couple of projects that way and then built from there.

Josh Birk:
And I suppose we should level set what precisely is Drupal?

Aaron Crosman:
Drupal is a content management platform built on PHP. It is one of the older and more established platforms, but it has evolved with the web over the last 20 years. And so as the web’s gotten more powerful and more sophisticated, so has Drupal. And that is a kind of an enterprise class content management system these days that can handle most of your needs directly and cross connect to your other systems when at camp.

Aaron Crosman:
So in my current work, I end up doing Drupal to Salesforce integrations.

Josh Birk:
Gotcha.

Aaron Crosman:
So you can have your web portal for your membership organization be part of your website with Salesforce backing you up. And then the two systems are somewhat divorced. If you’re ready to make major changes to one, you don’t have to make major changes to the other.

Josh Birk:
Gotcha.

Aaron Crosman:
And some other nice advantages like that.

Josh Birk:
So I want to poke your brain a little bit about that, but first I kind of want to challenge, I think, presumptions that people might have. So when people think enterprise class web application, I don’t think a lot of people jump to PHP these days. But how do you find the language in its modern form, working with an application like that?

Aaron Crosman:
I mean, PHP has grown up a lot and it is now a modern language. Over the last 20 years, it’s gone from kind of being a toy hobby language that encouraged people to do really terrible things. And I did some of those really terrible things and there’s no shortage of examples of terrible things done in PHP, into a language where you can really do kind of whatever you need to do and handle modern object-oriented programming and invent driven architectures and, and, and-

Josh Birk:
Gotcha.

Aaron Crosman:
… without the kind of security and weaknesses that came with it. It still has… Every language has its weird issues that make it somewhat challenging to work with. It’s a little less consistent than I would like. But Python has features that aren’t included because they never wanted to make the language too complex. JavaScript has every feature you’ve ever thought of in 16 versions of it. So you can do the same thing 9,000 ways and everybody does.

Josh Birk:
Right. Yeah. I was just going to say like, no JavaScript developer should like feel too proud of themselves because everything you just said about PHP is kind of like also true of JavaScript and any language that’s been around that long. And what I had found, I remember distinctly having a project that I was working on that was pure PHP, but it was pure PHP in two modes and one was kind of like the server/API. And then the other part was this. The company that we were working with called it their modular plugin subsystem, something like that.

Josh Birk:
It was literally the most random selection of PHP script I have ever seen in my life. And it just taught me. It’s like PHP is the language that you get out of it, what you put into it. Like it’s so easy to write bad PHP, but you don’t have to. And then when you see those examples, it’s like, nah, that’s just, yeah.

Aaron Crosman:
And I think PHP got maligned because for the first decade of its life, so much of the examples out there on the web to teach it were bad examples.

Josh Birk:
Right.

Aaron Crosman:
Were insecure, were lazy, were using every feature the fastest way to teach somebody, not the best way to make its effective.

Josh Birk:
Yeah. Yeah.

Aaron Crosman:
And so lots of people started doing bad things.

Josh Birk:
Completely agree. Any particular challenges in integrating between Drupal and Salesforce?

Aaron Crosman:
There are some pretty good tools out there that you can use across the API, but they’re different architectures, right? I mean, one is from accessing it over the API it’s, to some extent, a black box on the far side, on the Salesforce side. I mean, if you’re working in both, you do have a lot of flexibility. But once you’re passing through the API, you’ve got that abstraction layer and it doesn’t always align quite to the same object structure that Drupal wants to encourage.

Aaron Crosman:
And so there are some places where you spend time trying to realign. To say like, “How do I get this contact actually broken into four different pieces? How do I take my one event page and break it in into a campaign and the six other ancillary things I need to make the campaign drive?” Those kinds of-

Josh Birk:
Gotcha.

Aaron Crosman:
… one to manys in both directions can be a little bit challenging. From my perspective, if somebody does it a bunch, they’re exciting challenges, but they can be a little bit burdensome at times just because they’re different architectures. But there are a bunch of places where they kind of fund fundamentally… For all, one is a proprietary platform as a service with a large company behind it. And the other is an open source community without a large company, with only a bunch of small companies driving it.

Josh Birk:
Right.

Aaron Crosman:
They actually have a lot of kind of overlap in terms of very collaborative communities that support each other and work together, that have both platforms internally share some concepts in terms of allowing you to customize and adjust and do what you need to do inside of it. So there are some places they actually really do align and play together well, it’s just a matter of making sure you’re paying attention to those and playing both platforms to their strength.

Josh Birk:
Got it. Out of curiosity, what are you using to wrap the API from Drupal to Salesforce? Do you have custom connectors or is there a library?

Aaron Crosman:
There’s a library. So Drupal has a large community of modules that are supported by the Drupal community. One of which is the Salesforce suite, which is actually maintained by a former employer of mine. Have a good relationship with the lead maintainer on that. So now and then I find issues with it, I can just shoot a note over to him and say… There’s a process to do it through the community and I put it there, but I think mine get a little more attention than others just because he has some appropriate level set of my ability, and that I’m not just whining that I’ve probably banged my head against this [crosstalk 00:09:05] all week-

Josh Birk:
Got it. Nice.

Aaron Crosman:
… before I declare it a bug.

Josh Birk:
Nice. Okay. So you work with nonprofits and higher education clients. Is that how you first got involved with the Open Source Commons?

Aaron Crosman:
Yeah, so I came in to Attain Partners to work on a particular project actually, that was a Drupal to Salesforce integration. And then was immediately encouraged by my colleagues to start working on attending the Open Source Sprints just as Open Source Commons was kind of getting set up and branded that way and becoming what it is today.

Aaron Crosman:
And so I attended my first in-person Sprint in Philadelphia, which happens to be where I’m from. And that was the last in-person Sprint pre-COVID. So I had just gotten started in it when we went to all virtual for everybody’s safety.

Josh Birk:
Nice, nice.

Aaron Crosman:
Yeah, that was that connection into that community and my colleagues who are very engaged with those projects that brought me in.

Josh Birk:
So my impression is that the Open Source Sprints have evolved pretty quickly over time. What was that one in Philadelphia like?

Aaron Crosman:
The one in Philadelphia was, I mean, again, it was my first one. So it was a first impression of how that community was operating, but it was an excellent opportunity to be working with people from across the both end users and admins and developers and other consultants that would normally be rival consultancies. We all work very hard when we’re at Sprints to be collaborative and working together, that we share freely on what we’re working with and we acknowledge the challenges that are in front of us. And we don’t pretend that they don’t exist, that we’re not sitting claiming to be smarter than everybody else in the room. So it keeps that friendly vibe and it was a great opportunity to see new projects and new ideas getting spun up and to have challenges validated.

Aaron Crosman:
So when I put the question on the wall that has become the Data Generation Toolkit project, we put these large new Sprint pieces up with the two question you wanted answered and people would go and label and kind of say either if they had answers, like, “Here’s the answer to your question.” And other people with the kind of concurring, “Oh, yeah. I’d like this problem solved too.” And I had fully expected my question was going to be like, there’s just these 16 tools out there, you just don’t know them. I wanted [crosstalk 00:11:32] data generation for variety of scenarios. And I fully expected everybody to come, “No, no, no, you fool just go-

Josh Birk:
Just use this.

Aaron Crosman:
… just use this.” And instead from big, well known consultancies, I got other people going, “Yeah, that’s exactly the problem. I’m having that problem right now.” Like, “The client is calling me, bugging me to get this problem solved.”

Josh Birk:
So give me more detail on that. So first of all, I had not known that before that’s one of the… Because I know that the Sprints are like jam boards of sticky notes and all this kind of stuff. I never thought of the noun there being, so you framed it as a question. What was that question?

Aaron Crosman:
I don’t remember exactly the way I phrased it, but the question I was looking to answer was how do I generate good test data for a specific org?

Josh Birk:
Interesting.

Aaron Crosman:
I had just come off of a project with a client where we had hundreds of scenarios that needed to be tested in this web portal, driven by data from Salesforce, which means I needed hundreds of records in Salesforce that I could clone down to Drupal and verify that all the various edge cases worked. And we had no good way to do that at the time.

Aaron Crosman:
And so we had spent a whole miserable evening working. We have a spreadsheet to work out what all the scenarios were. And then we’re just sitting there like on a group call trying to keep each other entertained and awake, making up people and using the entire Marvel Universe at the time, which was fairly big enough at the time. It’s gotten big, unfortunately. But we’re putting every Marvel character and we’re looking up their relationships so we can build the right households of who’s married to who and who’s with who and all that stuff at the Marvel Universe, so that we would have enough data to test this thing.

Josh Birk:
Wow.

Aaron Crosman:
And that was the problem I wanted solve is I never wanted to spend a whole [inaudible 00:13:30] making up people, trying to do all this stuff and be able to just say like, “Here I’ll outline what I want and let the machine go do the boring, repetitive part, because that’s what machines are good at.” Humans aren’t great at boring repetitive tasks, but machines are awesome at it.

Josh Birk:
And I’m with you because it’s like, that seems like such a common starting point for getting to development into a sandbox before you go to production, right? How can you get production worthy data to test your application? And I think I would’ve been exactly where you were like, surely somebody’s already solved this. People have been doing development on Salesforce for years at this point. But there was really nothing… Like, did anybody have a, this is how I’m currently doing it, or was everybody just stuck in rooms, creating the Marvel Universe and putting them into a database?

Aaron Crosman:
More of us than wanted to admit it, particularly [crosstalk 00:14:33] consultancies were doing collaborative pick your fantasy universe and grab everybody from it kinds of projects. There were tools, but they’re usually pretty, they’re flat. You can’t get the depth of data that you need. So you need just a bunch of contacts, you can go grab [inaudible 00:14:53] and you can generate out a bunch of flat contacts and maybe a little bit of detail around them.

Aaron Crosman:
But to handle a four person household with all of their relationships and three different giving types and membership and a bunch of campaigns, and, and, and, there just wasn’t… nobody had a good answer. Everybody was on the there.

Aaron Crosman:
There were some people who had talked about some very complex stuff they built, but it was all very specialized. For this one client who had a bottomless pit of money, we built up a whole giant Apex suite that generated all the data for, kinds of answers.

Josh Birk:
Wow.

Aaron Crosman:
And I’ve seen others of those. Like I when I talked about this in other context, people are like, “Well, I just did it in Apex once.” You look at Apex, you’re like, “ow, that does work on this org.”

Josh Birk:
Right? Exactly.

Aaron Crosman:
No other org in existence has exactly this collection of things. And your Apex is now… And are some Apex-based solutions, but also you then your governor limits, you can only run so much.

Josh Birk:
Right. And it’s not a trivial amount of Apex, I would think.

Aaron Crosman:
It’s rarely a trivial amount of Apex. There’s some pretty large packages out there that do it. And then yeah, you talk about large Apex and then on the scale of data, you might want small amounts of data.

Josh Birk:
Yeah. Because a lot of times when I think of like that kind of data generation, I think of like test factories and things like that, but test factories are on the opposite end of that spectrum. Right? Like a test factory is like, oh, you unit test. In order for you to operate, you need these 10 accounts to work. But what you’re describing is like truly complex and also realistic data, like data that has values to it, that your application’s going to work again.

Aaron Crosman:
Right. I mean, for those 10 accounts in my test factory, I don’t care if they’re named account one through 10. That’s perfectly good, because no humans’ ever going to interact with them.

Josh Birk:
Right.

Aaron Crosman:
But for test data-

Josh Birk:
Maybe not even see it.

Aaron Crosman:
Right. But for test data that you want to actually use to test with a client and use in a demo, and even just for yourself, it’s reassuring to see that these things are real-ish. hat Jane Smith is actually going to fit in the field on the page layout. And you could put people in with long names and the short names and all that kind of variety so that you can see it.

Josh Birk:
Yeah. Okay. So then that moment where you’re putting the stick on the board, that’s the moment Paul Prescott’s referring to when he was like, “At this Sprint, Aaron Crosman basically came up with the concept for Snowfakery.” And you kind of put all of this into motion.

Josh Birk:
What’s it like to go from that moment with the Commons and the Sprints, to go from that moment… And maybe this is way too broad of a question, but it’s like, to a repo that people can just install and start generating data? Like what’s that timeline look like to go from the challenge statement to something people can actually use?

Aaron Crosman:
So that first week, that first Sprint, we just kind of got the proof of concept and I love that by the next Sprint, two major things had happened. One is Paul decided that we were crazy and it couldn’t be done and then proved himself wrong. I love that frame that he puts on that because it’s just one of those things that just, it tickles me just the right way that-

Josh Birk:
Right, right.

Aaron Crosman:
… something about that is just wonderful that he set out to prove us wrong. And it was like, “Oh, no, this can be done. Oh, look at this. I now own an open source project.”

Josh Birk:
Congratulations.

Aaron Crosman:
Congratulations. And showed up at the next Sprint with this tool that could do the vast majority of what we had kind of outlined as wanting done.

Josh Birk:
Really?

Aaron Crosman:
And he’s continued to improve and improve its flexibility, but really even that very first version at the next Sprint, which was the first virtual Sprint, I canceled the one in Atlanta, which is where I met him. I still never met Paul in real life, just at virtual events-

Josh Birk:
Oh man.

Aaron Crosman:
… and other similar settings. It was both amazing to see it done, incredibly flattering that they had kind of risen to that level of importance that kind of all the various… Because there’s lots of great ideas that get floated at Sprints. And the community does a lot of really excellent work, but not everything becomes a tool that Salesforce is putting resources behind.

Josh Birk:
Right.

Aaron Crosman:
A lot of it continues to be supported by the community, which is great and I’ve is a wonderful thing, but it was a little bit shocking to go from in that context and then to, “Well actually, we have one of our developers working on this.” And I’m going, “Oh wow.”

Josh Birk:
Oh wow.

Aaron Crosman:
“This is really cool.”

Josh Birk:
Yeah.

Aaron Crosman:
So we then kind of moved to the next layer of challenge, which was we had this tool, but the documentation was thin in the first version and the examples were nonexistent because Paul was doing what he could do, but he was, at that point, a one man band.

Josh Birk:
Right.

Aaron Crosman:
And hadn’t brought anybody else along yet. They were using it for dot org products. But hadn’t really gotten very far beyond that, which is, you know, they’d gone a huge distance in six months.

Josh Birk:
Yeah. Right, right.

Aaron Crosman:
But they weren’t all very done. And so we started to look at some other questions that the data gen group has been working on in terms of helping people move data around. And so there were some work doing that, and then starting to write the sample recipes so that people could not have to read the documentation and simply figure it out from documentation, but be able to look at actual tutorials and samples of, this is an example that does it in a nonprofit, in NPSP, in Ida.

Josh Birk:
Gotcha.

Aaron Crosman:
It can generate actual data that you might actually use for testing.

Josh Birk:
Got it.

Aaron Crosman:
And so we’ve been steadily now building up this collection of resources and sample recipes, and they now live in their own repository. There’s a whole Snowfakery recipe templates repository, that’s its own collection of things. And this most recent Sprint, that was most of where the time went, was into expanding that. And that also cycles background to giving Paul feedback of that group will hit conditions that he hasn’t encountered yet or challenges he hasn’t encountered just because they’re not part of the product development life cycle in house, or the kind of the consulting scenarios where we have demos for clients and we want demo data or we want kind of one off… Or an org, not necessarily one off, but an org that is very specific to the client and customizing to them, which aren’t his scenarios and aren’t the Salesforce staff scenarios. And that’s absolutely fine.

Josh Birk:
Right.

Aaron Crosman:
But the tool can do it all, and so we need to have examples of it.

Josh Birk:
Yeah. And I think that’s one of the, like a lot of the projects in the Commons are very nuanced and obviously great if you are an animal shelter and you want an animal shelter CRM. But this has the advantage of a problem, which is almost universal. And as you were just describing, has its own people have variations of that original problem. How else are you using it professionally? You described a couple of scenarios, are there any others?

Aaron Crosman:
At the moment, I have mostly used it in Salesforce contexts.

Josh Birk:
Okay.

Aaron Crosman:
So things like demoing MPSP, but I did it recently for an organization that is very geolocated. They’re very concerned about their local community. And so we made sure all the people we generated for their example donors, all live nearby. So we could tailor it so that everybody was generated to live in cities and in the state where the organization was based because that’s their audience.

Aaron Crosman:
So when we did the demo, it looked like people, not just real-ish people, but people they could actually… towns they knew and the addresses were dropping pins into places they recognized and stuff like that. So it gave that level of assurance that this system can really look like your audience.

Aaron Crosman:
I’ve also looked at how to wire it into Drupal. And I have a couple of pull requests out, open on Snowfakery itself to get a couple of features added that I have to go back to. Paul’s made some great revisions and feedback and I have to get back and do my part on the open source project. But to be able to feed into Drupal, because Drupal has a built in data generator, but it’s also, it’s very good at the depth, it’s very bad at the real-ish part, just dumps-

Josh Birk:
Gotcha.

Aaron Crosman:
You ask for a string and it gives you a string of letters.

Josh Birk:
Stuff.

Aaron Crosman:
And you want to a range of length and it’s evenly distributed across the range. And you’re like, “Well, my first names range from three characters to 256 characters. Evenly distribute it across that.” So most of them are in the 100 range and you’re like, “Well, most people’s names are really into like five to 10 range.

Josh Birk:
Five.

Aaron Crosman:
The outliers are there, but I don’t need a whole database of outliers.

Josh Birk:
Right. So compare that, like going through the exercise of putting together a realistic set of data, to be able to demo, compare that to the long night with Marvel characters. How much easier is it now?

Aaron Crosman:
It’s so much easier. It can be a reasonably a one person project as opposed to a team keeping each other awake. And to really customize all the way down to all the nuance for a specific org, it takes me about the same amount of time, but it’s just me, not four or five people. I’m getting faster so that time’s coming down as the tool gets better and I get smarter, and it’s repeatable. That we can pump it into one org and then say, “Okay, full copy sandbox. We want 10,000 example people in it and for the developer org, we just need 50.”

Josh Birk:
Gotcha.

Aaron Crosman:
And we can take the same recipe and run it over and over and over again to each of the sandboxes at different volumes. We had a client who we were needed to test something. They needed some test data to just kick the tires and poke the features. And then the same client, we built a feature and we’re looking at it going, “We need to volume test this at your full capacity.”

Aaron Crosman:
And so we took the same recipe and went from creating 50 or 100 sample contacts, to 1.5 million.

Josh Birk:
Oh, wow.

Aaron Crosman:
And associate every one of those was a donor because that was what we were testing, was the volume against their opportunities. And so we could fire hose that much data, but at that point, it was a matter of changing the parameters in the call once I built the recipe.

Josh Birk:
Right, right.

Aaron Crosman:
In a matter of 30 seconds, I could trigger again, vastly larger sizes.

Josh Birk:
And then go get a cup of coffee.

Aaron Crosman:
And go get a cup of coffee, although only one. It wasn’t like [inaudible 00:26:04] got to dinner and put your feet up and watch six hours of television. It was, “Okay, it’s running. It’s going to take it a couple of minutes.” But it’s pumping across the bulk API and doesn’t take that much time to generate that much data. So it’s not bad.

Josh Birk:
Nice.

Aaron Crosman:
Even at those million plus volumes.

Josh Birk:
Mm-hmm (affirmative). Nice. Tell me a little bit more as we’re taping this, you have just gotten “back from an open source Sprint,” which was virtual again, correct?

Aaron Crosman:
That’s correct.

Josh Birk:
Yeah. Tell me a little bit more about this week’s Sprints. Like what projects went on there and what did you work on?

Aaron Crosman:
So I was working with the Data Generation Toolkit team. I’m project lead there. I tend to hang out with that team.

Josh Birk:
Gotcha.

Aaron Crosman:
I play a role of kind of making sure everybody’s able to keep moving and drop in where I’m most useful. So I don’t do as much direct work as much as supporting everybody else. We did a lot of recipe writing and got new people set up to write recipes and expand out the recipe toolkit, or recipe templates. We’re also recognizing places where there are missing features. And not actually just in Snowfakery, but actually the library it uses to generate names and to generate name… For the education data, there’s nothing that generates realistic major names or college names.

Josh Birk:
Gotcha.

Aaron Crosman:
And so we had to go, we’ve figured out how to get that handled and the diversity on the name generation and that tool is thinner than we’d like. It’s pretty biased to people of European descent and trying to figure out how do we get a more diverse set of names so they look more reflective of the communities that people work in and are the real communities of the world.

Aaron Crosman:
And so we’ve started to figure out how we’re going to solve that problem. We didn’t get those problems solved, but we’ve recognized them and figured out how to address them.

Josh Birk:
Gotcha.

Aaron Crosman:
The larger Sprint also includes a wide range of projects that are everything from MPSP videography, in terms of training videos, to a diversity initiative to help monitor and improve the diversity of the Sprints themselves and the open source and nonprofit community that’s engaging with Salesforce. There’s projects that have their own managed packages coming out. So there’s now an events package called Summit Events that has just cleared security review and is going to be a managed package available through the community that was entirely built by folks at the Sprints engaging between Sprints so the that they can keep it moving.

Aaron Crosman:
But that’s, I think, one of the first big packages that was entirely built there, that is really coming a full fruition without having become an in-house product or connected to an in-house team. I mean, many things have stayed open source, but are now supported directly. And some of the events is currently going to come out supported by the community.

Josh Birk:
Nice.

Aaron Crosman:
And DLRS, the Declarative Lookup Roll-up Summaries engine has moved into the comment. It’s always been an open source project, but it has now been absorbed in. And so there’s a group trying to figure out how to take on any large existing code base, improve it, move it forward, deal with all of the other challenges that come with understanding what is a very good code base, but large and complicated. And it takes a while to learn it.

Josh Birk:
Right. Right. And that’s the project from Andy Fawcett, who I’m still trying to get on the mic. He’s been a very busy man to talk more about that in depth. But when I was researching for Andy’s interview, that was one of the first things that came very obvious, is DLRC or Dolores is big. It’s a big, big application. And it’s just a great story that he’s able to make it [inaudible 00:30:01] easier and faster by being able to pull that into the comments.

Josh Birk:
Now you mentioned a couple times the Data Generation Toolkit, what precisely is that? And how does Snowfakery fit into it?

Aaron Crosman:
So the Data Generation Toolkit project got named that way because we recognized pretty quickly that it was going to be more than a tool. And Snowfakery is a key piece of the overall work that we do in that group.

Aaron Crosman:
But recognizing that not only is there that tool that needs to be there, but then there are those samples that need to be around to support it. There’s also just documentation and training of the kind of, how else can I move data around, fake data? How do I sample data out? And we have started a little of that. And there’s a couple of articles we wrote that are fairly substantive about kind of laying out scenarios about how can you do it? Just kind of before we had Snowfakery, of how do you do it using Excel and VLOOKUPs to sample your data down.

Aaron Crosman:
We want to do some more of those that are using some of the other tools that are out there, particularly for that kind of sampling scenarios of, I have data in my production org, how do I get that data properly sampled? And the tools for that keep improving all the time. And that’s been actually some of our challenges, keeping up every time somebody sits down to write one they’re like, “Oh, but I found six more. And Salesforce just added this and Gearset added that. [inaudible 00:31:26] write this other one.”

Josh Birk:
Right.

Aaron Crosman:
And those are improving. But those all require that you have a production org with a lot of data, which for many people is very useful. For consultants, in particular, the SIs we tend to be much more, I have an empty org that I’m about to put your historic data into just as soon as the migration team gets it there.

Josh Birk:
Because I’m going to break these things over here where you’re not.

Aaron Crosman:
Right. And so I need to get this other data in and I need something to test on while the migration team is still getting their data worked on so we can [crosstalk 00:32:05] parallel.

Josh Birk:
Got it.

Aaron Crosman:
And so, yeah. How do you stage all those things? And so Snowfakery is just an excellent piece of the puzzle for that piece.

Josh Birk:
Right. So it’s not just the creation of data at that point, it’s the creation, migration, utilization, all the tools you will need for the lifecycle of data in general. Is that a fair description?

Aaron Crosman:
Yeah. Yeah. And so it is this whole toolkit of how am I handling data, and particularly generated data so that I can do all the things I need to do as an admin, as a consultant, as a developer. And as I’ve recognized that those demo scenarios more also… As sales engineers, how can they be part of it and learn from that?

Josh Birk:
Got it. And that’s our show. Now I do want to point out Aaron’s got a great blog. It’s very active, talks about a variety of topics, including what we’ve been talking about today, but also some other stuff like professional career growth and other things. We will definitely have a link to that in the show notes. Highly recommend you check it out.

Josh Birk:
Now, before we go, I did ask after Aaron’s favorite non-technical hobby and well, he’s got a yarn to spin ya.

Aaron Crosman:
I spin fiber. I make-

Josh Birk:
You what?

Aaron Crosman:
… thread and yarn.

Josh Birk:
Really?

Aaron Crosman:
but not on the bicycle, but on a spinning wheel. Picked it up from my mother-in-law when I was first married and-

Josh Birk:
No way. That’s one of the most unique I’ve ever heard. What’s involved in spinning? Like how… I’m just so curious now.

Aaron Crosman:
I have started as much as having raw fleece that was freshly sheared off of a friend’s sheep and washed the wool and carted it, and then spun that into yarn that I eventually knitted into hats and scarves. So I have done the full-

Josh Birk:
Wow.

Aaron Crosman:
It’s referred to as sheep to shawl because that’s the competition, but I do sheep to hat.

Josh Birk:
I want to thank Aaron for the great conversation and information and as always, I want to thank you for listening. Now, if you want to learn more about this show, head on over to developer.salesforce.com/podcast, where you can hear more episodes, see the show notes that links to your favorite podcast service. Thanks again, everybody. And I’ll talk to you next week.

Get notified of new episodes with the new Salesforce Developers Slack app.