Newer Version Available
Combine Data from Multiple Data Streams with cogroup
Example - Inner cogroup
Suppose that you want to understand how much time your reps spend meeting with each account. Is there a relationship between spending more time and winning an account? Are some reps spending much more or much less time than average? To answer these questions, first combine meeting data with account data using cogroup.
Suppose that you have a dataset of meeting information from the Salesforce Event object. In this example, your reps have had six meetings with four different companies. The Meetings dataset has a MeetingDuration column, which contains the meeting duration in hours.

The account data exists in the Salesforce Opportunity object. The Ops dataset has an Account, Won, and Amount column. The Amount column contains the dollar value of the opportunity, in millions.

To see the effect of meeting duration on opportunities, you start by combining these two datasets into a single data stream using cogroup.
Internally (you cannot see these results yet), the resulting cogrouped data stream contains the following data. Note how the data streams are rolled up on one or more dimensions.
Now the datasets are combined. To see the data, you create a projection using foreach:
The resulting data stream contains the sum of amount and total meeting time for each company. The sum of amount is the sum of the dollar value for every opportunity for the company.

Now that you have combined the data into a single data stream, you can analyze the effects that total meeting time has on your opportunities.