Newer Version Available

This content describes an older version of this product. View Latest

Combine Data from Multiple Data Streams with cogroup

You can combine data from two or more data streams into a single data stream using cogroup. The data streams must have at least one common field.

Example - Inner cogroup

Suppose that you want to understand how much time your reps spend meeting with each account. Is there a relationship between spending more time and winning an account? Are some reps spending much more or much less time than average? To answer these questions, first combine meeting data with account data using cogroup.

Suppose that you have a dataset of meeting information from the Salesforce Event object. In this example, your reps have had six meetings with four different companies. The Meetings dataset has a MeetingDuration column, which contains the meeting duration in hours.

Diagram showing the meeting dataset.

The account data exists in the Salesforce Opportunity object. The Ops dataset has an Account, Won, and Amount column. The Amount column contains the dollar value of the opportunity, in millions.

Diagram showing the Ops datasets.

To see the effect of meeting duration on opportunities, you start by combining these two datasets into a single data stream using cogroup.

Internally (you cannot see these results yet), the resulting cogrouped data stream contains the following data. Note how the data streams are rolled up on one or more dimensions.

Now the datasets are combined. To see the data, you create a projection using foreach:

The resulting data stream contains the sum of amount and total meeting time for each company. The sum of amount is the sum of the dollar value for every opportunity for the company.

Diagram showing the combined dataset.

Now that you have combined the data into a single data stream, you can analyze the effects that total meeting time has on your opportunities.