Newer Version Available

This content describes an older version of this product. View Latest

group and cogroup

Groups matched records. The group and cogroup statements are interchangeable. However, cogroup is typically used to operate on more than 1 input stream.

Syntax

Simple Grouping

Adds one or more columns to a group. If data is grouped by a value that’s null in a row, that whole row is removed from the result.

Syntax:

or

The order of the fields matters for limit queries, but not for top queries.

Note

Group by 1 dimension:

a = group a by year;
Group by multiple dimensions:

Inner Cogrouping

Cogrouping means that two input streams, called left and right are grouped independently and arranged side by side. Only data that exists in both groups appears in the results.

Syntax:

This example is a simple cogroup operation on 2 datasets:
You can cogroup more than 2 datasets:
This example performs a cogroup operation:
You can’t have the same stream on both sides of a cogroup operation. To perform a cogroup operation on 1 dataset, load the dataset twice so you have 2 streams.
You can also load 1 dataset and filter it into 2 different streams:
This code throws an error because it performs a cogroup operation on a single stream, a:
To use aggregate functions when cogrouping, specify which input side to use in the aggregate function. For example, if you have an a side and a b side, and each contains a particular measure, use one of these syntaxes:
This query is valid because it uses the third syntax form to specify that miles comes from the a side.
This query isn’t valid because miles doesn't specify which side it is coming from:
If a lens or dashboard has a cogroup query, specify the input stream for projections and for count() aggregations on cogroup queries, as in this example:

Outer Cogrouping

Outer cogrouping combines groups as an outer join. For the half-matches, null rows are added. The grouping keys are taken from the input that provides the value.

Syntax:

Specify left, right, or full to indicate whether to perform a left outer join, a right outer join, or a full join.

Example: z = cogroup x by (day,origin) left, y by (day,airport);

You can apply an outer cogrouping across more than 2 sets of data. This example does a left outer join from a to b, with a right join to c:

result = cogroup a by keya left, b by keyb right, c by keyc;

Outer joins return null when there is no match, instead of defaulting to zero.

Note