Newer Version Available
Null Operators
Null operators return true or false.
| Operator | Description |
|---|---|
| is null | True when the value is null. |
| is not null | True when the value is not null. |
Use is null and is not null in projections and in post-projection filters. You can't use them in pre-projection filters.
For example, display all the accounts that your reps have met with at least once.
1q = load "Meetings";
2q = group q by 'Company';
3q = foreach q generate 'Company' as 'Company', sum('MeetingDuration') as 'TotalMeetings';
4
5--filter out fields with no meetings
6q = filter q by 'TotalMeetings' is not null;Or, you can use case to replace null values with a value of your choice.
1q = load "dataset";
2q = foreach q generate (case when Name is null then "john doe" else Name end) as Name;This example is not valid because you can't use is not null or is null before a projection:
1a = load "dataset";
2a = filter a by Year is not null;
3a = foreach a generate Name as Name, Year as Year;Use is null with cogroup
A left outer cogroup combines the right data stream with the left data stream. If a record on the left side does not have a match on the right, the missing right value is null in the resulting data stream.
For example, suppose that you have a Meeting data set containing information about your rep's meetings with each account. You want to see all accounts that reps have not met with. Use a left outer cogroup between Ops and Meetings, then use is null to filter results.
1ops = load "Ops";
2meetings = load "Meetings";
3q = cogroup ops by 'Account' left, meetings by 'Company' ;
4q = foreach q generate ops.'Account' as 'Account', sum(ops.'Amount') as 'sum_Amount', sum(meetings.'MeetingDuration') as 'TimeSpent';
5
6--use is null to get records with no time time spent
7q = filter q by 'TimeSpent' is null;