A Deeper look at SOQL and Relationship Queries on Force.com

Introduction

SOQL is the object query language for querying data in the Force.com platform. You can think of it as an object oriented cousin of SQL, where the data model and the way to utilize joins differs from SQL. The data model in Force.com is based on objects that are related to each other by relationships. The relationship navigation is a fundamental building block in constructing queries that allow you to utilize data from different objects, thus utilizing implicit joins. In other words, the relationship syntax in SOQL helps joining data using path traversal.

This article will revisit some of the capabilities and patterns in SOQL that allow developers to conceptualize familiar concepts like joins in SQL, and puts them in perspective with respect to SOQL's syntax and capabilities. In doing so, we will also examine some of the features made available in the Spring '10 release of the platform.

A Review of Relationships in Force.com

Let's start by examining the kind of relationships that are available for building an application on the platform. There are two kinds of relationships. Both of them are 1-to-many relationships, but they differ with respect to the kind of support the platform provides for the developer. Both relationships are defined from the many-to-1 side, namely from a child to a parent.

1. Lookup Relationships:

They are very flexible. They can be used to create more complex relationships (such as m-n) as a building block.

  • Orphans are allowed in this relationship. Namely, a relationship does not have to exist between individual records even though the data model allows a relationship to occur between a child and a parent object type.
  • Sharing rules do not apply to children. As a result, the existence of a relationship does not dictate usage and visibility constraints of related records. In this manner, flexible security models can be implemented by using lookup relationships.
  • Behaviors are not implied by the relationship. The lifecycle of the parent does not affect the lifecycle of the child records. If you need to rollup a value from child records, you would have to implement it on your own.

2. Master-Detail Relationships:

These are a special case of lookup and thus come with built in semantics.

  • Orphans are not allowed in this relationship. Namely, a child record can not exist on its own.
  • Sharing rules apply to children. Thus whatever visibility is set for the parent, is also true for the children by default.
  • Special behavior is provided by the platform. This also implies a strict composition that affect the lifecycle of the children. If the parent record is deleted, the children are deleted (cascade-delete). Rollup of values and aggregation from children can also be made available on the parent declaratively.

Spring10.png As of the Spring '10 release, Master-Detail relationships allow up to 3 levels of Master-Detail relationships. Note that for containment relationships, this allows the parent to rollup aggregates from great grand children. See the release note on this topic.

In the subsequent sections, we will look at how some of the SOQL patterns can be used in detail. These become particularly handy when you need to roll out behaviors with respect to records in Lookup relationships.

SOQL Results and Data Binding in Apex Code

Programmatic access to a result set of a SOQL query naturally reflects the object oriented nature of the underlying data model. SOQL utilizes the navigation with relationships and is anchored to a particular object where the result set is obtained from, by using the FROM Clause. Consequently, the returned result set is a List of Objects that reflect the object that constitutes the domain of the SOQL query. The set also contain the list of fields that are included in the SELECT clause, including those that are traversed via the relationships. SELECT clauses can include nested SELECT statements using relationships. As a result, the result set can include nested objects within a result set of objects. This is illustrated with the running example patterns that will be covered below.

Running Example

In this article, we will use a very simplified subset of the Recruitment Application. For more information on this application, please refer to the Force.com Fundamental's book.

The application has a m-n relationship between two objects, Position and Candidate. In order to materialize the relationship, another object Job Application is used to establish the relationship as Force.com allows you create only 1-m relationships.

  • A Position may have several Job Applications from several Candidates.
  • A Candidate may have applied to several Positions.
  • Note that a Candidate may have applied to a company but not necessarily to a specific position. Therefore, there would be an application for that Candidate. Later during the course of the candidate's evaluation, an application may be related to an open position and can be processed as appropriately for that position, such as having a set of interviews, etc.

Pictorially, this is described as follows:

Relationships and Sample Data

As can be seen from the sample data, certain Positions such as Administrator has no applicants and certain Candidates, such as C-0001 do not have any assigned Positions.

SOQL Joins

Let's look at what we can do with this data with some familiar JOIN patterns borrowed from SQL. This time however, we will utilize them in SOQL and also use the power of relationship traversal in path expressions that allow us to do joins implicitly.

In each case below, we will use the pattern with respect to a query problem that we would like to express with SOQL. For Position, we will use Position__c, for Application we will use Job_Application__c, and for Candidate we will use Candidate__c in the corresponding code snippets.

Right Outer Join

Problem: Find job Applications and related Departments (from Applications).

Note that for this example, we would like to list applications including those which are not related to a Position yet. This is accomplished by an outer join.

SOQL Query:

SELECT Name, Position__r.Department__c FROM Job_Application__c

Results:

RightOuterJoin.jpg

Discussion: Note that we have utilized a path expression Position__r.Deparment__c to create a RIGHT JOIN between Job_Application__c and Position__c. The path expression allows us both to do the implicit join as well as extract the value of the Department__c field.


Left Outer Join

Problem: Find all positions with their related list of applications.

Again, we would like to list the positions regardless of whether there is a related application or not.

SOQL Query:

SELECT Name, (SELECT Name FROM Job_Applications__r) FROM Position__c

Results:

LeftOuterJoin.jpg

Discussion:

Note that we have utilized a NESTED select to obtain the records of related children from the parent. This kind of expression is very powerful in obtaining related records traversing from the 1 side in a 1-m relationship. In complex data modeling, especially with containment relationships, it is a useful pattern to obtain related information on a parent and all its children via traversing path expressions with relationship fields. In this example, we are utilizing a path expression from Position__c using the relationship, Job_Applications__r to obtain the list of applicants to create this join. This pattern will also be very useful in obtaining data from feeds in Chatter which will be discussed elsewhere. Please note that the result set is a list of positions where each position contains a list of application, reflecting the nature of the hierarchical representation of the data that is selected by this query.


Left Inner Join

Problem: Find names of all positions for which there are associated applications

SOQL:

SELECT Name FROM Position_c WHERE Id IN

(SELECT Position__c FROM Job_Application__c)

Results:

LeftInnerJoin.jpg

Discussion:

Spring10.png This is a very interesting case that illustrates a new Spring '10 feature. Previously, you would have to generate the list of Ids from Job_Application__c programmatically and test whether the Position Id existed in this list. Now, a SELECT statement in conjunction with an IN clause allows Force.com to generate the Ids directly from the Job_Application__c object in order to perform the join.


Right Inner Join

Problem: Find all applications for positions that are in the 'Sales' department.

SOQL:

SELECT Name,Position__r.Name,

FROM Job_Application__c

WHERE Position__r.Department__c = ‘Sales’

Results:

RightInnerJoin.jpg

Discussion:

In this case, we are again utilizing path expressions and relationships to allow us to get the results of a join. In addition, by constraining the value of Department__c field in Position__c (via Position__r relationship navigation) we are achieving an inner join.


Left Anti Join

Problem: Find the name of all the positions which have no job applications.

SOQL:

SELECT Name FROM Position_c WHERE Id NOT IN

(SELECT Position__c FROM Job_Application__c)

Results:

LeftAntiJoin.jpg

Discussion:

Spring10.png This use case is again utilizing a feature of the Spring '10 release, the ability to use a SELECT statement in conjunction with a NOT IN clause. See the section on Left Inner Join for the reverse example.


Right Anti Join

Problem: Find the name of all applications for which there are no associated positions.

SOQL:

SELECT Name FROM Job_Application__c WHERE Position__c = null


Results:

RightAntiJoin.jpg

Discussion:

If you look closely, we are exploiting the way relationships are expressed with objects and field values here. Since this is a join from the m-1 side, the "join" is actually obtained by testing the value of the relationship field value, Position__c, with null, thus simulating a join without traversing the relationship.


Cheat Sheet on Using Join Patterns

Below is a cheat sheet for utilizing the Join patterns using a 1-m relationship. In this case, we are assuming that the parent object is on the left and the child object is on the right as in our example above.

CheatSheet.jpg

SOQL Join Summary

The take away here is that the outer join patterns are useful when you don't want to be constrained by relationships strictly, and use anti-patterns to find records which are not included in the relationships, such as orphaned or childless records with respect to a relationship. Lookup relationships are handy when you need to include additional behaviors, thus utilizing outer joins becomes important when you need to find childless or orphaned records.

Date Functions:

The Spring '10 release introduces new capabilities to deal with data in SOQL queries by utilizing a set of new functions that deal with dates. A noteworthy list, that will allow you to select and (as needed) aggregate data:

  • CALENDER functions: CALENDAR_MONTH, CALENDAR_QUARTER, CALENDAR_YEAR
  • DAY : DAY_IN_MONTH, DAY_IN_WEEK, DAY_IN_YEAR, DAY_ONLY
  • FISCAL: FISCAL_MONTH, FISCAL_QUARTER, FISCAL_YEAR
  • HOUR function: HOUR_IN_DAY
  • WEEK function: WEEK_IN_MONTH, WEEK_IN_YEAR

Example: Following our Recruitment App, the following SOQL snippet finds all Positions that were closed in the month of February.

SELECT Title FROM Position__c WHERE CALENDAR_MONTH(Date_Closed__c) = 2

Note that by using these date functions, the possibilities for generating custom reports and aggregated results are endless.

Aggregate Results with GROUP BY and HAVING

Spring10.pngSpring '10 introduces several familiar constructs from SQL: These are GROUP BY and HAVING clauses. These additions are particularly useful in obtaining on the fly analysis and reports about the data at hand. In addition, they come in especially handy with Lookup relationships to generate rollup-like summaries programmatically using Apex.

Let's see some of these in action with respect to our Recruitment example.

GROUP BY Example

Problem: Find total number of job applications in each department as well as total number of applications in all of the departments.

SOQL:

SELECT Position__r.Department__c deptname, COUNT(id) total

FROM Job_Application__c

GROUP BY Position__r.Department__c

Results:

GroupBy.jpg

Discussion: There are two aspects of data binding we need to comment on with this example. As grouping is a new concept introduced in Spring '10, it also introduces a new way of dealing with the result set in Apex. Notice that we are selecting two different items using aliasing in the SELECT clause. The first one is a field, utilizing a relationship traversal Position__r.Department__c aliased as deptname, and the second is the total number of applications (from id field in Job_Application__c object). This aliasing allows us to use these results in accessing the result set.

The result is provided in Apex by using a new data type, AggregateResult. Note that since there will be different "fields" in this object depending on what is being returned as a result of the query, one would need ways to access each field returned by a row of the result set. The names in the aliasing comes in handy for this purpose. The AggregateResult will have two fields, deptname and total. They are accessed by the familiar accessor pattern, using a getter. This is illustrated below:

List <AggregateResult> aggrs =

[SELECT Position__r.Department__c deptname, COUNT(id) total
FROM Job_Application__c
GROUP BY Position__r.Department__c];

for (AggregateResult ja : aggrs)

{System.debug(ja.get('deptname') + ' | ' + ja.get('total'));}

SOQL now also has the capability to include HAVING clauses along with GROUP BY clause to further constrain the results of a grouped result set.


GROUP BY ROLLUP Example

Let's look at a more advanced use of a grouping clause. There are several new constructs in SOQL in Spring '10 that allow you to return aggregated results. One example is a GROUP BY ROLLUP clause. This clause not only allows us to get aggregated results that are partitioned by the GROUP BY clause; it also provides a total rollup for all the groups that the evaluated by the query.

For example, the following query not only displays the total number of applications per Department but also provides a total number applications aggregated that are subject to query constraints.

SOQL:

SELECT Position__r.Department__c deptname, COUNT(id) total

FROM Job_Application__c

GROUP BY ROLLUP(Position__r.Department__c)

Results:

GroupByRollup.jpg

Discussion: Although the query results appear to be similar to the GROUP BY example we used above, look at the last row of the results. The total number of all grouped applications is given in the result set as the values of the last row.


Other Grouping Functions

Spring '10 contains several other forms of aggregate functions for grouping that are also available in SQL, such as GROUP BY CUBE and GROUPING. These allow new ways to obtain and partition the results that was not possible before. They are especially useful for generating reports or analyzing the data in nested relationships by aggregating over the whole set or aggregating results with subsets. See the following Sections in the API documentation to learn more:

Summary

This article covers common patterns that SQL developers utilize, such as outer joins, inner joins and anti joins, as well as aggregate functions and how the same concepts apply to SOQL queries with a running example. It also covers several features introduced in the Spring '10.

References

  • Different aspects of the Spring '10 release, including the release notes, can be found on the release page.

About the Author

Dr. Umit Yalcinalp is a Developer Evangelist at salesforce.com. She is a software architect, author, editor of several Web Services and SOA standards, specification lead in Java, contributor to XML and Java specifications and frequent speaker at conferences. As one of the moms of EJB QL in the EJB specification, query languages have always been dear to her heart.