Data Cloud SQL Syntax

SQL statements are represented as strings. A SQL statement string consists of tokens. A token can be a keyword, an unquoted or quoted identifier representing a column name or a table name, a literal (or constant), or a special character symbol. Tokens are normally separated by whitespace.

For example, this statement is a valid SQL input.

In this example:

Tokens such as SELECT and WHERE in the preceding example are keywords, that is, words that have a special meaning in SQL. Keywords provide the structure of the SQL commands. For example, the preceding SELECT example contains three parts.

  1. The list of selected columns: the SELECT clause (in this case *)
  2. The list of tables: the FROM clause
  3. The applied filters: the WHERE clause

These FROM and WHERE keywords are examples of clauses. For more information about clauses, see Clauses reference. SELECT is an example of a SQL command.

Keywords share syntax as unquoted identifiers. One common convention is to write keywords in upper case and identifiers in lower case. This example shows the keywords in upper-case and the identifier my_table in lowercase.

The tokens Customers and "Discount" in the provided example are identifiers. These tokens identify names of tables, columns, or other database objects. These tokens are called “names”. In Data Cloud, tables are the objects that you can query, including data model objects (DMOs), data lake objects (DLOs), and calculated insight objects (CIOs). To learn about which objects are queryable in Data Cloud, see Query Data in Data Cloud. For more information on object in Data Cloud, see Object Model in Data Cloud.

For table and column names, use API names. In Data Cloud, table and column names correspond to the object and field API names. Also, column names end with the __c suffix.

There are two types of identifiers: unquoted identifiers, such as Customers, and quoted or delimited identifiers, such as "Discount".

Quoted identifiers keep their case, while unquoted identifiers are lowercase.

Because Data Cloud names usually include uppercase characters in table and column names, quote all Data Cloud names. We recommend that you always quote your table names and column names.

To quote an identifier, enclose table names and field names in double quotes. For example, in this query, the ssot__Individual__dlm data model object (DMO) name and field names are enclosed within double quotes.

Quoted identifiers are enclosed in double quotes (") and can contain any character. To include a double quote, write two double quotes. You can use special characters like tabs, protected whitespace, or newline characters without escaping them.

Unquoted identifiers start with a letter (non-Latin letters are also accepted) or an underscore. Subsequent characters in an identifier can be letters, underscores, or digits (0–9).

Unquoted identifiers are always folded to lowercase. For example, the identifiers MyCustomers, myCustomers, and "mycustomers" all represent the lowercase name mycustomers.

These examples show the differences in case sensitivity.

Unquoted identifiers can easily collide with keywords. For example, if you name your column “column”, use a quoted identifier. Data Cloud SQL interprets the unquoted column identifier as a keyword.

A string constant in SQL is an arbitrary sequence of characters bounded by single quotes ('). For example, 'This is a string'. To include a single quote character within a string constant, write two adjacent single quotes. For example, 'Dianne''s horse'.

Two string constants that are only separated by whitespace with at least one newline are concatenated and treated as if the string was written as one constant.

These two examples are equivalent.

This example isn’t valid syntax.

Data Cloud SQL and Tableau Hyper API also accept string constants with escaped characters, which are an extension to the SQL standard. An escaped string constant is specified by writing the letter E in uppercase or lowercase before the opening single quote. For example, E'foo'. Within an escape string, a backslash character (\) begins a C-like backslash escape sequence. The sequence uses a combination of a backslash and the characters in this table to represent a special byte value.

Backslash Escape SequenceInterpretation
\bbackspace
\fform feed
\nnewline
\rcarriage return
\ttab
\o, \oo, \ooo (o = 0–7)octal byte value
\xh, \xhh (h = 0–9, A–F)hexadecimal byte value
\uxxxx, \Uxxxxxxxx (x = 0–9, A–F)16 or 32-bit hexadecimal Unicode character value

To include a backslash, escape it with a backslash (\\). Also, a single quote can be included in an escape string by writing \' or ''.

A dollar-quoted string constant consists of a dollar sign ($), an optional “tag”, another dollar sign, the string content, a dollar sign, the same tag that began this dollar quote, and a dollar sign. For example, here are two different ways to specify the string “Dianne's horse” by using dollar quoting.

Numeric constants are accepted in these formats.

digits is one or more decimal digits (0 through 9). Always include at least one digit before or after the decimal point, if one is used. Always include at least one digit after the exponent marker (e), if one is present. There can’t be any spaces or other characters embedded in the constant. Any leading plus or minus sign isn’t considered part of the constant. It’s an operator applied to the constant.

These are some examples of valid numeric constants.

A constant of an arbitrary type can be entered using any one of these notations.

The :: and CAST() syntaxes can be used to specify run-time type conversions of arbitrary expressions, as described in Type Conversions.

The string constant can be written by using either regular SQL notation or dollar quoting.

An operator name is a sequence of characters from this list.

-- and /* can’t appear anywhere in an operator’s name because they’re taken as the start of a comment.

A comment is a sequence of characters beginning with double dashes and extending to the end of the line, for example:

A comment is removed from the input stream before further syntax analysis and is effectively replaced by whitespace.