Reading External Data in SQL
Applies to: ❌ Data Cloud SQL ✅ Tableau Hyper API
Data Cloud SQL provides three options for using external data in SQL.
-
Copied into a table with the COPY FROM SQL command.
-
Read directly in a SQL query by using the set returning function external.
-
Exposed as if it was a table using the CREATE EXTERNAL TABLE SQL command. It can then subsequently be queried by using the name of the external table.
Usage examples of the individual alternatives can be found in the documentation of the respective statements. The choice among these ways to access external data depends on the use case.
COPY
creates a copy of the data in a table. It’s functionally equivalent to an INSERT command that reads from the external
function. For example, these two statements have the same effect, assuming that 'products.parquet'
has the same columns as the existing table products
.
The COPY
statement automatically selects only the columns that are in the target table. The insert statement names the columns to insert explicitly. For formats that don’t carry schema information, such as CSV, COPY
assumes that the file has the table’s schema, while external
requires the schema to be given explicitly.
In general, using INSERT
with external
is more flexible because it can transform and filter the data before it’s inserted into the table. For example, this query inserts products with a price greater than 100 and transforms the product name to upper case.
external
can also be used in CREATE TABLE AS
statements to create and fill a new table. For example, this example creates and fills a products table.
The set returning function external
and the CREATE TEMPORARY EXTERNAL TABLE
statement can be used to query external data without inserting it into a table.
With external
, you can write a fully self-contained query that doesn’t rely on external tables being set up before. CREATE TEMPORARY EXTERNAL TABLE
enables reading external data as if it was a table. As a result, SQL queries can operate on a table or on an external table without changing their syntax.
CREATE TEMPORARY EXTERNAL TABLE
infers the schema of the source on creation. If the source file then gets replaced by a file with a different schema, subsequent queries fail. In contrast, external
re-infers the schema whenever the query containing it’s executed.