Reading External Data in SQL

Applies to: ❌ Data Cloud SQL ✅ Tableau Hyper API

Data Cloud SQL provides three options for using external data in SQL.

  • Copied into a table with the COPY FROM SQL command.

  • Read directly in a SQL query by using the set returning function external.

  • Exposed as if it was a table using the CREATE EXTERNAL TABLE SQL command. It can then subsequently be queried by using the name of the external table.

Usage examples of the individual alternatives can be found in the documentation of the respective statements. The choice among these ways to access external data depends on the use case.

COPY creates a copy of the data in a table. It’s functionally equivalent to an INSERT command that reads from the external function. For example, these two statements have the same effect, assuming that 'products.parquet' has the same columns as the existing table products.

The COPY statement automatically selects only the columns that are in the target table. The insert statement names the columns to insert explicitly. For formats that don’t carry schema information, such as CSV, COPY assumes that the file has the table’s schema, while external requires the schema to be given explicitly.

In general, using INSERT with external is more flexible because it can transform and filter the data before it’s inserted into the table. For example, this query inserts products with a price greater than 100 and transforms the product name to upper case.

external can also be used in CREATE TABLE AS statements to create and fill a new table. For example, this example creates and fills a products table.

The set returning function external and the CREATE TEMPORARY EXTERNAL TABLE statement can be used to query external data without inserting it into a table.

With external, you can write a fully self-contained query that doesn’t rely on external tables being set up before. CREATE TEMPORARY EXTERNAL TABLE enables reading external data as if it was a table. As a result, SQL queries can operate on a table or on an external table without changing their syntax.

CREATE TEMPORARY EXTERNAL TABLE infers the schema of the source on creation. If the source file then gets replaced by a file with a different schema, subsequent queries fail. In contrast, external re-infers the schema whenever the query containing it’s executed.