External Locations
Applies to: ❌ Data Cloud SQL ✅ Tableau Hyper API
To specify the location where external data is read, use one of:
This location can be a path in the file system of the Hyper server (not the client!) or an Amazon S3 URI. For example, s3://mybucket/path/to/myfile.csv
.
Using a file system path instructs the server to directly read from a file. The files must reside on or be accessible to the database server machine, not the client. They must be accessible to and readable by the user, not the client. The name must be specified from the viewpoint of the server.
In the case of an Amazon S3 URI, you can use the extended syntax s3_location(...)
to specify credentials and optionally a bucket region. The specified credentials can be empty if used for buckets with anonymous access.
If no bucket region is specified, Data Cloud SQL infers the bucket region. In this case, the specified credentials must have permissions for the HeadBucket
S3 request.
Data Cloud SQLs read capabilities from Amazon S3 are highly optimized. They use techniques such as concurrent requests, request hedging, and prefetching. For maximum performance, ensure a high network bandwidth to Amazon S3. For example, run Data Cloud SQL on an AWS EC2 instance.
To access data stored on Microsoft Azure Blob Storage, use the azure_location
syntax. Data Cloud SQL supports the DFS
and BLOB
endpoints and supports these Azure URL formats.
abfss://container@account.dfs.core.windows.net/...
https://account.dfs.core.windows.net/container/...
https://account.blob.core.windows.net/container/...
- Hyper supports the non-SSL version of the URLs
http://
andabfs://
, and it establishes SSL encrypted connections.
The ARRAY[ <source_location> [, ...] ]
syntax can be used to read from a list of source locations. All source locations in this list must share the same file format. The list can contain S3 locations, local files, or any other file locations.
To scan multiple files, don’t mix different external formats in one invocation of external
or in one external table. To scan data from different formats at the same time or to scan data in the same format but with different format options, use one external
function or external table per format and combine the results with UNION ALL
. This example combines two parquet files with two CSV files.
The two parquet files can be scanned with one invocation of external
. The CSV files have a different schema and different delimiters, so they’re scanned separately. The example combines 'products4.csv'
with the rest, even though it has an additional column discount
, because this column isn’t selected.