File-based Foreign Table

Cloud platforms typically support general object storage systems, offering scalable file read and write capabilities. These platforms are commonly used for data storage in lake scenarios. This article introduces the configuration method for file-based external tables.

Install the Extension

Install the extension once. If it is already installed, you can skip this step.

CREATE EXTENSION file_fdw;

Create a Foreign Server for File Storage

Create a Foreign Server for AWS S3

CREATE SERVER <server_name> FOREIGN DATA WRAPPER file_fdw
OPTIONS (
 REGION 'region-id'
);

REGION: The region ID of the cloud storage service. For specific IDs, see AWS S3 Endpoint.

-- example
CREATE SERVER s3_server FOREIGN DATA WRAPPER file_fdw
OPTIONS (
 REGION 'us-west-1'
);

Create a User Mapping

Use 'access_id' and 'access_key' to grant the local user account access rights to the designated object storage.

CREATE USER MAPPING FOR <local_user> SERVER s3_server
OPTIONS (
 ACCESS_ID 'access-id',
 ACCESS_KEY 'access-key'
);

Create a Foreign Table

CREATE FOREIGN TABLE <table_name> (
    <column_name> <data_type>,
     ...
    )
SERVER <server_name>
OPTIONS (
   DIR '<path_to_s3>',
   FORMAT '<parquet|orc>'
);

DIR: The directory path for the specified external object storage in OPTIONS.
The specified file format in OPTIONS, and FORMAT can be parquet, or orc.

-- create parquet table
CREATE FOREIGN TABLE f_t1(
  col1 INT,
  col2 TEXT  -- Manually define the schema
)
SERVER s3_server
OPTIONS
(
 FORMAT 'parquet',  -- OR orc
 DIR 's3://demo/data/'
);
 
-- Import into Tacnode managed table from foreign table
INSERT INTO local_t1 SELECT * FROM f_t1;