Data Import and Export
Tacnode facilitates efficient data import and export within a table. For importing large datasets, the INSERT
or COPY FROM
methods are typically used. The INSERT
operation is suitable for smaller datasets or when incrementally adding to an existing table. In contrast, the COPY FROM
command is ideal for large initial imports, as it reads data directly from a file and writes it into the storage engine's underlying file format, reducing the overhead associated with SQL transaction processing linked with INSERT
. This significantly enhances the import experience. Additionally, the COPY TO
command allows for batch exporting of a table or query results to a file for subsequent imports or processing.
For example, consider the following table for students:
We have a prepared file (stu.txt) containing data conforming to the above table schema. Its content is:
In this file, fields in each row are separated by a single tab character \t
, and an empty field is indicated by \N
.
Data Import
COPY FROM
Once connected to Tacnode's default PostgreSQL database using psql, run these commands to import data and view the results:
The syntax for the detailed batch import command COPY FROM
is as follows:
Supported options include:
DELIMITER 'delimiter_character'
: Specifies the column separator for each data row in the file, defaulting to the tab character\t
. Since Tacnode currently only supports text format file input, it must be escaped if this delimiter appears in normal field content. Future versions will support CSV and binary format file input.NULL 'null_string'
: Indicates a string representing a NULL value, defaulting to the two-character text\N
.ON CONFLICT DO NOTHING
: If a primary key from the copied file already exists in the existing data, that row will be ignored.
For additional syntax for COPY
, refer to COPY
.
Data can be imported via the JDBC driver using the CopyManager in PostgreSQL.
INSERT
The INSERT
statement adds new records to a table. You can insert individual rows or multiple rows simultaneously. The syntax is as follows:
For additional syntax for INSERT
, refer to INSERT
.
The test table's definition is outlined below. Unless otherwise stated, the above table definition will be used for testing.
- Clearly define the column names, field, and value columns in a one-to-one correspondence. You may provide only a selection of column names and their corresponding values, while the values for the other columns will default to their standard values.
Example:
- If you don't specify column names, you must provide values for all fields in
VALUES
.
Example:
- Batch insertion allows you to add several records simultaneously.
Example:
- Utilize query results as input for the
INSERT INTO
statement.
Example:
- When a primary key conflicts during data insertion, you can use the
INSERT INTO ON CONFLICT
syntax to determine the desired action. There are two mainCONFLICT_ACTION
options:
- NOTHING: When a primary key conflict occurs, no action is taken, and the conflicting data to be inserted is discarded.
- UPDATE: The specified column is updated after a primary key conflict (Note: While technically permissible, updating primary key columns is not recommended).
It is advisable to implement ON CONFLICT
based on the primary key or to set up a unique index; otherwise, a full table scan may be initiated, which can diminish performance.
Example:
Performance Optimization
For large volumes of data, batch submission statements allow for the simultaneous submission of multiple SQL statements to Tacnode, enhancing import performance through batching. Here’s an example:
For incremental updates to a large data table, such as updating a value of 3 million in a base table of 3 billion, you cannot use COPY
with ON CONFLICT DO UPDATE
. Instead, use COPY FROM
to a temporary table followed by INSERT INTO SELECT FROM (temporary table) ON CONFLICT
for optimal import performance. Here are the steps:
Data Export
COPY TO
The syntax for the COPY TO
command is detailed below:
Options available with WITH
include:
- DELIMITER 'delimiter_character'
- NULL 'null_string'
These have the same meaning as those in COPY FROM
, as shown in the example:
Example:
- To export all student tables, run this command:
- To export the first 200 students, run this command:
You can use the CopyManager with PostgreSQL to export data through the JDBC driver.
Streaming Data Export
Tacnode enables data export through the streaming replication protocol; for more details, refer to Change Data Capture.