Tacnode facilitates efficient data import and export within a table. For importing large datasets, the INSERT or COPY FROM methods are typically used. The INSERT operation is suitable for smaller datasets or when incrementally adding to an existing table. In contrast, the COPY FROM command is ideal for large initial imports, as it reads data directly from a file and writes it into the storage engine's underlying file format, reducing the overhead associated with SQL transaction processing linked with INSERT. This significantly enhances the import experience. Additionally, the COPY TO command allows for batch exporting of a table or query results to a file for subsequent imports or processing.
For example, consider the following table for students:
We have a prepared file (stu.txt) containing data conforming to the above table schema. Its content is:
In this file, fields in each row are separated by a single tab character \t, and an empty field is indicated by \N.
Once connected to Tacnode's default PostgreSQL database using psql, run these commands to import data and view the results:
The syntax for the detailed batch import command COPY FROM is as follows:
Supported options include:
DELIMITER 'delimiter_character': Specifies the column separator for each data row in the file, defaulting to the tab character \t. Since Tacnode currently only supports text format file input, it must be escaped if this delimiter appears in normal field content. Future versions will support CSV and binary format file input.
NULL 'null_string': Indicates a string representing a NULL value, defaulting to the two-character text \N.
ON CONFLICT DO NOTHING: If a primary key from the copied file already exists in the existing data, that row will be ignored.
The INSERT statement adds new records to a table. You can insert individual rows or multiple rows simultaneously. The syntax is as follows:
For additional syntax for INSERT, refer to INSERT.
The test table's definition is outlined below. Unless otherwise stated, the above table definition will be used for testing.
Clearly define the column names, field, and value columns in a one-to-one correspondence. You may provide only a selection of column names and their corresponding values, while the values for the other columns will default to their standard values.
Example:
If you don't specify column names, you must provide values for all fields in VALUES.
Example:
Batch insertion allows you to add several records simultaneously.
Example:
Utilize query results as input for the INSERT INTO statement.
Example:
When a primary key conflicts during data insertion, you can use the INSERT INTO ON CONFLICT syntax to determine the desired action. There are two main CONFLICT_ACTION options:
NOTHING: When a primary key conflict occurs, no action is taken, and the conflicting data to be inserted is discarded.
UPDATE: The specified column is updated after a primary key conflict (Note: While technically permissible, updating primary key columns is not recommended).
It is advisable to implement ON CONFLICT based on the primary key or to set up a unique index; otherwise, a full table scan may be initiated, which can diminish performance.
For large volumes of data, batch submission statements allow for the simultaneous submission of multiple SQL statements to Tacnode, enhancing import performance through batching. Here’s an example:
For incremental updates to a large data table, such as updating a value of 3 million in a base table of 3 billion, you cannot use COPY with ON CONFLICT DO UPDATE. Instead, use COPY FROM to a temporary table followed by INSERT INTO SELECT FROM (temporary table) ON CONFLICT for optimal import performance. Here are the steps: