Data Import and Export
Master efficient data transfer with Tacnode's comprehensive import and export capabilities. Learn COPY commands, INSERT strategies, and performance optimization techniques.
Efficient data import and export operations are fundamental to modern database workflows. This comprehensive guide covers Tacnode’s powerful data transfer capabilities, from basic operations to advanced performance optimization strategies.
Overview
Tacnode offers multiple approaches for data import and export, each optimized for different scenarios:
Import Methods:
- COPY FROM: Best for large bulk imports, highest performance, direct storage engine writes
- INSERT: Ideal for small batches and complex logic, flexible SQL syntax, transaction control
Export Methods:
- COPY TO: Best for bulk exports, supports text and CSV formats
- Streaming: Real-time sync through CDC mechanisms
Performance Comparison:
-- Example performance for 1 million rows
-- COPY FROM: ~30 seconds
-- INSERT (single): ~15 minutes
-- INSERT (batch): ~2 minutes
COPY FROM bypasses SQL transaction processing overhead by writing directly to the storage engine, while INSERT provides greater flexibility for complex data transformations and conditional logic.
Setup and Sample Data
Let’s establish a sample schema and data files for our demonstrations:
-- Sample tables
CREATE TABLE customers (
id SERIAL PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL,
first_name VARCHAR(100) NOT NULL,
last_name VARCHAR(100) NOT NULL,
phone VARCHAR(20),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer_id INTEGER REFERENCES customers(id),
order_number VARCHAR(50) UNIQUE NOT NULL,
total_amount DECIMAL(12,2) NOT NULL,
status VARCHAR(20) DEFAULT 'pending',
order_date DATE DEFAULT CURRENT_DATE
);
Sample data files:
customers.txt (tab-delimited):
1 john.doe@email.com John Doe +1-555-0101 2024-01-15 10:30:00
2 jane.smith@email.com Jane Smith +1-555-0102 2024-01-16 09:15:00
3 bob.johnson@email.com Bob Johnson \N 2024-01-17 14:20:00
products.csv (comma-delimited):
id,sku,name,price,stock_quantity
1,LAPTOP-001,"Gaming Laptop",1299.99,50
2,MOUSE-001,"Wireless Mouse",29.99,200
Use \N to represent NULL values in text files, and ensure consistent field delimiters throughout your data files.
Data Import Methods
COPY FROM - High-Performance Bulk Import
The COPY FROM command offers the best performance for bulk operations:
-- Basic syntax
COPY table_name [ ( column_name [, ...] ) ]
FROM { 'filename' | STDIN | PROGRAM 'command' }
[ [ WITH ] ( option [, ...] ) ]
-- Essential parameters
-- DELIMITER: Field separator (default: \t)
-- NULL: NULL value representation (default: \N)
-- FORMAT: Input format (text, csv, binary)
-- HEADER: Skip first row for CSV
Import Examples:
-- Basic tab-delimited import
\COPY customers FROM '/path/to/customers.txt';
-- CSV import with headers
\COPY products FROM '/path/to/products.csv'
WITH (FORMAT csv, HEADER true, DELIMITER ',');
-- Custom delimiter and NULL handling
\COPY orders FROM '/path/to/orders.txt'
WITH (DELIMITER '|', NULL 'N/A');
-- Handling conflicts with staging table
CREATE TEMP TABLE customers_staging (LIKE customers INCLUDING ALL);
\COPY customers_staging FROM '/path/to/customers.txt';
INSERT INTO customers
SELECT * FROM customers_staging
ON CONFLICT (email) DO UPDATE SET
first_name = EXCLUDED.first_name,
last_name = EXCLUDED.last_name,
updated_at = CURRENT_TIMESTAMP;
INSERT - Flexible Data Insertion
INSERT provides maximum flexibility for complex operations:
-- Single row insert
INSERT INTO customers (email, first_name, last_name)
VALUES ('alice.wilson@email.com', 'Alice', 'Wilson');
-- Multi-row batch insert
INSERT INTO products (sku, name, price, stock_quantity) VALUES
('TABLET-001', 'Android Tablet', 299.99, 30),
('SPEAKER-001', 'Bluetooth Speaker', 79.99, 100),
('HEADPHONES-001', 'Noise-Canceling Headphones', 199.99, 45);
-- Insert from query results
INSERT INTO vip_customers (customer_id, total_spent, tier)
SELECT
c.id,
SUM(o.total_amount),
CASE
WHEN SUM(o.total_amount) > 10000 THEN 'PLATINUM'
WHEN SUM(o.total_amount) > 5000 THEN 'GOLD'
ELSE 'SILVER'
END
FROM customers c
JOIN orders o ON c.id = o.customer_id
GROUP BY c.id
HAVING SUM(o.total_amount) > 1000;
-- Conflict resolution (UPSERT)
INSERT INTO customers (email, first_name, last_name, phone)
VALUES ('john.doe@email.com', 'John', 'Doe', '+1-555-0999')
ON CONFLICT (email) DO UPDATE SET
first_name = EXCLUDED.first_name,
last_name = EXCLUDED.last_name,
phone = EXCLUDED.phone,
updated_at = CURRENT_TIMESTAMP;
Java Programming Examples
Tacnode is compatible with PostgreSQL JDBC drivers, enabling high-performance data operations through Java applications:
import java.sql.*;
import java.io.*;
import org.postgresql.copy.CopyManager;
import org.postgresql.core.BaseConnection;
public class TacnodeDataImportExport {
private static final String JDBC_URL = "jdbc:postgresql://tacnode-host:5432/dbname";
private static final String USERNAME = "your-username";
private static final String PASSWORD = "your-password";
// High-performance bulk import using CopyManager
public void bulkImportFromFile(String tableName, String filePath) throws Exception {
try (Connection conn = DriverManager.getConnection(JDBC_URL, USERNAME, PASSWORD)) {
CopyManager copyManager = new CopyManager((BaseConnection) conn);
// Import CSV file with headers
String sql = String.format("COPY %s FROM STDIN WITH (FORMAT CSV, HEADER true)", tableName);
try (FileInputStream fileInputStream = new FileInputStream(filePath)) {
long rowsImported = copyManager.copyIn(sql, fileInputStream);
System.out.println("Imported " + rowsImported + " rows successfully");
}
}
}
// Bulk export to file
public void bulkExportToFile(String query, String filePath) throws Exception {
try (Connection conn = DriverManager.getConnection(JDBC_URL, USERNAME, PASSWORD)) {
CopyManager copyManager = new CopyManager((BaseConnection) conn);
// Export query results to CSV
String sql = String.format("COPY (%s) TO STDOUT WITH (FORMAT CSV, HEADER true)", query);
try (FileOutputStream fileOutputStream = new FileOutputStream(filePath)) {
long rowsExported = copyManager.copyOut(sql, fileOutputStream);
System.out.println("Exported " + rowsExported + " rows successfully");
}
}
}
// Batch insert with prepared statements
public void batchInsert() throws Exception {
try (Connection conn = DriverManager.getConnection(JDBC_URL, USERNAME, PASSWORD)) {
conn.setAutoCommit(false);
String sql = "INSERT INTO customers (email, first_name, last_name, phone) VALUES (?, ?, ?, ?)";
try (PreparedStatement pstmt = conn.prepareStatement(sql)) {
// Add multiple rows to batch
pstmt.setString(1, "user1@example.com");
pstmt.setString(2, "John");
pstmt.setString(3, "Doe");
pstmt.setString(4, "+1-555-0001");
pstmt.addBatch();
pstmt.setString(1, "user2@example.com");
pstmt.setString(2, "Jane");
pstmt.setString(3, "Smith");
pstmt.setString(4, "+1-555-0002");
pstmt.addBatch();
// Execute batch
int[] updateCounts = pstmt.executeBatch();
conn.commit();
System.out.println("Batch insert completed: " + updateCounts.length + " rows affected");
}
}
}
// Import from InputStream (useful for web uploads)
public void importFromInputStream(String tableName, InputStream inputStream) throws Exception {
try (Connection conn = DriverManager.getConnection(JDBC_URL, USERNAME, PASSWORD)) {
CopyManager copyManager = new CopyManager((BaseConnection) conn);
String sql = String.format("COPY %s FROM STDIN WITH (FORMAT CSV, HEADER true)", tableName);
long rowsImported = copyManager.copyIn(sql, inputStream);
System.out.println("Imported " + rowsImported + " rows from input stream");
}
}
}
Maven Dependencies:
<dependencies>
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>42.6.0</version>
</dependency>
</dependencies>
Data Export Methods
COPY TO - High-Performance Bulk Export
-- Basic syntax
COPY { table_name [ ( column_name [, ...] ) ] | ( query ) }
TO { 'filename' | STDOUT | PROGRAM 'command' }
[ [ WITH ] ( option [, ...] ) ]
-- Export examples
-- Complete table export
\COPY customers TO '/exports/customers_backup.txt';
-- Export to CSV format
\COPY customers TO '/exports/customers.csv'
WITH (FORMAT CSV, HEADER true, DELIMITER ',');
-- Export specific columns
\COPY customers (id, email, first_name, last_name)
TO '/exports/customer_contacts.csv'
WITH (FORMAT CSV, HEADER true);
-- Conditional data export
\COPY (
SELECT
o.id,
o.order_number,
c.email,
o.total_amount,
o.order_date
FROM orders o
JOIN customers c ON o.customer_id = c.id
WHERE o.order_date >= CURRENT_DATE - INTERVAL '30 days'
ORDER BY o.order_date DESC
) TO '/exports/recent_orders.csv'
WITH (FORMAT CSV, HEADER true);
-- Export with custom format
\COPY products TO '/exports/products.txt'
WITH (DELIMITER '|', NULL 'N/A', QUOTE '"');
-- Export to compressed file
COPY customers TO PROGRAM 'gzip > /exports/customers.csv.gz'
WITH (FORMAT CSV, HEADER true);
Streaming Export for Real-Time Sync
For real-time data synchronization, Tacnode supports streaming export through CDC mechanisms.
-- Incremental export based on timestamps
\COPY (
SELECT *
FROM customers
WHERE updated_at > '2024-01-01 00:00:00'::timestamp
ORDER BY updated_at
) TO '/exports/incremental_customers.csv'
WITH (FORMAT CSV, HEADER true);
For comprehensive streaming export capabilities, refer to the Change Data Capture (CDC) documentation.
Best Practices
Import Guidelines:
- Use
COPY FROMfor bulk imports (>10,000 rows) - Use
INSERTfor smaller datasets or complex logic - Consider hybrid approaches for very large datasets
- Validate data before import and use staging tables for transformations
Export Guidelines:
- Use indexes effectively for filtered exports
- Limit result sets appropriately
- Consider parallel exports for large tables
- Use compression for large exports
Java Development Best Practices:
- Use connection pooling for production applications
- Implement proper error handling and logging
- Validate data before bulk operations
- Use transactions for data integrity
- Monitor memory usage for large datasets
// Example with error handling and connection pooling
import com.zaxxer.hikari.HikariConfig;
import com.zaxxer.hikari.HikariDataSource;
public class TacnodeConnectionManager {
private HikariDataSource dataSource;
public TacnodeConnectionManager() {
HikariConfig config = new HikariConfig();
config.setJdbcUrl("jdbc:postgresql://tacnode-host:5432/dbname");
config.setUsername("your-username");
config.setPassword("your-password");
config.setMaximumPoolSize(20);
config.setMinimumIdle(5);
config.setConnectionTimeout(30000);
config.setIdleTimeout(600000);
config.setMaxLifetime(1800000);
this.dataSource = new HikariDataSource(config);
}
public void safeImport(String tableName, String filePath) {
try (Connection conn = dataSource.getConnection()) {
// Validate file exists and is readable
File file = new File(filePath);
if (!file.exists() || !file.canRead()) {
throw new IOException("File not accessible: " + filePath);
}
CopyManager copyManager = new CopyManager((BaseConnection) conn);
String sql = String.format("COPY %s FROM STDIN WITH (FORMAT CSV, HEADER true)", tableName);
try (FileInputStream fis = new FileInputStream(file)) {
long rows = copyManager.copyIn(sql, fis);
System.out.println("Successfully imported " + rows + " rows");
}
} catch (SQLException e) {
System.err.println("Database error: " + e.getMessage());
// Log error details, potentially retry
} catch (IOException e) {
System.err.println("File I/O error: " + e.getMessage());
} catch (Exception e) {
System.err.println("Unexpected error: " + e.getMessage());
}
}
public void close() {
if (dataSource != null) {
dataSource.close();
}
}
}
Security:
- Use dedicated database users for import/export operations
- Implement least-privilege access principles
- Encrypt sensitive data in export files
- Use secure file transfer methods
For related topics, see: