GuidesData Sync

Example - Sync from PostgreSQL to Tacnode

This guide provides detailed instructions for setting up data synchronization from PostgreSQL to Tacnode, including prerequisites, permissions, and configuration requirements.

Prerequisites

Network Configuration

Ensure network connectivity is properly configured. For detailed network setup instructions, see Data Sync Network Configuration.

Required Permissions

Permission requirements depend on the type of synchronization job:

Full Synchronization Only

For jobs that only perform full data synchronization, grant the following permissions to your synchronization user:

-- Grant SELECT permissions on all tables in specified schemas
postgres=# GRANT SELECT ON ALL TABLES IN SCHEMA ${schema1}, ${schema2} TO ${user};
 
-- Grant USAGE permissions on schemas
postgres=# GRANT USAGE ON SCHEMA ${schema1}, ${schema2} TO ${user};

Where:

  • ${schema1}, ${schema2}: Schema names that require synchronization
  • ${user}: The username for the synchronization job

Full + Incremental Synchronization

Incremental synchronization requires a dedicated user with replication permissions. Cloud providers typically don't allow granting replication permissions to existing users, so you must create a new user.

1. Connect with high-privilege account:

psql postgres://${admin_user}:${admin_password}@${host}:5432/${database_name}

2. Create CDC synchronization user:

CREATE USER ${cdc_user} WITH REPLICATION ENCRYPTED PASSWORD '${cdc_password}';

3. Grant necessary permissions to the CDC user:

-- Grant SELECT permissions on all tables
postgres=# GRANT SELECT ON ALL TABLES IN SCHEMA ${schema1}, ${schema2} TO ${cdc_user};
 
-- Grant USAGE permissions on schemas
postgres=# GRANT USAGE ON SCHEMA ${schema1}, ${schema2} TO ${cdc_user};

Incremental Synchronization Setup

Incremental synchronization requires additional PostgreSQL configuration:

1. Configure WAL Level

Set the Write-Ahead Logging (WAL) level to support logical replication:

Self-hosted PostgreSQL:

  1. Edit postgresql.conf:
    wal_level = logical
  2. Restart the PostgreSQL service

Cloud PostgreSQL (AWS RDS, GCP Cloud SQL, Azure Database):

  1. Access your cloud provider's database console
  2. Modify the database parameter group
  3. Set wal_level = logical
  4. Apply changes and restart the database instance

2. Create Publication

Publications define which tables are available for replication.

Connect with replication privileges:

psql -h ${host} -p 5432 -U ${admin_user} "dbname=${database_name} replication=database" -W

Create publication for all tables:

CREATE PUBLICATION datasync_pub FOR ALL TABLES;

Or create publication for specific tables:

CREATE PUBLICATION datasync_pub FOR TABLE ${schema1}.${table1}, ${schema2}.${table2};

Verify publication:

-- List all publications
SELECT * FROM pg_publication;
 
-- View tables included in publication
SELECT * FROM pg_publication_tables WHERE pubname = 'datasync_pub';

Data Source Configuration

Configure the PostgreSQL connection using these required parameters:

ParameterDescriptionExample
HostPostgreSQL server addresspostgres.example.com
PortPostgreSQL service port5432 (default)
DatabaseTarget database nameproduction_db
UsernameUser with required permissionssync_user or cdc_user
PasswordPassword for the specified usersecure_password

Connection String Format

For programmatic access, you can use connection strings:

postgresql://${username}:${password}@${host}:${port}/${database}

Default Object Mapping

Tacnode uses the following default mapping rule for PostgreSQL objects:

concat(SOURCE_DATABASE, '.', SOURCE_SCHEMA, '.', SOURCE_OBJECT)

Mapping Logic:

  • Source: database.schema.table (PostgreSQL format)
  • Target: database.schema.table (Tacnode format, preserving schema structure)

Example:

  • PostgreSQL table: ecommerce.public.users
  • Tacnode table: ecommerce.public.users

Data Type Mapping

PostgreSQL types map directly to Tacnode types in most cases. Array types are indicated with a leading underscore (_).

Integer Types

PostgreSQL TypeTacnode TypeDescription
INT2INT216-bit integer
_INT2_INT2Array of 16-bit integers
INT4INT432-bit integer
_INT4_INT4Array of 32-bit integers
INT8INT864-bit integer
_INT8_INT8Array of 64-bit integers

Character Types

PostgreSQL TypeTacnode TypeDescription
CHARCHARFixed-length character
_CHAR_CHARArray of fixed-length characters
BPCHARBPCHARBlank-padded character
_BPCHAR_BPCHARArray of blank-padded characters
VARCHARVARCHARVariable-length character
_VARCHAR_VARCHARArray of variable-length characters
TEXTTEXTVariable-length text
_TEXT_TEXTArray of text values

Numeric Types

PostgreSQL TypeTacnode TypeDescription
NUMERICNUMERICExact numeric
_NUMERIC_NUMERICArray of numeric values
FLOAT4FLOAT4Single precision floating point
_FLOAT4_FLOAT4Array of single precision floats
FLOAT8FLOAT8Double precision floating point
_FLOAT8_FLOAT8Array of double precision floats

Boolean Type

PostgreSQL TypeTacnode TypeDescription
BOOLBOOLBoolean value
_BOOL_BOOLArray of boolean values

Date and Time Types

PostgreSQL TypeTacnode TypeDescription
DATEDATEDate without time
_DATE_DATEArray of dates
TIMETIMETime without timezone
_TIME_TIMEArray of times
TIMETZTIMETZTime with timezone
_TIMETZ_TIMETZArray of times with timezone
TIMESTAMPTIMESTAMPDate and time without timezone
_TIMESTAMP_TIMESTAMPArray of timestamps
TIMESTAMPTZTIMESTAMPTZDate and time with timezone
_TIMESTAMPTZ_TIMESTAMPTZArray of timestamps with timezone
INTERVALINTERVALTime interval
_INTERVAL_INTERVALArray of intervals

Binary and Bit Types

PostgreSQL TypeTacnode TypeDescription
BYTEABYTEABinary data
_BYTEA_BYTEAArray of binary data
BITBITFixed-length bit string
_BIT_BITArray of bit strings
VARBITVARBITVariable-length bit string
_VARBIT_VARBITArray of variable-length bit strings

Special Types

PostgreSQL TypeTacnode TypeDescription
UUIDUUIDUniversally unique identifier
_UUID_UUIDArray of UUIDs
JSONJSONJSON data
_JSON_JSONArray of JSON values
JSONBJSONBBinary JSON data
_JSONB_JSONBArray of binary JSON values
XMLXMLXML data
_XML_XMLArray of XML values
NAMENAMEPostgreSQL name type
NAME_ARRAYNAME_ARRAYArray of names

Best Practices

Security

  1. Dedicated User: Use a dedicated user for synchronization with minimal required permissions
  2. Strong Passwords: Use strong, unique passwords for synchronization users
  3. Network Security: Use SSL/TLS encryption for connections when possible

Performance

  1. Publication Scope: Create publications for specific tables rather than all tables when possible
  2. Monitor WAL: Monitor WAL disk usage, especially during extended synchronization jobs
  3. Connection Pooling: Consider connection pooling for high-throughput scenarios

Maintenance

  1. Regular Monitoring: Monitor replication lag and synchronization health
  2. Publication Management: Keep publications up to date as schema changes occur
  3. User Management: Regularly review and rotate synchronization user credentials

Next Steps

After configuring your PostgreSQL data source:

  1. Test Connection: Verify connectivity using the Test Connection feature
  2. Select Objects: Choose which schemas and tables to synchronize
  3. Configure Mapping: Customize object and column mappings if needed
  4. Set Policies: Configure synchronization behavior and conflict resolution
  5. Monitor Job: Track synchronization progress and performance

For detailed steps on these configuration phases, refer to the Data Sync Quick Start guide.