Data Sync Capacity Planning
Learn how to plan and select appropriate DataSync specifications based on data volume, sync frequency, and performance requirements to ensure efficient and cost-effective data synchronization.
Before initiating data synchronization tasks, proper capacity planning is crucial for ensuring that sync operations run efficiently and reliably. Different data synchronization tasks have varying resource requirements, and selecting the appropriate sync specification not only guarantees performance but also effectively controls costs. By understanding the performance limits of different specifications, we can choose the most suitable configuration based on actual data volume, synchronization frequency, and performance requirements, avoiding resource waste or performance bottlenecks.
Tacnode DataSync defines three specifications based on data synchronization pipeline performance limits: Small, Medium, and Large. The performance limits for each DataSync specification are shown in the table below (RPS = Records Per Second):
| DataSync Specification | Performance Limit (RPS) Reference |
|---|---|
| Small | 2,000 |
| Medium | 5,000 |
| Large | 7,000 |
Note: The actual performance of specifications running in production is influenced by factors such as network environment, source instance performance, target instance performance, and latency. Real-world performance values may vary from these reference limits. When evaluating, ensure that the source instance can handle pressure greater than or equal to the performance limits of each specification, and that the target instance’s write performance is not a bottleneck.
Specification Selection Guidelines
Small Specification (2,000 RPS)
Ideal for:
- Small to medium-sized databases (< 100GB)
- Low-frequency batch synchronization
- Development and testing environments
- Periodic data migrations with relaxed latency requirements
Use Cases:
- Daily batch synchronization of transactional data
- Synchronizing lookup tables and reference data
- One-time migration of historical data
- Development environment data refresh
Resource Characteristics:
- CPU: 2 cores
- Memory: 4GB
- Network: Standard bandwidth
- Concurrent connections: Limited
Medium Specification (5,000 RPS)
Ideal for:
- Medium to large-sized databases (100GB - 1TB)
- Real-time or near real-time synchronization
- Production environments with moderate load
- CDC (Change Data Capture) scenarios
Use Cases:
- Real-time synchronization for reporting databases
- Cross-region data replication
- ETL pipelines with moderate throughput requirements
- Multi-source data consolidation
Resource Characteristics:
- CPU: 4 cores
- Memory: 8GB
- Network: Enhanced bandwidth
- Concurrent connections: Moderate
Large Specification (7,000 RPS)
Ideal for:
- Large-scale databases (> 1TB)
- High-throughput real-time synchronization
- Mission-critical production environments
- Complex multi-table synchronization with high transaction volumes
Use Cases:
- High-volume transactional system replication
- Real-time analytics and data warehousing
- Cross-cloud data synchronization
- Large-scale database migrations
Resource Characteristics:
- CPU: 8+ cores
- Memory: 16GB+
- Network: Premium bandwidth
- Concurrent connections: High
Performance Factors and Optimization
Source Database Considerations
Database Performance Impact:
- CPU Utilization: High CPU usage on source database can limit read performance
- Memory Available: Insufficient memory may cause slower query execution
- Storage I/O: Disk performance directly affects data retrieval speed
- Connection Pool: Limited connections can become a bottleneck
Optimization Strategies:
-- Monitor source database performance
SELECT
datname,
numbackends,
xact_commit,
xact_rollback,
blks_read,
blks_hit,
tup_returned,
tup_fetched
FROM pg_stat_database
WHERE datname = 'your_source_db';
-- Check for long-running queries that might impact sync
SELECT
pid,
state,
query_start,
query,
wait_event_type,
wait_event
FROM pg_stat_activity
WHERE state = 'active'
AND query_start < NOW() - INTERVAL '5 minutes';
Target Database Considerations
Write Performance Factors:
- Insert/Update throughput: Target database must handle incoming write operations
- Index maintenance: Extensive indexing can slow down write operations
- Transaction log processing: WAL write performance impacts overall throughput
- Lock contention: Concurrent access patterns affect write performance
Optimization Strategies:
-- Monitor target database write performance
SELECT
schemaname,
tablename,
n_tup_ins,
n_tup_upd,
n_tup_del,
n_tup_hot_upd
FROM pg_stat_user_tables
ORDER BY (n_tup_ins + n_tup_upd + n_tup_del) DESC;
-- Check for write bottlenecks
SELECT
wait_event_type,
wait_event,
COUNT(*) as wait_count
FROM pg_stat_activity
WHERE state = 'active'
GROUP BY wait_event_type, wait_event
ORDER BY wait_count DESC;
Network and Connectivity
Network Performance Factors:
- Bandwidth: Available network capacity between source and target
- Latency: Round-trip time affects synchronization speed
- Packet Loss: Network instability can cause retransmissions
- Security Overhead: SSL/TLS encryption adds processing overhead
Monitoring Network Performance:
# Test network latency
ping -c 10 target-database-host
# Test network bandwidth
iperf3 -c target-database-host -t 30
# Monitor network utilization
netstat -i
iostat -x 1
Sizing Calculator and Estimation
Data Volume Assessment
Calculate Daily Data Growth:
-- Estimate daily insert volume
SELECT
schemaname,
tablename,
n_tup_ins as daily_inserts,
n_tup_upd as daily_updates,
n_tup_del as daily_deletes,
(n_tup_ins + n_tup_upd + n_tup_del) as total_daily_changes
FROM pg_stat_user_tables
WHERE schemaname = 'your_schema'
ORDER BY total_daily_changes DESC;
-- Calculate average row size
SELECT
schemaname,
tablename,
pg_total_relation_size(schemaname||'.'||tablename) as table_size_bytes,
n_live_tup as row_count,
CASE
WHEN n_live_tup > 0
THEN pg_total_relation_size(schemaname||'.'||tablename) / n_live_tup
ELSE 0
END as avg_row_size_bytes
FROM pg_stat_user_tables
WHERE schemaname = 'your_schema'
AND n_live_tup > 0;
Capacity Planning Formula:
Required RPS = (Peak Daily Changes × Safety Factor) / (24 × 3600)
Where:
- Peak Daily Changes = Maximum daily insert/update/delete operations
- Safety Factor = 1.5 - 3.0 (depending on growth expectations)
- Result should be compared against specification limits
Specification Selection Matrix
| Data Characteristics | Recommended Specification | Rationale |
|---|---|---|
| < 1,000 RPS required | Small | Cost-effective for low-volume scenarios |
| 1,000 - 3,000 RPS required | Medium | Balanced performance and cost |
| 3,000 - 5,000 RPS required | Medium (with monitoring) | Close monitoring recommended |
| 5,000 - 7,000 RPS required | Large | Optimal performance zone |
| > 7,000 RPS required | Large (with optimization) | May require additional tuning |
Monitoring and Performance Tuning
Key Metrics to Monitor
DataSync Performance Metrics:
- Synchronization Lag: Time delay between source and target
- Records Per Second: Actual throughput achieved
- Error Rate: Percentage of failed synchronization attempts
- Memory Usage: DataSync process memory consumption
- CPU Utilization: Processing overhead
Database-Specific Metrics:
-- Monitor replication lag (for CDC scenarios)
SELECT
slot_name,
active,
restart_lsn,
confirmed_flush_lsn,
pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn) as lag_bytes
FROM pg_replication_slots;
-- Check table-level synchronization status
SELECT
schemaname,
tablename,
last_analyze,
last_autoanalyze,
n_mod_since_analyze
FROM pg_stat_user_tables
WHERE n_mod_since_analyze > 1000;
Performance Optimization Strategies
Source Database Optimization:
- Index Optimization: Ensure proper indexing for sync queries
- Query Tuning: Optimize DataSync extraction queries
- Connection Pooling: Use connection pooling to reduce overhead
- Batch Size Tuning: Adjust batch sizes for optimal throughput
Target Database Optimization:
- Bulk Loading: Use bulk insert operations when possible
- Constraint Checking: Temporarily disable non-critical constraints during large loads
- Parallel Processing: Configure parallel workers for write operations
- Maintenance Windows: Schedule intensive operations during low-traffic periods
Network Optimization:
- Compression: Enable data compression for network transfer
- Connection Reuse: Maintain persistent connections
- Regional Placement: Place DataSync instances close to data sources
- SSL Optimization: Use efficient SSL/TLS configurations
Scaling and Upgrade Considerations
When to Scale Up
Indicators for Specification Upgrade:
- Consistently hitting RPS limits
- Increasing synchronization lag
- High CPU or memory utilization (> 80%)
- Growing error rates
- Business requirements for faster synchronization
Scaling Strategies
Vertical Scaling (Specification Upgrade):
Small → Medium → Large
Horizontal Scaling (Multiple DataSync Instances):
- Partition data by schema or table
- Use multiple DataSync jobs for different data sets
- Implement round-robin or hash-based distribution
Hybrid Approaches:
- Combine different specifications for different data types
- Use Large specifications for high-volume tables
- Use Small specifications for infrequent reference data
Cost Optimization
Cost-Performance Analysis
Specification Cost Comparison:
- Small: Lowest cost, suitable for basic requirements
- Medium: Balanced cost-performance ratio
- Large: Highest cost, maximum performance
Cost Optimization Strategies:
- Right-sizing: Avoid over-provisioning for current needs
- Scheduling: Use lower specifications during off-peak hours
- Data Filtering: Sync only necessary data to reduce volume
- Incremental Sync: Prefer incremental over full synchronization
Resource Efficiency Tips
Minimize Data Transfer:
-- Use column filtering to reduce bandwidth
SELECT essential_column1, essential_column2
FROM large_table
WHERE last_modified > :last_sync_time;
-- Implement efficient change detection
CREATE INDEX idx_last_modified ON large_table(last_modified);
Optimize Sync Frequency:
- Balance between data freshness requirements and resource costs
- Use different sync frequencies for different data types
- Implement event-driven synchronization where possible
Troubleshooting Performance Issues
Common Performance Problems
1. High Latency Issues:
-- Check for blocking queries
SELECT
blocked_locks.pid AS blocked_pid,
blocked_activity.usename AS blocked_user,
blocking_locks.pid AS blocking_pid,
blocking_activity.usename AS blocking_user,
blocked_activity.query AS blocked_statement,
blocking_activity.query AS blocking_statement
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_locks blocking_locks ON blocking_locks.locktype = blocked_locks.locktype
JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.granted;
2. Memory Issues:
-- Monitor memory usage patterns
SELECT
setting as shared_buffers_mb,
(setting::bigint * 8192) / 1024 / 1024 as shared_buffers_size_mb
FROM pg_settings
WHERE name = 'shared_buffers';
3. Connection Pool Exhaustion:
-- Check connection usage
SELECT
count(*) as total_connections,
count(*) FILTER (WHERE state = 'active') as active_connections,
count(*) FILTER (WHERE state = 'idle') as idle_connections
FROM pg_stat_activity;
Resolution Strategies
Performance Tuning Checklist:
- Verify specification selection matches workload
- Check database statistics are up to date
- Monitor system resources (CPU, memory, I/O)
- Analyze slow queries and optimize
- Review network performance and connectivity
- Validate DataSync configuration parameters
- Consider data partitioning strategies
- Implement proper monitoring and alerting
By following these capacity planning guidelines, you can ensure that your Tacnode DataSync implementation delivers optimal performance while maintaining cost efficiency and reliability.