GuidesData Sync

Data Sync Job Operations

Once a synchronization job is configured and created, you can manage it through various operations on the Tacnode platform. This guide covers the complete job lifecycle and monitoring capabilities.

Job Lifecycle Management

Data Sync jobs support the following operations:

  • Start: Initialize and begin the synchronization process
  • Pause: Temporarily halt synchronization while preserving state
  • Resume: Continue synchronization from where it was paused
  • Restart: Reset job state and begin synchronization from the beginning
  • Delete: Remove the job and all associated data

Job States

Understanding job states helps you monitor and manage synchronization effectively:

Initializing

  • Trigger: Job is starting up and establishing connections
  • Available Actions: Pause
  • Next State: Running (on success) or Failed (on error)

Running

  • Description: Job is actively synchronizing data
  • Available Actions: Pause
  • Behavior: Continues until manually paused or an error occurs

Paused

  • Description: Job is temporarily stopped but retains its position
  • Available Actions: Resume, Restart
  • Important: For PostgreSQL sources, extended pause periods may cause WAL log accumulation and increased disk usage

Succeeded

  • Description: Full synchronization jobs that completed successfully
  • Note: Only applies to full synchronization jobs; incremental jobs remain in "Running" state

Failed

  • Description: Job encountered an error and stopped
  • Available Actions: Restart (after addressing the issue)

Managing Jobs

Resume vs. Restart:

  • Resume: Continues from the last synchronized position, preserving progress
  • Restart: Resets all progress and begins synchronization from the beginning

Deletion Requirements: Jobs can only be deleted when they are not active (states: never started, paused, succeeded, or failed). Deletion removes all job status and history information permanently.

Job History and Instances

Each time a job is started (including restarts), a new job instance is created. This provides detailed tracking of job execution over time.

Viewing Job History

The job history shows:

  • Instance ID: Unique identifier for each execution
  • Start Time: When the instance began
  • Duration: How long the instance ran
  • Status: Final state of the instance
  • Records Processed: Total number of records synchronized
  • Data Volume: Total bytes transferred

Job Monitoring and Metrics

The Data Sync dashboard provides six key performance indicators to help you monitor job health and performance:

Performance Metrics

Latency Indicators

  • Idle Time: Milliseconds between current system time and the last data entry point for synchronization

    • Low Value: No significant delay
    • High Value: Check if source has new data or if job is experiencing delays
  • Emit Event Time: Milliseconds between the last data processing time and the most recent data entry

    • Low and Stable: Normal operation with no source data
    • High and Increasing: Significant job delay requiring attention

Throughput Metrics

  • Instance Records: Total count of records processed by the job
  • Instance Bytes: Total number of bytes handled by the job
  • Instance RPS: Records processed per second (current rate)
  • Instance BPS: Bytes processed per second (current rate)

Interpreting Metrics

Normal Operation:

  • Idle Time: Minimal (< 1000ms)
  • Emit Event Time: Low and stable
  • RPS/BPS: Consistent with data arrival patterns

No Source Data:

  • Idle Time: High but stable
  • Emit Event Time: Low and unchanged
  • RPS/BPS: Zero or very low

Performance Issues:

  • Idle Time: High and increasing
  • Emit Event Time: High and increasing (approximately equal to Idle Time)
  • RPS/BPS: Below expected levels

Best Practices

Monitoring

  1. Regular Health Checks: Monitor Idle Time and Emit Event Time regularly
  2. Set Alerts: Configure alerts for prolonged high latency values
  3. Track Trends: Monitor RPS and BPS trends to identify performance degradation

Operational Management

  1. Planned Maintenance: Use Pause/Resume for scheduled maintenance windows
  2. Error Recovery: Investigate failed jobs before restarting
  3. Resource Planning: Monitor throughput metrics to plan for capacity changes

PostgreSQL Specific

  1. WAL Management: Avoid extended pause periods to prevent disk space issues
  2. Connection Monitoring: Ensure replication connections remain stable
  3. Performance Tuning: Monitor database performance impact during synchronization