Elasticsearch to Tacnode Migration Guide

This guide provides a comprehensive, step-by-step approach for migrating data and applications from Elasticsearch (ES) to Tacnode. It covers environment preparation, data export/import, schema design, application refactoring, best practices, and troubleshooting. All instructions are written in clear, native English for maximum readability.

Overview

Elasticsearch is a distributed, RESTful search engine widely used for full-text search, log analytics, and real-time data analysis. Tacnode is a distributed, relational, and semi-structured database platform. Migrating from ES to Tacnode enables you to leverage advanced SQL, flexible storage, and robust security features.

Preparation

Environment Requirements

  • Elasticsearch 6.x or newer
  • Tacnode database instance
  • Node.js environment (for export tools)
  • Network connectivity between ES and Tacnode (preferably internal)

Permissions

  • Access to the Elasticsearch cluster
  • Privileges to create databases and tables in Tacnode
  • Permission to import data into Tacnode

Network Configuration

Ensure network connectivity between the source ES cluster and the target Tacnode instance.

Migration Steps

1. Assess Source Data

Before migrating, evaluate your ES data:

  • Check cluster health:
    curl -X GET "localhost:9200/_cluster/health?pretty"
  • List all indices:
    curl -X GET "localhost:9200/_cat/indices?v"
  • View index mappings:
    curl -X GET "localhost:9200/index_name/_mapping?pretty"
  • Get index stats:
    curl -X GET "localhost:9200/index_name/_stats?pretty"

2. Prepare Target Environment

Create the target database and user in Tacnode:

CREATE DATABASE search_engine;
CREATE USER "es_migration" WITH PASSWORD 'secure_password';
GRANT ALL PRIVILEGES ON DATABASE search_engine TO "es_migration";

3. Export Data from Elasticsearch

Install Node.js and the elasticdump tool:

npm install -g elasticdump

Export all data:

elasticdump --input='http://localhost:9200/' --output='dump.json' --timeout=600000

Export a specific index:

elasticdump --input='http://localhost:9200/INDEX_NAME' --output='dump.json' --timeout=600000

4. Create Target Tables in Tacnode

Design tables based on your ES data structure. For semi-structured documents, use JSONB:

CREATE TABLE gharchive (
  id TEXT NOT NULL PRIMARY KEY,
  doc JSONB
) USING hybrid; -- Choose row, columnar, or hybrid storage as needed

5. ETL: Import Data into Tacnode

Use Python to import data:

import json
import psycopg2
 
def make_connection():
    return psycopg2.connect('host=... port=5432 dbname=search_engine user=es_migration password=secure_password')
 
def main():
    inited_tables = set()
    with open('dump.json', 'r', encoding='utf8') as fd, make_connection() as conn:
        with conn.cursor() as cursor:
            for line in fd:
                line = line.strip()
                if not line:
                    continue
                payload = json.loads(line)
                index_name = payload['_index']
                document_id = payload['_id']
                document = payload['_source']
                if index_name not in inited_tables:
                    init_sql = f"""
                    CREATE TABLE IF NOT EXISTS {index_name} (
                        id TEXT NOT NULL PRIMARY KEY,
                        doc JSONB
                    ) USING hybrid;
                    """
                    cursor.execute(init_sql)
                    conn.commit()
                    inited_tables.add(index_name)
                insert_sql = f"""
                INSERT INTO {index_name} (id, doc) VALUES (%s, %s)
                """
                cursor.execute(insert_sql, (document_id, json.dumps(document)))
                conn.commit()
 
if __name__ == '__main__':
    main()

Application Refactoring

Update Connection Configuration

Replace ES connection settings with Tacnode's JDBC configuration:

# Old ES config
elasticsearch.host=localhost
elasticsearch.port=9200
 
# New Tacnode config
jdbc.url=jdbc:postgresql://tacnode-host:5432/search_engine
jdbc.username=es_migration
jdbc.password=secure_password

CRUD Operations

Insert Document

  • ES:
    curl -X POST "localhost:9200/index_name/_doc/123" -H 'Content-Type: application/json' -d '{"key": "val"}'
  • Tacnode:
    INSERT INTO index_name (id, doc) VALUES ('123', '{"key": "val"}'::JSONB);
    -- Upsert
    INSERT INTO index_name (id, doc) VALUES ('123', '{"key": "val"}'::JSONB)
    ON CONFLICT(id) DO UPDATE SET doc=EXCLUDED.doc;
    -- Partial update
    UPDATE index_name SET doc=jsonb_set(doc, '{flag}', 'false') WHERE id='123';

Retrieve Document

  • ES:
    curl http://localhost:9200/index_name/_doc/123
  • Tacnode:
    SELECT doc FROM index_name WHERE id='123';
    SELECT doc->>'id' AS "id", doc->>'type' AS "type" FROM index_name WHERE id='123';

Delete Document

  • ES:
    curl -XDELETE http://localhost:9200/index_name/_doc/123
  • Tacnode:
    DELETE FROM index_name WHERE id='123';

Search Operations

Full-Text Search

  • Create split_gin index for full-text search:
    CREATE INDEX ON index_name USING split_gin(to_tsvector('english', doc->>'key'), tsvector_ops);
    CREATE INDEX ON index_name USING split_gin(doc->>'key', gin_trgm_ops);
  • Exact match:
    SELECT doc FROM index_name WHERE (doc->>'key') = 'val' LIMIT 100;
  • Fuzzy match:
    SELECT doc FROM index_name WHERE (doc->>'key') LIKE '%val%' LIMIT 100;
  • Full-text search:
    SELECT doc FROM index_name WHERE to_tsvector('english', (doc->>'key')) @@ to_tsquery('english', 'val') LIMIT 100;
  • Trigram similarity:
    SELECT doc FROM index_name WHERE (doc->>'key') % 'val' LIMIT 100;

Aggregation Queries

  • NDV (distinct count):
    SELECT COUNT(DISTINCT (doc->>'fees')) AS ndv FROM index_name;
  • Max/Min:
    SELECT max(fees) FROM index_name;
    SELECT min(fees) FROM index_name;
  • Sum:
    SELECT sum(fees) FROM index_name;

Pagination

  • ES uses from and size parameters.
  • Tacnode uses OFFSET and LIMIT:
    SELECT * FROM index_name OFFSET 20 LIMIT 10;

Best Practices

Pre-Migration

  • Assess ES cluster size and index count
  • Analyze document structure and field types
  • Identify large or complex documents
  • Plan migration during off-peak hours
  • Prepare rollback and testing plans
  • Train your team on Tacnode

During Migration

  • Migrate non-critical data first
  • Gradually migrate core business data
  • Run old and new systems in parallel for validation
  • Monitor export/import progress and system performance
  • Document issues and solutions

Post-Migration

  • Create appropriate indexes based on query patterns
  • Optimize table storage (row/columnar/hybrid)
  • Tune system parameters for workload
  • Harden security: IP whitelists, granular permissions, audit logs

Troubleshooting & FAQ

Data Type Mapping Issues

  • ES's dynamic types may not match Tacnode's strong types
  • Use JSONB for dynamic data
  • Validate and convert types at the application layer

Full-Text Search Performance

  • Design indexes carefully
  • Use proper text search configuration
  • Consider materialized views for complex queries

Aggregation Differences

  • ES and SQL aggregation functions differ
  • Adapt aggregation logic in your application
  • Use window functions for advanced needs

Pagination Differences

  • ES: from and size parameters
  • Tacnode: OFFSET and LIMIT