Elasticsearch to Tacnode Migration Guide

This guide provides a comprehensive, step-by-step approach for migrating data and applications from Elasticsearch (ES) to Tacnode. It covers environment preparation, data export/import, schema design, application refactoring, best practices, and troubleshooting. All instructions are written in clear, native English for maximum readability.

Overview

Elasticsearch is a distributed, RESTful search engine widely used for full-text search, log analytics, and real-time data analysis. Tacnode is a distributed, relational, and semi-structured database platform. Migrating from ES to Tacnode enables you to leverage advanced SQL, flexible storage, and robust security features.

Preparation

Environment Requirements

Elasticsearch 6.x or newer
Tacnode database instance
Node.js environment (for export tools)
Network connectivity between ES and Tacnode (preferably internal)

Permissions

Access to the Elasticsearch cluster
Privileges to create databases and tables in Tacnode
Permission to import data into Tacnode

Check cluster health:
curl -X GET "localhost:9200/_cluster/health?pretty"
List all indices:
curl -X GET "localhost:9200/_cat/indices?v"
View index mappings:
curl -X GET "localhost:9200/index_name/_mapping?pretty"
Get index stats:
curl -X GET "localhost:9200/index_name/_stats?pretty"

2. Prepare Target Environment

Create the target database and user in Tacnode:

CREATE DATABASE search_engine;
CREATE USER "es_migration" WITH PASSWORD 'secure_password';
GRANT ALL PRIVILEGES ON DATABASE search_engine TO "es_migration";

3. Export Data from Elasticsearch

Install Node.js and the elasticdump tool:

npm install -g elasticdump

Export all data:

elasticdump --input='http://localhost:9200/' --output='dump.json' --timeout=600000

Export a specific index:

elasticdump --input='http://localhost:9200/INDEX_NAME' --output='dump.json' --timeout=600000

4. Create Target Tables in Tacnode

Design tables based on your ES data structure. For semi-structured documents, use JSONB:

CREATE TABLE gharchive (
  id TEXT NOT NULL PRIMARY KEY,
  doc JSONB
) USING hybrid; -- Choose row, columnar, or hybrid storage as needed

5. ETL: Import Data into Tacnode

Use Python to import data:

import json
import psycopg2
 
def make_connection():
    return psycopg2.connect('host=... port=5432 dbname=search_engine user=es_migration password=secure_password')
 
def main():
    inited_tables = set()
    with open('dump.json', 'r', encoding='utf8') as fd, make_connection() as conn:
        with conn.cursor() as cursor:
            for line in fd:
                line = line.strip()
                if not line:
                    continue
                payload = json.loads(line)
                index_name = payload['_index']
                document_id = payload['_id']
                document = payload['_source']
                if index_name not in inited_tables:
                    init_sql = f"""
                    CREATE TABLE IF NOT EXISTS {index_name} (
                        id TEXT NOT NULL PRIMARY KEY,
                        doc JSONB
                    ) USING hybrid;
                    """
                    cursor.execute(init_sql)
                    conn.commit()
                    inited_tables.add(index_name)
                insert_sql = f"""
                INSERT INTO {index_name} (id, doc) VALUES (%s, %s)
                """
                cursor.execute(insert_sql, (document_id, json.dumps(document)))
                conn.commit()
 
if __name__ == '__main__':
    main()

Application Refactoring

Update Connection Configuration

Replace ES connection settings with Tacnode's JDBC configuration:

# Old ES config
elasticsearch.host=localhost
elasticsearch.port=9200
 
# New Tacnode config
jdbc.url=jdbc:postgresql://tacnode-host:5432/search_engine
jdbc.username=es_migration
jdbc.password=secure_password

CRUD Operations

Insert Document

ES:
curl -X POST "localhost:9200/index_name/_doc/123" -H 'Content-Type: application/json' -d '{"key": "val"}'

Tacnode:

INSERT INTO index_name (id, doc) VALUES ('123', '{"key": "val"}'::JSONB);
-- Upsert
INSERT INTO index_name (id, doc) VALUES ('123', '{"key": "val"}'::JSONB)
ON CONFLICT(id) DO UPDATE SET doc=EXCLUDED.doc;
-- Partial update
UPDATE index_name SET doc=jsonb_set(doc, '{flag}', 'false') WHERE id='123';

Retrieve Document

ES:
curl http://localhost:9200/index_name/_doc/123
Tacnode:
SELECT doc FROM index_name WHERE id='123'; SELECT doc->>'id' AS "id", doc->>'type' AS "type" FROM index_name WHERE id='123';

Delete Document

ES:
curl -XDELETE http://localhost:9200/index_name/_doc/123
Tacnode:
DELETE FROM index_name WHERE id='123';

Search Operations

Full-Text Search

Create split_gin index for full-text search:
CREATE INDEX ON index_name USING split_gin(to_tsvector('english', doc->>'key'), tsvector_ops); CREATE INDEX ON index_name USING split_gin(doc->>'key', gin_trgm_ops);
Exact match:
SELECT doc FROM index_name WHERE (doc->>'key') = 'val' LIMIT 100;
Fuzzy match:
SELECT doc FROM index_name WHERE (doc->>'key') LIKE '%val%' LIMIT 100;
Full-text search:
SELECT doc FROM index_name WHERE to_tsvector('english', (doc->>'key')) @@ to_tsquery('english', 'val') LIMIT 100;
Trigram similarity:
SELECT doc FROM index_name WHERE (doc->>'key') % 'val' LIMIT 100;

Aggregation Queries

NDV (distinct count):
SELECT COUNT(DISTINCT (doc->>'fees')) AS ndv FROM index_name;
Max/Min:
SELECT max(fees) FROM index_name; SELECT min(fees) FROM index_name;
Sum:
SELECT sum(fees) FROM index_name;