Elasticsearch to Tacnode Migration Guide
Detailed migration guide for transferring data and applications from Elasticsearch to Tacnode
This guide provides a comprehensive, step-by-step approach for migrating data and applications from Elasticsearch (ES) to Tacnode. It covers environment preparation, data export/import, schema design, application refactoring, best practices, and troubleshooting. All instructions are written in clear, native English for maximum readability.
Overview
Elasticsearch is a distributed, RESTful search engine widely used for full-text search, log analytics, and real-time data analysis. Tacnode is a distributed, relational, and semi-structured database platform. Migrating from ES to Tacnode enables you to leverage advanced SQL, flexible storage, and robust security features.
Preparation
Environment Requirements
- Elasticsearch 6.x or newer
- Tacnode database instance
- Node.js environment (for export tools)
- Network connectivity between ES and Tacnode (preferably internal)
Permissions
- Access to the Elasticsearch cluster
- Privileges to create databases and tables in Tacnode
- Permission to import data into Tacnode
Network Configuration
Ensure network connectivity between the source ES cluster and the target Tacnode instance.
Migration Steps
1. Assess Source Data
Before migrating, evaluate your ES data:
- Check cluster health:
curl -X GET "localhost:9200/_cluster/health?pretty" - List all indices:
curl -X GET "localhost:9200/_cat/indices?v" - View index mappings:
curl -X GET "localhost:9200/index_name/_mapping?pretty" - Get index stats:
curl -X GET "localhost:9200/index_name/_stats?pretty"
2. Prepare Target Environment
Create the target database and user in Tacnode:
CREATE DATABASE search_engine;
CREATE USER "es_migration" WITH PASSWORD 'secure_password';
GRANT ALL PRIVILEGES ON DATABASE search_engine TO "es_migration";
3. Export Data from Elasticsearch
Install Node.js and the elasticdump tool:
npm install -g elasticdump
Export all data:
elasticdump --input='http://localhost:9200/' --output='dump.json' --timeout=600000
Export a specific index:
elasticdump --input='http://localhost:9200/INDEX_NAME' --output='dump.json' --timeout=600000
4. Create Target Tables in Tacnode
Design tables based on your ES data structure. For semi-structured documents, use JSONB:
CREATE TABLE gharchive (
id TEXT NOT NULL PRIMARY KEY,
doc JSONB
) USING hybrid; -- Choose row, columnar, or hybrid storage as needed
5. ETL: Import Data into Tacnode
Use Python to import data:
import json
import psycopg2
def make_connection():
return psycopg2.connect('host=... port=5432 dbname=search_engine user=es_migration password=secure_password')
def main():
inited_tables = set()
with open('dump.json', 'r', encoding='utf8') as fd, make_connection() as conn:
with conn.cursor() as cursor:
for line in fd:
line = line.strip()
if not line:
continue
payload = json.loads(line)
index_name = payload['_index']
document_id = payload['_id']
document = payload['_source']
if index_name not in inited_tables:
init_sql = f"""
CREATE TABLE IF NOT EXISTS {index_name} (
id TEXT NOT NULL PRIMARY KEY,
doc JSONB
) USING hybrid;
"""
cursor.execute(init_sql)
conn.commit()
inited_tables.add(index_name)
insert_sql = f"""
INSERT INTO {index_name} (id, doc) VALUES (%s, %s)
"""
cursor.execute(insert_sql, (document_id, json.dumps(document)))
conn.commit()
if __name__ == '__main__':
main()
Application Refactoring
Update Connection Configuration
Replace ES connection settings with Tacnode’s JDBC configuration:
# Old ES config
elasticsearch.host=localhost
elasticsearch.port=9200
# New Tacnode config
jdbc.url=jdbc:postgresql://tacnode-host:5432/search_engine
jdbc.username=es_migration
jdbc.password=secure_password
CRUD Operations
Insert Document
- ES:
curl -X POST "localhost:9200/index_name/_doc/123" -H 'Content-Type: application/json' -d '{"key": "val"}' - Tacnode:
INSERT INTO index_name (id, doc) VALUES ('123', '{"key": "val"}'::JSONB); -- Upsert INSERT INTO index_name (id, doc) VALUES ('123', '{"key": "val"}'::JSONB) ON CONFLICT(id) DO UPDATE SET doc=EXCLUDED.doc; -- Partial update UPDATE index_name SET doc=jsonb_set(doc, '{flag}', 'false') WHERE id='123';
Retrieve Document
- ES:
curl http://localhost:9200/index_name/_doc/123 - Tacnode:
SELECT doc FROM index_name WHERE id='123'; SELECT doc->>'id' AS "id", doc->>'type' AS "type" FROM index_name WHERE id='123';
Delete Document
- ES:
curl -XDELETE http://localhost:9200/index_name/_doc/123 - Tacnode:
DELETE FROM index_name WHERE id='123';
Search Operations
Full-Text Search
- Create split_gin index for full-text search:
CREATE INDEX ON index_name USING split_gin(to_tsvector('english', doc->>'key'), tsvector_ops); CREATE INDEX ON index_name USING split_gin(doc->>'key', gin_trgm_ops); - Exact match:
SELECT doc FROM index_name WHERE (doc->>'key') = 'val' LIMIT 100; - Fuzzy match:
SELECT doc FROM index_name WHERE (doc->>'key') LIKE '%val%' LIMIT 100; - Full-text search:
SELECT doc FROM index_name WHERE to_tsvector('english', (doc->>'key')) @@ to_tsquery('english', 'val') LIMIT 100; - Trigram similarity:
SELECT doc FROM index_name WHERE (doc->>'key') % 'val' LIMIT 100;
Aggregation Queries
- NDV (distinct count):
SELECT COUNT(DISTINCT (doc->>'fees')) AS ndv FROM index_name; - Max/Min:
SELECT max(fees) FROM index_name; SELECT min(fees) FROM index_name; - Sum:
SELECT sum(fees) FROM index_name;
Pagination
- ES uses
fromandsizeparameters. - Tacnode uses
OFFSETandLIMIT:SELECT * FROM index_name OFFSET 20 LIMIT 10;
Best Practices
Pre-Migration
- Assess ES cluster size and index count
- Analyze document structure and field types
- Identify large or complex documents
- Plan migration during off-peak hours
- Prepare rollback and testing plans
- Train your team on Tacnode
During Migration
- Migrate non-critical data first
- Gradually migrate core business data
- Run old and new systems in parallel for validation
- Monitor export/import progress and system performance
- Document issues and solutions
Post-Migration
- Create appropriate indexes based on query patterns
- Optimize table storage (row/columnar/hybrid)
- Tune system parameters for workload
- Harden security: IP whitelists, granular permissions, audit logs
Troubleshooting & FAQ
Data Type Mapping Issues
- ES’s dynamic types may not match Tacnode’s strong types
- Use JSONB for dynamic data
- Validate and convert types at the application layer
Full-Text Search Performance
- Design indexes carefully
- Use proper text search configuration
- Consider materialized views for complex queries
Aggregation Differences
- ES and SQL aggregation functions differ
- Adapt aggregation logic in your application
- Use window functions for advanced needs
Pagination Differences
- ES:
fromandsizeparameters - Tacnode:
OFFSETandLIMIT