Capacity Planning
Effective capacity planning in Tacnode involves understanding both storage and computing resource requirements for your specific use cases.
Storage Resources Planning
Data Compression Benefits
Tacnode uses LSM (Log Structured Merge Tree) architecture that provides:
- Efficient Storage: Data typically compresses to 1/3 of original CSV size
- Optimized Writes: Random writes converted to sequential writes for better performance
- Adaptive Compression: Dictionary encoding provides additional compression for low-cardinality columns
Estimating Storage Requirements
Historical Data
- Start with your current data size in CSV format
- Apply compression factor:
Original Size × 0.33 = Estimated Tacnode Size
- Columnar Benefits: Low-cardinality columns achieve even better compression rates
Incremental Data Growth Calculate daily storage growth using:
Example Calculation:
- 10M daily rows × 100 bytes/row × 0.33 = ~330MB daily growth
Storage Expansion Considerations
Compaction Process
- LSM architecture performs regular data compaction
- Temporary Expansion: Data may temporarily grow during compaction
- Final Size: Stabilizes after compaction completes
- Plan for 1.5-2x temporary storage during compaction periods
Computing Resources Planning
Computing capacity in Tacnode is measured in Units, which can be dynamically scaled based on workload requirements.
Sizing Guidelines
Transactional Workloads (High Write Activity)
- Ratio: 1 Unit per ~500GB of compressed data
- Best for: Real-time applications with frequent updates
- Characteristics: Consistent read/write operations, low latency requirements
Analytical Workloads (Read-Heavy, Cold Data)
- Ratio: 1 Unit per 1TB-2TB of compressed data
- Maximum: Do not exceed 2TB per Unit for optimal performance
- Best for: Batch processing, historical analysis, infrequent access patterns
Performance Considerations
Data Access Patterns
- Hot Data: Recent data (daily/weekly) requires more computing power
- Warm Data: Monthly data can use moderate Unit allocation
- Cold Data: Historical data can operate with fewer Units per TB
Query Complexity
- Complex aggregations and joins require additional computing capacity
- Simple queries and scans can operate efficiently with baseline Units
Real-World Examples
Example 1: E-commerce Supply Chain
Scenario: Real-time inventory and order processing
- Write QPS: 5K average, 28K peak
- Data Volume: 1.5TB total
- Daily Growth: 43GB raw → 14.3GB compressed
- Recommended: 4 Units (transactional workload pattern)
Example 2: Industrial IoT Analytics
Scenario: Sensor data with scheduled batch processing
- Data Volume: 20TB existing, 300GB daily raw → 100GB daily compressed
- Access Pattern: Mostly cold data, periodic analysis
- Recommended: 16 Units (analytical workload pattern, cold data optimization)
Optimization Best Practices
Start Conservative
- Begin with lower Unit allocation
- Monitor performance metrics
- Scale up based on actual usage patterns
Monitor Key Metrics
- Query response times
- CPU and memory utilization
- Storage I/O patterns
- Concurrent query capacity
Cost Optimization
- Use pause/resume functionality for non-production environments
- Right-size Units based on actual workload patterns
- Consider time-based scaling for predictable workloads