Features, embeddings, and AI signals — one system
A Context Lake is semantic — features, aggregates, vector search, and LLM-derived signals are all first-class context, computed and served within the same system as raw state.
Features, embeddings, and LLM signals each require different infrastructure to compute and query. When they live in separate systems, combining them at decision time means assembling results across different APIs — and no guarantees they agree.
No unified retrieval
3 separate APIs. 3 separate consistency models. Cross-system queries impossible at decision time.
Feature Store
Structured / aggregates
Vector Database
Embeddings / similarity
LLM Pipeline
Semantic signals
Three Kinds of Semantic Context
Derived context is not monolithic. Structured features, vector embeddings, and LLM-inferred signals have different computational models and different query semantics — but they all describe the same underlying state and should be consistent with it.
Structured
Features & Aggregates
Deterministic, SQL-derived context: rolling windows, ratios, velocity counts. Precise and reproducible. Filterable with exact predicates.
avg_order_value_7d
fraud_velocity_1h
demand_score
user_ltv
Embedding
Vector Representations
Dense vector encodings of objects or queries. Enable similarity search and nearest-neighbor retrieval — without exact predicates.
product_embedding
user_preference_vec
doc_embedding
query_vec
LLM-Derived
Model-Inferred Signals
Probabilistic interpretations that can't be expressed as SQL: classifications, sentiment, intent labels, entity extraction. Computed by a model, stored as context.
intent_label
sentiment_score
topic_cluster
entity_tags
Unified Derivation: All Signals, One Boundary
When derivation happens inside the same transactional boundary as raw state, derived signals are always consistent with the state they were computed from. No separate systems. No sync pipelines. No cross-system coordination.
SELECT avg(order_value) FROM orders WHERE user_id = ? AND ts > now() - interval '7d'
Aggregates and features computed directly over transactional state — no separate feature store.
One system. One consistency model. All signal types queryable and filterable together.
Selecting the Right Context at Query Time
Semantic context isn't only about how signals are constructed — it's also about how agents select and filter what they need. The three signal types have different selection models:
demand_score > 0.7, category = 'electronics', ts > now() - interval '1h'
vec <-> query_vec < 0.3, ORDER BY vec <-> query_vec LIMIT 20
intent_label = 'high_purchase', sentiment = 'negative', topic IN ('billing', 'refund')
When all three live in one system, agents can compose these selection models in a single query — structured filters narrow the candidate set, vector similarity ranks it, LLM-derived labels apply semantic conditions. The result is exact enough to be actionable, and flexible enough to express meaning that no individual signal type can capture alone.
What Semantic Context Actually Requires
Unifying derived context isn't a query routing problem. It requires derivation and serving to share a transactional boundary.
Consistent Derivation
On-Demand or Pre-Computed
Unified Query Surface
Semantic Atomicity
See how Tacnode unifies all derived context
Structured queries, vector search, and LLM-derived signals — all inside one transactional boundary, computed from the same state.