What Is a Feature Store?

Infrastructure that computes, stores, and serves the input signals ML models use to make predictions — bridging raw data and real-time decisions.

4h 12m ago

What the model sees

txn_count_30m31
avg_order_val$84.20
risk_score0.23

Batch-refreshed. Frozen since last pipeline run.

live

What's actually happening

txn_count_30m47
avg_order_val$84.20
risk_score0.79

Continuously computed. Fresh at decision time.

The Problem: Feature Drift

Without shared infrastructure, teams compute features ad-hoc — one version in training, another in production, a third in a dashboard. The values diverge. Models degrade silently.

Actual value right now

user_risk_score47

Training notebook

31

drift: +16

Computed last Tuesday

Production API

38

drift: +9

Cache from 2h ago

Batch dashboard

44

drift: +3

Last night's run

Three systems. One feature. Three different answers. This is drift.

What a Feature Store Does

A feature store centralizes the full lifecycle: define features as named transformations, compute them consistently via batch or streaming pipelines, store historical values for training and current values for serving, and serve them to models at inference time with low-latency APIs.

One definition. One computation path. Every consumer — training jobs, production models, analytics dashboards — reads from the same source. Drift eliminated by design.

But most feature stores stop there. They serve pre-computed values from a cache — values that were fresh when the pipeline ran, but stale by the time the model reads them.

The Freshness Spectrum

For fraud, personalization, and dynamic pricing — the gap between "when the feature was computed" and "when the model reads it" is where value leaks.

StaleFresh at decision time
Batch Pipeline
Hours
Streaming Pipeline
Seconds
Unified (Continuous)
Instant

Where It Matters Most

< 50ms

decision window

Fraud Detection

Evaluate transaction risk against behavioral features from the last few minutes — not last night's batch.

1000s

features per request

Personalization

Recommendations from live session signals, purchase history, and embeddings — updated as the user browses.

Live

market signals

Dynamic Pricing

Prices that track demand, competitor data, and inventory — continuously, not hourly.

10k+

QPS in milliseconds

Risk Scoring

Credit and insurance features that reflect the applicant's most recent activity.

What to Evaluate

Freshness

Continuous — features update as data changes
Batch-refreshed — hours old when models read them

Consistency

Transactional guarantees across all readers
Eventual consistency between stores

Training-Serving Parity

Same definitions, same system, no drift
Separate pipelines with diverging logic

Semantic Operations

Native vector operations in feature definitions
Scalar aggregations only, vectors bolt-on

Traditional vs. Unified

When workloads demand continuous freshness or semantic reasoning, the architecture needs to collapse compute and serving into one boundary.

Traditional

Architecture

Separate offline + online stores

Freshness

Minutes to hours

Consistency

Eventual

Semantic ops

Bolt-on vector DB

Unified Context Layer

Architecture

Single system for compute and serve

Freshness

Continuous

Consistency

Transactional

Semantic ops

Native embeddings

See how Tacnode approaches feature serving

Features computed directly from live data, inside a single transactional boundary. No batch pipelines. No sync. No stale snapshots.