OLTP + analytical replica creates replica-lag during decisions

A common production pattern: writes go to the primary, analytical reads go to a logical or streaming replica. For dashboards and reporting, the design is correct — the lag is measured in seconds and the workload tolerates it. For decision systems, the same pattern fails. A fraud-detection service reads the customer profile from the primary (current to commit) and reads recent-merchant velocity from a query against the replica (seconds behind under load). During a card-testing burst, the replica lag spikes; the fraud rule reads a stale velocity counter; the decision approves. The schema design isn’t wrong. The decision to compose two reads from two replicas with different freshness profiles is what creates the gap. Schema-level tuning can’t fix it without folding the analytical read back into the primary, which the workload was designed to offload. Real-time fraud detection architecture walks the full cross-channel version of this failure.

OLTP + separate feature store creates stream-serving skew

A modern ML-driven fraud architecture: source events land in Postgres. A Flink job consumes the CDC stream, maintains a windowed velocity counter, and promotes the value into an online feature store (Redis or DynamoDB) every checkpoint interval. The model server reads the feature at inference time. Schema design on the source side is textbook. Schema design on the feature store side optimizes for low-latency key-value lookups. Each is correct in isolation. The composed read at decision time isn’t. The model reads a feature that reflects events processed by Flink up to its last checkpoint — not events the source database has committed since. During a burst, the gap between the source’s committed state and the feature store’s served value widens. The model is correct against stale inputs; the decision is wrong. Schema-level tuning can shorten the gap (smaller checkpoint intervals, more aggressive promotion cadences) but can’t close it. The architecture itself — derived state computed in a separate pipeline, promoted on a cadence — creates the skew. See stateful stream processing for decisions for the full mechanism.

OLTP + separate vector DB + FTS cluster creates cross-store retrieval gap

A more elaborate composition: source data in Postgres, behavioral embeddings in pgvector or a separate vector database, support-history search in Elasticsearch. A customer-service AI agent reads all three to answer a single user query. Each store has its own consistency guarantee. The vector index rebuilds nightly; the Elasticsearch index refreshes every five minutes; the source database is current. The agent composes a response from three values reflecting three different points in time. The composed answer describes a customer state that never existed in the source of truth. The schema design in each store is correct for the store’s workload. The decision to spread retrieval across stores is what creates the gap. Generic Postgres schema design has no opinion on this composition; the failure happens above the database boundary. Polyglot persistence and the retrieval gap covers this failure mode in depth.

Schema-per-tenant security doesn’t extend through derivation

Multi-tenant schemas with strict tenant isolation enforced via RLS or schema-per-tenant are the standard pattern. They protect the source rows. They don’t protect the derived state. When a feature engineering pipeline computes per-customer aggregates in Flink and writes them to a shared feature store, the tenant boundary at the source isn’t replicated automatically. If the pipeline’s compute logic forgets to partition by tenant — or if the feature store doesn’t expose RLS — the protection ends at the source boundary. This isn’t a schema design bug per se. It’s a category error: the security boundary was designed for source rows but applied to a workload that requires the boundary to extend through every layer of derivation. Schema-level RLS can’t reach across a pipeline that crosses database boundaries.

What is the difference between Postgres schema design for OLTP and OLAP?

OLTP schemas normalize to eliminate redundancy and optimize for point lookups and small transactional reads. OLAP schemas denormalize into star or snowflake patterns and optimize for analytical scans, often using columnar storage and aggregations over time. For a single workload, picking the right one is a clean decision. Real-time decision systems compose both at once — they need normalized source rows for transactional integrity AND denormalized aggregates for fast feature lookup. The standard answer is to maintain both schemas and synchronize them via ETL or CDC; the result is the cross-store consistency gap that breaks decisions under concurrent load.

How do I normalize for analytical workloads?

You typically don’t. Analytical workloads benefit from denormalization — star schemas, wide tables, pre-aggregated rollups — because join cost dominates query time on large datasets. For decision systems where analytical aggregations must stay current with source events, the better answer is to keep source rows normalized AND maintain denormalized derived state as incrementally-maintained views, rather than running ETL into a separate analytical schema.

What’s the best Postgres partitioning strategy?

There isn’t one. RANGE partitioning works for time-series and audit logs where queries usually scope to a recent window. LIST partitioning works for region or tenant where the partition key has a small enumerable set. HASH partitioning works for sharding-style write distribution. The choice depends on query patterns — partitioning that doesn’t match the WHERE clauses your queries actually use gives no pruning benefit and adds operational overhead.

How do I design multi-tenant schemas in Postgres?

Three common patterns: schema-per-tenant (one schema per tenant, strict isolation at the namespace level), table-per-tenant with row-level security (shared tables, RLS policies filter rows by tenant ID), or database-per-tenant (full isolation, highest operational cost). The choice depends on tenant count, isolation requirements, and operations budget. Schema-per-tenant is the strongest isolation per dollar of operational complexity; RLS is the right answer when storage efficiency matters more than per-tenant operational simplicity. For decision systems, neither pattern extends automatically to derived state computed outside the database.

Should derived state live in the same database as source state?

For decision systems where derived state has to reflect events committed to the source within the decision’s validity window, yes — but “same database” here means a serving substrate, not a migration off Postgres. Separating source and derived state into different stores creates the freshness and consistency gaps that break composed reads. The conventional pattern (Postgres source + Redis serving cache + warehouse for analytics) is correct for batch and dashboard workloads; for decisions inside tight validity windows, a context lake that ingests source events via CDC and maintains derived state continuously inside its own transactional boundary closes the gap — the source database stays where it is, and the context lake serves the retrieval patterns the decision composes.

How is schema design for AI agents different from schema design for OLTP applications?

AI agents read composed context — customer profile + recent behavior + similar-case retrieval + relevant document chunks — and act inside a session. The schema has to serve point lookups (profile), windowed aggregations (recent behavior), vector similarity (similar cases), and full-text search (documents) inside the same agent loop. Conventional OLTP design optimizes for the first; AI agent context infrastructure requires the schema to natively support all four. This is why agent stacks frequently pair Postgres with a vector DB and a search cluster — and why those bolt-ons create the cross-store retrieval gap that erodes decision quality.

Can a Postgres schema design support full-text search and vector similarity natively?

Yes — Postgres ships with full-text search (tsvector + GIN indexes), JSONB containment queries, and supports pgvector for vector similarity via extension. A single schema can include all of these on the same table. The limitation is scale and consistency under high concurrent load: separate vector databases and search clusters often appear because pgvector’s IVFFlat or HNSW index doesn’t scale to the same volumes as purpose-built vector engines under mixed workload. For substantial multi-modal retrieval volumes, the schema-level question becomes whether the database substrate can serve all index types coherently at decision-time scale.

How do schema design principles differ between Postgres and Oracle, SQL Server, or Snowflake?

The underlying relational principles (normalization, indexing, partitioning, access control) are the same. Implementation details differ: Oracle has materialized view query rewrite, Snowflake separates compute from storage, SQL Server has built-in temporal tables and indexed views with stronger optimizer integration. Postgres’s schema design philosophy is closer to “compose the primitives” — schema, index, partition, RLS, materialized view — into the design that fits the workload. This gives flexibility at the cost of some out-of-the-box features other databases include. For decision systems, the gap is similar across all of them: none was originally designed to serve four retrieval patterns from one coherent snapshot, so the architectural answer involves a serving layer above the database, not a different database.

Can a Postgres schema evolve without downtime?

Most schema changes can, with discipline. Additive changes (new columns with constant defaults, new tables, new indexes via CREATE INDEX CONCURRENTLY) are cheap. Changes to existing column types or constraints require either online tooling (pg_repack, pgloader, gh-ost-style patterns adapted for Postgres) or the dual-write + backfill + flip pattern. The risk increases when the workload has tight validity windows — schema migrations under live decision traffic require either taking the workload offline or running migrations against a substrate that supports schema evolution natively at scale.

How does row-level security extend to derived state?

It doesn’t, in conventional Postgres. RLS applies to rows in tables that have an RLS policy enabled — typically source tables. Derived state computed in external pipelines (Flink jobs, ETL into a warehouse, embeddings into a vector store) doesn’t inherit the policy. Each downstream system has to re-implement tenant isolation, and most don’t expose anything as fine-grained as RLS. For multi-tenant decision systems where derived state matters, the architectural answer is to maintain derived state inside the same transactional and security boundary as source state — which a context lake does natively.

Back to Blog

Postgres

Postgres Schema Design Principles for OLTP, Analytics, and Real-Time Decisions

A practical guide to Postgres schema design principles — database/schema/table hierarchy, normalization, indexing, partitioning, access control, schema evolution — and where the conventional rules need revision for systems that commit decisions inside small validity windows.

Alex Kimball

Product Marketing

May 29, 2026

14 min read

TL;DR: Postgres schema design principles — normalize to 3NF, index by query patterns, partition by RANGE/LIST/HASH, RLS for tenant isolation, additive schema evolution — produce a correct OLTP database. They’re insufficient for real-time decision systems where one read has to compose point lookups, aggregations, full-text search, and vector similarity inside a tight validity window. The bolt-on architecture (OLTP source + analytical replica + feature store + vector DB + search cluster) creates freshness and consistency gaps that schema-level tuning can’t close. Decision-time workloads need hybrid storage on the same row, derived state maintained inline, multi-modal indexes coexisting, and snapshot reads across all retrieval patterns — what a context lake provides on top of Postgres-compatible primitives. :::

Postgres schema design principles are well-codified for traditional database workloads — normalize for transactional integrity, index by query pattern, partition by time or tenant for scale, layer access control by schema. Following them produces a database that behaves correctly under transactional load and remains operable as it grows.

The conventional advice optimizes for one workload at a time. OLTP schemas serve point lookups quickly; OLAP schemas serve analytical scans; feature stores serve ML inference; vector databases serve similarity search. Real-time decision systems break that assumption: a single fraud check or credit decision composes a point lookup, an analytical aggregation, sometimes a full-text search and a vector similarity — all returning a coherent view of the same moment, inside a validity window measured in hundreds of milliseconds. Schema design that assumes one retrieval pattern at a time guarantees freshness and consistency gaps when the others are bolted on as separate stores.

This guide covers the canonical Postgres schema design principles — database/schema/table hierarchy, normalization, indexing, partitioning, access control, schema evolution — and the workloads where each applies correctly. It also covers where these principles need to be revised for systems that commit automated decisions inside tight validity windows, and what the structural alternatives look like when one schema and one read path have to serve four retrieval patterns at once.

What Are Postgres Schema Design Principles?

Postgres schema design principles are the rules and conventions that govern how data is organized inside a PostgreSQL database — the hierarchy of databases, schemas, and tables; the normalization of rows to eliminate redundancy; the indexing strategy that matches query patterns; the partitioning approach for large tables; the role-based access control that limits exposure; and the schema evolution discipline that keeps the system operable as requirements change. Together they determine whether the database serves its workload reliably under load.

A schema in Postgres is a named container of database objects — tables, views, indexes, functions — inside a single database. Schemas exist for organization (grouping related objects), for security (granting different roles different access patterns), and for evolution (separating stable from in-flux objects). Design principles are how teams decide which objects belong in which schema, how rows relate across them, and how the resulting database behaves over time.

There’s no single “correct” Postgres schema. The right design depends on the workload — what reads the database has to serve, what writes it has to absorb, what consistency guarantees downstream consumers expect, and what consistency the database can offer. Principles encoded in textbooks (Codd’s normal forms, the relational model, BCNF) optimize for one set of workload assumptions. Modern decision systems sometimes share those assumptions and sometimes break them.

Conventional Principles That Work

The widely-taught Postgres schema design principles cover most production workloads and apply correctly within their intended scope. Before discussing where they need revision, it’s worth naming them clearly.

Normalize for transactional integrity. Eliminate redundant data by decomposing tables into entities and relationships, typically to third normal form (3NF). Use foreign keys to enforce referential integrity. The result is a schema that prevents update anomalies, keeps storage compact, and supports the transactional invariants Postgres can enforce at commit.

Index by actual query patterns. Create indexes for the columns that filter, join, and order rows in the queries the application actually runs. Match index type to query shape — B-tree for equality and range, GIN for full-text and JSONB, BRIN for time-series scan ranges, partial indexes for skewed distributions. Indexes that don’t match query patterns waste storage and slow writes.

Partition by time, tenant, or geography for scale. When a table grows past the point where queries against the whole table degrade performance, partition it. RANGE for time-series and audit logs, LIST for tenant or region, HASH for even distribution across shards. Partition pruning lets the planner scan only the relevant partitions, restoring access times to acceptable levels.

Schema-per-tenant for isolation. In multi-tenant systems where tenants need strict isolation but share a single database, use a schema per tenant. Each tenant’s tables live in their own namespace; role-based access control restricts each tenant role to its own schema. Cross-tenant queries become impossible at the engine level.

Additive schema evolution. Schema changes should be backwards-compatible by default. Add new columns and tables; avoid renaming or dropping in-use ones. Use views to abstract evolving table shapes from consuming code. Tools like `pg_repack` and the `CREATE INDEX CONCURRENTLY` pattern let you change physical layout without taking the database offline.

These principles are correct, they compose well, and they produce databases that behave reliably under transactional load. The next sections cover them in detail; the second half of the post covers where they don’t suffice.

Database, Schema, and Table Hierarchy

Postgres organizes objects in a three-tier hierarchy: a cluster contains one or more databases; each database contains one or more schemas; each schema contains tables, views, indexes, sequences, functions, and other objects. Connections are scoped to one database at a time — to query a different database, applications open a new connection. Cross-database queries require foreign data wrappers; they don’t happen natively.

Inside a database, schemas are inexpensive namespaces. A schema-per-business-domain or schema-per-tenant approach is common and adds negligible overhead. Cross-schema references inside the same database work natively — a table in schema `inventory` can reference a table in schema `users` via foreign key, subject to the calling role’s permissions.

The default schema is `public`. Most teams treat `public` as a staging area or strip it on production databases by revoking permissions, on the principle that nothing should accidentally land there. Explicit schemas force the design decision of where each object belongs.

Schema vs. database. A common point of confusion: a Postgres database is a top-level isolation boundary — separate databases cannot query each other natively, and each connection is scoped to exactly one. A schema is a namespace inside a database — a way to group related objects (tables, views, functions) and apply access control at the group level. Cross-schema references inside a database are cheap; cross-database references require foreign data wrappers. If you need strong isolation, use separate databases. If you need namespaced organization, use separate schemas.

sql

-- Three databases for clean separation
CREATE DATABASE ecommerce_transactions;
CREATE DATABASE analytics_warehouse;
CREATE DATABASE user_management;

-- Inside ecommerce_transactions, schemas by business domain
\c ecommerce_transactions;
CREATE SCHEMA inventory;
CREATE SCHEMA orders;
CREATE SCHEMA customers;
CREATE SCHEMA reporting;

Normalization for Transactional Integrity

The relational normalization rules — eliminating redundancy, enforcing primary key dependencies, isolating transitive dependencies — were developed to prevent integrity problems that arise when the same fact is stored in multiple places. Update anomalies, insertion anomalies, and deletion anomalies all derive from denormalized storage.

For OLTP workloads, normalization to 3NF (or BCNF in stricter applications) is the right default. Each fact is stored in exactly one place. Updates touch one row. Foreign keys enforce referential integrity at the database level, so the application doesn’t have to.

sql

-- Denormalized (wrong for OLTP)
CREATE TABLE order_items (
    order_id BIGINT,
    customer_id BIGINT,
    customer_name TEXT,      -- duplicated across every order
    customer_email TEXT,     -- duplicated across every order
    product_id BIGINT,
    product_name TEXT,       -- duplicated across every line item
    product_price NUMERIC,   -- price-at-time-of-order should be captured separately
    quantity INTEGER
);

-- Normalized (3NF)
CREATE TABLE customers (
    customer_id BIGINT PRIMARY KEY,
    customer_name TEXT NOT NULL,
    customer_email TEXT UNIQUE NOT NULL
);

CREATE TABLE products (
    product_id BIGINT PRIMARY KEY,
    product_name TEXT NOT NULL,
    current_price NUMERIC NOT NULL
);

CREATE TABLE orders (
    order_id BIGINT PRIMARY KEY,
    customer_id BIGINT NOT NULL REFERENCES customers,
    order_timestamp TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE TABLE order_items (
    order_id BIGINT NOT NULL REFERENCES orders,
    product_id BIGINT NOT NULL REFERENCES products,
    quantity INTEGER NOT NULL,
    unit_price NUMERIC NOT NULL,  -- price at time of order, captured explicitly
    PRIMARY KEY (order_id, product_id)
);

Denormalization is a deliberate trade — accepting redundancy to avoid a join. It’s appropriate in analytical schemas where reads dominate writes and join cost dominates query time. In transactional schemas, normalize first and denormalize only with evidence that join cost is the bottleneck.

Indexing by Actual Query Patterns

The single biggest determinant of query performance is whether the right indexes exist for the queries the application actually runs. Postgres supports multiple index types, each tuned for different access patterns.

B-tree — the default. Equality, range, sort. The right choice for primary keys, foreign keys, and any column queried with `=` or comparison operators.
GIN — generalized inverted index. The right choice for full-text search (`tsvector`), JSONB containment queries, and array membership.
BRIN — block range index. Compact, fast for time-series and other naturally-clustered data scanned in large ranges.
GIST — generalized search tree. Geometric, custom data types, and ranges.
Hash — equality only, smaller than B-tree for that case but with fewer features. Rarely the right default since Postgres 10’s improved hash index made it crash-safe but didn’t extend its query coverage.
Partial indexes — an index with a `WHERE` clause that filters which rows it covers. Excellent for skewed distributions where queries always touch a hot subset.

sql

-- Composite index for a known query pattern
CREATE INDEX idx_orders_customer_recent
ON orders (customer_id, order_timestamp DESC);

-- Partial index for high-value orders only
CREATE INDEX idx_orders_high_value
ON orders (customer_id, order_timestamp DESC)
WHERE total_amount > 1000;

-- GIN index for full-text search on product descriptions
CREATE INDEX idx_products_search
ON products USING GIN (to_tsvector('english', product_name || ' ' || product_description));

Match index design to query patterns, not to columns. A column queried only in combination with another column belongs in a composite index, not a single-column index. Indexes that match no query waste write throughput.

Partitioning Strategies for Large Tables

Once a table grows past a few tens of millions of rows, queries against the whole table degrade unless they hit a specific small index range. Partitioning splits the table physically into smaller partitions, each with its own statistics and indexes. The planner uses partition pruning to scan only the relevant partitions.

Postgres supports three partitioning strategies:

RANGE partitioning — partition by a column whose values fall into ordered ranges. Most common use is time: one partition per month or per day for an events or audit table. Old partitions become read-mostly; queries on recent data hit small, hot partitions.

sql

CREATE TABLE transactions (
    transaction_id BIGSERIAL,
    customer_id BIGINT NOT NULL,
    amount NUMERIC NOT NULL,
    occurred_at TIMESTAMPTZ NOT NULL
) PARTITION BY RANGE (occurred_at);

CREATE TABLE transactions_2026_05 PARTITION OF transactions
    FOR VALUES FROM ('2026-05-01') TO ('2026-06-01');

CREATE TABLE transactions_2026_06 PARTITION OF transactions
    FOR VALUES FROM ('2026-06-01') TO ('2026-07-01');

LIST partitioning — partition by a column whose values fall into a finite enumerable set. Common use is region or tenant.

sql

CREATE TABLE customers (
    customer_id BIGSERIAL,
    region TEXT NOT NULL,
    -- ...
) PARTITION BY LIST (region);

CREATE TABLE customers_us PARTITION OF customers FOR VALUES IN ('US');
CREATE TABLE customers_eu PARTITION OF customers FOR VALUES IN ('EU');
CREATE TABLE customers_apac PARTITION OF customers FOR VALUES IN ('APAC');

HASH partitioning — partition by a hash of the partition key. Even distribution; supports neither pruning by range nor by list, only by exact partition-key equality. Used for sharding-style distribution of writes across partitions.

Partition pruning at query time requires the WHERE clause to constrain the partition key. A query that doesn’t constrain `occurred_at` will scan all monthly partitions of `transactions`. Partition design has to match query patterns, like indexes.

Role-Based Access Control and Row-Level Security

Postgres has a layered access control model: object-level privileges (`GRANT` and `REVOKE` on tables, schemas, databases), role hierarchy (roles inherit from group roles), and row-level security (RLS) policies that further restrict which rows a role can see or modify.

For multi-tenant systems, the principle of least privilege says each tenant role should be granted access only to its tenant’s data. The simplest implementation is schema-per-tenant: a `tenant_alpha` schema contains tenant Alpha’s tables, and the `tenant_alpha_app` role is granted `USAGE` on `tenant_alpha` only. Cross-tenant queries become impossible at the privilege check.

When tables must be shared across tenants for storage efficiency, RLS provides the same guarantee at the row level. A policy on the shared table restricts each role to its tenant’s rows:

sql

ALTER TABLE customer_records ENABLE ROW LEVEL SECURITY;

CREATE POLICY tenant_isolation ON customer_records
    USING (tenant_id = current_setting('app.current_tenant_id')::BIGINT);

Every query against `customer_records` from a role subject to that policy gets the implicit filter `WHERE tenant_id = ...`. The application sets `app.current_tenant_id` per session; the database enforces isolation at every read.

RLS is correct for protecting source rows. Its scope ends at the database boundary — derived state computed in external pipelines, cached in Redis, or replicated to a warehouse is not covered. This is a clean fit for OLTP but becomes the failure mode discussed later when derived state matters.

Schema Evolution Without Downtime

Schema changes are inevitable. Disciplined design minimizes their cost — additive changes are cheap; destructive changes are expensive and risk-prone.

The rules:

Add new columns with defaults; don’t change existing column types in place.
Use views to abstract changing table shapes from the queries the application runs. The view definition can change while the underlying table evolves.
Use `CREATE INDEX CONCURRENTLY` to build new indexes without blocking writes.
Use `ALTER TABLE ... ADD COLUMN` with a constant default (Postgres 11+ supports this without rewriting the table).
Drop unused columns and indexes only after confirming they’re truly unused — `pg_stat_user_indexes` and `pg_stat_user_tables` make this auditable.

For more invasive changes — splitting tables, renaming columns under traffic, changing a primary key — the pattern is dual-write + read-from-old + backfill + flip-reads + drop-old. Slow but safe.

These rules are correct for OLTP databases under steady-state operation. They become harder to apply when the workload is decision-time inference under continuous concurrent load — discussed below.

Comparison: OLTP, Analytical, and Decision-System Schemas

A side-by-side comparison clarifies what conventional schema design assumes — and what changes when the workload is real-time decision composition.

OLTP and OLAP are the two workloads conventional Postgres schema design targets cleanly. Decision systems require a third design profile that the standard principles don’t address — and that’s where the conventional rules need revision.

Dimension	OLTP-optimized	OLAP-optimized	Decision-system
Storage layout	Row-oriented	Columnar	Hybrid (row + columnar on same table)
Primary index	B-tree on PK	BRIN or none	All types coexisting per query pattern
Freshness	Up to commit	Refresh-interval lag	Continuous, incremental
Read pattern	Single entity by key	Aggregation over range	Composed: lookup + aggregation + search + similarity
Consistency boundary	Transactional snapshot	Per-load snapshot	Single snapshot across all retrieval patterns
Tenant isolation	Schema-per-tenant or RLS	Often absent at warehouse	Must extend through derivation chain
Schema evolution	Additive + view-mediated	Rebuild-on-load	Must support live evolution under decision load

Where Standard Principles Break for Decision Systems

The standard principles assume the schema serves one workload. Real-time decision systems break that assumption — a single decision composes multiple retrieval patterns over data that must reflect a coherent view of the same moment. Four failure modes recur.

The composed read at decision time: four stores anchored to the same “now,” each serving from a different point in the past. The schema design in each store is correct in isolation. The composition isn’t.

A decision reads from four stores at the same moment; each store reflects a different point in time — Postgres primary current to commit, replica 3 seconds behind, feature store 30 seconds behind, vector index 24 hours behind

Alternative Architectures

The failures above aren’t problems with Postgres schema design itself — they’re problems with composing multiple data systems to serve one decision. Several architectural responses exist.

Read replicas with lag accounting. Use Postgres streaming or logical replication, but instrument replica lag and apply it to decision SLAs. Decisions that require sub-second freshness can’t tolerate replica lag; force those reads to the primary. Doesn’t address the multi-store retrieval gap.

CDC + stream processor + serving cache. The canonical real-time ML pattern. Postgres source → Debezium CDC → Flink → Redis serving. Schema design on the source remains conventional. The serving-side schema (Redis key shape, TTL, refresh cadence) becomes its own design problem and a new source of stream-serving skew.

Lambda or Kappa architectures. Lambda runs a batch layer for accuracy and a speed layer for freshness, reconciling at read time. Kappa runs only the speed layer, treating batch as a special case of streaming. Both attempt to make composed reads coherent by giving the speed layer enough freshness to serve decisions standalone.

Feature stores. Tecton, Feast, Hopsworks, Databricks Feature Store. Purpose-built for ML inference: train-serve consistency, point-in-time correctness, low-latency online serving. Solves the training-serving skew problem; doesn’t solve the stream-serving skew problem because the online store is still a cache.

HTAP databases. SingleStore, TiDB, Snowflake Unistore. Co-locate row and columnar storage so transactional and analytical workloads share the same primary database. Reduces the OLTP + analytical replica gap. Doesn’t always integrate native vector or full-text search, so vector-DB-plus-FTS-cluster bolt-ons can still appear above them.

Streaming databases. Materialize, RisingWave, Decodable. Treat incremental view maintenance as the core primitive. Source data flows in via CDC; views are maintained continuously; reads reflect the most recent ingested events. Excellent for the derived-state-freshness gap; less complete on multi-modal retrieval.

Context lakes. A context lake (Tacnode is one example) extends Postgres-compatible schema design with hybrid storage (row + columnar on the same table), incremental materialized views maintained inline as part of the schema definition, multi-modal indexes (B-tree, GIN, BRIN, vector, FTS) coexisting on the same physical row, RLS that extends through derivation, and snapshot reads across all four retrieval patterns under one MVCC transaction. Source data is ingested via CDC from existing Postgres clusters — the context lake doesn’t replace the source-of-truth database; it serves the derived state and multi-pattern retrieval the source can’t deliver inside the decision’s validity window. The conventional Postgres design transfers; the design space expands to cover derived state, multi-pattern composition, and snapshot-coherent reads natively.

The right architecture depends on the workload shape. For single-pattern workloads, conventional Postgres plus a single serving cache works. For composed-retrieval workloads where derived state must be coherent with source state at decision time, a substrate that maintains derivation inside the same transactional boundary is more direct than composing four systems and trying to keep them aligned through pipeline discipline.

Schema Design for Decisions

For OLTP and reporting workloads, the canonical Postgres schema design principles produce correct, operable databases. For real-time decision systems, four structural shifts in how the schema is designed start to matter.

Hybrid storage on the row. Decision systems read both single rows (customer profile) and ranges (recent transactions). A schema where every row exists once — accessible via row-oriented index for point lookups and columnar layout for analytical scans — collapses the OLTP/OLAP split that drives the analytical-replica architecture. Tables in this design profile aren’t OLTP tables or OLAP tables; they’re hybrid tables.

Derived state inline. Schemas that include derived state as first-class objects — incrementally-maintained views, computed aggregates, generated embeddings — keep the data the decision actually reads inside the same transactional boundary as the source. The feature store goes from a separate system the schema doesn’t touch to a maintained-view declaration that lives next to the source tables. (See Postgres materialized views for where the built-in implementation runs out of room and what extends it.)

Multi-modal indexes coexisting. The same physical row can be served as a key-value lookup (B-tree), an aggregation member (BRIN or columnar scan), a full-text search hit (GIN/tsvector), and a vector similarity result (IVFFlat or HNSW) — all maintained as part of the schema definition, not as separate retrieval systems. Index design becomes multi-modal by default.

Snapshot reads across all retrieval patterns. When a decision reads four retrieval patterns under one MVCC transaction, the composition reflects one point in time by construction. The retrieval gap that defines the third failure mode above isn’t tunable; it’s eliminated by the schema-level guarantee that all reads see the same snapshot.

These shifts aren’t a replacement for Postgres schema design — they’re an extension of it for workloads conventional design doesn’t address. Postgres remains the source of truth for transactional rows; the same schema, in a context lake substrate, also serves the derived state and multi-modal retrieval that decisions require.

Frequently Asked Questions

Key Takeaways

Postgres schema design principles — database/schema/table hierarchy, normalization, indexing, partitioning, role-based access control, additive schema evolution — produce correct OLTP databases and adequate analytical schemas, but were not designed for systems that compose multiple retrieval patterns inside small validity windows.
Normalize to 3NF for transactional integrity; denormalize deliberately when join cost dominates query time.
Match index design to actual query patterns. Use B-tree for keys, GIN for full-text and JSONB, BRIN for time-series scans, partial indexes for skewed distributions.
Partition large tables by RANGE (time), LIST (region or tenant), or HASH (sharding) when query patterns can prune to specific partitions.
For multi-tenant systems, schema-per-tenant offers the strongest isolation per unit of operational cost; RLS is the right answer when shared storage matters more.
The conventional principles assume one workload at a time. Real-time decision systems compose multiple retrieval patterns over coherent state — and the standard architecture (OLTP source + analytical replica + feature store + vector DB + FTS cluster) creates freshness and consistency gaps that schema-level tuning can’t close.
For decision systems where derived state must reflect source events within the decision’s validity window, schema design extends to include hybrid storage on the same row, derived state maintained inline as incremental materialized views, multi-modal indexes coexisting per query pattern, and snapshot reads across all retrieval patterns — what a context lake provides on top of Postgres-compatible primitives.

PostgreSQLSchema DesignDatabase DesignMulti-TenantReal-Time DecisionsData Architecture

Written by Alex Kimball

Former Cockroach Labs. Tells stories about infrastructure that actually make sense.

View profile LinkedIn

Continue Reading

AI Infrastructure

Ready to see Tacnode Context Lake in action?

Book a demo and discover how Tacnode can power your AI-native applications.

Book a Demo