Xiaowei Jiang

CEO & Chief Architect at Tacnode

Xiaowei Jiang is CEO and Chief Architect at Tacnode, where he designed the Context Lake architecture from first principles. He previously built distributed query engines at Meta and Microsoft, working at petabyte scale across some of the largest data systems in production. His formal analysis of decision coherence — the Composition Impossibility Theorem — is published on arXiv (2601.17019) and provides the theoretical foundation for the Context Lake as a system category. He writes about database architecture, AI agent infrastructure, and the structural limitations of composed data stacks.

Database ArchitectureDistributed SystemsAI InfrastructureDecision Coherence

Posts by Xiaowei (19)

Data Engineering

OLTP vs OLAP: The False Choice for the Agentic Era

Every architecture guide frames OLTP vs OLAP as a choice: optimize for transactions or optimize for analytics. But automated decision systems — fraud checks, credit approvals, agent actions — need both transactional consistency and analytical power at the same moment. The Composition Impossibility Theorem proves you can't stitch separate OLTP and OLAP systems together to get there. Here's what comes after the tradeoff.

Xiaowei Jiang|Mar 17, 2026

Data Engineering

Apache Kafka vs Apache Flink: The Real Comparison Is Flink vs Kafka Streams

Most people comparing Kafka and Flink are actually asking which stream processing layer do I need? The real architectural choice is Apache Flink vs the Kafka Streams API — and understanding the difference changes how you build.

Xiaowei Jiang|Mar 2, 2026

AI & Machine Learning

What Retrieval Really Means for AI Agents

AI retrieval is not one operation. Production decisions require exact and semantic retrieval patterns used together: point lookups, range scans, filters, joins, aggregations, and similarity search.

Xiaowei Jiang|Feb 18, 2026

Architecture

What Is Derived Context?

Why data freshness matters for AI decisions: derived context is state computed from events that must be current at decision time. When feature freshness degrades, decisions fail—not from bad models, but stale context.

Xiaowei Jiang|Feb 13, 2026

Architecture

Context Silos: When the System Knows But the Decision-Maker Doesn't

Why AI agent memory fails even when data exists: context silos prevent agents from accessing knowledge computed elsewhere. The fraud pattern was detected—but the checkout agent couldn't see it. Stale context isn't always old. Sometimes it's just unreachable.

Xiaowei Jiang|Feb 6, 2026

AI & Machine Learning

What Is Context Engineering? The Discipline Behind Effective AI Agents

Context engineering is the discipline of designing how AI agents receive, manage, and act on information. It goes far beyond prompt engineering — covering context windows, tool calls, memory architecture, and the retrieval systems that determine whether an agent makes good decisions or bad ones.

Xiaowei Jiang|Feb 3, 2026

Data Engineering

ClickHouse JOINs Are Slow: Here's Why (And What To Do About It)

If your ClickHouse JOINs are killing query performance, you're not alone. Here's why columnar databases struggle with JOINs, what join algorithms are available, how to read the query plan, and when it's time to consider alternatives.

Xiaowei Jiang|Feb 5, 2026

AI Infrastructure

AI Agent Memory Architecture: The Three Layers Production Systems Need

AI agents need more than a vector database. Production systems require three distinct memory layers — episodic, semantic, and state. Here's what each layer does and why it matters.

Xiaowei Jiang|Feb 4, 2026

AI Infrastructure

Semantic Operators: Run LLM Queries Directly in SQL

Classify, summarize, and extract data using LLM reasoning inside your database. No external pipelines, no data movement — just SQL.

Xiaowei Jiang|Jan 28, 2026

Company News

Join Tacnode at Current 2025: Putting Context in Motion

Context Lake comes to the Big Easy.

Xiaowei Jiang|Oct 14, 2025

AI & Agentic Systems

Context Lake: The Infrastructure Imperative for Real-Time AI

The next evolution from Data Lake to Context Lake.

Xiaowei Jiang|Aug 16, 2025

Company News

Tacnode Context Lake is now available in the new AWS Marketplace AI Agents and Tools category

Helping usher in a new category of real-time AI solutions.

Xiaowei Jiang|Jul 16, 2025

AI & Machine Learning

The Decision-Time System Model

Kafka + ClickHouse solves streaming analytics—but not AI decision-making. Here’s why teams searching for Kafka alternatives or a streaming database still hit walls: split state, temporal misalignment, and consistency gaps that break automated decisions.

Xiaowei Jiang|Feb 14, 2026

Real-Time Architecture

Context Under Concurrency: Why Your Cache Collapses Under Load

Context under concurrency is the production failure mode where cached derived state goes stale faster than the system can refresh it, and parallel decisions commit against divergent snapshots. This post covers why high-velocity state plus concurrent decisions break the caching pattern, how the preparation gap and the retrieval gap compound under load, and what a serving layer has to do differently to keep decisions coherent when every millisecond of staleness has a business consequence.

Xiaowei Jiang|Apr 21, 2026

Fraud Detection

Real-Time Fraud Detection Architecture: Where Coherence Breaks

Fraud detection architectures converge on the same canonical stack — Kafka → Flink → feature store → model serving → rules engine — and fail at three predictable seams under concurrent load: velocity counter staleness, feature-store / rules-engine divergence, and cross-channel retrieval gap. Sub-50ms p99 on each component doesn’t fix any of these.

Xiaowei Jiang|Apr 23, 2026

Financial Services

Real-Time Credit Decisioning Architecture

Real-time credit decisioning is not batch underwriting with a faster SLA. Every transaction reads three derived signals — exposure, velocity, and risk — from separate pipelines that drift under concurrent load. The composite a decision reads is a chimera, correct only in the sense that each part was correct against its own snapshot.

Xiaowei Jiang|Apr 23, 2026

Stream Processing

Stateful Stream Processing for Decisions: Where Flink Stops Being Enough

Flink gives you stateful stream processing. It does not give you a decision-coherent serving layer. The gap is what teams discover when they put Redis or Postgres in front of Flink to serve decisions — and hit the same split-state problem Flink was supposed to have solved.

Xiaowei Jiang|Apr 24, 2026

Real-Time Architecture

The Thundering Herd Problem in Real-Time Decision Systems

The thundering herd problem looks like a load problem, and single-flight or jittered TTLs treat it as one. In a real-time decision system it's a correctness problem — and the real fix is removing the cache, not tuning it.

Xiaowei Jiang|May 22, 2026

Architecture

The Modern Data Stack Has a Coherence Problem

The modern data stack is good at making individual tables fresh. It’s bad at ensuring coherence across tables and systems when a decision reads all of them simultaneously. Three failure modes — preparation delay, cross-system retrieval inconsistency, snapshot incoherence under concurrency — get diagnosed as model or feature problems when they’re architectural problems.

Xiaowei Jiang|Mar 9, 2026