
Xiaowei Jiang
CEO & Chief Architect at Tacnode
Xiaowei Jiang is CEO and Chief Architect at Tacnode, where he designed the Context Lake architecture from first principles. He previously built distributed query engines at Meta and Microsoft, working at petabyte scale across some of the largest data systems in production. His formal analysis of decision coherence — the Composition Impossibility Theorem — is published on arXiv (2601.17019) and provides the theoretical foundation for the Context Lake as a system category. He writes about database architecture, AI agent infrastructure, and the structural limitations of composed data stacks.
Posts by Xiaowei (20)
ACID for Agents: Why Database Consistency Is the Bottleneck for Production AI
Oracle just validated what production agent teams already know: the agent data layer is broken. Here's why ACID compliance across retrieval patterns is the fix.
Xiaowei Jiang|Mar 27, 2026OLTP vs OLAP: The False Choice for the Agentic Era
Every architecture guide frames OLTP vs OLAP as a choice: optimize for transactions or optimize for analytics. But automated decision systems — fraud checks, credit approvals, agent actions — need both transactional consistency and analytical power at the same moment. The Composition Impossibility Theorem proves you can't stitch separate OLTP and OLAP systems together to get there. Here's what comes after the tradeoff.
Xiaowei Jiang|Mar 17, 2026Apache Kafka vs Apache Flink: The Real Comparison Is Flink vs Kafka Streams
Most people comparing Kafka and Flink are actually asking which stream processing layer do I need? The real architectural choice is Apache Flink vs the Kafka Streams API — and understanding the difference changes how you build.
Xiaowei Jiang|Mar 2, 2026What Retrieval Really Means for AI Agents
AI retrieval is not one operation. Production decisions require exact and semantic retrieval patterns used together: point lookups, range scans, filters, joins, aggregations, and similarity search.
Xiaowei Jiang|Feb 18, 2026What Is Derived Context?
Why data freshness matters for AI decisions: derived context is state computed from events that must be current at decision time. When feature freshness degrades, decisions fail—not from bad models, but stale context.
Xiaowei Jiang|Feb 13, 2026Context Silos: When the System Knows But the Decision-Maker Doesn't
Why AI agent memory fails even when data exists: context silos prevent agents from accessing knowledge computed elsewhere. The fraud pattern was detected—but the checkout agent couldn't see it. Stale context isn't always old. Sometimes it's just unreachable.
Xiaowei Jiang|Feb 6, 2026What Is Context Engineering? The Discipline Behind Effective AI Agents
Context engineering is the discipline of designing how AI agents receive, manage, and act on information. It goes far beyond prompt engineering — covering context windows, tool calls, memory architecture, and the retrieval systems that determine whether an agent makes good decisions or bad ones.
Xiaowei Jiang|Feb 3, 2026ClickHouse JOINs Are Slow: Here's Why (And What To Do About It)
If your ClickHouse JOINs are killing query performance, you're not alone. Here's why columnar databases struggle with JOINs, what join algorithms are available, how to read the query plan, and when it's time to consider alternatives.
Xiaowei Jiang|Feb 5, 2026AI Agent Memory Architecture: The Three Layers Production Systems Need
AI agents need more than a vector database. Production systems require three distinct memory layers — episodic, semantic, and state. Here's what each layer does and why it matters.
Xiaowei Jiang|Feb 4, 2026Semantic Operators: Run LLM Queries Directly in SQL
Classify, summarize, and extract data using LLM reasoning inside your database. No external pipelines, no data movement — just SQL.
Xiaowei Jiang|Jan 28, 2026Join Tacnode at Current 2025: Putting Context in Motion
Context Lake comes to the Big Easy.
Xiaowei Jiang|Oct 14, 2025Context Lake: The Infrastructure Imperative for Real-Time AI
The next evolution from Data Lake to Context Lake.
Xiaowei Jiang|Aug 16, 2025Tacnode Context Lake is now available in the new AWS Marketplace AI Agents and Tools category
Helping usher in a new category of real-time AI solutions.
Xiaowei Jiang|Jul 16, 2025The Decision-Time System Model
Kafka + ClickHouse solves streaming analytics—but not AI decision-making. Here’s why teams searching for Kafka alternatives or a streaming database still hit walls: split state, temporal misalignment, and consistency gaps that break automated decisions.
Xiaowei Jiang|Feb 14, 2026Agent Drift and AI Drift: Why Production AI Models Quietly Get Worse
AI drift is the umbrella term for the gradual degradation of a machine learning model’s performance in production as data, relationships, or context diverge from training. Classical ML recognizes three types — data drift (covariate shift), concept drift, and label drift — detectable with statistical tests like the Kolmogorov-Smirnov test, Population Stability Index, and KL divergence. Agent systems introduce a fourth the classical toolkit misses — agent drift, where the model is unchanged but the derived context the agent reads at decision time has gone stale. This guide covers all four types, how to detect model drift, and how to prevent agent drift with the right context infrastructure.
Xiaowei Jiang|Apr 22, 2026Context Under Concurrency: Why Your Cache Collapses Under Load
Context under concurrency is the production failure mode where cached derived state goes stale faster than the system can refresh it, and parallel decisions commit against divergent snapshots. This post covers why high-velocity state plus concurrent decisions break the caching pattern, how the preparation gap and the retrieval gap compound under load, and what a serving layer has to do differently to keep decisions coherent when every millisecond of staleness has a business consequence.
Xiaowei Jiang|Apr 21, 2026Real-Time Fraud Detection Architecture: Where Coherence Breaks
Fraud detection architectures converge on the same canonical stack — Kafka → Flink → feature store → model serving → rules engine — and fail at three predictable seams under concurrent load: velocity counter staleness, feature-store / rules-engine divergence, and cross-channel retrieval gap. Sub-50ms p99 on each component doesn’t fix any of these.
Xiaowei Jiang|Apr 23, 2026Real-Time Credit Decisioning Architecture
Real-time credit decisioning is not batch underwriting with a faster SLA. Every transaction reads three derived signals — exposure, velocity, and risk — from separate pipelines that drift under concurrent load. The composite a decision reads is a chimera, correct only in the sense that each part was correct against its own snapshot.
Xiaowei Jiang|Apr 23, 2026Stateful Stream Processing for Decisions: Where Flink Stops Being Enough
Flink gives you stateful stream processing. It does not give you a decision-coherent serving layer. The gap is what teams discover when they put Redis or Postgres in front of Flink to serve decisions — and hit the same split-state problem Flink was supposed to have solved.
Xiaowei Jiang|Apr 24, 2026Real-Time ML: Architecture, Feature Freshness, and Where ML Models Make Bad Decisions
Real-time ML — the architecture that runs ML models against live requests for instant decisions — is bottlenecked by feature freshness, not model latency. The model serves in 8 milliseconds; the features it scored are 40 seconds old. For real-time machine learning systems committing against fresh state, the freshness budget is the binding constraint, and most stacks never measure it.
Xiaowei Jiang|Apr 24, 2026