Back to Blog
AI & Machine Learning

Retrieval Is More Than Vector Search

RAG architecture needs more than embeddings. Real AI agent memory requires hybrid search: point lookups, aggregations, filters, joins—plus semantic search. When retrieval is vector-only, agents miss the structured context that determines correctness.

Tacnode Team
8 min read
Share:
Six retrieval patterns beyond vector search: point lookups, aggregations, filters, temporal queries, joins, and semantic search

An agent needs to answer: "Can I return this order?"

It retrieves the return policy from a vector store. The most relevant passage says returns are accepted within 30 days. The order is 25 days old. The agent approves the return.

But the customer has already returned six items in the past 90 days. The policy caps returns at five. That fact lives in a database, not the vector store. The agent never saw it.

The return goes through. It shouldn't have.

This is what happens when retrieval means only vector search.

The conflation is everywhere. RAG tutorials start with embeddings. Context engineering guides jump to vector databases. The mental model got flattened: retrieval = embeddings.

It's not.

What Retrieval Actually Means

Go back to the return question. To answer it correctly, the agent needs six different retrievals:

A point lookup to get the order record—order ID, purchase date, items, amounts. Exact match, exact answer.

A range scan to get all returns this customer has made in the past 90 days. Not similar returns. All of them.

An aggregation to calculate the total return count and value in that window.

A filter to check which items in the order are in returnable product categories.

A join to pull the customer's membership tier and its associated return policy—different tiers, different rules.

And a semantic search to find the relevant passages in the return policy documentation.

Six retrieval patterns. Only the last is approximate. The first five require exact answers—no room for "close enough."

Skip any of them and the agent reasons over an incomplete picture. The return that shouldn't have gone through is the result.

Three Ways This Breaks

The reshape problem. When vector search is all you have, every question becomes a similarity problem. Need the return count? Embed the question, hope a chunk contains the number. Need the policy cap? Hope it got embedded near the query. This works sometimes. It fails unpredictably. The failure mode isn't a wrong answer—it's an approximate answer to a question that needed an exact one.

The consistency problem. When retrieval spans systems, you get temporal mismatches. The database updated two seconds ago when the customer changed their shipping address. The vector store is indexed on a lag. The agent queries both and combines the results. Now it's reasoning over a state that never existed—and no audit trail will reproduce how it got there.

The filtering problem. When queries need to span structured and semantic dimensions, you're forced to choose. Say an agent is handling a support case and needs to find similar past cases—but only from enterprise customers, in the last 90 days, tagged as billing-related.

Filter first, then search: query the database for matching case IDs, pass them to the vector store, search only within that set. But the vector index was built over all cases. Constraining it to a sparse subset degrades similarity matching—the index can't find meaningful neighbors in a fragmented slice.

Search first, then filter: retrieve the top 50 similar cases, then apply the structured constraints. But what if only 3 pass? You've missed relevant cases that ranked below the cutoff. Increase retrieval size and you're pulling thousands of results to discard most of them.

Precision dies at the seam—and recall dies with it.

What's Actually Needed

A context layer needs to support all retrieval patterns: point lookups, range scans, aggregations, filters, joins, semantic search.

One system. One snapshot. One query plan across structured and semantic. So predicates can be pushed into semantic search, and semantic results can be joined and aggregated like any other data.

Vector search is a feature. Retrieval is the architecture.

Get the architecture wrong, and your agent will keep approving returns that should have been denied—confident, fast, and wrong for reasons you can't reproduce. A Context Lake provides this unified retrieval layer—supporting all six patterns against a single coherent state.

RAGRAG ArchitectureVector SearchHybrid SearchAI Agent MemoryContext Lake
T

Written by Tacnode Team

Building the infrastructure layer for AI-native applications. We write about Decision Coherence, Tacnode Context Lake, and the future of data systems.

View all posts

Ready to see Tacnode Context Lake in action?

Book a demo and discover how Tacnode can power your AI-native applications.

Book a Demo