There's a specific failure mode that experienced data engineers recognize immediately: a support routing agent classifying enterprise customers by a column that was semantically reclassified eighteen months ago. The column still exists, the values still look correct, but the business rule changed and nobody updated the record. The agent routes enterprise accounts to the wrong queue. The model isn't broken — it's missing the constraint that four people on the data team carry in their heads.
This is a data sovereignty problem. Not sovereignty in the compliance sense, but sovereignty in the sense that business context — the rules that make data mean something — is concentrated in a small number of people and has no formal representation that agents can query. Every question from an agent that touches business semantics goes back to the data engineer. They become the bottleneck, and adding more agents multiplies the load proportionally.
A context store is the structural answer: a versioned, governed, freshness-monitored layer that encodes what data means and what constraints apply — so agents can query it directly and data engineers own it systematically rather than answer it reactively. This article covers what goes in one, how to structure the schema, how to keep it current at scale, and the ownership inversion that makes it a force multiplier rather than another maintenance surface.
Key Takeaways
- ✓
A context store is distinct from a RAG knowledge base — it encodes business rules, explicit constraints, and freshness guarantees, not just documents.
- ✓
The schema's most important field is constraints[]: what agents must NOT do with this data, not just what the data means.
- ✓
Freshness has four operational states — fresh, stale, expired, conflicted — each requiring different agent behavior.
- ✓
The ownership inversion: data engineers publish semantic schemas as governed context products; agents self-serve rather than route questions back to humans.
- ✓
One well-maintained context document can serve dozens of agents consistently — the ROI scales with agent count, not with human headcount.
The Bottleneck Is Business Semantics
Why data engineers become the blockers in agentic systems — and why more agents makes it worse
Research on NL2SQL agents confirms what practitioners already know from painful experience: data agents fail not from retrieval errors or model limitations, but because they form misconceptions about what data actually represents.[1] They query the right column for the wrong semantic concept. They join on an ID field that was deprecated. They ignore the business rule that inventory counts below 5 are unreliable due to sync lag from a legacy ERP. The model did nothing wrong — nobody gave it the rules.
Meta ran into this at scale when building AI tools over their data pipelines. Only 5% of their 4,100+ code files across three repositories had any form of codified tribal knowledge — the institutional rules that experienced engineers applied automatically but had never written down. To address it, they built a pre-compute engine using 50+ specialized AI agents that systematically read code files and produced context documents encoding what the data meant, what to watch for, and how to navigate edge cases correctly.[2] The output: structured guidance covering 100% of their code modules, up from 5%.
Most teams can't run 50 agents to extract the initial knowledge. But they face the same structural problem: business knowledge accumulates informally, agents query data without it, and the data team answers the same questions repeatedly rather than systematically. A data engineer who spends 30% of their week answering semantic questions from analysts won't survive an agentic system at any meaningful scale — agents ask questions continuously and at volume. The math doesn't work unless the questions stop routing to humans.
Agents query data with no semantic context
Business rule questions route back to data engineers per-agent
Each new agent repeats the same knowledge gaps
Stale rules cause silent failures with no observable staleness signal
Data engineers are a per-question bottleneck
Agents query versioned context documents alongside data
Data engineers publish rules once; agents query them at runtime
New agents inherit the full governed semantic layer immediately
Freshness states (fresh / stale / expired) are observable and alertable
Data engineers are per-schema owners — one schema serves many agents
What a Context Store Actually Is
A definition that distinguishes it from RAG, wikis, and data catalogs
A context store is a versioned, governed semantic layer that gives AI agents the business rules, constraints, and freshness metadata they need to interpret data correctly — as a queryable, owned artifact rather than informal institutional knowledge.
It's not a RAG knowledge base. RAG retrieves document chunks based on semantic similarity. A context store serves governed business rules with explicit ownership, version history, and freshness SLAs. The difference matters operationally: a document in a RAG index becomes stale silently, and retrieval quality degrades as rules drift from reality. A rule in a context store has an owner who is responsible for keeping it current and a staleness state that agents observe at query time. RAG and a context store are complementary — RAG handles unstructured document retrieval; the context store handles governed business rule delivery.
It's also not a data catalog. A data catalog documents what data exists and where. A context store encodes what data means and what constraints apply to its use — the semantic layer on top of the catalog, containing the rules that can't be derived from column descriptions alone.
The Atlan context engineering framework describes this as treating every piece of context as a versioned, auditable data product with the same ownership and lifecycle management you'd apply to a core dbt model.[4] The a16z framing is simpler: agents need a business knowledge layer that answers the questions raw data can't.[5]
Context Kubernetes, a research architecture for managing enterprise context at scale, defines the components precisely: a Context Registry managing document identity, a Freshness Manager tracking four states (fresh, stale, expired, conflicted), a Permission Engine enforcing RBAC, and a Trust Policy determining which documents agents can act on without human verification.[6]
Only 5% of 4,100+ code files had any written institutional knowledge — the rest existed only in engineers' heads (Engineering at Meta, Apr 2026)
fresh, stale, expired, conflicted — each requires different agent behavior at query time (Context Kubernetes, arXiv 2604.11623)
Inventory context needs seconds-level freshness; financial reporting context can tolerate daily — same store, different SLAs per document type (Streamkap)
Anatomy of a Context Store Schema
The fields that separate a governed context document from a knowledge base chunk
Most teams that try to build a context store start by porting their existing data documentation — column descriptions, table readmes, data dictionary entries. That produces a searchable knowledge base. It does not produce a governed context store, because it's missing the three fields that make a context document agent-safe: constraints, freshness, and owner.
The constraints field is the one almost nobody covers. It encodes what agents must not do with this data — the corrective knowledge that lives in experienced engineers' heads and routinely causes failures when absent. A rule that says "enterprise tier requires ARR >= $50K" is useful. A constraint that says "never use orders.customertier for current-state routing — use customers.currenttier" is the difference between a routing agent that works and one that silently misclassifies for months.
The freshness object anchors the SLA. Without it, the context document is a snapshot with no indication of when it expires or what drives it. Agents that can see freshness state can include uncertainty in their output when a rule is stale — instead of making a confident wrong decision.
Here's a minimal but complete schema for a context document:
context_store_schema.yaml# Example: business rule context document
id: customer-tier-classification
type: business-rule # business-rule | metric-definition | entity-map | workflow-constraint
version: "2.3.1" # semver — bump minor for rule updates, major for breaking changes
owner: data-platform-team # team responsible for keeping this current
tags: [routing, support, customers]
content: |
The customer_tier column in the orders table reflects the tier at
time of order creation, not the customer's current tier. For real-time
routing decisions, use customers.current_tier. Enterprise threshold:
ARR >= 50000 OR has_sla_agreement = true.
constraints:
- "NEVER use orders.customer_tier for current-state routing — it is immutable at order creation"
- "Enterprise classification requires BOTH ARR check AND SLA flag — either alone misclassifies"
- "Tier reclassification propagates in up to 24h — check customers.tier_updated_at to detect lag"
freshness:
sla_hours: 24
source: "dbt model: dim_customers.tier_recalculation (runs: 02:00 UTC daily)"
last_verified: "2026-05-06T02:14:00Z"
status: fresh # fresh | stale | expired | conflicted
retrieval:
embedding_hint: "customer tier enterprise classification routing support"
citation_required: true # agent must surface this rule when used in a decisionFreshness as a First-Class Guarantee
Why stale context is worse than no context — and how to monitor it
Freshness monitoring is where most context store implementations fall apart in production. Teams build the initial schema, populate it with current rules, and ship it. Six months later, the business rules have drifted, the context documents haven't, and agents are confidently applying stale constraints to live decisions. The problem is that freshness degradation is silent by default. A stale context document doesn't throw an error — it just serves wrong information with full confidence.
The four-state model from the Context Kubernetes architecture maps this clearly:[6]
- Fresh — document verified against its source within the SLA window; agents can act on it
- Stale — the SLA window passed but the document hasn't been explicitly invalidated; agents should note uncertainty
- Expired — the owner must re-verify before agents can use it; treat as unavailable
- Conflicted — two documents make contradictory claims about the same data; a human must resolve before agents proceed
Different context types warrant different SLA windows. Inventory levels or real-time pricing rules need freshness within seconds — a stale inventory rule causes agents to promise availability that doesn't exist. Financial reporting context can tolerate daily freshness. Customer support routing rules need near-real-time updates to avoid misrouting live cases.[3] The freshness SLA belongs in the schema itself, not in a separate monitoring config, because agents need to inspect it at query time to decide whether to use the document or flag uncertainty to the caller.
The production-grade pattern is CDC-driven freshness, not polling. When the dim_customers table updates, the freshness monitor checks whether any context document's source field references that model and updates the staleness state automatically.[3] Polling hourly means you can be 59 minutes behind a breaking rule change. CDC-driven updates catch it in minutes.
The Ownership Inversion Model
How data engineers shift from reactive answerers to proactive context publishers
The traditional data engineering model is reactive: analysts ask questions, engineers answer them. When agents enter the picture, this model doesn't hold — agents generate questions continuously and at volume, with no natural throttle. A data engineer who spends 30% of their week answering semantic questions from analysts will hit a ceiling immediately when agents ask the same questions at 10x the rate.
The ownership inversion shifts the direction of the work. Data engineers become context publishers, not question answerers. They formalize the business rules they currently hold in their heads into governed context documents — and agents query those documents directly. The engineer's output changes from verbal explanations to schema artifacts. The questions stop routing to humans.
This is what a16z means by "your data agents need context":[5] the move from data as a pipeline artifact to data as a context product, where the same formalized rule can be queried by any number of agents without increasing the data team's workload per query. One well-maintained context document serves a dozen agents consistently. The marginal cost of adding another agent drops near zero once the schema exists.
The friction is the quality of initial formalization. Getting data engineers to write constraints[] fields — not just column descriptions — requires a different mental model: writing not just what the data is, but what can go wrong if an agent uses it wrong. That's unfamiliar work, and teams that rush it produce context documents that are technically accurate but operationally incomplete. The most successful implementations treat initial formalization as a sprint goal, not a backlog item — and pair engineers with agents during the discovery phase to surface the questions the agents actually ask before writing the schema.
- 1
Inventory the semantic questions agents will ask
Before writing any schema, identify the top 20 questions that agents in your system ask about data — these are the questions that currently route to a data engineer. They become your first 20 context documents. Start with the ones that cause the most support tickets or silent agent failures in staging.
- 2
Write context documents with constraints, not just descriptions
For each identified rule, create a context document with the full schema: type, version, owner, content, constraints[], freshness, and retrieval hints. The constraints field is mandatory — a context document without negative knowledge is incomplete. Version at 1.0.0 for initial publication.
- 3
Wire freshness monitoring to source changes
Set up CDC-based freshness tracking so that when the underlying data source changes, affected context documents transition to stale automatically. Don't rely on periodic polling — CDC-driven updates catch changes in minutes. Configure staleness alerts to the owning team before expiry, not after.
- 4
Expose the store as a queryable API with RBAC and freshness in every response
Agents should query the context store via an API, not by reading YAML files directly. The API enforces access control, returns freshness state in every response, and logs every lookup for audit and coverage analysis. A simple REST endpoint returning JSON context documents with staleness metadata is sufficient to start — invest in purpose-built infrastructure only after the schema pattern is validated.
Pre-Production Governance Checklist
What to verify before agents start querying your context store in production
Context Store Governance Checklist
Every context document has an explicit owner — a team, not an individual
Every document has a non-empty constraints[] field
Freshness SLAs are set per document type, not as a global default
CDC-based freshness monitoring is live — documents transition to stale automatically on source changes
Staleness alerts route to the owning team before expiry, not after
Agents receive freshness state in every context response and handle stale / expired cases explicitly
RBAC enforced — not all agents can access all context documents
Every context document lookup is logged for audit trail and coverage analysis
Version history maintained — agents can query at a specific version for reproducibility in evals
Initial document set covers the rules that caused the most agent failures or misroutes in staging
How is a context store different from a RAG knowledge base?
RAG retrieves document chunks based on semantic similarity. A context store serves governed business rules with explicit ownership, version history, and freshness SLAs. The practical difference: a RAG document becomes stale silently and retrieval quality degrades gradually. A context store document has an owner responsible for keeping it current and a freshness state that agents observe at query time. The two are complementary — RAG handles document retrieval for unstructured knowledge; the context store handles governed business rule delivery for operational decisions.
What should the first 20 context documents cover?
Start with the rules that cause the most agent failures or the most questions routed to data engineers. The most valuable initial documents cover: business classifications (what 'enterprise' means in each system), column semantic corrections (this column means X not Y — use the other column for Z), threshold definitions (the exact numbers and conditions that define a category), and workflow routing rules (which values map to which queues or behaviors). Mine incident postmortems and Slack threads for recurring patterns — the questions that appear more than twice are your top candidates.
How do you handle conflicting context documents?
The 'conflicted' freshness state exists for exactly this case — two documents making contradictory claims about the same data. When the freshness monitor detects a conflict, both documents transition to conflicted, and agents receiving them treat the conflict as a signal to surface uncertainty rather than make a decision. A human from the owning team resolves it, updates one or both documents, and re-publishes. In practice, conflicts surface most often during business rule changes when the old rule hasn't been explicitly retired. The fix is a retirement protocol: when a new rule supersedes an old one, the old document's status is set to 'archived' before the new one is published.
Do we need dedicated infrastructure, or can we start simpler?
For a first iteration, a Git repository of YAML files plus a simple REST API is enough. The schema design, versioning discipline, and ownership model matter more than the infrastructure. Atlan, Snowflake Cortex, and similar platforms provide native context layers for teams already using those tools. The Context Kubernetes architecture offers a complete reference design for declarative orchestration at scale. Don't invest in purpose-built infrastructure until you've validated the schema with at least 20 governed documents and real agent queries — the API surface is small and the early iteration will likely reveal schema changes that are easier to make before tooling hardens.
How do we prevent context store maintenance from becoming its own bottleneck?
The goal is shifting work from reactive question answering to proactive schema authoring — a one-time cost per rule that pays dividends per agent query. The bottleneck re-emerges if schema authoring isn't systematic. The fix: treat context document creation as part of the definition-of-done for any significant data model change. When a new dbt model ships, the owning team publishes the corresponding context document before the model is available to agents. This makes context documents a first-class deliverable, not an afterthought, and distributes the authoring work to the teams who already own the underlying data.
The context store isn't a new invention — semantic layers and data catalogs have existed for years. What's changed is the ownership model: treating every business rule as a governed, versioned, freshness-monitored artifact that agents query directly, rather than institutional knowledge that humans mediate on demand.
The teams that get this right are the ones where data engineers stop seeing themselves as answerers and start seeing themselves as publishers. The work is the same — encoding the rules they already know — but the delivery changes from a Slack reply to a schema artifact that serves every future agent automatically.
Start with the failures. Pick the five agent decisions that went wrong in the last sprint and trace each one back to the business rule that was missing or stale. Write those five rules as governed context documents. Add the constraints[] field. Set a freshness SLA. Ship them to a simple API and measure whether the same failures recur. That's a week of work and a meaningful signal about whether the pattern scales for your system.
- [1]Arming Data Agents with Tribal Knowledge (arXiv 2602.13521, Feb 2026)(arxiv.org)↩
- [2]How Meta Used AI to Map Tribal Knowledge in Large-Scale Data Pipelines (Engineering at Meta, Apr 2026)(engineering.fb.com)↩
- [3]The Context Layer: What AI Agents Need Beyond Raw Data (Streamkap)(streamkap.com)↩
- [4]Context Engineering Framework for Enterprise AI in 2026 (Atlan)(atlan.com)↩
- [5]Your Data Agents Need Context (Andreessen Horowitz)(a16z.com)↩
- [6]Context Kubernetes: Declarative Orchestration of Enterprise Knowledge for Agentic AI Systems (arXiv 2604.11623)(arxiv.org)↩
- [7]The Data Problem Behind Agentic AI, and What You Can Do About It (Equinix Blog, Jan 2026)(blog.equinix.com)↩
- [8]6 Agentic Knowledge Base Patterns Emerging in the Wild (The New Stack)(thenewstack.io)↩