Agentic AI Data Readiness: A 10-Dimension Self-Assessment

Q: How is this different from the three-tier data readiness checklist?

Different jobs. The three-tier checklist (Foundation, Workflow, Autonomous) is a pre-build gate — pass or fail before writing agent code. This assessment is a positioning tool — it locates where you currently sit on the readiness spectrum and what deployment model you can responsibly target. Run this first to set the baseline and calibrate ambitions. Run the checklist once you have committed to a specific build, to verify individual gates before code ships. They answer different questions.

Q: We scored four. We plan to reach nine in six months. Can we start now?

Yes — with discipline. Build for the supervised tier. Ship real users through it with human approval queues. Use the supervised period to close the gaps that separate you from narrow autonomy. The mistake is designing the agent for your future infrastructure and shipping it as if the infrastructure already exists. Build for what is real now. Improve in parallel. Upgrade the agent's autonomy when the data is actually ready, not when the roadmap says it should be.

Q: Q3 looks like a six-month data architecture project. How do we unblock the agent?

Scope the first agent to a single domain so Q3 does not apply. Enforce explicit domain boundaries so the agent cannot make cross-domain joins. Then invest in entity resolution before expanding to a second domain. Discipline: never ship cross-domain joins before the entity resolution layer is in place. The mismatch rate on unresolved cross-domain joins compounds silently — low enough to miss in testing, high enough to corrupt production decisions at scale.

Q: Our agent is internal-only and low-stakes. Do we still need a nine?

Probably not. The required score scales with decision stakes and audit exposure. An internal agent drafting meeting summaries operates at four. An agent approving expense reports needs six or seven. An agent making pricing or credit decisions needs nine to ten. The right question: what is the worst-case consequence of this agent acting on stale, mismatched, or incorrect data, and can your organization explain and accept that outcome without a full audit trail? That answer sets the required score more accurately than any blanket rule.

Q: We have a score of seven but our schema contracts are weak. Should we move to narrow autonomy?

No. The score is an average — a weak Q2 at 0.5 can be masked by strong scores elsewhere and still show seven. Before upgrading autonomy, audit each failing dimension individually. A seven with Q2 at 0.5 means an upstream schema change can silently corrupt your autonomous agent's decision context. That specific gap makes narrow autonomy unsafe regardless of the total score. Treat Q2 as a hard gate, not an averaged input.

Q: Does the EU AI Act actually apply to internal enterprise AI agents?

It depends on the decision type, not the deployment surface. EU AI Act high-risk categories cover systems making decisions about creditworthiness, employment, access to essential services, and law enforcement — not 'internal-only' as a class. An internal agent approving loans, scoring employee performance for compensation decisions, or triaging benefits applications almost certainly falls within the high-risk scope. Consult legal counsel against the specific use cases. The audit trail requirements (Article 12, Article 26(6)) are worth building regardless — they protect you in any regulatory environment.

Agentic AI Data Readiness: Score 10 Dimensions Before Your Agents Fail

Eight in ten agentic AI projects stall on data, not models. Score your environment on ten dimensions before the agent surfaces the gaps. Four tiers, calibrated thresholds, structural fixes ordered before operational ones.

Data, Context & KnowledgeintermediateApr 23, 20269 min read

By Viktor Bezdek · VP Engineering, Groupon

Eight in ten organizations name data limitations as the top barrier to scaling agentic AI.^[1] Gartner puts the failure projection starker: over 40% of agentic AI projects will be canceled by end of 2027 — escalating costs, unclear ROI, or inadequate risk controls.^[6] Most of those failing companies already have governance. Data owners on a RACI sheet. Quality policies in Confluence. A council that meets monthly. The artifacts exist. The agents still fail.

The gap is not governance. It is that analytics governance and agentic governance solve different problems. BI asks whether a human analyst can read this data and build a report. That tolerates stale rows, broad table grants, and informal ownership because a human is in the loop to catch anomalies. Agents ask the harder question: can an autonomous system make a real-time decision on this data, act on it, and be accountable afterward — without human review? No standard governance audit answers that.

This assessment scores your data environment on ten dimensions that map directly to autonomous deployment. The total maps to one of four tiers, each with a defined deployment strategy. Most teams know the answer by question four. Most score lower than they expected.

80%

Organizations blocked by data, not model quality

McKinsey, 2025 — data limitations rank #1 barrier to scaling agentic AI; model capability ranks far lower ^[1]

1 in 5

Companies with governance built for autonomous decisions

Deloitte, 2026 — only 20% report governance designed for autonomy, not for human-in-the-loop analytics ^[2]

97%

Organizations citing barriers to using data for AI

World Economic Forum, 2025 — 35% name outdated IT architecture and data silos as the primary blocker ^[3]

What this assessment covers

✓
Ten dimensions scored 0 / 0.5 / 1 against your production read path — not your warehouse copy
✓
Four deployment tiers tied to specific score bands: Analytics, Supervised, Narrow Autonomy, Full Autonomy
✓
Ordered fix lists: structural gaps (Q1, Q2, Q7) before operational (Q5, Q8, Q10) before accountability (Q6, Q9)
✓
Concrete SQL and Python checks you can run today to expose the gaps your policy docs hide
✓
A Monday-morning action list — one verification per dimension you can complete before lunch
✓
FAQ answers to the real objections this audience raises: entity resolution scope, low-stakes exemptions, the six-month trajectory

0–10 score

A numerical readiness rating tied to a specific deployment strategy

4 tiers

Analytics. Supervised. Narrow autonomy. Full autonomy.

Calibrated thresholds

Different bars for supervised and autonomous — calibrated against decision stakes

Gap diagnosis

Pinpoints which dimensions to fix, in the order they matter

Governance Theater: Documentation That Cannot Catch a Stale Row

Passing the audit is not enforcement. The artifacts and the agent's read path live in different systems.

The pattern repeats across enterprise agentic deployments. A team gets the green light because the governance audit passed. Policies in Confluence. Data owners on a spreadsheet. Quality rules defined in a standards doc. Six weeks into build, the agent is acting on records that are fourteen hours stale. The listed data owner left three months ago. The quality rules fire in a downstream report nobody reads before the agent does.

This is governance theater. Governance that satisfies an audit and fails at the operational level agents demand. The artifacts are real. The connective tissue between artifact and runtime read path is not.

The reason is structural. Governance frameworks were designed for a world where humans were the final consumer of data. A human analyst hits stale data, notices something is off, asks. An autonomous agent does not. It executes with the confidence of a system that was told its data is reliable. A stale BI dashboard and an agent making refund decisions on stale data are not the same failure with different blast radius. They are categorically different failure modes.

75% of enterprises now report double-digit AI failure rates, with fragmented observability — the inability to see what the agent read, when, and with what data state — as the primary structural cause.^[8] The log says 'action executed.' That is an alibi, not an audit trail.

Theater

Freshness: daily batch ETL is fine; humans catch anomalies in review
Access: table-level read permissions via AD groups
Quality: checked weekly in a monitoring dashboard nobody opens
Ownership: a named steward on a RACI spreadsheet
Explainability: analysts reconstruct queries informally after the fact
Failure mode: stale report caught at the next review cycle

Enforcement

Freshness: SLA enforced at read-time; sub-minute for real-time decisions
Access: column- and row-level enforced at query execution, not the application layer
Quality: gate fires before the agent reads; circuit breaker on degradation
Ownership: on-call rotation with a contractual SLA, not a spreadsheet entry
Explainability: full data state captured at decision time, legally auditable
Failure mode: agent takes the wrong autonomous action and cascades

Supervised and Autonomous Are Not the Same Bar

A human in the loop covers a multitude of data sins. Take that human out, and the bar moves.

Most organizations do not jump straight to fully autonomous agents. They start with supervised — the agent drafts an action or recommendation, a human approves before execution. Sensible default. The data readiness bar is meaningfully lower because the human approval window is itself a check.

A supervised agent tolerates freshness measured in hours rather than seconds. The reviewer catches permission anomalies before they become executed actions. Decision explainability at the data layer can be informal because human review is the accountability point.

The trap is the trajectory. Most teams shipping a 'supervised' agent have ambitions for autonomy within six months. They build on the data infrastructure that clears the supervised bar, then discover they have to rebuild from scratch to reach the autonomous one. This assessment separates the two deliberately. A score of six is operationally sufficient for supervised deployment. A score of six is insufficient for narrow autonomy. Pick the target before you read the score, or you will misread it.

The stakes compound as workflows lengthen. Even if each individual agent step has 85% reliability, a ten-step autonomous workflow succeeds end-to-end only about 20% of the time — compound error, not additive. Data failures are not the only source of that reliability gap, but they are the most tractable one to fix before deployment.

Two Paths, Two Bars: Supervised vs. Autonomous Read Topology

Supervised tolerates batch data and a human gate. Autonomous demands real-time freshness, a quality gate before the read, and an audit log capturing data state and decision together.

What 'Fresh Enough' Actually Means: SLA Thresholds by Decision Type

Freshness requirements are not uniform. They track decision stakes and reversibility.

The most common failure in Q1 (Data Freshness SLA) is not having no SLA — it is having the wrong SLA. Teams set thresholds against their current batch cadence rather than against the decision the agent will make.

The mapping is mechanical. Ask two questions about each decision type your agent will make: how quickly does the underlying state change in a way that invalidates the decision? And what is the cost of acting on a state that was true one hour ago but is no longer?

An agent routing support tickets to the right team can tolerate 30-minute-old routing data. An agent approving credit decisions cannot tolerate data that is four hours stale — customer risk profiles can shift inside that window. An agent triggering automated refunds based on order status needs sub-60-second freshness, because order fulfillment state changes at sub-minute frequency.

Query latency between data generation and agent availability exceeding five minutes is the functional threshold where high-velocity decisions become unreliable. More than 30% of critical data arriving from batch sources with refresh cycles longer than four hours is the readiness signal that autonomous deployment is premature.^[5] These are not aspirational numbers. They are the thresholds where silent failures start accumulating faster than review cycles can catch them.

Decision Type	Max Acceptable Staleness	Data Source Requirement	Failure Mode if Missed
Content routing, ticket triage, summarization	15–30 minutes	Near-real-time streaming or frequent micro-batch (≤15 min)	Misdirected work, delayed responses — recoverable
Inventory allocation, capacity planning	5–15 minutes	Event-driven CDC from operational system	Oversell, underallocation — recoverable with cost
Expense approval, procurement actions	2–5 minutes	Change Data Capture with sub-2-minute propagation	Budget overrun approvals on stale balance data
Credit decisions, risk scoring	Under 60 seconds	Direct operational DB read or real-time event stream	Credit extended on invalidated risk profile — regulatory exposure
Payment processing, order fulfillment triggers	Under 30 seconds	Direct operational system read at decision time	Double-execution, fulfillment on canceled orders — irreversible

Ten Questions. Score the Read Path, Not the Slide Deck.

Score 0 (gap), 0.5 (partial), or 1 (operational). Score the runtime, not the policy doc.

Score each dimension against the data environment your agent will actually use in production. Not the cleaned warehouse. Not the dev environment. Not the documented policy.

A policy that exists but is not enforced is 0.5, not 1. A quality check that fires downstream of your agent's read path is 0.5, not 1. The point is to surface gaps before the agent does. Run this against your most critical data domain first. If that score surprises you, run it against two more before you draw any conclusion about your overall posture.

#	Dimension	Score 0: Gap	Score 0.5: Partial	Score 1: Operational
1	Data Freshness SLA	No defined maximum data age. Freshness undocumented or assumed.	SLA documented but unenforced. Violations surface only after the agent acted on stale data.	SLA enforced at the access layer. Staleness violations trip a circuit breaker before any agent read.
2	Schema Contracts	No formal contracts. Tables evolve without notifying agent teams.	Contracts documented but not CI-tested. Breaking changes discovered at runtime.	Contracts version-controlled, automatically tested, and require agent-team sign-off before upstream change ships.
3	Cross-Domain Entity Consistency	Each system has its own identifiers. No cross-system entity resolution exists.	Partial mapping. Some entities resolve. Others mismatch silently.	A semantic layer resolves entity identities across every agent-accessible domain at query time.
4	Query-Time Access Control	Agent service account has broad read access. Permissions not scoped to the data the function requires.	Role-based access at table level. No row- or column-level enforcement at query execution.	Column- and row-level security enforced at query execution. Agent reads only what its role permits.
5	Pre-Agent Quality Gates	Quality monitoring lives in dashboards or reports. No gate fires before the agent reads.	Quality checks exist in the pipeline but run downstream of the agent's read path.	Quality gate fires before agent access. Agent receives a degraded-data signal and halts autonomously.
6	Decision Explainability	Decisions logged at output level only. Input data at decision time not captured.	Inputs partially logged. Data state at query time not preserved or reconstructable.	Every decision logs exact data state — records, timestamps, versions, quality signal at read time. Replay fidelity, legally auditable.
7	Operational Data Ownership	Owners listed in documentation with no operational SLA or on-call rotation.	Owners exist and respond informally. No formal incident response SLA.	Named on-call owner per domain with a contractual SLA (4h critical, 24h standard) tracked in an incident system.
8	Graceful Degradation	No defined agent behavior on quality drops. Agent proceeds on bad data or crashes.	Fallback behavior exists but is undocumented and untested under production conditions.	Tested degradation path. Agent detects the quality signal, halts with a clear error, or escalates. Never silently continues.
9	End-to-End Audit Trail	No agent-level audit logging. Standard DB logs exist but do not map to agent actions.	Agent actions logged without data version or triggering condition. Correlation is manual investigation.	Every agent action links to data version, triggering condition, and agent identity. Queryable within minutes.
10	Domain Isolation	Agent access not bounded by domain. Reaches anything the service account allows.	Domain boundaries enforced at the application layer only. The data layer does not know.	Isolation enforced at the data layer. Adding a domain requires an explicit grant and is auditable.

Run These Before Lunch: The Concrete Verification Queries

One check per critical dimension. These expose the gap between what your policy doc says and what the runtime actually does.

The assessment above is a scoring framework. But the dimensions only matter if you verify them against your actual production environment. The following checks take under five minutes each and will expose gaps that took other teams six weeks of agent failures to find.

Run Q1 and Q7 first. They are the most common failing dimensions and the most consequential to fix before anything else.

readiness-checks.sql

-- Q1: Data Freshness — actual lag in your production tables
-- Run against each table your agent will query
SELECT
  table_name,
  MAX(updated_at) AS latest_record,
  NOW() - MAX(updated_at) AS current_lag,
  CASE
    WHEN NOW() - MAX(updated_at) < INTERVAL '1 minute' THEN 'real-time'
    WHEN NOW() - MAX(updated_at) < INTERVAL '5 minutes' THEN 'near-real-time'
    WHEN NOW() - MAX(updated_at) < INTERVAL '1 hour' THEN 'micro-batch'
    WHEN NOW() - MAX(updated_at) < INTERVAL '4 hours' THEN 'frequent-batch'
    ELSE 'STALE — autonomous deployment blocked'
  END AS freshness_tier
FROM information_schema.tables t
JOIN your_domain_tables d USING (table_name)
GROUP BY table_name
ORDER BY current_lag DESC;

-- Q3: Entity Mismatch — count customers who appear in CRM but
-- have no match in billing (the gap your agent will join across)
SELECT
  COUNT(*) AS total_crm_customers,
  COUNT(b.billing_customer_id) AS matched_in_billing,
  COUNT(*) - COUNT(b.billing_customer_id) AS unresolved,
  ROUND(
    100.0 * (COUNT(*) - COUNT(b.billing_customer_id)) / COUNT(*), 2
  ) AS mismatch_pct
FROM crm.customers c
LEFT JOIN billing.customers b
  ON c.email = b.email  -- if email is your join key
WHERE c.status = 'active';
-- mismatch_pct > 3% in production: entity resolution required before cross-domain agent

-- Q4: Permission Scope — what can your agent service account actually read?
-- PostgreSQL example
SELECT
  grantee,
  table_schema,
  table_name,
  privilege_type
FROM information_schema.role_table_grants
WHERE grantee = 'agent_service_account'
ORDER BY table_schema, table_name;

readiness-checks.py

"""Q5 / Q8: Verify your quality gate fires BEFORE the agent read path.
If your checks run after, they are monitoring — not gates.
This script traces pipeline DAG execution order against agent read timestamps.
"""
import datetime
from dataclasses import dataclass
from typing import Literal

@dataclass
class PipelineEvent:
    name: str
    completed_at: datetime.datetime
    kind: Literal["quality_check", "agent_read", "agent_action"]

def audit_gate_order(events: list[PipelineEvent]) -> dict:
    """Returns gate_before_read=True only if ALL quality checks
    precede ALL agent reads in the execution log."""
    quality_checks = [e for e in events if e.kind == "quality_check"]
    agent_reads = [e for e in events if e.kind == "agent_read"]

    if not quality_checks:
        return {"gate_before_read": False, "reason": "no quality checks found — score Q5: 0"}

    last_check = max(quality_checks, key=lambda e: e.completed_at)
    first_read = min(agent_reads, key=lambda e: e.completed_at)

    gate_before = last_check.completed_at < first_read.completed_at
    lag_seconds = (first_read.completed_at - last_check.completed_at).total_seconds()

    return {
        "gate_before_read": gate_before,
        "lag_seconds": lag_seconds,
        # negative lag means read happened before check — score Q5: 0
        "score": 1 if gate_before and lag_seconds < 300 else (0.5 if gate_before else 0),
    }

# Q9: Confirm audit log captures data state, not just action
def check_audit_completeness(audit_log_sample: list[dict]) -> dict:
    required_fields = {
        "agent_id", "action_type", "data_version",
        "query_timestamp", "record_ids", "triggering_condition"
    }
    missing = [
        {"entry": i, "missing": required_fields - set(entry.keys())}
        for i, entry in enumerate(audit_log_sample)
        if required_fields - set(entry.keys())
    ]
    return {
        "complete": len(missing) == 0,
        "entries_checked": len(audit_log_sample),
        "incomplete_entries": missing,
        # Q9 score: 1 if complete, 0.5 if partial fields, 0 if action-only
    }

What the Score Means and What to Fix

Four tiers, four deployment strategies, four ordered fix lists.

Scores cluster around recognizable patterns. Teams scoring 3–4 usually have schema documentation and a freshness SLA on paper, then fail on operational ownership (Q7) and quality gates (Q5). Teams scoring 6–7 pass the structural questions and stall on explainability (Q6) and audit trail (Q9). Teams at 9–10 are uncommon — and when they exist, they got there by running this kind of assessment, finding three or four gaps, and closing them before deployment.

One calibration. A score of six is not a failure. It is information. If your deployment target is a supervised agent with human approval queues, six is sufficient. If your target is autonomous execution across multiple domains, six means three or four specific dimensions to address first. The same number maps to different verdicts depending on what you are building toward.

Score	Tier	Deployment Model	Fix These Dimensions First
0–3	Analytics Tier	Agents assist humans with research, summarization, and reporting. No autonomous actions of any kind.	Q1 (freshness SLA), Q2 (schema contracts), Q7 (operational ownership). Structural gaps that block every higher tier.
4–6	Supervised Tier	Agents propose actions. A human approves every execution before it runs. No autonomous decisions.	Q5 (pre-agent quality gates), Q8 (graceful degradation), Q10 (domain isolation). Runtime governance gaps.
7–8	Narrow Autonomy	Autonomous execution within a single bounded domain on low-stakes action types only.	Q6 (decision explainability), Q9 (audit trail). Accountability gaps that block multi-domain or high-stakes use.
9–10	Full Autonomy	Multi-domain, high-stakes autonomous decisions in production with legal and operational accountability.	Ongoing: Q1 (freshness drift as domains expand), Q3 (entity consistency across new sources), Q8 (degradation paths per new domain).

The Four Gaps That Show Up Every Time

Patterns that hold across industries, stacks, and analytics maturity. Bet on finding all four.

Across organizations running their first production agentic deployment, four specific gaps appear with near-universal consistency. Industry, team size, analytics sophistication — none of it changes the pattern.

The freshness assumption (Q1). Every team believes their data is fresh enough. Almost none have measured. The first time someone runs SELECT MAX(updated_at) FROM your_table against production, they find that what they called 'real-time' arrives with a six-to-twenty-three-hour lag. Agents making customer service decisions, financial approvals, or supply-chain allocations on a data state a human analyst would flag as stale — that is where silent failures are born.

The cross-domain mismatch (Q3). Invisible until an agent joins across system boundaries. The same 'customer' has a different identifier in CRM, in billing, in support. An intelligent support agent checking billing while resolving a ticket silently matches the wrong records. Too low to catch in testing. Too high to accept in production autonomous decisions.

The governance theater gap (Q7). The owner exists on paper without an operational SLA. Invisible until an agent fails because a domain was migrated and nobody told the team. The signal: if you cannot name who gets paged at 2am when this source breaks, you have paper governance.

The audit trail gap (Q9). Teams building toward autonomy rarely implement full audit trails up front. The regulatory pressure has not hit yet. Then they have to explain an autonomous decision to a regulator or a customer and discover the log shows 'action executed' but not what data state the agent operated on or what triggered the decision. EU AI Act enforcement for high-risk systems begins 2 August 2026.^[7] Retrofitting end-to-end audit lineage into a production agent is substantially more expensive than building it in.

Why Schema Contracts Fail in Production and How to Stop the Drift

Schema drift is not a documentation problem. It is a deployment coupling problem — and agents amplify it.

Schema contracts (Q2) are the dimension teams most consistently underestimate. The documented contract exists. What does not exist is a mechanism that prevents an upstream team from shipping a breaking change without notifying the agent team.

The failure mode is specific. An upstream service renames customer_id to cust_id in a schema migration. The change is tested against the upstream service's own consumers. The agent team is not a registered consumer in the CI pipeline, so they are not notified. The schema migration ships on a Tuesday. The agent fails silently starting Wednesday — it does not error, it just returns null joins and proceeds on incomplete data. The failure surface is decisions made on nulls, not a 500 error that pages someone.

Preventing this requires three things, not one. First, the contract must be version-controlled in a format that is machine-readable — not a Confluence page, but a schema registry entry (Confluent Schema Registry, AWS Glue Schema Registry, or a Pact contract file). Second, the agent must be a registered consumer in the schema registry, so breaking changes fail CI before they ship. Third, the contract must specify not just field names and types but acceptable null rates, value ranges, and cardinality constraints — the properties that break agent reasoning without breaking the schema validator.

Schema Contract Requirements for Agent-Accessible Data

[01]

Version-controlled in a schema registry — not Confluence, not a README

Confluent Schema Registry, AWS Glue, or a committed Pact contract file. The contract must be machine-readable to be enforceable.

[02]

Agent registered as a consumer — breaking changes fail CI before shipping

Add the agent service as a consumer in the registry. Any upstream change that breaks compatibility must fail the upstream team's CI pipeline, not your production agent.

[03]

Contract covers field semantics, not just field names

Null rates, value ranges, enum cardinalities. A field renamed from customerid to custid is a breaking change. So is a field that was 0.1% null becoming 12% null after a migration.

[04]

Backward compatibility window explicitly defined — not assumed

Upstream teams must support the previous schema version for a defined period (14 days minimum) to allow agent-team migration. Unstated compatibility windows default to zero.

[05]

Schema version logged per agent decision — not just the agent version

When you need to replay or audit a decision, you need to know which schema the agent was operating against, not just which agent version. Both must be in the audit log.

From Score to Roadmap: The Order Is Not Negotiable

Four steps that hold regardless of where you start. Skip the order, pay the interest later.

[01]
Score the read path your agent will actually use — production reality, not the warehouse copy
Run the assessment against the data environment your highest-priority agent will read. Not the cleaned warehouse. Not the dev environment. Document specific evidence per dimension: the actual MAX(timestamp) lag for Q1, the actual on-call test result for Q7, the actual location of quality checks relative to the agent's read access for Q5. The number on a slide is not the number that matters.
[02]
Set the deployment target before you read the score
Decide before you score whether the first agent is supervised or autonomous. The target determines what the score means. A six for supervised is sufficient. A six for autonomous means open gaps. Teams that skip this step misread the score and deploy at the wrong autonomy level. The misread is the failure mode.
[03]
Fix structural gaps before operational ones — every time, in this order
Q1, Q2, Q7 — freshness SLAs, schema contracts, operational ownership — are structural. If they fail, investing in Q5 quality gates or Q9 audit trails is premature. The foundation those layers rely on is not there. Structural gaps propagate. Fixing operational gaps on top of them creates technical debt that surfaces under load. Q1, Q2, Q7 first. Always.
[04]
Re-run the assessment before every autonomy level change
Run it before moving from supervised to narrow autonomy. Run it again before moving to full autonomy. The data environment changes as scope expands — new domains, new upstream sources, new agents joining a multi-agent workflow. A score of seven on a single-domain agent drops to five when a second domain arrives without the corresponding entity resolution and isolation work. The score is not a one-time artifact. It drifts.

Score → Tier → Ordered Fix List → Next Tier

The score names the tier. Structural fixes (Q1, Q2, Q7) precede operational ones (Q5, Q8, Q10). Accountability (Q6, Q9) is the last gate before full autonomy. Resolved means evidence, not intent.

Pre-Deployment Gate: What 'Structural Gaps Resolved' Actually Means

Q1: MAX(updated_at) lag < your decision-type threshold across all production agent tables
Q1: A circuit breaker exists that prevents agent reads when freshness SLA is violated
Q2: All agent-accessible tables have version-controlled schema contracts in a registry
Q2: Agent is registered as a consumer — upstream breaking changes fail CI, not production
Q7: Named on-call owner per domain, tested with a real incident page in the last 90 days
Q7: On-call owner has an acknowledged SLA: 4h critical, 24h standard
Q5: Quality gate fires upstream of agent read access — trace in pipeline DAG confirmed
Q8: Degradation path tested under production conditions — agent halts, does not silently continue
Q9: Audit log captures data version, query timestamp, record IDs, and triggering condition per action
Q9: Logs retained for minimum 6 months (EU AI Act Article 26(6) floor for high-risk systems)
Q10: Domain isolation enforced at the data layer — agent cannot reach outside its granted domain

How is this different from the three-tier data readiness checklist?

Different jobs. The three-tier checklist (Foundation, Workflow, Autonomous) is a pre-build gate — pass or fail before writing agent code. This assessment is a positioning tool — it locates where you currently sit on the readiness spectrum and what deployment model you can responsibly target. Run this first to set the baseline and calibrate ambitions. Run the checklist once you have committed to a specific build, to verify individual gates before code ships. They answer different questions.

We scored four. We plan to reach nine in six months. Can we start now?

Yes — with discipline. Build for the supervised tier. Ship real users through it with human approval queues. Use the supervised period to close the gaps that separate you from narrow autonomy. The mistake is designing the agent for your future infrastructure and shipping it as if the infrastructure already exists. Build for what is real now. Improve in parallel. Upgrade the agent's autonomy when the data is actually ready, not when the roadmap says it should be.

Q3 looks like a six-month data architecture project. How do we unblock the agent?

Scope the first agent to a single domain so Q3 does not apply. Enforce explicit domain boundaries so the agent cannot make cross-domain joins. Then invest in entity resolution before expanding to a second domain. Discipline: never ship cross-domain joins before the entity resolution layer is in place. The mismatch rate on unresolved cross-domain joins compounds silently — low enough to miss in testing, high enough to corrupt production decisions at scale.

Our agent is internal-only and low-stakes. Do we still need a nine?

Probably not. The required score scales with decision stakes and audit exposure. An internal agent drafting meeting summaries operates at four. An agent approving expense reports needs six or seven. An agent making pricing or credit decisions needs nine to ten. The right question: what is the worst-case consequence of this agent acting on stale, mismatched, or incorrect data, and can your organization explain and accept that outcome without a full audit trail? That answer sets the required score more accurately than any blanket rule.

We have a score of seven but our schema contracts are weak. Should we move to narrow autonomy?

No. The score is an average — a weak Q2 at 0.5 can be masked by strong scores elsewhere and still show seven. Before upgrading autonomy, audit each failing dimension individually. A seven with Q2 at 0.5 means an upstream schema change can silently corrupt your autonomous agent's decision context. That specific gap makes narrow autonomy unsafe regardless of the total score. Treat Q2 as a hard gate, not an averaged input.

Does the EU AI Act actually apply to internal enterprise AI agents?

It depends on the decision type, not the deployment surface. EU AI Act high-risk categories cover systems making decisions about creditworthiness, employment, access to essential services, and law enforcement — not 'internal-only' as a class. An internal agent approving loans, scoring employee performance for compensation decisions, or triaging benefits applications almost certainly falls within the high-risk scope. Consult legal counsel against the specific use cases. The audit trail requirements (Article 12, Article 26(6)) are worth building regardless — they protect you in any regulatory environment.

Key terms in this piece

agentic AI data readinessdata readiness self-assessmentautonomous agent data requirementsdata governance for AI agentsagentic AI readiness scoresupervised vs autonomous agentsdata maturity framework

Sources

[1]McKinsey Technology — Building the foundations for agentic AI at scale(mckinsey.com)↩
[2]CIO Dive / Deloitte — Governance gaps stifle agentic AI adoption(ciodive.com)↩
[3]World Economic Forum — Agentic AI: Overcoming 3 obstacles to adoption and innovation(weforum.org)↩
[4]TDWI — Agentic AI Readiness Assessment(tdwi.org)↩
[5]Streamkap — AI Agent Data Infrastructure: How to Build the Data Layer Autonomous Agents Need(streamkap.com)↩
[6]Gartner — Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027(gartner.com)↩
[7]EU AI Act — Article 12: Record-Keeping — EU Artificial Intelligence Act(artificialintelligenceact.eu)↩
[8]Business Wire — New Study Reveals 75% of Enterprises Report Double-Digit AI Failure Rates(businesswire.com)↩

Share this article

X LinkedIn Hacker News

Agentic AI Data Readiness: Score 10 Dimensions Before Your Agents Fail

Data, Context & KnowledgeintermediateApr 23, 20269 min read

By Viktor Bezdek · VP Engineering, Groupon

Decision Type

Max Acceptable Staleness

Data Source Requirement

Failure Mode if Missed

Content routing, ticket triage, summarization

15–30 minutes

Near-real-time streaming or frequent micro-batch (≤15 min)

Misdirected work, delayed responses — recoverable

Inventory allocation, capacity planning

5–15 minutes

Event-driven CDC from operational system

Oversell, underallocation — recoverable with cost

Expense approval, procurement actions

2–5 minutes

Change Data Capture with sub-2-minute propagation

Budget overrun approvals on stale balance data

Credit decisions, risk scoring

Under 60 seconds

Direct operational DB read or real-time event stream

Credit extended on invalidated risk profile — regulatory exposure

Payment processing, order fulfillment triggers

Under 30 seconds

Direct operational system read at decision time

Double-execution, fulfillment on canceled orders — irreversible

Dimension

Score 0: Gap

Score 0.5: Partial

Score 1: Operational

Data Freshness SLA

No defined maximum data age. Freshness undocumented or assumed.

SLA documented but unenforced. Violations surface only after the agent acted on stale data.

SLA enforced at the access layer. Staleness violations trip a circuit breaker before any agent read.

Schema Contracts

No formal contracts. Tables evolve without notifying agent teams.

Contracts documented but not CI-tested. Breaking changes discovered at runtime.

Contracts version-controlled, automatically tested, and require agent-team sign-off before upstream change ships.

Cross-Domain Entity Consistency

Each system has its own identifiers. No cross-system entity resolution exists.

Partial mapping. Some entities resolve. Others mismatch silently.

A semantic layer resolves entity identities across every agent-accessible domain at query time.

Query-Time Access Control

Agent service account has broad read access. Permissions not scoped to the data the function requires.

Role-based access at table level. No row- or column-level enforcement at query execution.

Column- and row-level security enforced at query execution. Agent reads only what its role permits.

Pre-Agent Quality Gates

Quality monitoring lives in dashboards or reports. No gate fires before the agent reads.

Quality checks exist in the pipeline but run downstream of the agent's read path.

Quality gate fires before agent access. Agent receives a degraded-data signal and halts autonomously.

Decision Explainability

Decisions logged at output level only. Input data at decision time not captured.

Inputs partially logged. Data state at query time not preserved or reconstructable.

Every decision logs exact data state — records, timestamps, versions, quality signal at read time. Replay fidelity, legally auditable.

Operational Data Ownership

Owners listed in documentation with no operational SLA or on-call rotation.

Owners exist and respond informally. No formal incident response SLA.

Named on-call owner per domain with a contractual SLA (4h critical, 24h standard) tracked in an incident system.

Graceful Degradation

No defined agent behavior on quality drops. Agent proceeds on bad data or crashes.

Fallback behavior exists but is undocumented and untested under production conditions.

Tested degradation path. Agent detects the quality signal, halts with a clear error, or escalates. Never silently continues.

End-to-End Audit Trail

No agent-level audit logging. Standard DB logs exist but do not map to agent actions.

Agent actions logged without data version or triggering condition. Correlation is manual investigation.

Every agent action links to data version, triggering condition, and agent identity. Queryable within minutes.

Domain Isolation

Agent access not bounded by domain. Reaches anything the service account allows.

Domain boundaries enforced at the application layer only. The data layer does not know.

Isolation enforced at the data layer. Adding a domain requires an explicit grant and is auditable.

-- Q1: Data Freshness — actual lag in your production tables -- Run against each table your agent will query SELECT table_name, MAX(updated_at) AS latest_record, NOW() - MAX(updated_at) AS current_lag, CASE WHEN NOW() - MAX(updated_at) < INTERVAL '1 minute' THEN 'real-time' WHEN NOW() - MAX(updated_at) < INTERVAL '5 minutes' THEN 'near-real-time' WHEN NOW() - MAX(updated_at) < INTERVAL '1 hour' THEN 'micro-batch' WHEN NOW() - MAX(updated_at) < INTERVAL '4 hours' THEN 'frequent-batch' ELSE 'STALE — autonomous deployment blocked' END AS freshness_tier FROM information_schema.tables t JOIN your_domain_tables d USING (table_name) GROUP BY table_name ORDER BY current_lag DESC; -- Q3: Entity Mismatch — count customers who appear in CRM but -- have no match in billing (the gap your agent will join across) SELECT COUNT(*) AS total_crm_customers, COUNT(b.billing_customer_id) AS matched_in_billing, COUNT(*) - COUNT(b.billing_customer_id) AS unresolved, ROUND( 100.0 * (COUNT(*) - COUNT(b.billing_customer_id)) / COUNT(*), 2 ) AS mismatch_pct FROM crm.customers c LEFT JOIN billing.customers b ON c.email = b.email -- if email is your join key WHERE c.status = 'active'; -- mismatch_pct > 3% in production: entity resolution required before cross-domain agent -- Q4: Permission Scope — what can your agent service account actually read? -- PostgreSQL example SELECT grantee, table_schema, table_name, privilege_type FROM information_schema.role_table_grants WHERE grantee = 'agent_service_account' ORDER BY table_schema, table_name;

"""Q5 / Q8: Verify your quality gate fires BEFORE the agent read path. If your checks run after, they are monitoring — not gates. This script traces pipeline DAG execution order against agent read timestamps. """ import datetime from dataclasses import dataclass from typing import Literal @dataclass class PipelineEvent: name: str completed_at: datetime.datetime kind: Literal["quality_check", "agent_read", "agent_action"] def audit_gate_order(events: list[PipelineEvent]) -> dict: """Returns gate_before_read=True only if ALL quality checks precede ALL agent reads in the execution log.""" quality_checks = [e for e in events if e.kind == "quality_check"] agent_reads = [e for e in events if e.kind == "agent_read"] if not quality_checks: return {"gate_before_read": False, "reason": "no quality checks found — score Q5: 0"} last_check = max(quality_checks, key=lambda e: e.completed_at) first_read = min(agent_reads, key=lambda e: e.completed_at) gate_before = last_check.completed_at < first_read.completed_at lag_seconds = (first_read.completed_at - last_check.completed_at).total_seconds() return { "gate_before_read": gate_before, "lag_seconds": lag_seconds, # negative lag means read happened before check — score Q5: 0 "score": 1 if gate_before and lag_seconds < 300 else (0.5 if gate_before else 0), } # Q9: Confirm audit log captures data state, not just action def check_audit_completeness(audit_log_sample: list[dict]) -> dict: required_fields = { "agent_id", "action_type", "data_version", "query_timestamp", "record_ids", "triggering_condition" } missing = [ {"entry": i, "missing": required_fields - set(entry.keys())} for i, entry in enumerate(audit_log_sample) if required_fields - set(entry.keys()) ] return { "complete": len(missing) == 0, "entries_checked": len(audit_log_sample), "incomplete_entries": missing, # Q9 score: 1 if complete, 0.5 if partial fields, 0 if action-only }

Score

Tier

Deployment Model

Fix These Dimensions First

0–3

Analytics Tier

Agents assist humans with research, summarization, and reporting. No autonomous actions of any kind.

Q1 (freshness SLA), Q2 (schema contracts), Q7 (operational ownership). Structural gaps that block every higher tier.

4–6

Supervised Tier

Agents propose actions. A human approves every execution before it runs. No autonomous decisions.

Q5 (pre-agent quality gates), Q8 (graceful degradation), Q10 (domain isolation). Runtime governance gaps.

7–8

Narrow Autonomy

Autonomous execution within a single bounded domain on low-stakes action types only.

Q6 (decision explainability), Q9 (audit trail). Accountability gaps that block multi-domain or high-stakes use.

9–10

Full Autonomy

Multi-domain, high-stakes autonomous decisions in production with legal and operational accountability.

Ongoing: Q1 (freshness drift as domains expand), Q3 (entity consistency across new sources), Q8 (degradation paths per new domain).

What this assessment covers

Governance Theater: Documentation That Cannot Catch a Stale Row

Supervised and Autonomous Are Not the Same Bar

What 'Fresh Enough' Actually Means: SLA Thresholds by Decision Type

Ten Questions. Score the Read Path, Not the Slide Deck.

Run These Before Lunch: The Concrete Verification Queries

What the Score Means and What to Fix

The Four Gaps That Show Up Every Time

Why Schema Contracts Fail in Production and How to Stop the Drift

Schema Contract Requirements for Agent-Accessible Data

Version-controlled in a schema registry — not Confluence, not a README

Agent registered as a consumer — breaking changes fail CI before shipping

Contract covers field semantics, not just field names

Backward compatibility window explicitly defined — not assumed

Schema version logged per agent decision — not just the agent version

From Score to Roadmap: The Order Is Not Negotiable

Score the read path your agent will actually use — production reality, not the warehouse copy

Set the deployment target before you read the score

Fix structural gaps before operational ones — every time, in this order

Re-run the assessment before every autonomy level change

Pre-Deployment Gate: What 'Structural Gaps Resolved' Actually Means

Related

What this assessment covers

Governance Theater: Documentation That Cannot Catch a Stale Row

Supervised and Autonomous Are Not the Same Bar

What 'Fresh Enough' Actually Means: SLA Thresholds by Decision Type

Ten Questions. Score the Read Path, Not the Slide Deck.

Run These Before Lunch: The Concrete Verification Queries

What the Score Means and What to Fix

The Four Gaps That Show Up Every Time

Why Schema Contracts Fail in Production and How to Stop the Drift

Schema Contract Requirements for Agent-Accessible Data

Version-controlled in a schema registry — not Confluence, not a README

Agent registered as a consumer — breaking changes fail CI before shipping

Contract covers field semantics, not just field names

Backward compatibility window explicitly defined — not assumed

Schema version logged per agent decision — not just the agent version

From Score to Roadmap: The Order Is Not Negotiable

Score the read path your agent will actually use — production reality, not the warehouse copy

Set the deployment target before you read the score

Fix structural gaps before operational ones — every time, in this order

Re-run the assessment before every autonomy level change

Pre-Deployment Gate: What 'Structural Gaps Resolved' Actually Means

Related