Five enforcement layers anchored to documented production incidents. Permission scoping, dry-run gates, deletion protection, blast radius scoring, and audit trails the agent cannot reach. Built before you need them, not after the first escape.
Why documented production incidents are permission failures, not model failures
A three-tier authorization model with a hard "NEVER" row for destructive operations
How to wire mandatory dry-run gates into every write path — with runnable TypeScript
Cloud-native deletion protection: which flags to flip and where they live on AWS, GCP, and Azure
Blast radius scoring across three dimensions: scope, reversibility, downstream coupling
Audit log architecture — append-only, out-of-reach, with a seven-field schema
Monday-morning rollout sequence across five steps
OWASP Agentic AI Top 10 (December 2025) mapping to each layer
FAQ answering the real objections: latency, CI/CD pipelines, chained operations, internal tools
In July 2025, an engineer ran a routine change through an agent-assisted coding tool during an active code freeze. The agent deleted the production database. Not staging. Not a test instance. The live database paying customers were sitting on top of. The post-mortem called it a "catastrophic failure."[1]
This was not a thought experiment from a safety paper. The incident landed in the AI Incident Database[2], got covered by Fortune, and chewed through engineering Slack channels for weeks. It was not unique. The same month, an AWS Cost Explorer environment in a Mainland China region went offline for 13 hours — also traced to an AI coding agent that deleted and recreated a production environment. By late 2025, the pattern was visible from a hundred feet up: agentic systems shipped into production with permission scopes that wildly exceeded the work they were doing, and with nothing meaningful between intent and execution.
This is not a playbook for slowing adoption. It covers the five enforcement layers that stop a helpful automation from becoming an expensive incident review. Each layer maps to a specific failure mode that has already happened to someone. Build them now, or build them in the post-mortem.
Each layer is anchored to a documented incident. None of this is theoretical.
The Replit database deletion[1] is the loudest example, not the only one. The AI Incident Database (Incident 1152)[2] catalogs the same shape across stacks and industries.
An agent with broad Terraform permissions runs a configuration change that includes a destroy on a production database. The IaC tool executes it faithfully — exactly what the agent asked for. The execution was not the failure. The permission scope that allowed a routine task to cascade into a destructive operation was.
A deployment agent rolls back a failed canary, picks a target three versions back, and reintroduces a security vulnerability that was patched two releases ago. Nobody had estimated the blast radius of the rollback path itself.[5]
A cleanup agent reads "older than 90 days" one way, the team meant another, and 18 months of audit logs under legal hold get deleted. No dry-run step ever showed what the operation would touch.
These are not failures of model intelligence. They are failures of architecture. The agents executed competently inside the permissions they had. The gap between agent capability and consequence containment is where the incidents live.
The OWASP Top 10 for Agentic Applications — released in December 2025 with input from over 100 security experts and endorsed by NIST, Microsoft, and NVIDIA[7] — formalizes what the incident record already showed: the two highest-risk agentic failure modes are Excessive Agency (an agent granted capabilities beyond the task it's doing) and Insufficient Authorization (no policy layer between the agent and a destructive tool call). Both are architecture failures, not model failures. Both map directly to layers 1 and 2 of this playbook.
Permission scoping, dry-run gates, deletion protection, blast radius scoring, audit design.
The most-skipped layer, and the highest leverage. The shared admin service account is the production default.
Permission scoping means each agent runs with the minimum credentials its actual function requires. Nothing more. The reason this is the most-skipped layer is structural: nobody owns scope cleanup. One admin service account is faster to wire up than per-agent scoped credentials, and "we will tighten it later" never gets prioritized over the next feature.[4]
The Replit incident happened in part because the agent held write access to production infrastructure during a code freeze.[1] A scoped agent doing code review would have held read-only access on the codebase and zero infrastructure permissions. The deletion path simply would not have existed.
Scoping happens at three levels:
Tool-level — which tools the agent can invoke at all. A documentation agent does not get deployment tools.
Resource-level — which targets each tool can hit. A query agent gets read access to specific tables, not the whole database.
Action-level — which operations are permitted within each tool. Read and list are near-zero risk. Create is moderate. Update and delete are high-risk and demand additional authorization paths.[3]
The OWASP Agentic AI Top 10 (December 2025) names this constraint precisely: treat agents as first-class identities with explicit, scoped permissions, use short-lived credentials that rotate frequently, and never allow implicit trust between agents.[7] The practical corollary is that a shared service account shared between five agents is five attack surfaces pretending to be one.
| Tier | Permission Level | Actions Allowed | Human Approval | Examples |
|---|---|---|---|---|
| Tier 1 | Read-Only | Read, list, search, analyze | None required | Log analysis, code review, report generation |
| Tier 2 | Write-with-Confirmation | Create, update (with preview) | Required before execution | Config changes, PR creation, ticket updates |
| Tier 3 | Write-Autonomous | Create, update (within guardrails) | Post-hoc audit only | Formatting, dependency updates, test generation |
| NEVER | Destructive | Delete, drop, destroy, force-push | Always blocked for agent principals | Database drops, infrastructure teardown, data purge |
Every write gets a preview. The gate is the human review — an unread plan is decoration, not enforcement.
A dry-run gate is a mandatory preview between proposal and execution. The agent generates a plan — files modified, records affected, infrastructure resources created or destroyed — and a human approves before anything runs.
Terraform got this right with terraform plan. Before any apply, you see exactly what will be created, modified, and destroyed. The bitter joke about Terraform-related agent incidents is that the plan step existed and was either skipped or auto-approved without anyone reading it.[2] The gate is not the generation. The gate is the read.
For agentic systems, the dry-run gate is mandatory and non-bypassable for any Tier 2 operation. The plan must show:
Microsoft's Agent Governance Toolkit (released April 2026, open-source under MIT) implements this pattern as a stateless policy interceptor that runs at sub-millisecond latency (<0.1ms p99) before every tool call.[9] The same architecture principle applies whether you're using their toolkit or building your own: separate the decision to act from the authorization to execute, with the policy engine as the wall between them.
Below permissions, below approvals: hard infrastructure constraints that the agent cannot reason around.
Deletion protection is the last line — a hard infrastructure constraint that prevents destruction of critical resources regardless of who or what issues the call. This is not access control. It's making certain actions physically impossible without a separate, deliberate process executed by something other than the agent.
AWS ships deletion protection on RDS, DynamoDB, and CloudFormation stacks. GCP has it for Cloud SQL and GKE clusters. Azure has resource locks at the management group level. Terraform has prevent_destroy lifecycle rules. Every major cloud provider arrived at the same conclusion independently: some resources need protection that lives below access controls.[6] Every production environment running agentic systems should have these flags on before the first agent touches the environment.
The pipeline agent that deleted 18 months of audit logs would have failed at the storage layer if object lock and a retention policy had been enabled. The Terraform agent could not have destroyed the database with prevent_destroy = true set on the resource. These are five-minute configuration changes that close the door on million-dollar incidents.
| Provider | Resource Type | Protection Mechanism | CLI / Config |
|---|---|---|---|
| AWS | RDS, DynamoDB, CloudFormation | Deletion Protection flag | aws rds modify-db-instance --deletion-protection |
| AWS | S3 backups / audit logs | Object Lock + MFA Delete | aws s3api put-object-lock-configuration |
| GCP | Cloud SQL, GKE clusters | Deletion Protection flag | gcloud sql instances patch --deletion-protection |
| Azure | Resource groups, critical resources | Management Lock (CanNotDelete) | az lock create --lock-type CanNotDelete |
| Terraform | Any resource | prevent_destroy lifecycle rule | lifecycle { prevent_destroy = true } |
| Kubernetes | Namespaces, CRDs | Admission webhooks + finalizers | kubectl annotate namespace production protect=true |
Score the damage envelope before execution. Scope, reversibility, downstream coupling.
Blast radius scoring calculates the worst-case impact of an action before it runs. One question, answered before execution: if this fails or behaves outside spec, what is the maximum damage?[5]
We got this wrong on a live system. An early version of our scoring labeled a config file change "low" — single file, fully versioned, clean rollback path. The file was read by seven services at startup. The change took all seven down on the next rolling restart. The blast radius of the file modification was low. The blast radius of its downstream effects was critical. Dependency mapping is not optional.
Three dimensions decide the score:
Scope — how many resources, records, or systems land in the affected set. An operation touching 5 files is smaller than one touching 5,000 database rows. These thresholds are starting points; calibrate against your workload.
Reversibility — can it be undone, and at what cost. A file modification under version control is fully reversible. A DELETE against a database with no recent backup is effectively irreversible. Irreversible actions are categorically more dangerous regardless of how few entities they touch.
Downstream coupling — what depends on the resources being modified. Changing a shared API contract hits every consumer. Modifying a configuration file hits every service that reads it. The blast radius is not the target. It's the target plus everything coupled to it.
When something goes wrong — and it will — the audit trail is the difference between forensics and folklore.
Every agent action — including denials and escalations — gets logged with enough context to reconstruct the incident. The audit log is not compliance theater. It is the forensic instrument that turns a vague production failure into a diagnosable, preventable event.[3]
A real audit log captures seven fields per action: agent identity, action requested, timestamp, authorization decision (approved, denied, escalated), the dry-run preview that was shown, the execution result, and the rollback status if applicable.
The load-bearing design decision is where the log lives. Outside the agent's reach. If the agent can write to its own audit trail, a malfunctioning agent can erase the evidence of its malfunction during the exact failure mode the log was meant to capture. Append-only storage in a separate account or service. The agent has zero write access. No exceptions.
AI incident investigations without tamper-evident audit trails consistently devolve into blame games. Detailed logs that tie entitlements directly to actions are what separate a 30-minute forensic reconstruction from a three-day postmortem that concludes with "we're not sure what happened."[10]
Sequenced rollout. Each step closes a documented incident class.
Inventory every tool and resource your agents touch. Classify each as read-only, write-with-confirmation, or write-autonomous. Mark destructive operations (delete, drop, destroy) as permanently blocked for agent principals. Until the model exists in writing, scope drift is the default state.
Build the preview step into every write-capable tool. Tier 2 routes through human approval before execution. Tier 3 logs the preview and proceeds within guardrails. Skipping the preview is a safety violation, not a performance optimization.
One-time hardening. Enable every available deletion protection mechanism on databases, storage buckets, infrastructure stacks, and git branches. This layer is independent of the agent — it protects against any deletion source, including a tired engineer at 2am.
Extend the dry-run preview to include a blast radius score. Count affected entities, check reversibility, map downstream dependencies. Anything scoring critical auto-escalates regardless of tier. The dependency map is what catches the seven-service config change masquerading as a one-file edit.
Append-only sink in a separate account or service. Log every action — including denials and escalations — with the seven required fields. Set retention beyond your compliance floor. Verify the agent has zero write access by attempting to write from agent credentials and asserting the IAM denial.
Agent inherits a broad service account with admin permissions
Writes execute the moment the agent decides to call them
Production databases can be dropped by any authenticated caller
Downstream impact is discovered after the rolling restart
Audit logs sit alongside application data — agent-writable
Incident response starts cold: what happened, who did it, when
Agent runs with credentials scoped to its actual function, rotated and short-lived
Every write hits a preview; Tier 2+ blocks on human approval with 30-minute timeout
Deletion protection makes database drops physically impossible from any automated caller
Blast radius scored before execution; critical auto-escalates to human queue
Audit log lives in a separate account, append-only, agent cannot write, IAM-verified
Incident response starts hot: here is the trail, here is the call, here is the rollback path
The December 2025 peer-reviewed framework maps directly to the five layers.
| Layer | OWASP Risk | What It Addresses |
|---|---|---|
| L1: Permission Scoping | Excessive Agency (LLM06:2025) / ASI01 | Agent granted capabilities beyond task scope — the #1 production failure mode in the incident record |
| L2: Dry-Run Gates | Insufficient Authorization / ASI02 | No policy boundary between the agent and a destructive tool call; approval state not verified before execution |
| L3: Deletion Protection | Insecure Output Handling | Agent output (a tool call) executes without infrastructure-level containment; no hard stop below access controls |
| L4: Blast Radius Scoring | Excessive Agency / Supply Chain Risks | Agent acts without estimating downstream coupling; shared resources treated as isolated targets |
| L5: Audit Logging | Insufficient Logging & Monitoring (ASI09) | No tamper-evident record of agent actions; forensic reconstruction impossible after an incident |
Does this slow agent execution to the point of being useless?
Tier 1 read-only calls clear the stack in milliseconds — no approval, just logging. Tier 3 write-autonomous adds dry-run and blast radius scoring, measured in one to three seconds. Microsoft's Agent Governance Toolkit reports sub-millisecond policy enforcement at p99 for the intercept layer itself. Only Tier 2 stops for a human, and those are the operations where a few minutes of delay buys you hours not spent on incident response. The throughput cost is real and small. The throughput cost without it is extracted in unplanned 3am pages.
How do you handle agents that chain multiple operations?
Treat the chain as one unit of work with a combined blast radius. If an agent modifies a config file, restarts a service, and verifies health, the dry-run shows the entire chain, the score reflects the combined impact, and the approval covers the full sequence. Per-step approvals across a chain produce approval fatigue, which is how the gate stops enforcing anything. One approval, one audit entry, one rollback plan for the full chain.
What about agents in CI/CD pipelines where there is no human in the loop?
Classify CI/CD agents as Tier 3 inside strict scope limits. They create PRs, run tests, update non-production resources autonomously. Production deployments and infrastructure changes still hit a human approval — wired as a pipeline gate, not an interactive prompt. If no human approves before the timeout window closes, the pipeline pauses. Pausing is a feature. Auto-proceeding on timeout is the failure. AWS AgentCore Evaluations runs 13 pre-built behavioral evaluators continuously against CI/CD agent interactions to catch unexpected patterns before they reach prod.
Is five layers overkill for internal tools that touch non-critical data?
You can simplify, but not below three: permission scoping (always), audit logging (always), and one of the three middle layers. The tier model handles this naturally — most low-risk operations are Tier 1 or 3 and never hit the approval gate. The trap is the inverse: too many gates on low-stakes operations train humans to rubber-stamp without reading, and the gate stops enforcing anything. Calibrate so that real approvals demand real attention. The minimum viable stack is: scoped credentials + dry-run for Tier 2 + append-only audit log.
Do I need to build all of this from scratch?
No. AWS Bedrock AgentCore ships Cedar-based policy enforcement, per-agent identity scoping, and continuous behavioral evaluation. Microsoft's Agent Governance Toolkit (open-source, MIT license, April 2026) covers all 10 OWASP Agentic Top 10 risks and supports Python, TypeScript, Go, and Rust. GCP Vertex AI Agent Engine and Azure AI Agent Service both emit per-agent traces to Cloud Trace and Application Insights respectively. The implementation gap is configuration, not missing primitives — most teams ship without safety controls because nobody prioritized the integration sprint, not because the tools don't exist.
What is the minimum I should do before an agent touches production this week?
Three things, in order. First: provision scoped credentials for the agent — not the shared admin account. Takes under an hour. Second: turn on deletion protection for every database and storage bucket the agent can reach. Also under an hour. Third: stand up an append-only audit log in a separate account before the first action executes — AWS CloudWatch Logs in a separate account is a 30-minute setup. These three controls would have prevented the Replit incident, the AWS Cost Explorer outage, and the audit log deletion scenario. Start there. Add dry-run gates and blast radius scoring in the next sprint.
Delete, drop, destroy, force-push are permanently blocked on agent principals. When a legitimate destructive operation is needed, a human runs it manually with their own credentials. There is no shortcut around this one.
No write skips the plan. The preview shows what changes, how many entities are touched, and whether the operation is reversible. The gate is the human reading the plan, not the system generating it.
If the agent can write to its own audit trail, the failure mode the log was supposed to catch is the one that erases the evidence. Append-only, separate account, separate access controls. Verify this in CI.
When the score returns critical, the operation lands on a human queue regardless of the agent's tier. This is the catch-net for the cases permission scoping alone cannot anticipate, including downstream coupling nobody mapped yet.
All five layers exist before the agent gets near production. Not after the first incident review. Every documented production agent failure was preventable by enforcement layers that were known and not yet implemented. The first escape is not a learning opportunity. It is a customer outage.
The OWASP Agentic Top 10, AWS AgentCore, and Microsoft's open-source Agent Governance Toolkit all arrived at the same architecture independently in 2025 and 2026. That convergence is the signal. The controls are known, the primitives exist on every major cloud, and the incident record is long enough to remove all ambiguity about what happens without them. The only remaining question is whether you build the cage before the agent ships, or after the agent finds the door.
Your team codes 3x faster with AI tools, but lead time is up and deployment frequency is flat. The structural reason, and the four pipeline changes that actually fix it.
Agentic tools push engineering past 2–3x velocity and product definition becomes the binding constraint. Hiring more PMs makes it worse. The fix is a three-tier decision rights model that moves authority to where the information actually lives.
Push automation onto an absent substrate and you get usage numbers without capability. Four layers — Literacy, Sandbox, Playbooks, Feedback Loops — a scored readiness rubric, and the sequencing rhythm that holds after the mandate memo fades.