In July 2025, an engineer ran a routine change through an agent-assisted coding tool during an active code freeze. The agent deleted the production database. Not staging. Not a test instance. The live database paying customers were sitting on top of. The post-mortem called it a "catastrophic failure."[1]
This was not a thought experiment from a safety paper. The incident landed in the AI Incident Database[2], got covered by Fortune, and chewed through engineering Slack channels for weeks. It was also not unique. By late 2025 the pattern was visible from a hundred feet up: agentic systems shipped into production with permission scopes that wildly exceeded the work they were doing, and with nothing meaningful between intent and execution.
The playbook here is not about slowing adoption. It is about the five enforcement layers that stop a helpful automation from becoming an expensive incident review. Each layer maps to a specific failure mode that has already happened to someone. Build them now or build them in the post-mortem.
Every Pattern Here Comes From Something That Already Broke
Each layer is anchored to a documented incident. None of this is theoretical.
The Replit database deletion[1] is the loudest example, not the only one. The AI Incident Database (Incident 1152)[2] catalogs the same shape across stacks and industries:
An agent with broad Terraform permissions runs a configuration change that includes a destroy on a production database. The IaC tool executes it faithfully — exactly what the agent asked for. The execution was not the failure. The permission scope that allowed a routine task to cascade into a destructive operation was.
A deployment agent rolls back a failed canary, picks a target three versions back, and reintroduces a security vulnerability that was patched two releases ago. Nobody had estimated the blast radius of the rollback path itself.[5]
A cleanup agent reads "older than 90 days" one way, the team meant another, and 18 months of audit logs under legal hold get deleted. No dry-run step ever showed what the operation would touch.
These are not failures of model intelligence. They are failures of architecture. The agents executed competently inside the permissions they had. The gap between agent capability and consequence containment is where the incidents live.
Five Layers. Each One Catches What the Last One Lets Through.
Permission scoping, dry-run gates, deletion protection, blast radius scoring, audit design.
Layer 1: Permission Scoping. An Agent Cannot Break What It Cannot Touch.
The most-skipped layer, and the highest leverage. The shared admin service account is the production default.
Permission scoping means each agent runs with the minimum credentials its actual function requires. Nothing more. The reason this is the most-skipped layer is structural: nobody owns scope cleanup. One admin service account is faster to wire up than per-agent scoped credentials, and "we will tighten it later" never gets prioritized over the next feature.[4]
The Replit incident happened in part because the agent had write access to production infrastructure during a code freeze.[1] A scoped agent doing code review would have held read-only access on the codebase and zero infrastructure permissions. The deletion path simply would not have existed.
Scoping happens at three levels. Tool-level: which tools the agent can invoke at all. A documentation agent does not get deployment tools. Resource-level: which targets each tool can hit. A query agent gets read access to specific tables, not the whole database. Action-level: which operations are permitted. Read and list are near-zero risk. Create is moderate. Update and delete are high-risk and demand additional authorization paths.[3]
| Tier | Permission Level | Actions Allowed | Human Approval | Examples |
|---|---|---|---|---|
| Tier 1 | Read-Only | Read, list, search, analyze | None required | Log analysis, code review, report generation |
| Tier 2 | Write-with-Confirmation | Create, update (with preview) | Required before execution | Config changes, PR creation, ticket updates |
| Tier 3 | Write-Autonomous | Create, update (within guardrails) | Post-hoc audit only | Formatting, dependency updates, test generation |
| NEVER | Destructive | Delete, drop, destroy, force-push | Always blocked for agents | Database drops, infrastructure teardown, data purge |
Layer 2: Dry-Run Gates. The Plan, Then the Apply.
Every write gets a preview. The gate is the human review — an unread plan is decoration, not enforcement.
A dry-run gate is a mandatory preview between proposal and execution. The agent generates a plan — files modified, records affected, infrastructure resources created or destroyed — and a human approves before anything runs.
Terraform got this right with terraform plan. Before any apply, you see exactly what will be created, modified, and destroyed. The bitter joke about Terraform-related agent incidents is that the plan step existed and was either skipped or auto-approved without anyone reading it.[2] The gate is not the generation. The gate is the read.
For agentic systems, the dry-run gate is mandatory and non-bypassable for any operation classified Tier 2 or above. The plan must show:
- What changes (specific files, records, resources)
- How many entities are touched ("3 files" vs "4,200 database rows")
- Whether the change is reversible — and if so, by what mechanism
- What the rollback path actually looks like when something fails
dry-run-gate.ts// Dry-run gate. Plan first, approve, then execute. No exceptions for Tier 2+.
interface DryRunResult {
action: string;
tier: 'read' | 'write-confirm' | 'write-auto';
affectedResources: {
type: string;
count: number;
identifiers: string[];
}[];
reversible: boolean;
rollbackPlan?: string;
estimatedBlastRadius: 'none' | 'low' | 'medium' | 'high' | 'critical';
requiresApproval: boolean;
}
async function executeSafely(
action: AgentAction,
context: ExecutionContext
): Promise<ExecutionResult> {
// Generate the plan before anything touches a real resource.
const preview = await generateDryRun(action, context);
// Approval gate. Tier 2+ blocks here.
if (preview.requiresApproval) {
const approval = await requestHumanApproval(preview);
if (!approval.granted) {
return { status: 'denied', reason: approval.reason };
}
}
// Critical blast radius escalates regardless of tier.
if (preview.estimatedBlastRadius === 'critical') {
return { status: 'escalated', reason: 'Blast radius exceeds threshold' };
}
// Execute, audit, log every field. The audit log is the source of truth.
return await executeWithAudit(action, preview, context);
}Layer 3: Make Catastrophic Actions Physically Impossible
Below permissions, below approvals: hard infrastructure constraints that the agent cannot reason around.
Deletion protection is the last line — a hard infrastructure constraint that prevents destruction of critical resources regardless of who or what issues the call. This is not access control. This is making certain actions physically impossible without a separate, deliberate process executed by something other than the agent.
AWS ships deletion protection on RDS, DynamoDB, and CloudFormation. GCP has it for Cloud SQL and GKE. Terraform has prevent_destroy lifecycle rules. Every major cloud provider arrived at the same conclusion: some resources need protection that lives below access controls. Every production environment running agentic systems should turn these flags on.[6]
The pipeline agent that deleted 18 months of audit logs would have failed at the storage layer if object lock and a retention policy had been enabled. The Terraform agent could not have destroyed the database with prevent_destroy = true set on the resource. These are five-minute configuration changes that close the door on million-dollar incidents.
Infrastructure Deletion Protection Checklist
Deletion protection enabled on every production database (RDS, Cloud SQL, equivalent)
prevent_destroy lifecycle set on every Terraform resource that cannot survive recreation
Object lock enabled on audit log buckets — retention period set to legal-hold floor
Production Kubernetes namespaces protected behind admission webhooks
Branch protection enforced on main and production-tracking branches
MFA delete enabled on every S3 bucket holding backups
Termination protection set on production EC2 instances and ECS services
Soft-delete with retention enabled on every data store an agent can reach
Layer 4: Blast Radius. What Is the Worst Case Before You Run It?
Score the damage envelope before execution. Scope, reversibility, downstream coupling.
Blast radius scoring calculates the worst-case impact of an action before it runs. One question, answered before execution: if this fails or behaves outside spec, what is the maximum damage?[5]
We got this wrong on a live system. An early version of our scoring labeled a config file change "low" — single file, fully versioned, clean rollback path. The file was read by seven services at startup. The change took all seven down on the next rolling restart. The blast radius of the file modification was low. The blast radius of its downstream effects was critical. Dependency mapping is not optional. We learned that on a Tuesday afternoon.
Three dimensions decide the score:
Scope — how many resources, records, or systems land in the affected set. An operation touching 5 files is smaller than one touching 5,000 database rows. An operation against one service is smaller than one against a shared database used by twelve. These thresholds are starting points; calibrate against your workload.
Reversibility — can it be undone, and at what cost. A file modification under version control is fully reversible. A DELETE against a database with no recent backup is effectively irreversible. Irreversible actions are categorically more dangerous regardless of how few entities they touch.
Downstream coupling — what depends on the resources being modified. Changing a shared API contract hits every consumer. Modifying a configuration file hits every service that reads it. The blast radius is not the target. It is the target plus everything coupled to it.
Layer 5: The Audit Log Is the Only Source of Truth
When something goes wrong — and it will — the audit trail is the difference between forensics and folklore.
Every agent action — including denials and escalations — gets logged with enough context to reconstruct the incident. The audit log is not compliance theater. It is the forensic instrument that turns a vague production failure into a diagnosable, preventable event.[3]
A real audit log captures seven fields per action: agent identity, action requested, timestamp, authorization decision (approved, denied, escalated), the dry-run preview that was shown, the execution result, and the rollback status if applicable.
The load-bearing design decision is where the log lives. Outside the agent's reach. If the agent can write to its own audit trail, a malfunctioning agent can erase the evidence of its malfunction during the exact failure mode the log was meant to capture. Append-only storage in a separate account or service. The agent has zero write access. No exceptions on this one.
audit-logger.ts// Audit entry schema. Append-only sink in a separate account.
// The agent has read access to its own actions only — never write.
interface AgentAuditEntry {
id: string; // Unique event ID
timestamp: string; // ISO 8601
agentId: string; // Which agent principal
sessionId: string; // Conversation or task session
action: {
type: string; // e.g., 'database.query', 'file.write'
target: string; // What resource
parameters: Record<string, unknown>;
};
authorization: {
tier: 'read' | 'write-confirm' | 'write-auto';
decision: 'approved' | 'denied' | 'escalated';
approvedBy?: string; // Human approver, if any
denialReason?: string;
};
dryRun?: {
affectedCount: number;
blastRadius: string;
preview: string; // Summary of planned changes
};
execution: {
status: 'success' | 'failure' | 'partial' | 'not-executed';
duration: number; // Milliseconds
error?: string;
};
rollback?: {
available: boolean;
executed: boolean;
result?: string;
};
}How the Five Layers Run as One System
Sequenced rollout. Each step closes a documented incident class.
- [01]
Define the tiered authorization model and stop accepting one shared service account
Inventory every tool and resource your agents touch. Classify each as read-only, write-with-confirmation, or write-autonomous. Mark destructive operations (delete, drop, destroy) as permanently blocked for agent principals. Until the model exists in writing, scope drift is the default state.
- [02]
Wire mandatory dry-run gates into every write path
Build the preview step into every write-capable tool. Tier 2 routes through human approval before execution. Tier 3 logs the preview and proceeds within guardrails. Skipping the preview is a safety violation, not a performance optimization.
- [03]
Turn on every deletion protection your cloud already ships
One-time hardening. Enable every available deletion protection mechanism on databases, storage buckets, infrastructure stacks, and git branches. This layer is independent of the agent — it protects against any deletion source, including a tired engineer at 2am.
- [04]
Add blast radius scoring to the dry-run output
Extend the dry-run preview to include a blast radius score. Count affected entities, check reversibility, map downstream dependencies. Anything that scores critical auto-escalates regardless of tier. The dependency map is what catches the seven-service config change masquerading as a one-file edit.
- [05]
Stand up the audit log in an account the agent cannot reach
Append-only sink in a separate account or service. Log every action — including denials and escalations — with the seven required fields. Set retention beyond your compliance floor. Verify the agent has zero write access by attempting to write and confirming the IAM denial.
Agent inherits a broad service account with admin permissions
Writes execute the moment the agent decides to call them
Production databases can be dropped by any authenticated caller
Downstream impact is discovered after the rolling restart
Audit logs sit alongside application data — agent-writable
Incident response starts cold: what happened, who did it, when
Agent runs with credentials scoped to its actual function
Every write hits a preview; Tier 2+ blocks on human approval
Deletion protection makes database drops physically impossible
Blast radius is scored before execution; critical auto-escalates
Audit log lives in a separate account, append-only, agent cannot write
Incident response starts hot: here is the trail, here is the call, here is the rollback
Does this slow agent execution to the point of being useless?
Tier 1 read-only calls clear the stack in milliseconds — no approval, just logging. Tier 3 write-autonomous adds the dry-run and blast radius scoring, on the order of one to three seconds. Only Tier 2 stops for a human, and those are the operations where a few minutes of delay buys you hours not spent on incident response. The throughput cost is real and small. The throughput cost without it is extracted in unplanned 3am pages.
How do you handle agents that chain multiple operations?
Treat the chain as one unit of work with a combined blast radius. If an agent modifies a config file, restarts a service, and verifies health, the dry-run shows the entire chain, the score reflects the combined impact, and the approval covers the full sequence. Per-step approvals across a chain produce approval fatigue, which is how the gate stops enforcing anything.
What about agents in CI/CD pipelines where there is no human in the loop?
Classify CI/CD agents as Tier 3 inside strict scope limits. They create PRs, run tests, update non-production resources autonomously. Production deployments and infrastructure changes still hit a human approval — wired as a pipeline gate, not an interactive prompt. If no human approves before the timeout window closes, the pipeline pauses. Pausing is a feature. Auto-proceeding is the failure.
Is five layers overkill for internal tools that touch non-critical data?
You can simplify, but not below three: permission scoping (always), audit logging (always), and one of the three middle layers. The tier model handles this naturally — most low-risk operations are Tier 1 or 3 and never hit the approval gate. The trap is the inverse: too many gates on low-stakes operations train humans to rubber-stamp without reading, and the gate stops enforcing anything. Calibrate so that real approvals demand real attention.
Non-Negotiable Rules for Production Agents
No agent holds standing destructive permissions in production
Delete, drop, destroy, force-push are permanently blocked on agent principals. When a legitimate destructive operation is needed, a human runs it manually with their own credentials. There is no shortcut around this.
Every write operation has a dry-run preview attached to a human read
No write skips the plan. The preview shows what changes, how many entities are touched, and whether the operation is reversible. The gate is the human reading the plan, not the system generating it.
Audit logs live in a separate system the agent cannot mutate
If the agent can write to its own audit trail, the failure mode the log was supposed to catch is the one that erases the evidence. Append-only, separate account, separate access controls.
Critical blast radius auto-escalates. No exceptions for trusted agents.
When the score returns critical, the operation lands on a human queue regardless of the agent's tier. This is the catch-net for the cases permission scoping alone cannot anticipate, including downstream coupling nobody mapped yet.
Build the cage before you grant production access
All five layers exist before the agent gets near production. Not after the first incident review. Every documented production agent failure was preventable by enforcement layers that were known and not yet implemented. The first escape is not an opportunity for learning. It is a customer outage.
- [1]Fortune — Replit AI Coding Tool Wiped Production Database(fortune.com)↩
- [2]AI Incident Database — Incident 1152(incidentdatabase.ai)↩
- [3]OWASP GenAI — Top 10 Risks and Mitigations for Agentic AI Security(genai.owasp.org)↩
- [4]Cobbai — AI Agent Tool Security Support(cobbai.com)↩
- [5]LoginRadius — Limiting Data Exposure and Blast Radius for AI Agents(loginradius.com)↩
- [6]Google ADK — Safety Documentation(google.github.io)↩