Billing anomaly alerts run on a 24–48 hour lag. The retry loop is already an invoice by the time anyone sees it. The control that catches it is per-session, in-process, and lives in the orchestration layer — profiled envelope, 3x P95 trip, defined degradation.
You approved Copilot. Then Claude Code. The invoice is a surprise and nobody owns the line item. The window for token FinOps is open right now — proxy, attribution, routing, anomaly detection. Build it before the next quarterly review.
Two LangChain agents burned $47K in eleven days. The model worked. The budget math didn't. Multi-agent cost is a heavy-tailed distribution, monitoring is structurally too late, and only synchronous SDK-level enforcement stops the spiral.
Latency, error rate, and token cost stay green while LLM output quality degrades for weeks. The infrastructure layer cannot see semantic failure. Sampled evals, prompt hash drift, and distribution alerts are the signals that catch it before users do.