Why production inference bills always exceed estimates — and the Finance-Engineering governance framework for per-agent budgets, model routing, context compression, and cost forecasting without capability degradation.
Why single-inference cost estimates fail for agentic workflows — the four-component inference multiplier (call count, context accumulation, tool schema overhead, retry tax) with concrete workflow examples and measurement patterns.