AI test generation raises coverage metrics while the test oracle problem bakes bugs into passing tests. The diff-scoped, spec-first, mutation-filtered pattern that generates tests worth keeping — and the failure modes to screen before they reach human review.