Roughly nine in ten skill files fail one of five basic checks. The body is rarely the problem. The description is — that 100-token blurb is the only thing the agent reads when deciding whether to load you. Engineer it, or stay invisible.
Every dismiss, modify, and escalate is a labeled training signal. Most teams log it as a debug artifact and move on. Here is the audit schema, the weekly tuner, and the human approval gate that turn that signal into thresholds that converge in eight weeks.
Most teams promote to multi-agent before proving the single agent. Three gates — observability, override readiness, behavioral consistency — decide whether orchestration is earned or inherited. Skip them and a $3.50 task becomes a $47,000 incident.