Quality at five users is self-regulating. At fifty, it is a liability. Build the rubric layer, gate stack, and federated ownership model before consensus rots into theater — or your AI program gets cancelled with the next budget cycle.
Every dismiss, modify, and escalate is a labeled training signal. Most teams log it as a debug artifact and move on. Here is the audit schema, the weekly tuner, and the human approval gate that turn that signal into thresholds that converge in eight weeks.