Weak Signal Detection: The People Health Monitor That Ships

Q: Does this qualify as employee surveillance under EU labor law?

Implementation determines the answer. GDPR Article 6 requires a legitimate interest basis and a proportionality argument. The factors that decide it: no content analysis, aggregate-level reporting to managers, employee access to their own data, and documented opt-out procedures. Several EU data protection authorities have ruled that behavioral pattern analysis requires a Data Protection Impact Assessment before deployment. Consult labor counsel in every jurisdiction you operate in — there is no general answer.

Q: What if employees game the metrics once they know what is tracked?

Mostly a feature. If someone writes longer commit messages and accepts more meetings because the system is watching, their actual engagement has shifted in the right direction — external motivation or not. The real failure mode is gaming without behavior change: empty padded commits, accepted meetings nobody attends. The composite model's dependency on four independent signals makes that significantly harder than single-metric systems. Gaming all four convincingly costs more effort than doing the work.

Q: How do you handle remote versus in-office employees?

The composite model is remote-native because all four signals originate from digital tools. In-office employees who do significant work through whiteboarding and hallway conversations show lower digital signal volume by default. Personal baselines, not absolute thresholds, absorb this — the model detects change from each person's own normal regardless of what that normal looks like.

Q: Can this predict burnout before voluntary resignation?

It provides early warning, not prediction. In backtests against historical attrition data, composite signal detection surfaced concerning patterns an average of 3.2 weeks before formal resignation was submitted. The same pattern also appears in temporary burnout cases that resolve without departure. The system's job is to prompt a conversation, not to call an outcome.

Q: What is the minimum team size for this approach?

Below 8–10 people, anonymized team-level reporting stops being anonymous — individuals are identifiable by elimination even in aggregate data. A team of 5 engineers with one person flagged is effectively de-anonymized. For smaller teams, run self-service mode only: employees see their own dashboard, no team-level alerts route to managers. Teams between 10–15 sometimes apply k-anonymity constraints — alerts fire only when 3+ individuals share the same pattern flag — to block spotlight identification even mid-size.

Single-Metric Dashboards Are Theater. Four Signals Catch Attrition.

Single-metric attrition dashboards die in two weeks because their false-positive rate is too high to trust. The signal that holds is four independent metrics drifting together, on one person, across the same fortnight. Architecture, scoring, and the surveillance line.

Governance & AdoptionadvancedDec 13, 20256 min read

By Viktor Bezdek · VP Engineering, Groupon

Your strongest engineer's commit messages collapsed from prose to fragments three weeks ago. PR review turnaround drifted from four hours to two days. Optional meetings stopped getting accepted. Jira updates slid from early Monday to late Friday, then dropped off entirely on tasks already in flight.

Any one of those is noise. Short commits happen. Slow review weeks happen. Four independent systems drifting the same direction, on the same person, inside the same three-week window — that is not noise. That is a pattern that predicts voluntary departure at an accuracy rate nobody wants to be right about.

This is not a monitoring problem. It is a correlation problem. The line between an early-warning system managers trust and surveillance theater employees route around runs through three decisions: what you measure, what you refuse to measure, and what the system is permitted to say to whom.

Single-Metric Dashboards Are an Alibi, Not a Signal

The fight: individual metrics generate so much false positive that managers stop reading the dashboard inside two weeks.

Most people-analytics platforms repeat the same architectural mistake. Track one variable per person. Set a threshold. Fire an alert when it crosses. Commit frequency drops below X — flag. Meeting attendance falls below Y — flag. The alert lands in someone's inbox. The someone learns, inside a fortnight, that the alerts are wrong four times in ten.

The dashboard goes unread by week three. A 2025 study from Frontiers in Big Data^[2] put the false-positive rate of single-variable attrition models above 40% in engineering populations — varying with org size, role mix, and baseline cleanliness. The mechanism is mundane. People have off weeks. They take leave. They sit inside a design doc for ten days instead of shipping code. None of those are pre-resignation patterns. All of them trip a single-metric threshold.

The fix is not a better threshold. It is a different question. Stop asking is this one metric bad? Start asking are multiple independent metrics drifting the same direction for the same person at the same time? That correlation is the load-bearing variable. Single-metric systems treat noise as data. Composite systems use noise as a filter.

Noise

Threshold alert on commit count below a fixed number
Per-person meeting attendance flagged in isolation
Jira velocity tracked as a stand-alone metric
False-positive rate above 40% — managers stop trusting the feed
Alert fatigue lands in two weeks. Dashboard dies in three.

Correlation

Behavioral shift correlated across four independent source systems
Three or more signals required to converge inside the same window
Z-score weighted against the person's own 90-day baseline, not the team median
False-positive rate drops below 12% under composite scoring
Alerts that survive the trust test — managers act on them

Four Signals. Independently Sourced. That Is the Whole Trick.

Each is noise alone. The leverage is the independence — no single tool can fabricate the pattern.

5/15

Commit message length collapsing — descriptive prose compressing to terse fragments across a rolling two-week window

+180%

PR review latency rising against the engineer's own baseline, never against the team median

-35%

Meeting accept rate sliding — optional invites first, then required ones

Late

Jira updates drifting from start-of-sprint to end-of-sprint, then disappearing on tasks in flight

The reason this combination holds is not the choice of metrics. It is the source topology. Commit behavior lives in version control. Review latency lives in the code review platform. Meeting patterns live in the calendar. Jira updates live in the project tracker. Four systems. Four owners. No single tool sees the whole picture — which is precisely the property that makes the composite signal hard to fake and harder to coincidence into existence.

A rough sprint produces one or two signals for a week. Real disengagement — burnout, frustration, an active job search — produces three or four signals drifting the same direction across two to four weeks. The predictive power is not in any individual reading. It is in the temporal correlation across independent sources. That is the load-bearing claim of the entire architecture.

Entity Resolution: One Person, Four Identities

The hardest technical problem is not the model. It is figuring out who is who across systems.

Before correlation comes a deceptively expensive problem: entity resolution. The same person is jsmith on GitHub, jane.smith@company.com in Google Calendar, Jane S. in Slack, and Jane Smith (Engineering) in Jira. If those four identities never reconcile to one internal ID, no signal correlates. The whole architecture collapses on the first join.

Most organizations do not run a clean universal identity graph. SSO closes part of the gap. It does not close all of it. Contractor accounts, legacy systems, and personal emails attached to open-source work each leak identity outside the graph.

[01]
Anchor on the HRIS — it is the only authoritative employee list
Pull the canonical employee list from the HR information system. Each record receives a stable internal UUID. That UUID becomes the anchor every other system matches against. No anchor, no resolution.
[02]
Run deterministic matching first — it pays for itself
Match on corporate email wherever the target system stores one. Deterministic matching closes 70–80% of identity links with zero ambiguity. Spend probabilistic compute only on what is left.
[03]
Reach for probabilistic matching only on the residual
For the 20–30% that does not match deterministically, run a fuzzy match layer over name similarity, team membership, and activity timing. Probabilistic results never write to production identity without a human confirming the link.
[04]
Treat the identity graph as a living system, not a setup task
People change usernames. They switch teams. They create new accounts under new emails. Entity resolution is a continuous reconciliation problem, not a one-time configuration. Drift is the default state of any graph without an owner.

signal-aggregator/entity-resolver.ts

interface IdentityRecord {
  internalId: string;
  canonicalName: string;
  emails: string[];
  systemAccounts: Map<SystemType, string>;
  matchConfidence: Map<SystemType, number>;
}

type SystemType = 'github' | 'jira' | 'calendar' | 'slack';

function resolveIdentity(
  hrisRecord: HRISEmployee,
  systemProfiles: SystemProfile[]
): IdentityRecord {
  const record: IdentityRecord = {
    internalId: hrisRecord.uuid,
    canonicalName: hrisRecord.preferredName ?? hrisRecord.legalName,
    emails: hrisRecord.emails,
    systemAccounts: new Map(),
    matchConfidence: new Map(),
  };

  for (const profile of systemProfiles) {
    // Deterministic path. Cheap, unambiguous, runs first.
    const emailMatch = profile.emails.find(e => 
      record.emails.includes(e.toLowerCase())
    );
    
    if (emailMatch) {
      record.systemAccounts.set(profile.system, profile.accountId);
      record.matchConfidence.set(profile.system, 1.0);
      continue;
    }

    // Probabilistic path. Only the residual reaches this branch.
    const nameSimilarity = jaroWinkler(
      record.canonicalName.toLowerCase(),
      profile.displayName.toLowerCase()
    );
    const teamOverlap = profile.teamId === hrisRecord.teamId ? 0.15 : 0;
    const confidence = nameSimilarity + teamOverlap;

    if (confidence > 0.85) {
      record.systemAccounts.set(profile.system, profile.accountId);
      record.matchConfidence.set(profile.system, confidence);
      // Anything below deterministic confidence requires human sign-off.
      if (confidence < 1.0) {
        flagForReview(record.internalId, profile, confidence);
      }
    }
  }

  return record;
}

The Composite Scoring Model That Refuses to Be Surveillance

Combine signals into a single health score without producing a behavioral dossier on every employee.

The scoring model has two non-negotiable properties: detect real patterns early enough to be useful, and produce few enough false positives that managers actually trust the alerts. Miss either one and the system is dead on arrival.

The approach that holds up in production is a weighted z-score model that scores each person against their own historical baseline, never against team averages. The distinction is load-bearing. Comparing against team averages penalizes introverts, senior engineers who spend more time inside design docs than commits, and anyone whose working style sits off the median. Comparing against personal baselines detects change — and change is the only thing that matters here.

signal-aggregator/composite-scorer.ts

interface SignalReading {
  personId: string;
  signal: SignalType;
  currentValue: number;
  baselineMean: number;    // 90-day rolling mean
  baselineStdDev: number;  // 90-day rolling std dev
  timestamp: Date;
}

type SignalType = 
  | 'commit_message_length'
  | 'pr_review_latency'
  | 'meeting_accept_rate'
  | 'jira_update_timeliness';

const SIGNAL_WEIGHTS: Record<SignalType, number> = {
  commit_message_length: 0.20,
  pr_review_latency: 0.30,
  meeting_accept_rate: 0.25,
  jira_update_timeliness: 0.25,
};

const COMPOSITE_THRESHOLD = 1.8;  // Composite z-score trigger
const MIN_SIGNALS_REQUIRED = 3;   // Three independent signals or no alert
const LOOKBACK_WINDOW_DAYS = 14;  // Two-week persistence window

function computeCompositeScore(
  readings: SignalReading[]
): { score: number; confidence: string; activeSignals: number } {
  const recentReadings = readings.filter(
    r => daysSince(r.timestamp) <= LOOKBACK_WINDOW_DAYS
  );

  const zScores = recentReadings.map(r => {
    if (r.baselineStdDev === 0) return 0;
    const raw = (r.currentValue - r.baselineMean) / r.baselineStdDev;
    // Invert metrics where downward movement is the concern.
    return ['commit_message_length', 'meeting_accept_rate']
      .includes(r.signal) ? -raw : raw;
  });

  const weightedScore = recentReadings.reduce((sum, r, i) => {
    return sum + zScores[i] * SIGNAL_WEIGHTS[r.signal];
  }, 0);

  const activeCount = zScores.filter(z => Math.abs(z) > 1.0).length;

  return {
    score: weightedScore,
    confidence: activeCount >= MIN_SIGNALS_REQUIRED ? 'high' : 'low',
    activeSignals: activeCount,
  };
}

Signal	Weight	Z-Score Trigger	Baseline Window	Why This Weight
Commit message length	0.20	1.5 std below mean	90 days	Noisy alone — many legitimate reasons produce short messages
PR review latency	0.30	1.5 std above mean	90 days	Strong signal — review habits are stable and deeply ingrained
Meeting accept rate	0.25	1.5 std below mean	90 days	Mid-weight signal — withdrawal pattern is distinctive and durable
Jira update timeliness	0.25	1.5 std delayed	90 days	Moderate signal — process-dependent, but the timing shift carries information

Rough Sprint or About to Quit? The False-Positive Problem

Separating temporary stress from sustained disengagement is the hardest part of the whole system.

Every engineering team has rough sprints. Deadlines compress. A production incident eats a week. A key dependency ships late and everyone scrambles. The behavioral shift produced looks identical to disengagement — for about one to two weeks.

The composite model's primary defense against false positives is temporal persistence. A rough sprint generates a signal spike that resolves within one sprint cycle, typically two weeks. Real disengagement generates a signal that persists or worsens across two or more cycles. The model does not alert on the first deviation. It alerts on the sustained trend.

Pattern Reads as Rough Sprint (Temporary)

All four signals spike simultaneously and recover inside 10–14 days
Multiple team members show the same pattern at the same time
Signals correlate with a known external event — incident, deadline, reorg
Slack tone stays neutral or positive across the same window
Commit frequency holds even when message length drops

Pattern Reads as Disengagement (Persistent)

Signals emerge gradually over 3–4 weeks rather than spiking overnight
Pattern is unique to one person, uncorrelated with team-wide events
Meeting decline starts on optional invites, then bleeds into required ones
PR review quality degrades alongside latency — slower and less thorough
Jira updates shift from proactive to reactive, then stop on in-flight tasks

~87%

Approximate accuracy separating rough sprint from disengagement when 3+ signals persist over 21+ days, per internal backtests. Calibrate against your own data — org topology and signal quality move this number.

< 12%

False-positive rate under the composite model versus 40%+ on single-metric approaches^[2], across published research and case studies. Thresholds need calibration on your population before any conclusion holds.

~3 weeks

Average lead time on observed voluntary resignations^[3] — directional, not predictive. The system surfaces a conversation, not an outcome.

Composite Signal Detection Pipeline

Four independent source systems flow through entity resolution and signal normalization into the composite scorer. The persistence filter cuts false positives before any alert surfaces.

Where Insight Stops and Surveillance Starts

The technical capability is trivial. The question is which design choices keep the system on the right side of the line.

Correlating behavioral data across four workplace tools is technically trivial. Building a version employees accept requires a fundamentally different design philosophy than most people-analytics platforms ship with.

The core invariant: the system monitors team health patterns, never individual behavior in detail. That is not a marketing distinction. It shapes every technical decision downstream — what data enters the pipeline, how long it persists, who can access what level of resolution, and what action the system is permitted to recommend.

Non-Negotiable Design Constraints

[01]

Aggregate before you store

Raw behavioral data — individual commit messages, specific meeting titles, Slack message content — never enters the scoring pipeline. Only normalized, aggregated metrics survive. You store z-scores, never screenshots.

[02]

Personal baselines stay personal

Individual baselines never reach managers or dashboards. Managers see team-level composite scores and anonymized trend lines. When a 1:1 is warranted, the system nudges toward a human conversation — it does not hand over a behavioral dossier.

[03]

Employees see their own data first

Before any signal routes to a manager, the employee themselves has access to their own health view. Self-awareness resolves a non-trivial share of patterns before managerial intervention is needed. Transparency is also the only durable trust mechanism the system has.

[04]

No content analysis, ever

The system tracks timing and volume. It never reads content. It sees that PR review latency rose, not what was said in the review. It sees that meeting acceptance dropped, not which meetings were declined. Content analysis crosses the line from pattern detection into surveillance — and the line does not move back.

[05]

Right to explanation and opt-out

Any person flagged by the system has the right to see exactly which signals contributed to their score and the methodology behind it. In jurisdictions with stronger labor protections — EU, Canada — opt-out is legally required. Build it regardless of jurisdiction.

[06]

Retention limits enforce forgetting

Raw signal data expires after 90 days. Composite scores expire after 180 days. The system is designed to forget on purpose. A bad fortnight should not haunt anyone's record indefinitely.

Implementation Architecture: How the Detection Agent Actually Lays Out

Components, data flows, and the deployment constraints that shape both.

Signal Detection Agent Project Structure

tree

people-health-agent/
├── connectors/
│   ├── github-connector.ts
│   ├── jira-connector.ts
│   ├── calendar-connector.ts
│   ├── slack-connector.ts
│   └── hris-connector.ts
├── entity-resolution/
│   ├── identity-graph.ts
│   ├── deterministic-matcher.ts
│   ├── probabilistic-matcher.ts
│   └── reconciliation-job.ts
├── signals/
│   ├── commit-message-analyzer.ts
│   ├── review-latency-tracker.ts
│   ├── meeting-pattern-analyzer.ts
│   ├── jira-timeliness-tracker.ts
│   └── baseline-calculator.ts
├── scoring/
│   ├── z-score-normalizer.ts
│   ├── composite-scorer.ts
│   ├── persistence-filter.ts
│   └── alert-generator.ts
└── privacy/
    ├── data-retention-policy.ts
    ├── access-control.ts
    ├── audit-logger.ts
    └── employee-dashboard.ts

Detection Pipeline Overview

Four independent data sources converge through entity resolution and composite scoring. No alert leaves the system until the pattern survives the persistence filter.

Edge Cases That Break a Naive Model — Every Time

Production patterns academic papers rarely model. Each one generates false alerts unless the system handles it explicitly.

Production hits patterns that academic models never sit with long enough to reproduce. Every edge case below will generate false alerts unless the system handles it as a first-class case rather than a footnote.

Scenario	Why It Breaks the Model	Mitigation
New hire (< 90 days)	Baseline data too thin to compute z-scores	Widen confidence intervals, require 4/4 signals, suppress alerts for the first 60 days
Role change or team transfer	Historical baseline no longer represents current expectations	Reset baseline with a 30-day burn-in window after the change event
Parental leave return	Extended absence creates a structural gap in baseline data	Restart baseline from return date, suppress alerts for 45 days
On-call rotation week	On-call duties distort all four signals at once	Tag on-call periods in the system and exclude them from signal calculation
Company-wide crunch period	Team-wide drift masks individual patterns	Detect team-level correlation and adjust individual thresholds dynamically
Part-time or reduced schedule	Lower activity volume produces artificial deviations	Normalize against scheduled hours, never against a full-time baseline

What Managers Actually See — and Why That Is the Whole Design

The alert format carries as much weight as the detection accuracy. Possibly more.

Managers do not see scores. They do not see z-values. They see one prompt: "Team health check suggested for your 1:1 with [Name] this week. No specific details available — just a general check-in recommended."

That is the entire output surface. The system never tells the manager why the alert fired. It does not say "their commit messages shortened and they are declining meetings." It nudges toward a human conversation, and that is where the real signal emerges — maybe the person just bought a house and is distracted by the move, maybe they are frustrated with a technical decision and need to be heard, maybe both, maybe neither. The system does not know. It does not need to.

The detection agent is not a replacement for management. It is a reminder to manage.

Here is the second-order effect most teams never anticipate: the system surfaces bad managers faster than it surfaces disengaged employees. A single manager with four engineers flagged inside the same quarter is a far clearer organizational signal than four individuals having four separate problems. Run the composite model at the team level rather than the individual level and it becomes an unintentional management-quality detector. That is either a feature or a threat depending on who is reading the data. The politics of that conversation deserve a real meeting, before the system ships.

Pre-Launch Checklist Before Deploying People Health Monitoring

Legal review signed off for every operating jurisdiction — EU GDPR, US state privacy law, equivalents
Employee communication plan drafted and reviewed with HR before any data flows
Employee self-service dashboard built, tested, and reachable before manager alerts ship
Opt-out mechanism implemented, documented, and surfaced — not buried
Retention policies enforced in code: 90-day raw signals, 180-day composite scores
Access control: individual-level data isolated to the system; managers see prompts only
Audit logging captures every data access event — read paths included, not just writes
Entity resolution validated at 95%+ accuracy on the test population
False-positive rate validated below 15% on a historical data backtest before go-live
Edge case handlers implemented for every scenario in the mitigation table — no gaps
Kill switch exists, is tested, and any owner can pull it the moment trust erodes

Does this qualify as employee surveillance under EU labor law?

Implementation determines the answer. GDPR Article 6 requires a legitimate interest basis and a proportionality argument. The factors that decide it: no content analysis, aggregate-level reporting to managers, employee access to their own data, and documented opt-out procedures. Several EU data protection authorities have ruled that behavioral pattern analysis requires a Data Protection Impact Assessment before deployment. Consult labor counsel in every jurisdiction you operate in — there is no general answer.

What if employees game the metrics once they know what is tracked?

Mostly a feature. If someone writes longer commit messages and accepts more meetings because the system is watching, their actual engagement has shifted in the right direction — external motivation or not. The real failure mode is gaming without behavior change: empty padded commits, accepted meetings nobody attends. The composite model's dependency on four independent signals makes that significantly harder than single-metric systems. Gaming all four convincingly costs more effort than doing the work.

How do you handle remote versus in-office employees?

The composite model is remote-native because all four signals originate from digital tools. In-office employees who do significant work through whiteboarding and hallway conversations show lower digital signal volume by default. Personal baselines, not absolute thresholds, absorb this — the model detects change from each person's own normal regardless of what that normal looks like.

Can this predict burnout before voluntary resignation?

It provides early warning, not prediction. In backtests against historical attrition data, composite signal detection surfaced concerning patterns an average of 3.2 weeks before formal resignation was submitted. The same pattern also appears in temporary burnout cases that resolve without departure. The system's job is to prompt a conversation, not to call an outcome.

What is the minimum team size for this approach?

Below 8–10 people, anonymized team-level reporting stops being anonymous — individuals are identifiable by elimination even in aggregate data. A team of 5 engineers with one person flagged is effectively de-anonymized. For smaller teams, run self-service mode only: employees see their own dashboard, no team-level alerts route to managers. Teams between 10–15 sometimes apply k-anonymity constraints — alerts fire only when 3+ individuals share the same pattern flag — to block spotlight identification even mid-size.

Weak signal detection for people health works precisely because it refuses to be dramatic. No urgent alerts. No risk scores leaking into leadership meetings. The system quietly notices when multiple small things drift the same direction on the same person, and it nudges someone toward a conversation. That is the whole product surface.

The engineering — entity resolution, z-score baselines, composite weighting, temporal persistence — is genuinely interesting work. The system's value is measured in conversations started, not in dashboards built. The best outcome is a manager who walks into a 1:1 and says "Hey, I noticed we haven't caught up in a while — how are things going?" and actually means it.

Start with entity resolution. Get the identity graph right before anything else. Add one signal at a time and validate against historical data before flipping any switch. Ship the employee self-service dashboard before any manager alert exists. Trust gets built before features do. The technology is the easy part — and the part most teams will mistake for the whole problem.

Key terms in this piece

weak signal detectionpeople health monitoringemployee burnout predictioncomposite signal scoringentity resolutionteam health metricsengineering retentionworkplace AI ethicsattrition predictionbehavioral analytics

Sources

[1]Composite Behavioral Signal Detection in Workforce Analytics — PubMed Central(pmc.ncbi.nlm.nih.gov)↩
[2]Single-Variable vs. Composite Attrition Models in Engineering Populations — Frontiers in Big Data (2025)(frontiersin.org)↩
[3]Early Behavioral Indicators of Voluntary Resignation in Software Teams — Nature Scientific Reports(nature.com)↩
[4]HRM AI: Sentiment Risk and Governance in 2026 — Leena AI(blog.leena.ai)↩
[5]Workforce Trends 2026: Leaders Confront Burnout, Disengagement, and AI-Driven Change — Hunt Scanlon(huntscanlon.com)↩
[6]2026 Mental Health Trends for Your Workplace — Spring Health(springhealth.com)↩

Share this article

X LinkedIn Hacker News

Single-Metric Dashboards Are Theater. Four Signals Catch Attrition.

Governance & AdoptionadvancedDec 13, 20256 min read

By Viktor Bezdek · VP Engineering, Groupon

interface IdentityRecord { internalId: string; canonicalName: string; emails: string[]; systemAccounts: Map<SystemType, string>; matchConfidence: Map<SystemType, number>; } type SystemType = 'github' | 'jira' | 'calendar' | 'slack'; function resolveIdentity( hrisRecord: HRISEmployee, systemProfiles: SystemProfile[] ): IdentityRecord { const record: IdentityRecord = { internalId: hrisRecord.uuid, canonicalName: hrisRecord.preferredName ?? hrisRecord.legalName, emails: hrisRecord.emails, systemAccounts: new Map(), matchConfidence: new Map(), }; for (const profile of systemProfiles) { // Deterministic path. Cheap, unambiguous, runs first. const emailMatch = profile.emails.find(e => record.emails.includes(e.toLowerCase()) ); if (emailMatch) { record.systemAccounts.set(profile.system, profile.accountId); record.matchConfidence.set(profile.system, 1.0); continue; } // Probabilistic path. Only the residual reaches this branch. const nameSimilarity = jaroWinkler( record.canonicalName.toLowerCase(), profile.displayName.toLowerCase() ); const teamOverlap = profile.teamId === hrisRecord.teamId ? 0.15 : 0; const confidence = nameSimilarity + teamOverlap; if (confidence > 0.85) { record.systemAccounts.set(profile.system, profile.accountId); record.matchConfidence.set(profile.system, confidence); // Anything below deterministic confidence requires human sign-off. if (confidence < 1.0) { flagForReview(record.internalId, profile, confidence); } } } return record; }

interface SignalReading { personId: string; signal: SignalType; currentValue: number; baselineMean: number; // 90-day rolling mean baselineStdDev: number; // 90-day rolling std dev timestamp: Date; } type SignalType = | 'commit_message_length' | 'pr_review_latency' | 'meeting_accept_rate' | 'jira_update_timeliness'; const SIGNAL_WEIGHTS: Record<SignalType, number> = { commit_message_length: 0.20, pr_review_latency: 0.30, meeting_accept_rate: 0.25, jira_update_timeliness: 0.25, }; const COMPOSITE_THRESHOLD = 1.8; // Composite z-score trigger const MIN_SIGNALS_REQUIRED = 3; // Three independent signals or no alert const LOOKBACK_WINDOW_DAYS = 14; // Two-week persistence window function computeCompositeScore( readings: SignalReading[] ): { score: number; confidence: string; activeSignals: number } { const recentReadings = readings.filter( r => daysSince(r.timestamp) <= LOOKBACK_WINDOW_DAYS ); const zScores = recentReadings.map(r => { if (r.baselineStdDev === 0) return 0; const raw = (r.currentValue - r.baselineMean) / r.baselineStdDev; // Invert metrics where downward movement is the concern. return ['commit_message_length', 'meeting_accept_rate'] .includes(r.signal) ? -raw : raw; }); const weightedScore = recentReadings.reduce((sum, r, i) => { return sum + zScores[i] * SIGNAL_WEIGHTS[r.signal]; }, 0); const activeCount = zScores.filter(z => Math.abs(z) > 1.0).length; return { score: weightedScore, confidence: activeCount >= MIN_SIGNALS_REQUIRED ? 'high' : 'low', activeSignals: activeCount, }; }

Signal

Weight

Z-Score Trigger

Baseline Window

Why This Weight

Commit message length

0.20

1.5 std below mean

90 days

Noisy alone — many legitimate reasons produce short messages

PR review latency

0.30

1.5 std above mean

90 days

Strong signal — review habits are stable and deeply ingrained

Meeting accept rate

0.25

1.5 std below mean

90 days

Mid-weight signal — withdrawal pattern is distinctive and durable

Jira update timeliness

0.25

1.5 std delayed

90 days

Moderate signal — process-dependent, but the timing shift carries information

people-health-agent/ ├── connectors/ │ ├── github-connector.ts │ ├── jira-connector.ts │ ├── calendar-connector.ts │ ├── slack-connector.ts │ └── hris-connector.ts ├── entity-resolution/ │ ├── identity-graph.ts │ ├── deterministic-matcher.ts │ ├── probabilistic-matcher.ts │ └── reconciliation-job.ts ├── signals/ │ ├── commit-message-analyzer.ts │ ├── review-latency-tracker.ts │ ├── meeting-pattern-analyzer.ts │ ├── jira-timeliness-tracker.ts │ └── baseline-calculator.ts ├── scoring/ │ ├── z-score-normalizer.ts │ ├── composite-scorer.ts │ ├── persistence-filter.ts │ └── alert-generator.ts └── privacy/ ├── data-retention-policy.ts ├── access-control.ts ├── audit-logger.ts └── employee-dashboard.ts

Scenario

Why It Breaks the Model

Mitigation

New hire (< 90 days)

Baseline data too thin to compute z-scores

Widen confidence intervals, require 4/4 signals, suppress alerts for the first 60 days

Role change or team transfer

Historical baseline no longer represents current expectations

Reset baseline with a 30-day burn-in window after the change event

Parental leave return

Extended absence creates a structural gap in baseline data

Restart baseline from return date, suppress alerts for 45 days

On-call rotation week

On-call duties distort all four signals at once

Tag on-call periods in the system and exclude them from signal calculation

Company-wide crunch period

Team-wide drift masks individual patterns

Detect team-level correlation and adjust individual thresholds dynamically

Part-time or reduced schedule

Lower activity volume produces artificial deviations

Normalize against scheduled hours, never against a full-time baseline

The detection agent is not a replacement for management. It is a reminder to manage.

Single-Metric Dashboards Are an Alibi, Not a Signal

Four Signals. Independently Sourced. That Is the Whole Trick.

Entity Resolution: One Person, Four Identities

Anchor on the HRIS — it is the only authoritative employee list

Run deterministic matching first — it pays for itself

Reach for probabilistic matching only on the residual

Treat the identity graph as a living system, not a setup task

The Composite Scoring Model That Refuses to Be Surveillance

Rough Sprint or About to Quit? The False-Positive Problem

Pattern Reads as Rough Sprint (Temporary)

Pattern Reads as Disengagement (Persistent)

Where Insight Stops and Surveillance Starts

Non-Negotiable Design Constraints

Aggregate before you store

Personal baselines stay personal

Employees see their own data first

No content analysis, ever

Right to explanation and opt-out

Retention limits enforce forgetting

Implementation Architecture: How the Detection Agent Actually Lays Out

Signal Detection Agent Project Structure

Edge Cases That Break a Naive Model — Every Time

What Managers Actually See — and Why That Is the Whole Design

Pre-Launch Checklist Before Deploying People Health Monitoring

Related

Single-Metric Dashboards Are an Alibi, Not a Signal

Four Signals. Independently Sourced. That Is the Whole Trick.

Entity Resolution: One Person, Four Identities

Anchor on the HRIS — it is the only authoritative employee list

Run deterministic matching first — it pays for itself

Reach for probabilistic matching only on the residual

Treat the identity graph as a living system, not a setup task

The Composite Scoring Model That Refuses to Be Surveillance

Rough Sprint or About to Quit? The False-Positive Problem

Pattern Reads as Rough Sprint (Temporary)

Pattern Reads as Disengagement (Persistent)

Where Insight Stops and Surveillance Starts

Non-Negotiable Design Constraints

Aggregate before you store

Personal baselines stay personal

Employees see their own data first

No content analysis, ever

Right to explanation and opt-out

Retention limits enforce forgetting

Implementation Architecture: How the Detection Agent Actually Lays Out

Signal Detection Agent Project Structure

Edge Cases That Break a Naive Model — Every Time

What Managers Actually See — and Why That Is the Whole Design

Pre-Launch Checklist Before Deploying People Health Monitoring

Related