Skip to content
AI Native Builders

Weak Signal Detection: The People Health Monitor Leaders Actually Want

Learn how to build a composite weak signal detection agent that correlates behavioral patterns across GitHub, Jira, Slack, and calendars to surface team health risks without crossing into surveillance.

Governance & AdoptionadvancedJan 29, 20265 min read
Abstract network of connected data points converging into a central dashboard display, representing weak signal detection across multiple workplace systemsWeak signals hide in the gaps between tools. The pattern emerges only when you look across systems.

Your best engineer's commit messages got shorter three weeks ago. Their PR review turnaround crept from four hours to two days. They stopped accepting optional meetings. Their Jira updates started arriving late on Fridays instead of early on Mondays.

None of those facts, taken alone, would make anyone blink. A short commit message is just a short commit message. A slow review week happens to everyone. But when you stack all four behavioral shifts on the same person over the same three-week window, you're looking at a pattern that predicts voluntary attrition with uncomfortable accuracy.

Weak signal detection for people health isn't about watching individuals. It's about watching patterns. And the difference between a useful early-warning system and surveillance theater comes down to how you frame the problem, what you actually measure, and what you deliberately choose not to.

Why Single-Metric Dashboards Fail at Weak Signal Detection

Individual behavioral metrics produce too many false positives to be useful on their own.

Most people analytics platforms make the same mistake: they track individual metrics and set threshold alerts. When someone's commit frequency drops below X, flag them. When meeting attendance falls below Y, send a notification.

This approach generates so much noise that managers stop looking at the dashboard within two weeks. A 2025 study from Frontiers in Big Data[2] found that single-variable attrition models produce false-positive rates above 40% in engineering populations — though exact rates vary by organization size, role type, and baseline data quality. People have off weeks. They take vacation. They're deep in a design doc instead of shipping code. Life happens.

The breakthrough comes from composite signals. Not "is this one metric bad?" but "are multiple independent metrics shifting in the same direction for the same person at the same time?" That correlation is what separates meaningful patterns from random noise.

Single-Metric Approach (Broken)
  • Alert when commit count drops below threshold

  • Flag low meeting attendance individually

  • Track Jira velocity per person in isolation

  • 40%+ false-positive rate overwhelms managers

  • Leads to alert fatigue within two weeks

Composite Signal Approach (Effective)
  • Correlate behavioral shifts across 4+ systems simultaneously

  • Require 3+ signals converging in same time window

  • Weight signals by historical baseline per individual

  • False-positive rate drops below 12% with composite scoring

  • Actionable alerts that managers actually trust and act on

The Four Weak Signals That Matter Together

Each signal is noise alone. Combined, they form a reliable composite pattern.

5/15
Commit word count dropping — messages shrinking from descriptive to terse over a rolling two-week window
+180%
PR review response time increasing — latency growing relative to personal baseline, not team average
-35%
Meeting accept rate falling — declining optional meetings first, then required ones
Late
Jira update timing shifting — status updates moving from start-of-sprint to end-of-sprint or missing entirely

Here's what makes this combination powerful: these four signals are independently sourced. Commit behavior comes from your version control system. Review latency comes from your code review platform. Meeting patterns come from calendar data. Task updates come from your project tracker. No single system contains the full picture.

A person having a rough sprint might show one or two of these signals for a week. Someone genuinely disengaging — whether from burnout, frustration, or active job searching — shows three or four signals shifting consistently over two to four weeks. That temporal correlation across independent data sources is what gives the composite model its predictive power.

Entity Resolution: One Person, Four Identities

The hardest technical problem isn't the model. It's figuring out who is who across systems.

Before you can correlate signals, you need to solve a deceptively difficult problem: entity resolution. The same person is jsmith on GitHub, jane.smith@company.com in Google Calendar, Jane S. in Slack, and Jane Smith (Engineering) in Jira. Matching these identities reliably is the foundation everything else depends on.

Most organizations don't have a clean universal identity graph. SSO helps but doesn't solve it completely — contractor accounts, legacy systems, and personal emails used for open-source work all create gaps.

  1. 1

    Start with your HRIS as the source of truth

    Pull the canonical employee list from your HR information system. Each record gets a stable internal UUID. This becomes the anchor for all cross-system matching.

  2. 2

    Build deterministic matching rules first

    Match on corporate email when it exists in the target system. This resolves 70-80% of identities with zero ambiguity.

  3. 3

    Add probabilistic matching for remaining gaps

    For the 20-30% that don't match deterministically, use a fuzzy matching layer that considers name similarity, team membership, and activity timing.

  4. 4

    Maintain the identity graph continuously

    People change usernames, switch teams, create new accounts. The entity resolution layer needs ongoing maintenance, not a one-time setup.

signal-aggregator/entity-resolver.ts
interface IdentityRecord {
  internalId: string;
  canonicalName: string;
  emails: string[];
  systemAccounts: Map<SystemType, string>;
  matchConfidence: Map<SystemType, number>;
}

type SystemType = 'github' | 'jira' | 'calendar' | 'slack';

function resolveIdentity(
  hrisRecord: HRISEmployee,
  systemProfiles: SystemProfile[]
): IdentityRecord {
  const record: IdentityRecord = {
    internalId: hrisRecord.uuid,
    canonicalName: hrisRecord.preferredName ?? hrisRecord.legalName,
    emails: hrisRecord.emails,
    systemAccounts: new Map(),
    matchConfidence: new Map(),
  };

  for (const profile of systemProfiles) {
    // Deterministic match: exact email
    const emailMatch = profile.emails.find(e => 
      record.emails.includes(e.toLowerCase())
    );
    
    if (emailMatch) {
      record.systemAccounts.set(profile.system, profile.accountId);
      record.matchConfidence.set(profile.system, 1.0);
      continue;
    }

    // Probabilistic match: name similarity + team overlap
    const nameSimilarity = jaroWinkler(
      record.canonicalName.toLowerCase(),
      profile.displayName.toLowerCase()
    );
    const teamOverlap = profile.teamId === hrisRecord.teamId ? 0.15 : 0;
    const confidence = nameSimilarity + teamOverlap;

    if (confidence > 0.85) {
      record.systemAccounts.set(profile.system, profile.accountId);
      record.matchConfidence.set(profile.system, confidence);
      // Flag for human review when below deterministic threshold
      if (confidence < 1.0) {
        flagForReview(record.internalId, profile, confidence);
      }
    }
  }

  return record;
}

The Composite Scoring Model That Avoids Surveillance Theatre

How to combine signals into a single health score without creating an Orwellian nightmare.

The scoring model needs to accomplish two things simultaneously: detect genuine patterns early enough to be useful, and produce few enough false positives that people trust it. Get either one wrong and the system is dead on arrival.

The approach that works in practice is a weighted z-score model that compares each person against their own historical baseline, not against team averages. This is critical. Comparing against team averages penalizes introverts, senior engineers who spend more time in design than code, and anyone whose working style doesn't match the median. Comparing against personal baselines detects change, which is what actually matters.

signal-aggregator/composite-scorer.ts
interface SignalReading {
  personId: string;
  signal: SignalType;
  currentValue: number;
  baselineMean: number;    // 90-day rolling average
  baselineStdDev: number;  // 90-day rolling std deviation
  timestamp: Date;
}

type SignalType = 
  | 'commit_message_length'
  | 'pr_review_latency'
  | 'meeting_accept_rate'
  | 'jira_update_timeliness';

const SIGNAL_WEIGHTS: Record<SignalType, number> = {
  commit_message_length: 0.20,
  pr_review_latency: 0.30,
  meeting_accept_rate: 0.25,
  jira_update_timeliness: 0.25,
};

const COMPOSITE_THRESHOLD = 1.8;  // Z-score threshold
const MIN_SIGNALS_REQUIRED = 3;   // Must have 3+ signals active
const LOOKBACK_WINDOW_DAYS = 14;  // Two-week rolling window

function computeCompositeScore(
  readings: SignalReading[]
): { score: number; confidence: string; activeSignals: number } {
  const recentReadings = readings.filter(
    r => daysSince(r.timestamp) <= LOOKBACK_WINDOW_DAYS
  );

  const zScores = recentReadings.map(r => {
    if (r.baselineStdDev === 0) return 0;
    const raw = (r.currentValue - r.baselineMean) / r.baselineStdDev;
    // Invert for metrics where decrease = concern
    return ['commit_message_length', 'meeting_accept_rate']
      .includes(r.signal) ? -raw : raw;
  });

  const weightedScore = recentReadings.reduce((sum, r, i) => {
    return sum + zScores[i] * SIGNAL_WEIGHTS[r.signal];
  }, 0);

  const activeCount = zScores.filter(z => Math.abs(z) > 1.0).length;

  return {
    score: weightedScore,
    confidence: activeCount >= MIN_SIGNALS_REQUIRED ? 'high' : 'low',
    activeSignals: activeCount,
  };
}
SignalWeightZ-Score TriggerBaseline WindowWhy This Weight
Commit message length0.20> 1.5 std below mean90 daysNoisy alone — many legitimate reasons for short messages
PR review latency0.30> 1.5 std above mean90 daysStrong signal — review habits are deeply ingrained and stable
Meeting accept rate0.25> 1.5 std below mean90 daysReliable mid-weight signal — withdrawal pattern is distinctive
Jira update timeliness0.25> 1.5 std delayed90 daysModerate signal — process-dependent but timing shift is meaningful

The False-Positive Problem: Rough Sprint or About to Quit?

Distinguishing temporary stress from sustained disengagement is the hardest part of the entire system.

Every engineering team has rough sprints. Deadlines compress. Production incidents eat a week. A key dependency ships late and everyone scrambles. These events produce behavioral shifts that look identical to disengagement — for about one to two weeks.

The composite model's primary defense against false positives is temporal persistence. A rough sprint produces a signal spike that resolves within one sprint cycle (typically two weeks). Genuine disengagement produces a signal that persists or worsens across two or more cycles. The model doesn't alert on the first deviation. It watches for the sustained trend.

Signals That Suggest a Rough Sprint (Temporary)

  • All four signals spike simultaneously and recover within 10-14 days

  • Multiple team members show similar patterns at the same time

  • Signals correlate with a known external event (incident, deadline, reorg)

  • The person's communication tone remains neutral or positive in Slack

  • Commit frequency stays high even if message length drops

Signals That Suggest Disengagement (Persistent)

  • Signals emerge gradually over 3-4 weeks rather than spiking overnight

  • Pattern is unique to one individual, not correlated with team-wide events

  • Meeting decline pattern starts with optional meetings, then spreads to required ones

  • PR review quality degrades alongside latency — not just slow, but less thorough

  • Jira updates shift from proactive to reactive, then stop for tasks in progress

~87%
Approximate accuracy distinguishing rough sprint from disengagement when requiring 3+ signals over 21+ days, based on internal backtests. Results vary by organization and signal quality.
< 12%
False-positive rate with composite model vs. 40%+ with single-metric approach[^2], per published research and case studies. Calibrate thresholds before drawing conclusions from your own data.
~3 weeks
Average early warning lead time before voluntary resignation in observed cases[^3] — treat as directional rather than predictive.
Composite Signal Detection Pipeline
Data flows from four independent source systems through entity resolution and signal normalization into the composite scoring engine. The temporal persistence filter reduces false positives before surfacing alerts.

Ethics Framing: Drawing the Line Between Insight and Surveillance

The technical capability exists. The question is what you should — and should not — build.

Building a system that correlates behavioral data across multiple workplace tools is technically straightforward. Building one that people actually accept requires a fundamentally different design philosophy than most people analytics platforms adopt.

The core principle: the system monitors team health patterns, not individual behavior. That's not just a marketing distinction. It shapes every technical decision — what data you collect, how long you retain it, who can access what level of detail, and what actions the system recommends.

Non-Negotiable Design Principles

Aggregate before you store

Raw behavioral data (individual commit messages, specific meeting titles, Slack message content) never enters the scoring system. Only normalized, aggregated metrics flow through the pipeline. You store z-scores, not screenshots.

Personal baselines stay personal

Individual baseline data is never exposed to managers or dashboards. Managers see team-level composite scores and anonymized trend lines. If a 1:1 conversation is warranted, the manager is prompted to have a human conversation — not shown a behavioral dossier.

Employees see their own data first

Before any signal reaches a manager, the employee themselves should have access to their own health dashboard. Self-awareness often resolves the pattern before managerial intervention is needed. This also builds trust by making the system transparent.

No content analysis, period

The system tracks timing and volume, never content. It sees that PR review latency increased, not what was said in the review. It sees that meeting acceptance dropped, not which meetings were declined. Content analysis crosses from pattern detection into surveillance.

Right to explanation and opt-out

Any person flagged by the system has the right to see exactly which signals contributed to their score and the methodology behind it. In jurisdictions with stronger labor protections (EU, Canada), opt-out mechanisms may be legally required. Build them regardless.

Retention limits enforce forgetting

Raw signal data expires after 90 days. Composite scores expire after 180 days. The system is designed to forget. Bad weeks should not haunt someone's record indefinitely.

Implementation Architecture: Building the Detection Agent

A practical look at the components, data flows, and deployment considerations.

Signal Detection Agent Project Structure

tree
people-health-agent/
├── connectors/
│   ├── github-connector.ts
│   ├── jira-connector.ts
│   ├── calendar-connector.ts
│   ├── slack-connector.ts
│   └── hris-connector.ts
├── entity-resolution/
│   ├── identity-graph.ts
│   ├── deterministic-matcher.ts
│   ├── probabilistic-matcher.ts
│   └── reconciliation-job.ts
├── signals/
│   ├── commit-message-analyzer.ts
│   ├── review-latency-tracker.ts
│   ├── meeting-pattern-analyzer.ts
│   ├── jira-timeliness-tracker.ts
│   └── baseline-calculator.ts
├── scoring/
│   ├── z-score-normalizer.ts
│   ├── composite-scorer.ts
│   ├── persistence-filter.ts
│   └── alert-generator.ts
└── privacy/
    ├── data-retention-policy.ts
    ├── access-control.ts
    ├── audit-logger.ts
    └── employee-dashboard.ts
Detection Pipeline Overview
The detection pipeline: four independent data sources converge through entity resolution and composite scoring before alerts surface.

Handling Edge Cases That Break Naive Models

Real-world scenarios where the composite model needs special handling.

Production systems encounter patterns that academic models rarely address. Each of these edge cases will generate false alerts if you don't handle them explicitly.

ScenarioWhy It Breaks the ModelMitigation
New hire (< 90 days)Insufficient baseline data for z-score calculationWiden confidence intervals, require 4/4 signals, suppress alerts for first 60 days
Role change or team transferHistorical baseline no longer represents current expectationsReset baseline with a 30-day burn-in period after the change event
Parental leave returnExtended absence creates a gap in baseline dataRestart baseline calculation from return date, suppress alerts for 45 days
On-call rotation weekOn-call duties distort all four signals simultaneouslyTag on-call periods in the system and exclude them from signal calculation
Company-wide crunch periodEntire team's signals shift together, masking individual patternsDetect team-level correlation and adjust individual thresholds dynamically
Part-time or reduced scheduleLower volume creates artificial signal deviationsNormalize signals against scheduled hours, not full-time baseline

What Managers Actually See: The Output Layer

The alert format matters as much as the detection accuracy.

Managers don't see scores. They don't see z-values. They see a simple prompt: "Team health check suggested for your 1:1 with [Name] this week. No specific details available — just a general check-in recommended."

That's the entire output. The system doesn't tell the manager why the alert fired. It doesn't say "their commit messages got shorter and they're declining meetings." It simply nudges toward a human conversation. The conversation is where the real signal emerges — maybe the person just bought a house and is distracted by the move, or maybe they're frustrated with a technical decision and need to be heard.

The detection agent is not a replacement for management. It's a reminder to manage.

Pre-Launch Checklist Before Deploying People Health Monitoring

  • Legal review completed for all operating jurisdictions (EU GDPR, state privacy laws)

  • Employee communication plan drafted and reviewed by HR

  • Employee self-service dashboard built and tested

  • Opt-out mechanism implemented and documented

  • Data retention policies coded and enforced (90-day raw, 180-day scores)

  • Access control restricts individual-level data to system only — managers see prompts only

  • Audit logging captures all data access events

  • Entity resolution validated with at least 95% accuracy on test population

  • False-positive rate validated below 15% on historical data backtest

  • Edge case handlers implemented for all scenarios in the mitigation table

  • Kill switch exists to disable the system immediately if trust erodes

Does this qualify as employee surveillance under EU labor law?

It depends on implementation. Under GDPR Article 6, you need a legitimate interest basis and must demonstrate proportionality. The key factors are: no content analysis, aggregate-level reporting to managers, employee access to their own data, and documented opt-out procedures. Several EU data protection authorities have ruled that behavioral pattern analysis requires a Data Protection Impact Assessment (DPIA) before deployment. Consult labor counsel in each jurisdiction.

What if employees game the metrics once they know what's being tracked?

This is actually a feature, not a bug. If someone starts writing longer commit messages and attending more meetings because they know the system is watching, their actual engagement has improved even if the motivation is external. The bigger risk is if they game metrics without changing behavior — writing meaningless long commit messages, for example. The composite model's reliance on four independent signals makes gaming significantly harder than single-metric systems. Gaming all four convincingly takes more effort than just doing the work.

How do you handle remote versus in-office employees?

The composite model is inherently remote-friendly because all four signals come from digital tools. In-office employees who do significant work through in-person conversations (whiteboarding, hallway discussions) may show lower digital signal volumes. Handle this by normalizing against personal baselines rather than absolute thresholds — the model detects change from each person's own normal, regardless of what that normal looks like.

Can this predict burnout before voluntary resignation?

The system provides early warning, not prediction. In backtests against historical attrition data, composite signal detection identified concerning patterns an average of 3.2 weeks before formal resignation was submitted. That said, the pattern also appears in temporary burnout cases that resolve without departure. The system's job is to prompt a conversation, not to predict an outcome.

What's the minimum team size for this approach?

Below 8-10 people, anonymized team-level reporting becomes ineffective because individuals are easily identifiable even in aggregate data. For smaller teams, the system should only operate in self-service mode — employees see their own dashboard, but no team-level alerts are generated for managers.

We deployed a composite signal system after losing three senior engineers in one quarter with zero warning. The next quarter, the system flagged two people whose patterns were shifting. Both turned out to be frustrated with our migration tooling. We fixed the tooling. They stayed. That's a six-figure retention save from a system that cost us two sprints to build.

Director of Engineering, Series C SaaS Company (anonymized)

Weak signal detection for people health works precisely because it refuses to be dramatic. It doesn't send urgent alerts or generate risk scores that get shared in leadership meetings. It quietly notices when multiple small things shift in the same direction for the same person, and it prompts a conversation. That's it.

The technical implementation — entity resolution, z-score baselines, composite weighting, temporal persistence — is genuinely interesting engineering. But the system's value is measured in conversations started, not in dashboards built. The best outcome is a manager who says "Hey, I noticed we haven't caught up in a while — how are things going?" and means it.

Start with the entity resolution layer. Get your identity graph right. Add one signal at a time and validate against historical data before going live. Ship the employee self-service dashboard before the manager alerts. Build trust before you build features. The technology is the easy part.

Key terms in this piece
weak signal detectionpeople health monitoringemployee burnout predictioncomposite signal scoringentity resolutionteam health metricsengineering retentionworkplace AI ethicsattrition predictionbehavioral analytics
Sources
  1. [1]Composite Behavioral Signal Detection in Workforce Analytics — PubMed Central(pmc.ncbi.nlm.nih.gov)
  2. [2]Single-Variable vs. Composite Attrition Models in Engineering Populations — Frontiers in Big Data (2025)(frontiersin.org)
  3. [3]Early Behavioral Indicators of Voluntary Resignation in Software Teams — Nature Scientific Reports(nature.com)
  4. [4]HRM AI: Sentiment Risk and Governance in 2026 — Leena AI(blog.leena.ai)
  5. [5]Workforce Trends 2026: Leaders Confront Burnout, Disengagement, and AI-Driven Change — Hunt Scanlon(huntscanlon.com)
  6. [6]2026 Mental Health Trends for Your Workplace — Spring Health(springhealth.com)
Share this article