The burned-out engineer says it again: "I mentioned this three 1:1s ago." That sentence is not a feedback failure. It is a memory failure compounded by a tooling gap. The manager skimmed Jira, glanced at last week's standup, and walked in trusting their working memory to surface the pattern. Working memory loses to recency every time.
The fix is not better listening. It is a per-EM intelligence brief that runs 30 minutes before each scheduled 1:1. It pulls team Jira metrics, recent GitHub pull requests, the most recent 5/15 report, recognition mentions, retro concerns, and every open action item. Then it does the part that working memory cannot: tags each data point as new since last 1:1 or continuing pattern or resolved. The output is a one-page brief that says exactly what changed and what persisted. You walk in with the conversation already triaged.
Manual Prep Loses to Recency Bias. Every Time.
The signal is scattered across five tools. Working memory cannot reconstruct it under time pressure.
Engineering managers touch four to seven tools in any given week. Jira for sprint work. GitHub for code review. Google Docs for 5/15s. Slack for pulse checks. Whatever HR platform owns performance notes. Preparing for one 1:1 means context-switching across all of them — and that assumes you remember which tool holds which signal.
Jellyfish's research on engineering intelligence puts EM meeting prep at roughly 3+ hours per week of work that could be partially or fully automated.[3] Across six direct reports getting weekly 1:1s, that is a meaningful chunk of management time burned on information gathering rather than the conversation itself.
The deeper failure mode is recency. When you prep manually, the loud Slack thread from yesterday wins over the quiet pattern of three sprints with declining velocity. Whatever caught your eye dominates. The intelligence brief cuts the bias by surfacing data systematically — not by what was loudest.
Time the manager could spend on the conversation, not the tool-switching
Each one is a separate tab, a separate auth flow, a separate mental model
Late enough to capture last-minute changes, early enough to actually read it
Six Sections, Each Tagged With What Changed
The annotation layer is the work. The data is just the input.
The brief has six sections, one per data source. The data is not the value. Raw numbers are easy. The value is the annotation layer: every metric and every item carries a tag — new since last 1:1, continuing pattern, or resolved. Anything unchanged drops out of the executive summary. What persists gets escalated. What recovered gets acknowledged. The brief filters before you read it.
- [01]
Jira Sprint Metrics
Velocity over the last three sprints, current blocked count, scope churn, cycle time median. Each compared against the value frozen at the previous 1:1. A velocity drop above 15% or a blocked-count jump of 2+ trips a highlight flag. The threshold is the constraint; the comparison is the mechanism.
- [02]
GitHub Pull Request Activity
Open PRs authored by the report, PRs they reviewed, average review turnaround, and any PR open beyond 72 hours. Grouped by status: merged, in review, stale. PR-age drift is one of the earliest signals that someone is stuck — earlier than they will say it out loud.
- [03]
Most Recent 5/15 Report
Themes pulled from the engineer's latest 5/15: self-reported blockers, accomplishments, mood signal. Then cross-referenced against the Jira and GitHub data for consistency. When self-reported mood is fine but Jira shows three sprints of declining velocity, the gap itself is the signal.
- [04]
Recognition and Concerns From Last Sprint
Peer kudos pulled from the team's recognition channel. Retro concerns pulled from the last retrospective doc. Both deduplicated and ranked by recency. Repeat concerns across multiple retros surface as a separate flag — the team has named the same problem twice and nothing changed.
- [05]
Open Action Items
Every action item assigned in previous 1:1s that has not closed. Each shows its age, the meeting that produced it, and any progress logged since. Action items older than two weeks get marked overdue — the working test for whether the 1:1 series is actually shipping outcomes or just generating to-dos.
- [06]
Delta Summary
A top-of-brief executive read distilled to three to five bullets. This is the version you read if you have 90 seconds before the meeting starts. Each bullet carries its tag — new, continuing, or resolved — so the priority is encoded in the format, not implied.
Delta Detection: Why Snapshot-vs-Now Beats Raw Numbers
Current state is a number. Compared against the last meeting, it becomes a decision.
The engine compares current state against a snapshot frozen at the moment the previous 1:1 ended — not against rolling averages, not against last week. The snapshot is the load-bearing piece. It is what makes the diff meaningful instead of decorative.
Generation runs in three steps. First, retrieve the stored snapshot from the last meeting: every tracked metric, captured at end-of-1:1. Second, fetch the current values from each integration. Third, run the diff and assign tags.
The diff produces three categories:
Velocity: 34 story points
Blocked tickets: 1
Open PRs: 3 (avg age: 1.2 days)
Action items: 2 open
5/15 mood: Positive
Velocity: 28 story points (down 18% — flag)
Blocked tickets: 3 (new since last 1:1)
Open PRs: 5 (avg age: 2.8 days — continuing pattern)
Action items: 1 open, 1 resolved
5/15 mood: Neutral (downshift detected)
delta-engine.ts// Snapshot-vs-now diff. The thresholds are the policy — keep them in code, not the prompt.
interface MetricSnapshot {
timestamp: string;
velocity: number;
blockedCount: number;
openPRs: number;
avgPRAge: number;
actionItems: { id: string; status: 'open' | 'resolved' }[];
moodSignal: 'positive' | 'neutral' | 'negative';
}
type ChangeTag = 'new-since-last' | 'continuing-pattern' | 'resolved';
interface DeltaResult {
metric: string;
previous: string | number;
current: string | number;
tag: ChangeTag;
severity: 'info' | 'watch' | 'action';
}
function detectDeltas(
previous: MetricSnapshot,
current: MetricSnapshot
): DeltaResult[] {
const deltas: DeltaResult[] = [];
// Velocity drift
const velocityChange = (current.velocity - previous.velocity) / previous.velocity;
if (Math.abs(velocityChange) > 0.1) {
deltas.push({
metric: 'velocity',
previous: previous.velocity,
current: current.velocity,
tag: 'new-since-last',
severity: velocityChange < -0.15 ? 'action' : 'watch',
});
}
// Blocked tickets — escalation, not just count
if (current.blockedCount > previous.blockedCount) {
deltas.push({
metric: 'blocked-tickets',
previous: previous.blockedCount,
current: current.blockedCount,
tag: current.blockedCount - previous.blockedCount >= 2
? 'new-since-last'
: 'continuing-pattern',
severity: current.blockedCount >= 3 ? 'action' : 'watch',
});
}
// PR age drift — the earliest stuck signal
if (current.avgPRAge > previous.avgPRAge * 1.5) {
deltas.push({
metric: 'pr-review-latency',
previous: previous.avgPRAge,
current: current.avgPRAge,
tag: previous.avgPRAge > 2 ? 'continuing-pattern' : 'new-since-last',
severity: current.avgPRAge > 3 ? 'action' : 'info',
});
}
return deltas;
}New vs Continuing: The Distinction That Makes the Brief Useful
One sprint is a data point. Three is a structural problem the manager has been ignoring.
The split between new since last 1:1 and continuing pattern is where the leverage is. One sprint of low velocity is noise. Three consecutive sprints of declining velocity is a structural problem, and the conversation it requires is fundamentally different.
The tag logic runs on a rolling window. A metric crossing a threshold for the first time gets new since last 1:1. If the same metric was flagged in the previous brief and remains flagged, it escalates to continuing pattern. When a previously flagged metric returns to normal, it gets resolved — and the brief surfaces it explicitly, so the recovery actually gets acknowledged in the meeting.
This three-state model stops two specific failure modes. First, you stop re-litigating something already discussed and dismissed as a temporary blip. Second, you stop letting a temporary blip quietly become the new baseline. Three consecutive continuing pattern tags on the same metric escalate to a fourth tag: persistent trend — structural intervention required. At that point the conversation is not about the metric. It is about the system producing the metric.
| Condition | Tag | Severity | What the Conversation Becomes |
|---|---|---|---|
| Metric crosses threshold for the first time | New since last 1:1 | Watch | Acknowledge, get the engineer's read on the cause |
| Metric flagged last 1:1, still flagged | Continuing pattern | Action | Five to ten minutes on root cause, not symptoms |
| Metric flagged 3+ briefs in a row | Persistent trend | Critical | Structural change required — staffing, scope, ownership |
| Previously flagged metric back to normal | Resolved | Info | Acknowledge the recovery — recovery is also data |
| Metric within range, no prior flag | No tag | None | Drops out of the brief entirely |
Wiring the Sources Without Inventing a New Platform
Every data source already has an API. The work is reconciliation, not collection.
Jira Integration Points
- ✓
Sprint velocity from completed story points per sprint, three-sprint window for trend[6]
- ✓
Blocked count read from status field, filtered by assignee — not by label
- ✓
Scope churn from stories added after sprint start vs original commitment
- ✓
Cycle time median from In-Progress to Done transition timestamps
GitHub Integration Points
- ✓
Open PRs via GitHub API, filtered by author and org membership[5]
- ✓
Review turnaround measured from PR creation to first substantive review (not first comment)
- ✓
Stale PR threshold configurable per team — 72 hours without activity is the default
- ✓
Merge cadence tracked as 7-day and 14-day rolling averages[7]
5/15 and Performance Data
- ✓
Latest 5/15 parsed for sentiment language and self-reported blockers
- ✓
Mood signal derived from language patterns — not from a self-reported 1-5 score
- ✓
Self-reported blockers reconciled against actual Jira blocked tickets — gaps are the signal
- ✓
Peer recognition pulled from configured Slack channels by keyword match
What Runs in the 30 Minutes Before the Meeting
Cron triggers. Snapshot loads. Diff runs. Brief lands. Four steps.
- [01]
T-30 trigger fires from the calendar
typescript// Cron scans the calendar for upcoming 1:1s const upcoming = await calendar.getEvents({ timeMin: now(), timeMax: addMinutes(now(), 35), filter: 'one-on-one', }); - [02]
Pull current state from every source in parallel
typescript// Parallel fetch — total wall-clock budget under 60 seconds const snapshot = await Promise.all([ jira.getSprintMetrics(reportId), github.getPRActivity(reportGithubHandle), docs.getLatest515(reportId), slack.getRecognition(reportId, since), actionItems.getOpen(meetingSeriesId), ]); - [03]
Load the previous snapshot and run the diff
typescript// The snapshot is the comparison anchor — frozen at end of last meeting const previous = await snapshots.getForMeeting( meetingSeriesId, 'previous' ); const deltas = detectDeltas(previous, current); - [04]
Render the tagged brief and deliver to both parties
typescript// Same brief, two recipients, same timestamp — no asymmetry const brief = renderBrief({ deltas, snapshot: current, actionItems: openItems, recognition: kudos, }); await deliver(brief, [managerEmail, reportEmail]);
The Brief Becomes a Weapon Without These Three Rules
Aggregated performance data without governance is a surveillance dashboard with extra steps.
Any system that aggregates individual performance data is a surveillance system unless explicit governance says otherwise. Used carelessly, the brief turns into the thing engineers most reasonably distrust: a manager-side dossier they cannot see. Three rules keep it on the right side of that line. None of them is optional.
First, both parties get the brief at the same time. The engineer's inbox receives it the same minute the manager's does. No hidden metrics. No five-minute manager preview. The same artifact, the same timestamp. This collapses the information asymmetry that makes data feel adversarial.
Second, the brief reports what changed, not what it means. Velocity dropped 18% is data. "This person is underperforming" is interpretation. The brief never crosses that line. Maybe they were onboarding a teammate. Maybe the sprint was estimation-heavy. Maybe nothing happened and it is noise. Interpretation belongs in the conversation, where context lives.
Third, the brief never feeds performance reviews directly. This is a meeting prep tool, not an evaluation instrument. Write the boundary into team documentation. Reinforce it when reviews come around. The moment engineers suspect the brief is building a case, they start gaming every metric it tracks.
We learned this rule from a violation. In an early deployment a manager cited a two-week velocity dip in a mid-year review. The engineer found out. Within a month, half the team was padding Jira estimates to smooth their velocity curves. The brief's data quality on that team never recovered. The governance rules are not advice. They are what keeps the data trustworthy.
Operating Rules for the Brief
Both parties receive the brief at the same minute
No information asymmetry. The engineer reads exactly what the manager reads, on the same timestamp.
Brief data never feeds performance reviews directly
Conversation prep, not evaluation evidence. Cross this line once and the data becomes worthless.
Metrics describe what happened, not what it means
Interpretation belongs in the 1:1, where context exists. Numbers without context are accusations.
Engineers can annotate their own brief before the meeting
Direct reports add context to any flagged metric before the conversation. The brief is editable, not delivered.
Thresholds are set by the team, not by individual managers
What counts as a significant change is a team decision. No manager runs secret thresholds against their reports.
What Three Months of Briefs Actually Changes
Self-reported outcomes from teams that ran the system past the calibration window.
How to Roll This Out Without Burning Trust
Shadow mode first. Calibration second. Production third.
Pre-Implementation Checklist
Time your current 1:1 prep for one full week — get a number, not an estimate
List the data sources you actually check today (Jira, GitHub, Slack, 5/15s)
Set initial thresholds with the team, not in isolation — secret thresholds break the rule
Wire calendar integration to detect 1:1s automatically — no manual triggers
Build the snapshot store before anything else — it is the comparison anchor
Run shadow-mode briefs for two weeks — generate but do not act on them
Share the format with direct reports before the first live brief — no surprises
Tune thresholds after week four against false-positive rates
Operating Questions
Does the brief replace 1:1 prep entirely?
No. It absorbs the data-collection step — pulling metrics, surfacing changes, flagging patterns. You still own the agenda and the framing of sensitive topics. The brief compresses the gathering phase from 20 minutes to 5. The freed time goes to thinking about what to say. The managers who get the most out of the brief read it, add two or three handwritten notes about context the data cannot see, and walk in with a clear plan.
What if an engineer feels surveilled by the brief?
Transparency is the only working antidote, and it has to be proactive. Before the first brief runs, walk each direct report through exactly what data gets pulled, where it lives, and who can read it. Share the brief simultaneously — not 30 minutes after the manager has already read it. Allow engineers to annotate flagged metrics before the meeting. Most warm to it inside three or four meetings, once the brief flags their wins as reliably as it flags concerns. The ones who do not are usually carrying a previous manager who weaponized data. That history is the problem, not the brief.
How does it handle engineers on multiple teams?
Scope the brief to the 1:1 relationship, not the human. The primary manager's brief shows primary-team metrics. A dotted-line manager running their own 1:1 configures a project-scoped brief that covers only the relevant work. Cross-team GitHub contributions appear in the GitHub section regardless of scope — PRs do not respect org boundaries and hiding them is dishonest. Make the scope explicit in the brief header so both parties know what work is and is not being tracked.
What happens when there is not enough data for a meaningful delta?
During onboarding or after a long gap, the system generates a baseline brief instead of a delta brief. It shows current-state metrics without change tags and explicitly notes 'delta detection activates after the next 1:1.' This stops the system from inferring trends from a single data point — one of the most common failure modes in metric-driven tooling. The same applies when a data source goes offline: the brief degrades gracefully with 'data unavailable' rather than silently dropping the section. Silent failure is worse than a missing section.
- [1]EM Tools — One-on-One Meetings Guide(em-tools.io)↩
- [2]Windmill — Best 1-on-1 Meeting Software(gowindmill.com)↩
- [3]Jellyfish — Jira Performance Metrics for Engineering Leaders(jellyfish.co)↩
- [4]Cortex — Engineering Intelligence Platforms: Definition, Benefits, Tools(cortex.io)↩
- [5]DX — Git Metrics at Scale(getdx.com)↩
- [6]Atlassian — Agile Project Management Metrics(atlassian.com)↩
- [7]Harness — Top 3 Sprint Metrics to Measure Developer Productivity(harness.io)↩