Roughly 10,000 a day through the end of the decade. The exit ramp is dated.
Structural single-point-of-failure problem. Not a talent pipeline problem.
Before the re-discovery tax that hits the moment the person who knew leaves.
Default state of any system without an explicit owner: drift toward bus-factor one.
The senior engineer who keeps your tax engine running handed in his resignation Monday. Twenty-two years in the company. His head is the only place that holds the wiring for three legacy systems, two acquired-company integrations, and a 2009 production incident that still shapes how your deployment pipeline works — nobody else was in the room for most of it. Last day in eight weeks. The reflex from leadership is already forming: ask him to "document everything before he goes."
He won't. Not because he is unwilling. Because "document everything" is an impossible ask wearing the costume of a reasonable one. No structure, no prioritization, no format, no audience. What it produces — at best — is a 200-page Word document in a folder nobody can find, a handful of orphaned Confluence pages, and a few coffee chats that evaporate the moment he badges out. Eighteen months later, a junior engineer pushes a change to the tax engine and causes a compliance incident that takes three weeks and two contractors to unwind. That re-discovery bill is the actual cost of skipping this work.
The failure mode is structural, not motivational. Between 2024 and 2030, 30.4 million baby boomers will turn 65[1]. In enterprise software, 45% of senior engineering leaders are retirement-eligible inside five years[2]. The people who know why the SAP customization was bolted on in 2003, why the nightly batch runs at 3 AM, and which customer has a bespoke tax exemption wired into a stored procedure are walking out. The only question is whether the knowledge leaves with them or stays as something the AI can retrieve.
The ask fails for three reasons. None of them are about effort or goodwill.
The framing is homework. You hand someone in their final eight weeks — already negotiating transition logistics, an emotional exit, and probably a new job — an unbounded, unprioritized task with no payoff they will see. Engineers who want to leave things clean will try. They write until they cannot write anymore, producing pages that are comprehensive without being organized around the person who comes after them. Effort is not the bottleneck. Structure is.
The engineer cannot see what is load-bearing. Most of what a twenty-year veteran knows feels obvious to him. He cannot reliably tell the difference between "this is critical context that saves the next owner six months" and "this is background any competent engineer figures out in a week." The hardest knowledge to extract is tacit: the moves he makes without thinking, the decisions he made so long ago he has forgotten they were decisions. Self-documentation surfaces what is salient to him, not what is load-bearing for the system.
There is no schema, so there is no destination. Without a defined artifact type, the output is whatever shape the engineer defaults to. Usually prose. Prose is not searchable by system. It does not link to a knowledge graph. It cannot be retrieved by an AI agent answering a question in 2028. It is a document in a folder, slowly aging into shelfware.
Interview the People Whose Exit Causes an Incident. That's a Short List.
Not everyone needs an hour. Find the bus-factor-one humans first — the ones whose departure shows up in the postmortem.
The goal is not capture from everyone. It is capture from the people whose exit produces a production incident, a compliance miss, or a six-month re-discovery spiral. That list is shorter than you think — and far more urgent than headcount data will ever flag.
Start with a bus-factor audit. For every critical system, the question is one line: how many people can debug a production issue in this system at 2 AM without calling someone? If the answer is one, that is a single-point-of-failure. If that one is within five years of retirement, the extraction is urgent. Within two, it is an emergency.
The systems that hide the most load-bearing tribal knowledge are almost always the oldest: the core billing engine, the legacy ERP integration, the custom tax module, the deployment pipeline that accreted twelve years of one-off fixes. These rarely have the best documentation. They are usually the systems nobody volunteers to touch — which is precisely why the person who does carries knowledge that has no second copy.
- [01]
Pull the on-call rotation and the incident history
Who actually picks up the page at 2 AM? Who has resolved the most P0/P1 incidents in the past three years without escalating? Incident data tells you who knows the system. Org charts tell you who owns it on paper. Treat the page log as the source of truth.
- [02]
Map every system with no documented owner or one named owner
An undocumented system with one name in CODEOWNERS is a knowledge silo. A system whose architecture diagram was last updated in 2017 is a knowledge silo. List them all, then cross-reference against tenure. The intersection is your extraction backlog.
- [03]
Score engineers by tenure × system criticality × successor depth
Simple matrix. Years of tenure on this system (1-5) × business criticality (1-5, weighted by revenue and compliance blast radius) × inverse successor depth (1 when no backup exists, 0.2 when two qualified backups exist). Sort descending. Top ten is your queue.
- [04]
Rank by time horizon and put interviews on the calendar inside 60 days
Prioritize engineers actively planning retirement or inside a two-year eligibility window. Then expand to high-SPOF engineers regardless of timeline — bus-factor risk does not need an imminent exit to be real. The audit is what makes the risk visible. The schedule is what converts visibility into capture.
One Hour. One Topic. Twelve Questions. No Improvisation.
Free-form interviews produce nothing extractable. A structured hour anchors the transcript to retrievable artifacts.
The format is non-negotiable. One hour. One topic. One interviewer. One note-taker or recorder. The questions are written before the session. The interviewer does not improvise. The interviewee is not asked to prepare. The objective is to extract what is already in his head — not to ask him to pre-organize it.
Free-form sessions produce unextractable output because the shape of the conversation tracks what the interviewee finds memorable, not what the organization needs to capture. A structured hour anchors the transcript to specific retrievable artifacts: decision records, runbook gaps, system gotchas, and the "why we did it that way" entries that live nowhere else. The twelve questions below are designed to surface tacit knowledge by demanding specifics — never summaries.
The 12-question structured interview template
- ✓
Walk me through this system as if I'm a capable engineer joining the team today. What do I need to know in week one that is not in any doc?
- ✓
What is the one thing you would never let another engineer change in this system without asking you first — and what is the failure mode if they do?
- ✓
What is the most embarrassing failure this system has ever had? What did you learn? What change was made as a result?
- ✓
Who was the last person who understood this before you? What did you learn from him that is not written down anywhere?
- ✓
What is the runbook step missing from the actual runbook — the move you make mentally that is not in any written procedure?
- ✓
If you got hit by a bus tomorrow, who would I call first? What would I tell them to check before anything else?
- ✓
What decision was made before your time that still shapes how this system works today — and do you know why it was made?
- ✓
What part of this system scares you? What would you change if you had unlimited time and zero blast radius?
- ✓
Where are the landmines — changes that look safe and trigger a non-obvious failure downstream?
- ✓
Which customer, business unit, or compliance requirement has a special branch in this system that is not obvious from the code?
- ✓
What did you know six months ago about this system that you wish you had known on day one?
- ✓
If you were writing the onboarding guide for your replacement right now, what would the first three bullet points be?
The Transcript Is a Source Document, Not the Deliverable
What you do with the hour after it ends decides whether the time was wasted or compounded into memory.
The interview produces a transcript. The transcript is raw source. What you do with it next decides whether the hour compounded or evaporated.
Extraction turns unstructured narrative into typed artifacts: ADR-style decision records (why this was built this way), runbook patches (the undocumented step the on-call actually executes), system gotcha entries (the landmines the interviewee named), and org-knowledge-map entries (who knows what, traceable to a source). Each artifact has a defined schema — covered in the next section — and a defined home in your existing knowledge infrastructure. No schema, no home, no extraction.
The LLM is the extraction tool, not the author. You supply the transcript, a system prompt with each artifact schema and a few well-formed and poorly-formed examples, and an explicit instruction to identify discrete knowledge claims and serialize each into the right schema. This is not summarization. Ambiguous or unverifiable claims get flagged with a confidence score — never silently dropped. Every artifact passes through human verification before it enters the live knowledge base. LLMs hallucinate institutional history with full confidence. A second senior engineer signs off on anything that will be treated as authoritative.
Meta's 2026 work on tribal knowledge in large-scale data pipelines used a swarm of specialized agents to read every file and emit structured context files — coverage went from 5% to 100% of code modules[5]. The interview-first approach here is the human-side complement to that kind of automated extraction. Code-side and human-side feed the same retrieval surface.
Define the Schema Before the Interview. The Schema Shapes What You Extract.
Vague extraction prompts produce vague artifacts. Decide the artifact types up front — they are the only reason the hour compounds.
Extraction is bounded by the schema you give the LLM. Vague instructions produce vague output. Decide the artifact types before the first interview — the schema is what converts a transcript from raw text into a retrievable asset.
Each artifact below maps to a specific information need and a specific home in your existing knowledge infrastructure. Fields, destination, and ownership are defined up front. No artifact without an owner. No artifact without a home.
| Artifact | Schema fields | Destination | Owner |
|---|---|---|---|
| Decision Record (ADR-style)[6] | id, system, decision, context, alternatives-considered, consequences, date-approx, source-interview | ADR directory in repo, Confluence space, linked to the system wiki | Current system owner. Re-reviewed on every major change. |
| Runbook Patch | system, runbook-id, step-number, gap-description, corrected-procedure, who-to-call, urgency | Appended to the existing runbook or flagged inline as a note | On-call rotation owner. Validated on the next incident. |
| System Gotcha Entry | system, description, trigger-condition, failure-mode, mitigation, severity, source-interview | System-specific gotcha catalog. Surfaced in IDE context and AI retrieval. | System owner. Reviewed quarterly or after any related incident. |
| Org Knowledge Map Entry | knowledge-domain, person, depth (primary/secondary), last-verified, successor-candidates | Internal org knowledge map. Drives the next SPOF re-audit. | Engineering management. Re-validated annually or on role change. |
| Decision History Entry | system, question, answer, rationale, source-interview, confidence (high/medium/low) | Decision history log. Searchable by system and topic. | Anyone who can verify or refute. Confidence updates on verification. |
Consent and Retention Are Not Optional Scaffolding
Recording without informed consent is a legal failure and a trust failure. The framing decides which one you get.
Recording a structured interview without explicit informed consent is two failures in one — legal and trust. In most US states and across every EU jurisdiction, recording a conversation requires consent from all parties. The legal floor is the easy part. The framing of the conversation is what decides whether the engineer answers honestly or hedges.
Written consent before the first session. The form specifies: what is being recorded, how the transcript is stored and for how long, who has access to the raw audio versus the extracted artifacts, and that extracted artifacts may carry an attribution back to the interview for traceability while the raw recording stays inside the knowledge management team.
The framing for the conversation with the departing engineer is direct. This is for operational continuity, not his personnel file. The output never enters an HR process. The recording is held for 90 days and then deleted. Extracted artifacts are kept indefinitely as institutional documentation. Most engineers respond well to this framing — they want the knowledge to survive them. What they resist is the feeling of being surveilled or evaluated. Name the difference up front.
The retention policy goes in writing, communicated before recording starts. Ninety days for raw recordings is a reasonable default. Extracted artifacts are permanent documentation assets — versioned, attributed, and auditable. Treat them that way from day one.
Five Anti-Patterns That Burn the Hour
Each one runs the process and produces nothing retrievable. Recognize them before you book the first session.
The "Tell Me Everything" Interview
No topic, no questions, no time box. Output is a monologue. The interviewer cannot extract structured artifacts from a monologue because the structure does not exist. One session, one topic. If the engineer owns five systems, run five interviews.
The Solo Note-Taker
One person trying to listen, follow up, and capture notes simultaneously produces partial notes and misses every follow-up opening. Always pair: one interviewer driving the questions, one recorder or note-taker watching for the threads to pull. If you are recording, the note-taker's job is flagging follow-ups, not transcription.
Asking the Engineer to Write the Docs Himself
This is the default approach. It produces nothing. Writing is slow, unprioritized work that does not happen under time pressure, and his last eight weeks are already full. The interview offloads the writing burden by design — he talks, you extract.
The Video Archive Without a Transcript
An MP4 is not institutional memory. It is a file. It does not appear in search results, an AI agent cannot retrieve it, and nobody watches three hours of video while a production issue is burning. Transcript first. Video is optional and temporary.
LLM Extraction Without Human Verification
LLMs produce plausible-sounding ADRs with fabricated history at full confidence. Extraction is a draft, never a publication. Every artifact entering the live knowledge base needs a human to confirm it reflects actual history — ideally someone who was in the room, or someone who can test the claim against observable system behavior.
Four Stages From Audio to Retrieval. Each One Has Its Failure Mode.
Transcription, extraction, storage, retrieval. Every stage compounds — or compounds the rot.
The technical pipeline has four stages. Each one carries a tool recommendation and a specific failure mode that kills the program if you ignore it.
One honest caveat before the stages. Most extraction programs do not stall on technical failure. They stall because the first batch of artifacts has nobody querying them. If you cannot show an engineer a concrete example of a retrieved gotcha that prevented an incident inside 90 days, organizational appetite collapses and the program quietly dies. Start with a retrieval surface that already has active users — an AI coding assistant, Slack search, the runbook viewer — not a new internal wiki nobody opens.
Stage 1: Transcription. Whisper (local or API) or an equivalent speech-to-text service with speaker diarization enabled. Diarization assigns speaker labels — critical for separating the interviewer's questions from the engineer's answers, which matters the moment the LLM tries to extract artifacts. Output: plain text with speaker labels and timestamps. Store in your document system with access controls matching the consent policy. Failure mode: missing diarization turns the transcript into a single stream of unattributed claims.
Stage 2: LLM extraction. Run the transcript through a structured extraction prompt against each artifact schema. The system prompt carries the schema definition, well-formed and poorly-formed examples, and an explicit instruction to flag low-confidence entries rather than omit them. Use a long-context model — hour-long transcripts run 8,000-15,000 tokens. GPT-4o, Claude Sonnet, or equivalent. Never use a summarization prompt. The task is to identify discrete knowledge claims and serialize each into the right schema, not to compress what was said.
Stage 3: Storage and indexing. Extracted artifacts go to their defined homes — ADR directories, runbook patches, gotcha catalogs — and are chunked and embedded into your vector store in parallel. Knowledge graphs are emerging as the infrastructure layer for this in enterprise settings[8]: typed entities (system, person, decision) connected by typed relationships (knows, made, depends-on) carry the semantic structure pure vector search lacks. Starting from scratch, a document store with a tight tagging schema and full-text search is enough. The knowledge graph earns its weight once you have the volume to justify the infrastructure cost.
Stage 4: AI retrieval integration. Artifacts have to surface where engineers already work — inside the AI coding assistant context, inside internal Slack search, inside the incident runbook. Fortune 500 companies lose an estimated $31.5 billion a year on information that exists but cannot be found[8]. Indexing in a system nobody queries is the same as not indexing at all. Pick the retrieval surface first. Design the storage format to feed it.
Three-hour video archive in a Google Drive folder nobody can find
Free-text Confluence page titled "Steve's Notes" with no metadata
"Ask Steve — he knows" as the de facto runbook entry
Fifty-page brain dump Word doc, last opened six months after Steve left
Exit interview notes parked in an HR folder no engineer can access
Transcript with speaker labels in the document system, 90-day retention
Tagged ADR set linked to the system wiki, reviewed on every major change
Structured gotcha catalog entry: trigger, failure mode, mitigation, who to call
Vector-indexed artifacts surfaced in the AI assistant context for that system
Knowledge graph node for Steve's expertise, linked to the three systems he owned, with verified successors tagged
Ninety Days From Zero to Running
Treat the first quarter as a sprint with a hard end date — not an initiative that can always be deprioritized.
- [01]
Days 1-14: Run the SPOF audit. Produce the interview queue.
Complete bus-factor scoring across your five most critical systems. Identify the top ten extraction targets — not just the engineers retiring, but every engineer whose exit would produce a measurable incident regardless of timeline. Schedule ten interviews across the next twelve weeks. The ranked queue is the deliverable. Without it, the program drifts.
- [02]
Days 15-30: Run three pilot interviews. Validate the pipeline.
Pick three diverse topics — one legacy system, one business-rules domain, one operational procedure. Run the structured hour, transcribe, extract against the schemas, verify artifacts with a second engineer. The pilot tells you what to add or cut from the schema before the schema becomes load-bearing.
- [03]
Days 31-60: Ship the first batch. Wire it into retrieval.
After verification, publish artifacts to their defined homes. Integrate with at least one retrieval surface engineers already use — the AI coding assistant, internal search, the runbook system. Get retrieval feedback from the engineers who would otherwise have called Steve. The first concrete "the AI surfaced the gotcha and the change was halted" story is what buys the program a second quarter.
- [04]
Days 61-90: Establish the quarterly cadence. Name the owner.
The SPOF list moves as engineers join and leave. Set a quarterly review: re-run bus-factor scoring, add new high-SPOF holders, schedule the next quarter's interviews. Assign a program owner — typically the senior-most engineering lead or an internal knowledge management function. Drift is the default state of any system without an explicit owner.
What Actually Comes Up When You Try to Run This
Operating doctrine for the questions that surface once the program is real, not theoretical.
What if the senior engineer refuses to be interviewed?
Rare when the framing is right. Most engineers want their knowledge to survive them and respond well to a structured, time-bounded request. If there is resistance, do not push on the recording. Offer a note-based session: the interviewer asks the questions and takes notes, no audio captured. You will get 60-70% of what a recorded session produces. The refusal is almost always about surveillance anxiety, not about withholding knowledge. Name that distinction directly and the conversation usually opens.
How do we handle proprietary or customer-sensitive details in the transcript?
Design the interview to capture structural knowledge, not identifying detail. "There is a special tax exemption for one customer class" is what you want. "Customer X has a $2.4M exemption" is what you redact. If sensitive details surface in the recording, scrub them before the transcript enters your document system. The extraction prompt can carry an explicit instruction to flag and omit specific customer names and PII rather than fold them into artifacts. Treat redaction as a pipeline stage, not a one-time pass.
Should the interview be done by a manager or a peer?
A peer or a senior engineer from an adjacent team produces better results than a direct manager. The interviewee shares failure stories, regrets, and uncomfortable truths more readily with someone who is not on the performance-review path. The interviewer needs the technical depth to recognize when an answer is incomplete or evasive and to follow up on the right detail. A non-technical interviewer misses too much. The reporting line shapes what gets said. Treat the choice of interviewer as a design decision, not a calendaring detail.
What is the right number of interviews per quarter?
Two to four per quarter is a sustainable cadence for most engineering organizations once the initial backlog clears. The backlog itself — your highest-SPOF engineers — usually demands eight to twelve interviews in the first quarter. After that, the program maintains itself by catching new high-SPOF holders surfaced in the quarterly audit and by running exit interviews on any departing senior engineer using the same structured format. Across pilots in twelve organizations, the teams that succeeded treated the first quarter as a sprint with a hard end date. The teams that treated it as an ongoing initiative deprioritized it inside a quarter.
How do we keep the captured knowledge fresh after the interview?
Artifacts need ownership and a review trigger. Every ADR and runbook patch carries an owner who gets notified when the system it describes changes significantly. A simple CI hook that flags relevant ADRs when a file in their scope is modified is usually enough. Gotcha entries are reviewed after any related incident — the incident either confirms the gotcha, expands it, or renders it obsolete. Set a hard staleness threshold: any artifact unreviewed for 18 months is flagged stale and re-verified before it surfaces in AI retrieval. Drift is the default state. The review trigger is the only thing that reverses it.
Pre-Flight Checklist for the Extraction Program
Bus-factor scoring complete across every critical system
Ranked interview queue with the top 10 extraction targets identified
Written consent form drafted and reviewed for your jurisdiction
Retention policy documented and communicated to the interviewee in writing
All five artifact schemas defined before the first interview is run
Structured 12-question template finalized and shared with interviewers
Transcription pipeline tested end-to-end on a sample recording
LLM extraction prompt tested against a pilot transcript and reviewed
Human verification process assigned — named reviewers, defined rejection threshold
Artifact storage homes defined (ADR directory, runbook system, gotcha catalog)
At least one retrieval surface wired in (AI assistant context, internal search, runbook system)
Quarterly SPOF review on the calendar, program owner named
The exit ramp is dated. Between now and 2030, 30.4 million baby boomers will turn 65[1]. In enterprise software, the SPOF concentration is acute: the engineers who lived through the longest-running systems, sat in the room when the decisions were made, and still pick up the 2 AM page — every one of them is inside striking distance of exit. The bill for doing nothing is paid in incidents, re-discovery spirals, and the quiet morning when nobody on the team can explain why the deployment script runs at 3 AM.
The bill for doing something structured is two hours per interview, a working extraction pipeline, and a program owner with a quarterly calendar entry. Small relative to what it protects. The difference between a knowledge graveyard and a retrievable knowledge base is not effort. It is structure. Build the schema first. Then run the interview.
- [1]Protected Income: Peak 65 — Between 2024 and 2030, 30.4 million baby boomers will turn 65(protectedincome.org)↩
- [2]Enterprise Knowledge: Navigating the Retirement Cliff — Challenges and Strategies for Knowledge Capture and Succession Planning(enterprise-knowledge.com)↩
- [3]KS-Agents: Strategic Analysis of Knowledge Loss — Measuring the Cost of Employee Turnover(ks-agents.com)↩
- [4]Keevee: 27 Succession Planning Statistics for 2025 — only 35% of companies have formal succession plans(keevee.com)↩
- [5]Engineering at Meta: How Meta Used AI to Map Tribal Knowledge in Large-Scale Data Pipelines (April 2026)(engineering.fb.com)↩
- [6]ADR GitHub: Architectural Decision Records — the community standard for capturing decision context(adr.github.io)↩
- [7]AWS Architecture Blog: Master Architecture Decision Records — Best Practices for Effective Decision-Making(aws.amazon.com)↩
- [8]Glean: How Knowledge Graphs Work and Why They Are the Key to Context for Enterprise AI(glean.com)↩