The senior engineer who knows your tax engine handed in his resignation Monday. He has 22 years in the company. His tenure covers tribal knowledge extraction for three legacy systems, two acquired companies, and a production incident in 2009 that changed how your deployment pipeline works to this day — and nobody else was in the room for most of it. His last day is in eight weeks. Your response, almost certainly, will be to ask him to "document everything before he goes."
He won't. Not because he's unwilling. Because "document everything" is an impossible ask dressed up as a reasonable one. It has no structure, no prioritization, no format, and no audience. What he'll produce — if anything — is a 200-page Word document in a folder nobody will find, a few Confluence pages with no home, and a series of informal coffee chats that evaporate the moment he's gone. Then eighteen months later, a junior engineer will push a change to the tax engine and cause a compliance incident that takes three weeks and two contractors to debug. That's the actual cost of not doing this right.
This is the tribal knowledge extraction problem — and it's becoming a structural crisis. Between 2024 and 2030, 30.4 million baby boomers will turn 65[1]. In enterprise software, 45% of senior engineering leaders are retirement-eligible within five years[2]. The people who know how the SAP customization was built in 2003, why the nightly batch job runs at 3 AM, and which customer has a bespoke tax exemption wired into a stored procedure — they are leaving. The question is whether the knowledge leaves with them.
The "document everything" ask fails for three reasons that have nothing to do with the departing engineer's effort or goodwill.
First, it's framed as homework. You're asking someone in their final weeks — dealing with transition logistics, emotional dynamics, and probably a new job — to do unbounded, unprioritized work with no immediate payoff. Engineers who genuinely want to leave things in good shape will try. They will write until they can't write anymore, producing content that's comprehensive but not organized around what actually matters to the person who comes after them.
Second, the engineer doesn't know what's load-bearing. Most of what a 20-year veteran knows feels obvious to them. They can't reliably distinguish between "this is critical context that will save someone six months" and "this is background that anyone competent will figure out." The knowledge that's hardest to extract is the tacit knowledge — the stuff they do without thinking, the decisions they made so long ago they've forgotten they made a decision at all. Unstructured self-documentation surfaces what's salient to them, not what's valuable to the organization.
Third, there's no structure. Without a schema, a format, and a destination, the output is whatever form the engineer defaults to. Usually that's prose. Prose is not searchable by system. It's not linked to a knowledge graph. It can't be retrieved by an AI agent answering a question in 2028. It's a document in a folder.
Who to Interview: The Single-Point-of-Failure Map
Not everyone needs an interview. Find the people whose departure causes a measurable incident
The goal is not to capture knowledge from everyone. It's to capture the knowledge that would cause a production incident, a compliance failure, or a six-month rediscovery spiral if the person walked out tomorrow. That's a much shorter list than you think — and a much more urgent one.
Start with a bus-factor audit. For every critical system, ask: how many people understand this well enough to debug a production issue at 2 AM? If the answer is one, you have a single-point-of-failure. If that person is within five years of retirement, you have an urgent extraction target. If they're within two years, you have an emergency.
The systems that hide the most critical tribal knowledge are usually the oldest: the core billing engine, the legacy ERP integration, the custom tax calculation module, the deployment pipeline that grew over twelve years of one-off fixes. These are rarely the systems with the best documentation. They're often the systems nobody wants to touch — which is exactly why the person who does touch them carries irreplaceable knowledge.
- 1
Pull the on-call rotation and incident history
Who gets called at 2 AM when the system breaks? Who has the most incident response involvement over the past three years? Incident history is a reliable proxy for who actually knows the system — not who theoretically owns it.
- 2
Map systems with no documented owner or a single named owner
An undocumented system with one person's name in the CODEOWNERS file is a knowledge silo. A system whose architecture was last documented in 2017 is a knowledge silo. List them all, then cross-reference with tenure.
- 3
Score engineers by tenure × system criticality × successor depth
A simple scoring matrix: years of tenure on this system (1-5) × criticality of the system (1-5, based on revenue/compliance impact) × inverse of successor depth (1 if no backup exists, 0.2 if two qualified backups exist). Sort descending. The top ten are your interview list.
- 4
Rank by time horizon and schedule interviews in the next 60 days
Prioritize engineers who are actively planning retirement or who are within two years of eligibility. Then expand to high-SPOF engineers regardless of retirement timeline — the bus-factor problem doesn't require an imminent departure to be real.
The Structured Interview Format That Actually Works
One hour, one topic, twelve questions — no free-form, no improvisation
The format is non-negotiable: one hour, one topic, one interviewer, one note-taker or recorder. The questions are written before the interview. The interviewer does not improvise. The interviewee is not asked to prepare anything. The goal is to extract what they already carry in their head — not to ask them to organize it first.
Free-form interviews produce nothing extractable because the output depends entirely on what the interviewee finds memorable, not what the organization needs to capture. A structured interview anchors the conversation to specific, retrievable artifacts: decision records, runbook gaps, system gotchas, and the "why we did it that way" entries that live nowhere else. The twelve questions below are tested for this — they're designed to surface tacit knowledge by asking for specifics, not summaries.
The 12-question structured interview template
- ✓
Walk me through this system as if I'm a capable engineer joining the team today. What do I need to know in the first week that isn't in any doc?
- ✓
What's the one thing you would never let another engineer change in this system without asking you first — and why?
- ✓
What's the most embarrassing failure this system has ever had, and what did you learn? What change was made as a result?
- ✓
Who was the last person who understood this before you? What did you learn from them that isn't written down anywhere?
- ✓
What's the runbook step that's missing from the actual runbook — the thing you do mentally that isn't in any written procedure?
- ✓
If you got hit by a bus tomorrow, who would I call first? What would I tell them to check?
- ✓
What decision was made before your time that still shapes how this system works today — and do you know why it was made?
- ✓
What's the part of this system that scares you? What would you change if you had unlimited time?
- ✓
What are the landmines — the things that look safe to change but will cause a non-obvious failure downstream?
- ✓
Which customer, business unit, or compliance requirement has a special branch in this system that isn't obvious from the code?
- ✓
What did you know six months ago about this system that you wish you'd known when you started?
- ✓
If you were writing the onboarding guide for your replacement right now, what would the first three bullet points be?
The Extraction Pipeline: Transcript to Knowledge Graph
How a one-hour interview becomes structured, retrievable institutional memory
The interview produces a transcript. The transcript is a source document — not a deliverable. What you do with it next determines whether the hour was wasted or compounded.
The extraction process turns unstructured narrative into structured artifacts: ADR-style decision records (why was this built this way), runbook patches (the undocumented step that matters), system gotcha catalog entries (the landmines the interviewee named), and org-knowledge-map entries (who knows what, traceable to a source). Each artifact has a defined schema — covered in the next section — and a defined home in your existing knowledge infrastructure.
The LLM is the extraction tool. You supply the transcript, a system prompt that describes each artifact schema, and a clear instruction to extract structured entries rather than summarize prose. This is not a summarization task. The LLM should be asked to identify discrete knowledge claims in the transcript and serialize each into the appropriate schema. Entries that are ambiguous or unverifiable should be flagged, not silently dropped. Every extracted artifact goes through human verification before it enters the live knowledge base. The LLM hallucinates institutional history. A senior engineer who wasn't in the interview needs to sign off on anything that will be treated as authoritative.
Meta's 2026 approach to tribal knowledge in large-scale data pipelines used a swarm of specialized AI agents to read every file and produce structured context files encoding tribal knowledge — achieving structured coverage for 100% of code modules, up from 5%[5]. The interview-first approach described here is the human-side complement to that kind of automated extraction.
Output Schemas: What Comes Out of the Hour
Define the artifacts before the interview, not after — the schema shapes what you extract
The LLM extraction is only as good as the schema you ask it to fill. Vague instructions produce vague output. Define the artifact types before you run the first interview — the schema is what transforms a transcript from source material into institutional memory.
Each artifact type below maps to a specific information need and a specific home in your knowledge infrastructure. The table describes the fields, destination, and ownership model for each.
| Artifact | Schema fields | Where it goes | Who maintains it |
|---|---|---|---|
| Decision Record (ADR-style)[^6] | id, system, decision, context, alternatives-considered, consequences, date-approx, source-interview | ADR directory in repo / Confluence space / linked to system wiki | Current system owner; reviewed on major changes |
| Runbook Patch | system, runbook-id, step-number, gap-description, corrected-procedure, who-to-call, urgency | Appended to existing runbook or flagged in the runbook as a note | On-call rotation owner; validated on next incident |
| System Gotcha Entry | system, description, trigger-condition, failure-mode, mitigation, severity, source-interview | System-specific gotcha catalog; surfaced in IDE context and AI retrieval | System owner; reviewed quarterly or after any related incident |
| Org Knowledge Map Entry | knowledge-domain, person, depth (primary/secondary), last-verified, successor-candidates | Org knowledge map (internal; not public); used for SPOF re-audit | Engineering management; re-validated annually or on role change |
| Decision History Entry | system, question, answer, rationale, source-interview, confidence (high/medium/low) | Decision history log; searchable by system and topic | Anyone who can verify or refute; confidence updated on verification |
Consent, Retention, and the Awkward Conversation
The legal and trust scaffolding you cannot skip
Recording a structured interview without explicit informed consent is both a legal failure and a trust failure. In most US states and in all EU jurisdictions, recording a conversation requires the consent of all parties. Beyond the legal floor, the framing of the conversation matters as much as the consent form.
Get written consent before the first session. The consent form should specify: what is being recorded, how the transcript will be stored and for how long, who has access to the raw recording versus the extracted artifacts, and that extracted artifacts may be attributed to the interview (for traceability) but the raw recording is not shared outside the knowledge management team.
The right framing for the conversation with the departing engineer is direct: this is for the company's operational continuity, not their performance file. The output will not be used in any HR process. The recording will be held for 90 days and then deleted; the extracted artifacts will be held indefinitely as institutional documentation. Most engineers respond well to this framing — they want the knowledge to survive them. What they resist is feeling surveilled or evaluated.
Set the retention policy in writing, communicated to the engineer before recording starts. Ninety days for raw recordings is a reasonable default. Extracted artifacts are permanent documentation assets and should be treated as such — versioned, attributed, and auditable.
Anti-Patterns That Waste the Interview
Five ways to run this process and produce nothing
The 'Tell Me Everything' Interview
No topic, no questions, no time box. The output is a monologue. The interviewer can't extract structured artifacts from a monologue because the structure doesn't exist. One interview = one topic. If the engineer owns five systems, run five interviews.
The Solo Note-Taker
One person trying to simultaneously listen, follow up, and capture notes produces incomplete notes and misses follow-up opportunities. Always pair: one interviewer to drive the questions, one note-taker or a recording. If you're recording, tell the note-taker to focus on follow-up flagging, not transcription.
Asking the Engineer to Write the Docs Themselves
This is the standard approach, and it produces nothing. Writing is hard, slow, unprioritized, and doesn't happen under time pressure. The engineer's last eight weeks are already full. The interview offloads the writing burden — they talk, you extract.
The Video Archive Without Transcript
An MP4 is not institutional memory. It's a file. It doesn't appear in search results, it can't be retrieved by an AI agent, and nobody watches three hours of video when they need to debug a production issue. Transcript first; video is optional and temporary.
LLM Extraction Without Human Verification
LLMs will confidently produce plausible-sounding ADRs with fabricated history. The extraction step is a draft, not a publication. Every artifact that enters the live knowledge base needs a human to confirm it reflects actual history — ideally someone who was there, or who can test the claim against observable system behavior.
The AI Pipeline: From Transcript to Searchable Memory
The technical implementation across transcription, extraction, storage, and retrieval
The technical pipeline has four stages, each with a specific tool recommendation and a specific failure mode to guard against.
Stage 1: Transcription. Use Whisper (local or via API) or an equivalent speech-to-text service with speaker diarization enabled. Diarization assigns speaker labels — critical for distinguishing the interviewer's questions from the engineer's answers, which matters when the LLM extracts artifacts. Output format: plain text with speaker labels and timestamps. Store in your document system with access controls matching your consent policy.
Stage 2: LLM extraction. Run the transcript through a structured extraction prompt against each artifact schema. The system prompt should include the schema definition, examples of well-formed and poorly-formed extractions, and explicit instructions to flag low-confidence entries rather than omit them. Use a model with a long context window — full hour-long transcripts run 8,000-15,000 tokens. GPT-4o, Claude Sonnet, or equivalent. Do not use a summarization prompt. Use an extraction prompt: the task is to find and serialize discrete knowledge claims, not to summarize what was said.
Stage 3: Storage and indexing. The extracted artifacts go to their defined homes — ADR directories, runbook patches, gotcha catalogs — and are also chunked and embedded into your vector store. Knowledge graphs are emerging as the infrastructure layer for this in enterprise settings[8]: structured entities (system, person, decision) connected by typed relationships (knows, made, depends-on) provide the semantic structure that pure vector search lacks. If you're starting from scratch, a document store with a well-designed tagging schema and full-text search is sufficient. The knowledge graph comes later, when you have enough volume to justify the infrastructure investment.
Stage 4: AI retrieval integration. The extracted artifacts need to be surfaced where engineers actually work — in your AI coding assistant context, in your internal Slack AI search, in your incident response runbooks. Fortune 500 companies lose an estimated $31.5 billion annually from information that exists but can't be found[8]. The extraction work is wasted if the artifacts are indexed in a system nobody queries. Pick the retrieval surfaces first, then design the storage format to feed them.
Three-hour video archive in a Google Drive folder nobody can find
Free-text Confluence page titled 'Steve's Notes' with no metadata
'Ask Steve — he knows' as the de facto runbook entry
50-page brain dump Word document, last opened six months after Steve left
Exit interview notes in an HR folder, inaccessible to engineers
Transcript with speaker labels, stored in document system, 90-day retention
Tagged ADR set linked to the system wiki, reviewed on every major change
Structured gotcha catalog entry: trigger, failure mode, mitigation, who to call
Vector-indexed artifacts surfaced in AI assistant context for that system
Knowledge graph node for Steve's expertise, linked to the three systems he owned, with verified successors tagged
What This Looks Like in Your First 90 Days
A practical ramp for getting the program from zero to running
- 1
Days 1-14: Run the SPOF audit and produce the interview list
Complete the bus-factor scoring across your five most critical systems. Identify your top ten extraction targets — not just the engineers who are retiring, but the engineers whose departure would cause a measurable incident regardless of timeline. Schedule ten interviews across the next twelve weeks.
- 2
Days 15-30: Run three pilot interviews and validate the pipeline
Select three diverse topics — ideally one legacy system, one business-rules domain, one operational procedure. Run the structured interview, transcribe, extract against schemas, verify artifacts with a second engineer. Identify what the schema needs to add or remove based on what the transcripts actually produce.
- 3
Days 31-60: Publish the first batch of artifacts and connect to retrieval
After verification, publish the artifacts to their defined homes. Integrate with at least one retrieval surface that engineers already use — your AI coding assistant, your internal search, your runbook system. Get feedback on retrieval quality from the engineers who would have needed to call Steve.
- 4
Days 61-90: Establish the quarterly cadence and program ownership
The SPOF list changes as engineers join and leave. Set a quarterly review: re-run the bus-factor scoring, add new engineers who have become high-SPOF holders, and schedule interviews for the next quarter. Assign a program owner — typically the senior-most engineering lead or an internal knowledge management function if you have one.
We spent two years building a video archive of every departing senior engineer. We had 40 hours of recordings and nobody had watched any of them. When we finally switched to structured interviews with LLM extraction, we produced more usable runbook content in four weeks than we had in two years of video. The difference was the schema — it forced us to be specific about what we actually needed before we sat down with the engineer.
Common Questions
Practical answers to what actually comes up when you try to run this
What if the senior engineer refuses to be interviewed?
This is rare when the framing is right — most engineers want their knowledge to survive them and respond well to a structured, time-bounded request. If there's resistance, don't push on the recording. Offer a note-based session instead: the interviewer asks the questions and takes notes, no recording involved. You'll capture 60-70% of what a recorded interview would produce. The refusal is often about surveillance anxiety, not about withholding knowledge.
How do we handle proprietary or customer-sensitive details in the transcript?
Design the interview to avoid specific customer data at the raw fact level — you want the structural knowledge ("there is a special tax exemption for one customer class") not the identifying detail ("Customer X has a $2.4M exemption"). If sensitive details do surface in a transcript, redact them before the transcript enters your document system. The extraction prompt can include instructions to flag and omit specific customer names or PII rather than include them in artifacts.
Should the interview be done by a manager or a peer?
A peer or a senior engineer from an adjacent team typically produces better results than a direct manager. The interviewee is more likely to share failure stories, regrets, and uncomfortable truths with someone who won't evaluate their performance. The interviewer should be technically capable enough to recognize when an answer is incomplete or evasive, and to follow up on the right details. A non-technical interviewer misses too much.
What's the right number of interviews per quarter?
Two to four per quarter is a sustainable cadence for most engineering organizations once the initial backlog is cleared. The backlog — covering your highest-SPOF engineers — may require eight to twelve interviews in the first quarter. After that, the program maintains itself by catching new high-SPOF holders as they emerge from the quarterly audit and by running exit interviews for any departing senior engineers using the same structured format.
How do we keep the captured knowledge fresh after the interview?
The artifacts need ownership and a review trigger. Every ADR and runbook patch should have an owner who is notified when the system it describes changes significantly. A simple CI hook that flags relevant ADRs when a file in their scope is modified is often enough. Gotcha catalog entries should be reviewed after any related incident — the incident may confirm the gotcha, expand it, or render it obsolete. Set a hard staleness threshold: any artifact unreviewed for 18 months is flagged as potentially stale and needs re-verification before it's surfaced in AI retrieval.
Tribal Knowledge Extraction Program Checklist
Completed bus-factor scoring across all critical systems
Ranked interview queue with top 10 extraction targets identified
Written consent form drafted and reviewed for your jurisdiction
Recording retention policy documented and communicated
Artifact schemas defined for all five artifact types before first interview
Structured 12-question template finalized and shared with interviewers
Transcription pipeline tested with a sample recording
LLM extraction prompt tested against a pilot transcript
Human verification process assigned (who reviews, what threshold triggers rejection)
Artifact storage homes defined (ADR directory, runbook system, gotcha catalog)
At least one retrieval surface connected (AI assistant context, internal search, runbook system)
Quarterly SPOF review scheduled and program owner named
The demographic cliff is dated and approaching. Between now and 2030, 30.4 million baby boomers will turn 65[1]. In enterprise software specifically, the SPOF concentration is acute: the engineers who have lived through the longest-running systems, the ones who were in the room when the decisions were made, the ones the team calls at 2 AM — they are within striking distance of exit. The cost of doing nothing is paid in incidents, re-discovery spirals, and the quiet realization that nobody knows why the deployment script runs at 3 AM.
The cost of doing something structured is two hours per interview, a working extraction pipeline, and a program owner with a quarterly calendar entry. It is not a large investment relative to what it protects. The difference between a knowledge graveyard and a knowledge base is not effort — it's structure. Build the schema first. Then run the interview.
- [1]Protected Income: Peak 65 — Between 2024 and 2030, 30.4 million baby boomers will turn 65(protectedincome.org)↩
- [2]Enterprise Knowledge: Navigating the Retirement Cliff — Challenges and Strategies for Knowledge Capture and Succession Planning(enterprise-knowledge.com)↩
- [3]KS-Agents: Strategic Analysis of Knowledge Loss — Measuring the Cost of Employee Turnover(ks-agents.com)↩
- [4]Keevee: 27 Succession Planning Statistics for 2025 — only 35% of companies have formal succession plans(keevee.com)↩
- [5]Engineering at Meta: How Meta Used AI to Map Tribal Knowledge in Large-Scale Data Pipelines (April 2026)(engineering.fb.com)↩
- [6]ADR GitHub: Architectural Decision Records — the community standard for capturing decision context(adr.github.io)↩
- [7]AWS Architecture Blog: Master Architecture Decision Records — Best Practices for Effective Decision-Making(aws.amazon.com)↩
- [8]Glean: How Knowledge Graphs Work and Why They Are the Key to Context for Enterprise AI(glean.com)↩