Skip to content
AI Native Builders

The People-First AI Transformation Playbook: Why Enablement Precedes Automation

A reference-grade playbook for sequencing AI transformation: the Enablement Substrate Model, a scored readiness rubric, and role-differentiated adoption patterns that hold up after the mandate memo fades.

Strategy & Operating ModelintermediateApr 4, 202618 min read
By Viktor Bezdek · VP Engineering, Groupon
Editorial illustration of a greenhouse: a gardener preparing soil and irrigation beds while seedlings wait on a shelf — the substrate being built before anything is plantedThe substrate is built before anything is planted. Automation grows on top — or it doesn't grow at all.

There is a farming idea that applies cleanly to AI transformation: you do not plant seeds into untilled ground and expect a harvest. You prepare the soil, install irrigation, stake supports, and only then put plants into it. If you skip those steps the seeds may sprout for a week, but nothing durable grows. Leaders reach for the seeds — models, agents, copilots, mandates — because the seeds are photogenic and the soil work isn't. The playbook that follows is about the soil work.

This piece frames AI transformation as a two-layer system. Underneath is the enablement substrate: the shared vocabulary, practice environments, role-specific playbooks, and feedback loops that make any AI tool usable by the people who actually do the work. On top of that substrate is automation: copilots, agents, workflow rewrites, and eventually the ambient AI capability that reshapes an operating model. The substrate has to exist before the automation lands. When leaders invert that order — push the automation first, hope the substrate fills in afterward — the tools sit unused, the metrics go theatrical, and the transformation stalls at a cost roughly double what the substrate work would have cost to do correctly up front.

The argument here is not that people matter more than technology. The argument is that sequencing determines whether the technology produces value at all. This playbook names the sequence, provides a rubric to locate your organization on it, and gives you the concrete artifacts — layer definitions, role-differentiated patterns, five durability principles — that teams can use to build the substrate without waiting for another reorg.

The Enablement Substrate Model

Four layers that must exist before any AI tool is pushed to production use

The Enablement Substrate is the human, process, and tool layer that must exist before AI automation can land usefully. It has four layers, each of which is a precondition for the next. They are not maturity stages; they are preconditions. A team can have all four layers thin or all four thick, but if any layer is missing, automation pushed onto it will fail in the same predictable way — the tool gets rolled out, a small number of champions use it, the rest of the organization demonstrates surface compliance, and six months later the vendor contract is quietly renegotiated.

The four layers are Literacy, Sandbox, Playbooks, and Feedback Loops. They function as a foundation, in that order, and each one answers a specific human question that must be answered before the next layer is even possible.

Prior art matters here. BCG's often-cited ratio — roughly 10% of value from algorithms, 20% from data and technology, 70% from people, processes, and culture[1] — is a useful backdrop but it flattens the people half of the equation into a single bucket. The substrate model splits that bucket into four distinct layers because each one fails in a different way and each one demands a different investment. A leadership team can nod at 'the 70 percent' and still spend the entire budget on pilots. Layers force specificity.

The Enablement Substrate Model
Rendering diagram…
Four substrate layers support the automation layer. Push automation onto a missing layer and it falls through.
1. Literacy
Shared vocabulary, worked examples, and honest bounds on what current models can and cannot do — so every conversation about AI starts from the same ground.
2. Sandbox
Low-stakes practice environments with real-looking data, explicit permission to fail, and zero production reach — so people build intuition before the stakes are real.
3. Playbooks
Role-specific, task-level patterns that show a finance analyst, a support agent, or an engineer exactly what 'good use' looks like in their workflow.
4. Feedback Loops
Champions, measurement, and iteration mechanics that route real usage back into updates to the first three layers — so the substrate keeps pace with the tools.

Layer 1 — Literacy

Before you can negotiate anything, you need a shared language

Literacy is not training. Training is something organizations schedule; literacy is something they develop. The distinction matters because scheduled training produces certificates and a completion rate on a dashboard, while literacy produces the ability for two colleagues to have a precise conversation about what an AI system is actually doing and where its limits are. Most organizations have the first and call it the second.

A literate organization can do four things without hesitation. It can describe what a language model actually is in one paragraph without reaching for a metaphor that breaks under load. It can distinguish retrieval-augmented generation from fine-tuning well enough to know which one to ask for. It can name at least three concrete failure modes — hallucination, stale context, prompt injection — and give an example of each from the team's own domain. And it can tell the difference between a well-specified AI task and one that is going to waste everyone's afternoon.

The artifacts that build literacy are small and boring. A one-page glossary of twenty terms, written in the company's own voice, not copied from a vendor deck. A weekly thirty-minute 'prompt club' where practitioners show one prompt that worked and one that didn't, and the room discusses why. A short internal wiki page per common pitfall, updated whenever the team runs into it. None of these is expensive. All of them together produce a workforce that can hold its own in a conversation about AI without pretending — which is the real precondition for any layer above.

There is a negative definition worth keeping in view. Literacy is not a vendor briefing, a keynote from a conference, or a two-hour mandatory training that ends with a certificate. Those are awareness — useful at the top of the funnel, inadequate on their own. The test of literacy is local: can a product manager sit in a design review and ask the right question about grounding, or does the room go quiet whenever a practical limit comes up? Can a legal reviewer look at a prompt template and spot the place where sensitive data is being passed without redaction, or does the review become a debate about whether the tool is 'generally safe'? The answers depend on whether people have worked with the concepts in their own context, not whether they sat through a session about them.

Literacy has a half-life. Terms like 'agent' and 'tool use' and 'context' shift meaning every couple of model generations, and the glossary written last summer will read as quaint by winter if nobody maintains it. Name an owner for the glossary, give that owner a quarterly calendar reminder to review it, and accept that the literacy layer is not a one-time project but a small ongoing line item. The cost is low; the failure mode of not paying it is a workforce that quietly stops being able to talk about AI with precision at precisely the moment the tooling gets more capable.

Layer 2 — Sandbox

Intuition is not taught; it is earned through low-stakes repetition

A sandbox is a practice environment where people can do real-shaped work with AI tools on data that looks like their actual data, with no possibility of affecting production systems, and with explicit permission to produce bad output. That last clause is the operative one. Without explicit permission to fail — visible, stated, backed by manager behavior — people default to performing competence instead of building it, and the sandbox becomes a compliance exercise.

The technical shape of a sandbox is not exotic. It is a separate workspace with synthetic or masked copies of real documents, tickets, spreadsheets, and logs. It is a Slack channel where people share prompts, responses, and reactions without a quality bar. It is a monthly half-day where team leads run 'bring your own problem' sessions and model the behavior of trying three approaches and keeping the one that works. The engineering budget for a sandbox is small; the cultural budget — permission, psychological safety, time protected from other priorities — is the expensive part.

Sandboxes fail in two recognizable ways. The first is over-engineering: the platform team builds an elaborate internal tool with dashboards and RBAC and makes it too inconvenient to experiment in, and nobody does. The second is over-securing: legal insists that sandbox data be identical to production data with full controls, at which point the sandbox becomes production and the fear that kept people from experimenting in the real system transfers directly over. A healthy sandbox is slightly crude, obviously separate, and trusted by every contributor to be a place where mistakes have no consequences.

The economics of a sandbox are worth stating explicitly. The tokens consumed in a sandbox look like waste on a cost-per-seat dashboard, because each experiment produces no direct business outcome. They are not waste. Sandbox spend is capability spend; it is the line item that shortens the distance between 'our tool can do X' and 'our team can reliably get X out of the tool.' A reasonable rule of thumb is that a team new to a capability should be spending an order of magnitude more sandbox tokens than production tokens in its first quarter of use, with the ratio inverting as the Playbook layer catches up. Leaders who evaluate sandbox spend as overhead rather than capability tend to starve it first when budgets tighten, which reliably postpones the point at which the automation layer starts producing real value.

One more structural note: the sandbox is the place where the organization's voice and the model's default voice have to meet. Every model has characteristic tics in its output — phrasing, structure, register — and part of building intuition is learning how to steer the model toward the team's own conventions rather than accepting the defaults. That steering is a learned skill; it cannot be taught from a slide. The sandbox is where people learn it.

Layer 3 — Playbooks

Role-specific patterns that turn capability into habit

A playbook is the smallest unit of reusable practice: a specific role, a specific task, a specific pattern that demonstrably works. 'Use AI for customer support' is not a playbook. 'When a ticket contains an attached log and the customer's last contract renewal was within 90 days, use this retrieval prompt to surface the three most likely causes, then escalate if the confidence label comes back low' is a playbook. The difference is operational specificity.

Playbooks are the most under-invested layer of the substrate in most organizations. Leaders assume that once tools are available and people are trained, patterns will emerge organically. They sometimes do, inside small teams with strong tacit knowledge transfer. At scale they do not — the variance in how individual contributors use a shared tool is enormous, and without explicit patterns the top-quartile users look like wizards and the bottom-quartile users look like saboteurs. Neither is accurate. What is actually happening is that the playbook layer does not exist yet.

A mature playbook collection has three properties. It is indexed by role and task, so a new joiner can find the pattern that applies to their work in under two minutes. It is living, with explicit owners who update patterns as models change and as the organization learns. And it is honest about what does not work — every playbook has a 'when not to use this' section, which is the clearest signal that the authors have actually operated the pattern rather than described it aspirationally.

A concrete example helps. Consider a finance team that wants to use AI for variance analysis on monthly P&L. The unhelpful playbook reads: 'Use AI to explain variances.' The useful playbook reads something like: 'Given the current month's actuals and the rolling three-month forecast, generate a first-draft narrative for each line item with an absolute variance over $50K or a relative variance over 8%. Use the attached prompt template; paste the ledger extract in the specified section. Review the narrative for any claim that references a driver not present in the supplied data — this is the most common failure mode. Reject the draft and re-run with a narrower scope if more than two such claims appear. Owner: Senior Finance Analyst (name); last reviewed: current quarter; when not to use: any month with a restated period, any line tied to an unannounced strategic initiative.' That playbook is five paragraphs long. It makes the difference between a team that produces useful drafts in ten minutes and a team that spends an hour arguing about what the tool is for.

Playbooks are also the artifact that makes onboarding tractable. A new joiner on a team with a healthy Playbook layer can be productive with the AI tool in their first week; a new joiner on a team without one has to reconstruct the tacit knowledge by watching colleagues, which is both slow and lossy. The onboarding signal — time to first useful output — is the cleanest indicator of whether this layer exists in practice or only in principle.

PropertyGeneric trainingRole-specific playbook
Unit of workA course or certificationOne task pattern a person can run today
Updated whenAnnually, or when vendor pushes a new versionWhenever a model changes or a pattern breaks
OwnerL&D or central enablement functionA practitioner on the team that uses it
Evidence it worksCompletion rate on a dashboardUsage in real tickets, real PRs, real reports
Failure modeEveryone finishes; no one appliesDrifts out of date if no owner keeps it alive

Layer 4 — Feedback Loops

Champions, measurement, and the mechanics that keep the substrate alive

Feedback loops are the mechanisms that keep the first three layers from decaying. They have three visible components: champions who are explicitly recognized for running the substrate, measurement that distinguishes real usage from theatrical usage, and iteration that routes what is learned from real usage back into updates to literacy, sandbox, and playbooks.

The champion role is a part-time assignment with protected time and clear scope — typically 15-20% of a senior contributor's week, formally recognized in their objectives and in their manager's. Champions do not invent patterns for other people. They surface what is working inside their team, refine playbooks, facilitate sandbox sessions, and feed signal back to the platform and enablement functions. Organizations that try to run this as a volunteer network with no protected time and no recognition discover within two quarters that the champions have drifted back to their primary duties and the substrate has no heartbeat.

Measurement deserves particular care because the wrong metrics actively corrupt the substrate. License utilization, raw prompt counts, and completion percentages are compliance signals; they rise to satisfy dashboards without indicating whether work got better. The metrics that matter are harder and slower: time-to-first-useful-output for a new joiner, percentage of playbooks with an owner and a last-updated date within the last quarter, ratio of sandbox experiments that translate into updates to the playbook layer, and net sentiment among practitioners who have used the tool for more than thirty days. None of these fit on a quarterly slide, which is part of why they are undersupplied.

Iteration is the third component and the one most often skipped. Feedback that is collected and not acted on is worse than feedback not collected, because it teaches practitioners that their observations have no weight. A functioning iteration mechanic commits, in writing, to three things: a cadence for reviewing champion feedback (monthly is usually right), a decision rule for which observations update the substrate (not every anecdote becomes a pattern, but recurring ones must), and a visible changelog of what changed and why. The changelog is the part that many organizations omit; without it, practitioners cannot tell whether the system is listening, and stop contributing. The changelog does not have to be long — a two-line entry per change, linked to the original observation, is enough.

The Substrate Readiness Rubric

Score each layer 0–4; read the profile, not the average

The rubric below is the load-bearing artifact of this playbook. It lets a leadership team locate its organization on the substrate in under an hour, surfaces asymmetry between layers, and produces a defensible answer to the question 'are we ready to push automation at this team yet?' The rubric scores each of the four substrate layers from 0 (absent) to 4 (mature), for a composite 0–16. The composite is a coarse signal; the profile — the shape of the four numbers — is what tells you where to invest.

Two rules discipline the scoring. First, a layer scores at most a 2 unless you can point to a physical artifact for every criterion at that stage — a document, a dashboard, a Slack channel with a named owner, a job description, a playbook with a last-updated date. Self-reported scores without artifacts are systematically inflated; the gap between what a leadership team believes and what exists in writing is, in most organizations, at least one full stage. Second, assign the scoring to someone two levels below the executive sponsor. They will find the gaps. The sponsor will rationalize them. The gap between those two scorings is itself a data point about Layer 4.

Layer0 — Absent1 — Ad hoc2 — Documented3 — Practiced4 — Mature
LiteracyNo shared vocabulary. Conversations about AI use undefined metaphors.A few enthusiasts have read; no shared glossary exists.Internal glossary and one-pagers published; most teams can find them.Weekly forum where practitioners show and discuss worked examples.New joiners reach conversational fluency in under 30 days; failure modes named and referenced in design reviews.
SandboxNo practice environment. Experimentation happens in production or not at all.Individuals run personal experiments on personal accounts.A sanctioned sandbox exists with masked data and stated permission to fail.Sandbox sessions run monthly with facilitation; outputs flow back to playbooks.Sandbox is the default first step for any new use case; data and access parity maintained.
PlaybooksNo role-specific patterns. Each user reinvents.A few prompt snippets shared informally in a channel.Playbook repository exists, indexed by role and task, with a handful of patterns.Every critical workflow has a named owner and an update cadence; 'when not to use' sections present.Playbooks are version-controlled, reviewed like code, and cited in performance conversations.
Feedback LoopsNo champions, no measurement, no iteration. Tools are pushed one-way.Volunteer champions with no protected time; vanity metrics tracked.Champion role exists with 10–20% protected time; basic usage metrics reported.Champions convene cross-team; quality metrics (time-to-first-useful-output, sentiment) tracked.Feedback from real usage systematically updates literacy, sandbox, and playbooks within a sprint.
  1. 1

    Assemble the evidence

    For each of the four layers, produce the physical artifacts that back your claimed stage. No artifact, no score above 2.

  2. 2

    Score each layer independently

    Rate Literacy, Sandbox, Playbooks, and Feedback Loops from 0 to 4. Resist the urge to align them — asymmetry is the diagnostic.

  3. 3

    Compute the composite and read the profile

    Sum the four scores for a 0–16 composite. Then look at the shape — a 2-4-1-3 profile means very different investment than a 3-3-3-3 profile, even though the composite differs by only one point.

  4. 4

    Calibrate against a two-down-level scorer

    Have someone two levels below the executive sponsor score independently. Reconcile the gap. The reconciliation conversation is where most of the learning happens.

  5. 5

    Pick the single constraining layer

    The lowest-scoring layer is the constraint on everything above it. Invest there for the next quarter; reassess before touching anything else.

0–7
Pause all AI-first initiatives. Invest in substrate for two quarters before any production push. The automation you buy at this score will sit unused.
8–12
Selective piloting only. One or two workflows with explicit substrate ownership. Anything broader will metastasize into compliance theater within a quarter.
13–16
Ready for scaled automation. The substrate is carrying its own weight; the constraint has moved to the automation layer itself — eval pipelines, durability, operating model.

A note on the ceiling: very few organizations score above 12. That is not a failure diagnosis; it is the base rate. The rubric is calibrated so that real organizations land in the 6–11 range with meaningful asymmetry between layers. If your first scoring produces four identical numbers, the scoring was not careful enough — real substrates have texture, and the texture is the diagnostic.

Role-Differentiated Adoption Patterns

Engineering and non-engineering roles need different substrates — one size fits nobody

The substrate is universal in shape but specific in content. Engineering teams and non-engineering teams both need the four layers, but the Literacy glossary that works for a platform engineer will not work for a finance analyst; the sandbox experience that builds intuition for a support agent will bore a senior developer into uninstalling the tool. Treating the two populations with a single enablement curriculum is the single most common substrate-design error outside of skipping the substrate entirely.

Engineering roles come to AI tools with an existing mental model for code assistance, debugging, and review. Their playbook layer can be terse and technical; their sandbox can be a local IDE with safe branches; their feedback loop can ride on top of existing practices — code review, pairing, retros. The risk for engineering is over-adoption without taste: developers who accept suggestions without scrutiny, commit generated code without running evals, and produce plausible-looking diffs that break on edge cases. The substrate work for engineering is therefore weighted toward Playbooks and Feedback Loops — patterns that teach when to refuse a suggestion, and reviews that surface over-acceptance.

Non-engineering roles — operations, finance, legal, marketing, customer support — come in with a different problem. Their existing workflows were not designed to take intermediate outputs from a probabilistic system; the concept of 'ground truth' is softer in a draft email than in a test-passing function. Their Literacy layer has to spend more weight on failure modes and verification, their Sandbox on realistic in-role tasks rather than synthetic exercises, their Playbooks on pairing each AI step with an explicit human verification step, and their Feedback Loops on sentiment rather than raw throughput. The risk for non-engineering adoption is under-adoption masking as under-capability — people who find the tool confusing decide they are the problem and quietly stop using it. The substrate for non-engineering work has to make it safe to be a beginner for longer.

The implication for enablement design is significant: the engineering substrate and the non-engineering substrate are genuinely different products, not a single program with different target audiences. They share structure — four layers, the same rubric — but the contents of each layer diverge. An enablement team that tries to run one curriculum across both populations ends up with material that is too technical for non-engineers and too shallow for engineers, which is the worst of both outcomes. Splitting the two programs, naming separate owners, and accepting the higher coordination cost is the standard move for organizations that see real adoption in both populations.

A further wrinkle: the two populations depend on each other. Engineers build the internal tooling that non-engineers use. Non-engineers surface the real business questions that engineers then instrument. A substrate program that treats the two as independent misses the cross-flow, which is where most of the durable value actually lives. Cross-population practice sessions — engineers watching support agents use the tool, operations analysts sitting with platform engineers during an eval review — sound soft on an agenda and consistently produce the clearest jumps in capability on both sides.

Engineering substrate emphasis
  • Literacy: terse, technical; assumes existing model of code review and CI

  • Sandbox: local IDE, safe branches, eval harness in the loop

  • Playbooks: version-controlled, reviewed as code, fail-closed defaults

  • Feedback: rides existing retros and code review; track over-acceptance and eval drift

  • Primary risk: adoption without taste — plausible code, broken edge cases

Non-engineering substrate emphasis
  • Literacy: heavier weight on failure modes and verification; domain-specific examples

  • Sandbox: realistic in-role tasks, masked data, explicit permission to be a beginner

  • Playbooks: pair every AI step with a human verification step; 'when not to use' is prominent

  • Feedback: sentiment-first, time-to-first-useful-output, champion hours protected

  • Primary risk: under-adoption disguised as under-capability — quiet abandonment

Five Principles for Durable AI Tooling

What separates tooling that survives two model generations from tooling that doesn't

Substrate investment is wasted if the tools it supports are fragile. The model vendors will release a major update at least twice a year; the one you deploy against today will be two versions behind before the end of the fiscal year. Tooling built without durability in mind must be rebuilt with every model generation, and the rebuild cost eventually exceeds the adoption cost. The five principles below are the minimum design discipline that keeps internal tooling durable across model generations. They are not novel — they are the AI-specific restatement of practices that have been taken for granted in systems engineering for decades.

These principles also serve as a diagnostic: a tool that violates two or more of them is a tool that will not survive its second year. Before signing off on a build, walk the design against each principle and note where the tool is exposed. Cost comes in when you have to retrofit later; paid early, the cost is small.

Five Design Principles for Durable AI Tooling

1. Abstraction

Route every model call through an internal gateway that exposes a stable interface. Prompt text, model name, and provider are configuration, not code. When a better model appears, you change the configuration; you do not change twenty callsites.

2. Observability

Every model call produces a structured log entry with inputs, outputs, tool invocations, latencies, and cost. Without this you cannot diagnose regressions when a new model rolls in; with it, 'the tool got worse' becomes a specific, answerable question.

3. Graceful degradation

When the model fails — unavailable, hallucinating, producing low-confidence output — the tool degrades to a clearly labeled fallback rather than a silent failure. The worst tooling posts plausible nonsense when the model has an off day; durable tooling tells the user to take the wheel.

4. Evaluation infrastructure

A curated eval set, versioned alongside the tool, runs on every prompt change and every model update. Without it, you cannot tell the difference between 'the new model is better' and 'the new model is worse in ways you will discover in production.'

5. Model-agnostic interfaces

Design the tool's domain API around the task, not the model. A support-triage tool exposes 'suggest likely causes given this ticket'; it does not expose 'run GPT-X with this system prompt.' The model is an implementation detail behind the interface.

Sequencing Substrate and Automation

A two-year rhythm that lets both layers grow without either starving

The substrate does not get built once and set aside. It runs in a continuous rhythm alongside automation, with each reinforcing the other. The practical sequencing question for a leadership team is not 'when will the substrate be done?' — it will never be done — but 'what does a quarter of substrate investment look like in parallel with a quarter of automation investment?' The diagram below describes a rhythm that teams have reported as sustainable across multiple planning cycles.

The rhythm has four beats. Every quarter, a fixed share of capacity — typically 20–30% of platform and enablement hours — is reserved for substrate maintenance: keeping playbooks current, refreshing the sandbox, running the measurement cycle, convening champions. The remaining capacity goes to automation: new use cases, platform capability, agent runtime. Each major automation release triggers a substrate pulse: a playbook update, a literacy session on the new capability, a sandbox refresh. The annual planning cycle includes a full substrate readiness rubric rescoring as input to the next year's automation portfolio. When substrate and automation fall out of sync — automation running ahead by two quarters, or substrate stagnant for three — the rubric score will move first, and the planning cycle can adjust before a visible failure.

The rhythm has a useful second-order property: it gives the organization a shared vocabulary for saying 'no, not yet.' Without the rhythm, pressure to ship a new automation use case always looks the same — a stakeholder with a compelling case, a platform team with capacity, a deadline. With the rhythm, the question becomes tractable: where does this use case sit in the substrate, and if the substrate for the target team is thin, what is the substrate pulse that ships alongside the automation? Use cases that arrive without a substrate pulse get deferred by default. Use cases that arrive with one get prioritized against the rest of the automation portfolio on normal terms. Neither decision is made by a single person; the rhythm makes the decision partly self-executing, which reduces the political cost of saying no to a powerful stakeholder.

The rhythm also prevents the most common failure of substrate work: treating it as a one-time project with a completion date. Organizations that fund substrate as a program with a start and an end typically produce a respectable artifact — a glossary, a sandbox, a playbook repo, a champion network — and then watch the artifact decay over the following year as nobody is explicitly accountable for its upkeep. Running it as a rhythm with reserved capacity converts the substrate from a project into an ongoing practice, which is the only form in which it durably holds up.

Substrate and Automation in a Sustained Rhythm
Rendering diagram…
Substrate and automation run in parallel cycles. Each automation release triggers a substrate pulse.

Common Failure Modes

The predictable ways substrate work goes wrong, and what each one looks like from the inside

Most organizations that try to build a substrate run into at least one of the failure modes below. Naming them in advance makes them easier to recognize when they begin, which is typically a quarter or two before they become visible in any dashboard. Each one looks rational from the inside; none of them produces the outcome the substrate is there to produce.

Failure modeWhat it looks likeThe underlying sequencing error
Mandate-firstA top-down directive to use AI by a fixed date, followed by compliance dashboards and a quiet drop in sentiment.Automation pushed onto an absent substrate. Produces usage numbers without capability.
Platform-onlyA sophisticated internal platform with gateway, evals, observability — used by a small circle of enthusiasts.Automation layer investment without the human layers beneath it. The tool is durable but unloved.
Training-as-substrateA completed certification program with strong completion rates and no change in how work actually gets done.Literacy confused with training. Certificates do not produce the fluency the substrate requires.
Champion burnoutInitial enthusiasm from a volunteer network fading within two quarters; champions back in their primary lanes.Feedback Loops layer treated as a cultural phenomenon rather than a role with protected time.
Centralized CoE isolationA center of excellence producing elegant proofs of concept that never graduate into embedded team workflows.Substrate built in isolation from the teams it should support. Playbooks exist but belong to the wrong owners.

Leadership Commitments

Six commitments that distinguish a real substrate investment from a slide deck about one

The Six Leadership Commitments

  • Sequence enablement before automation, and state the sequence publicly in writing.

  • Reserve 20–30% of platform and enablement capacity every quarter for substrate maintenance, protected from automation pressure.

  • Fund the champion role with protected time (15–20%) and explicit recognition in performance reviews.

  • Retire vanity metrics — license utilization, prompt counts, certification completion — from executive dashboards.

  • Commit to the rubric: score quarterly, share the profile, act on the constraining layer.

  • Treat each model generation as a substrate event — plan the literacy, sandbox, and playbook updates alongside the capability upgrade.

Frequently Asked Questions

The objections worth answering head-on

Isn't this just change management with new vocabulary?

Partly — change management is one of the disciplines the substrate draws on. What is specific to AI is that the capability itself changes every six to nine months, which means the substrate has to be designed as a living system, not a one-time rollout. A traditional change management program with a start date and an end date will be out of date before the end date arrives. The substrate model borrows from change management but demands a continuous rhythm.

What if we are already deep into rollout — do we rewind?

No. Run the rubric on your current state, identify the lowest-scoring layer, and invest there for the next quarter before adding new automation surface area. The substrate can be built in parallel with existing automation; the cost of doing so is higher than starting in the correct sequence, but it is still substantially lower than the cost of a visible failure followed by a forced redo.

How small is too small? Does a ten-person team need a substrate?

A ten-person team needs a proportional substrate, not no substrate. The Literacy layer might be a shared document and a weekly thirty-minute conversation; the Sandbox might be a single shared workspace; the Playbooks might be a handful of patterns in a repo; the Feedback Loops might be the regular retro with one standing agenda item. What does not scale down is the sequence — enablement still precedes automation, even at ten people.

How do we pay for substrate work when the board wants automation wins?

Frame substrate investment as the risk mitigation for the automation portfolio, because that is what it actually is. Every dollar spent on substrate reduces the probability that an automation initiative fails in a way that requires a public recovery effort — and the empirical evidence for the cost asymmetry of unaddressed adoption issues is well-documented.[3] In most boardrooms, a defensible risk argument carries more weight than a generic appeal to culture.

Who owns the substrate — platform, enablement, or HR?

A cross-functional group with a single named owner. The platform team owns the Sandbox's technical shape and the tooling that supports the Playbook layer. Enablement or L&D owns the Literacy layer. HR owns the performance integration for champions. A single named owner — typically a VP-level leader with both platform and people leverage — is accountable for the composite and for quarterly rescoring.

What if our people are genuinely opposed to AI, not just skeptical?

Some are, and the substrate is not a mechanism for converting opposition into advocacy. It is a mechanism for making engagement productive for the people who are willing, and for surfacing real disagreements clearly enough that they can be addressed rather than suppressed. A small fraction of any workforce will prefer to leave rather than adopt AI, and that is a legitimate outcome. Compliance theater disguises that signal; a functioning substrate reveals it.

Where to go next in the strategy pillar

Appendix: State of AI Adoption, Q2 2026

Time-stamped snapshots of the adoption landscape. Useful as backdrop; not load-bearing for the playbook.

The numbers below were current as of April 2026. They are included as a dated snapshot of the environment in which the substrate model was formalized, not as evidence for the model itself — the playbook stands independent of the specific figures, which will drift as the adoption curve matures. Treat this section as an appendix, returning to the body of the playbook for anything operational.

Fortune reported in April 2026 that roughly 80% of white-collar workers were actively rejecting mandated AI adoption, a pattern the piece labeled FOBO — fear of becoming obsolete.[5] A survey of 2,400 knowledge workers in the same cycle found 29% admitting to actively slowing or sabotaging their employer's AI strategy, with the figure rising to 44% among Gen Z respondents.[7] A widely-discussed case from August 2025 involved a CEO laying off nearly 80% of his workforce after employees declined to adopt AI fast enough, a decision he publicly reaffirmed two years later.[6] On the business-outcome side, BCG's Build for the Future x AI global study of roughly 1,250 companies reported that only around 5% were generating substantial financial returns from AI, with the 10–20–70 capital-allocation ratio (algorithms–data/tech–people/process/culture) cited as the most reliable differentiator between outcomes.[1] HBR's November 2025 piece on organizational barriers to AI adoption catalogued the same pattern from the management-literature side, emphasizing the rarity of sustained behavior change without explicit enablement investment.[3]

These numbers will change. The sequencing argument will not — the substrate has to come first whether 80% of white-collar workers are rejecting mandates or 20%, and whether BCG's next study puts the people share at 70% or 60%. That is why the playbook body does not lean on the appendix. Read the body; glance at the appendix; act on the rubric.

Footnote on prior art. The Enablement Substrate Model borrows the broad intuition that people and process dominate AI outcomes from BCG's 10–20–70 framing[1] and from Prosci's people-first change-management tradition[4], but splits the people side into four layers — Literacy, Sandbox, Playbooks, Feedback Loops — because each layer fails in a distinct way and requires a distinct investment. The rubric, the durability principles, and the sequencing rhythm in this piece are proprietary constructs developed for the AI Native Builders strategy library; the prior art is acknowledged above, not embedded in the model.

Key terms in this piece
AI transformation playbookpeople-first AI adoptionenablement substrateAI change managementAI readiness rubricAI operating model
Sources
  1. [1]Boston Consulting GroupFrom Potential to Profit: Closing the AI Impact Gap(bcg.com)
  2. [2]Boston Consulting GroupAI Transformation Is a Workforce Transformation(bcg.com)
  3. [3]Harvard Business ReviewOvercoming the Organizational Barriers to AI Adoption(hbr.org)
  4. [4]ProsciAI Adoption: Driving Change With a People-First Approach(prosci.com)
  5. [5]FortuneWhite-collar workers are quietly rebelling against AI as 80% outright refuse adoption mandates(fortune.com)
  6. [6]FortuneThis CEO laid off nearly 80% of his staff because they refused to adopt AI fast enough(fortune.com)
  7. [7]EQ4C ToolsThe Corporate AI Mandate: Why Forcing Workers to Adopt AI or Face Termination is Backfiring(tools.eq4c.com)
Share this article