Most AI use case selection is workshop theater. Process mining reads the actual event logs and ranks workflows by volume, variance, and structure — so you find out whether you need an LLM, an RPA bot, or nothing before spending a dollar.
High variant count is the cleanest signal that RPA will shatter and LLMs may carry the load
Up from lower adoption baselines in 2021; satisfaction improved in the past 12 months for 49% of respondents
Only 25% are doing it today — the gap between intent and execution is where most programs stall
Why workshop-driven AI use case selection fails — structurally, not just culturally
How process mining reconstructs real process paths from event logs (not SOPs)
The Volume × Variance × Structure scoring model for ranking automation candidates
A concrete Python snippet for building a minimal event log from a SQL source
Object-centric process mining and when it matters over case-centric approaches
Tool selection by stack: Celonis vs. Apromore vs. Disco vs. UiPath vs. IBM
The five-step discovery pipeline and a 90-day sequence from zero to committed pilots
Pre-mortem checklist: the five blockers that kill automations that look great on paper
Task mining for screen-level workflows that emit no system logs
Most enterprise AI pilots were chosen in a conference room. Someone drew a 2×2 on a whiteboard, the senior voice picked the workflow that sounded exciting, and three months later a team is shipping something the system logs never asked for. The data already knew which processes were painful, repetitive, and worth automating. Nobody pulled it.
Process mining — Celonis, Apromore, Disco, IBM Process Mining — extracts event logs from the systems that actually run your business (SAP, Oracle, ServiceNow, Salesforce, Jira) and reconstructs every path your processes take[5]. Not the SOP from 2019. Not the BPMN diagram a consultant drew. The literal sequence of timestamped events your production systems emit. The output is a ranked list of workflows by volume, cycle time, bottleneck severity, and variant count — the raw material for an evidence-based AI use case pipeline.
The market is moving fast. Process mining software is projected to grow from roughly $850M in 2026 toward $2B by 2031[7], with Celonis claiming 60% market share and holding category leader status across analyst assessments[1]. The Deloitte Global Process Mining Survey 2025 found that 48% of organizations have reached company-wide adoption — up from 45% in 2021 — and 74% plan to integrate AI into their mining programs[9]. Most adoption is still RPA-flavored. The LLM-era question is sharper: which mined workflows justify a language model, which want rule execution, and which are not worth touching at all?
One contrarian point worth stating directly: process mining will not fix your AI selection problem if your executives refuse to act on unglamorous findings. One logistics company ran a full Celonis deployment, surfaced accounts payable exception handling as the highest-ROI candidate, and ignored it because the CFO wanted to do AI for customer experience. The data is only as useful as the decision-making culture that absorbs it.
Three structural reasons workshop-driven selection fails — and why log data cuts past every one
Workshop-driven selection fails predictably. The same three structural failures show up across industries.
The loudest voice wins. Every prioritization session has someone senior enough to override the analysis. They watched a vendor demo, they have a pet project, they find AI for contract review more interesting than AI for invoice routing. The team builds the interesting idea, not the high-ROI one. The incentive points at narrative, not throughput.
Nobody admits which workflow is actually painful. Ask a room of VPs which process is most broken and they describe a process in someone else's department. The order-to-cash exception handling that costs the finance team forty hours a week never surfaces because the finance VP will not air dirty laundry in front of the CIO. Mining surfaces the pain from system data. Politics has no vote.
The interesting idea beats the measurable one. AI for customer sentiment sounds innovative. AI to reduce the eight percent exception rate in supplier onboarding sounds unglamorous. ROI can be identical. The second one has a cleaner feedback loop, less hallucination surface, and a sharper success metric. Workshop groups pick the first one anyway. Mining-based selection lets the bottleneck data set the initial ranking before any human applies preferences.
The Deloitte 2025 survey identified management support as a barrier in 41% of organizations — up from 26% in 2021[9]. That gap is not random. Mining programs surface unglamorous, politically charged findings, and organizations without C-suite sponsors shelve them.
ETL plus clustering plus visualization on event logs — and why the output changes what AI selection looks like
If your background is ML rather than BPM, here is what process mining actually does. Every enterprise system of record — SAP, Oracle, ServiceNow, Salesforce, Jira — emits an event log: a timestamped record of which case (invoice, ticket, contract, employee record) hit which step, when, and in what sequence. Mining tools ingest those logs and run clustering and graph algorithms over the case sequences to reconstruct a process graph.
Every event log needs exactly three mandatory fields[10]: a case identifier (the invoice number, ticket ID, or contract ID that groups all events belonging to one process instance), an activity name (the step label — 'Invoice Created', 'Sent for Approval', 'Exception Raised'), and a timestamp (when that activity occurred, in any parseable datetime format). Everything else — resource, cost, originating department — is optional enrichment. If your system has those three columns, you have the minimum viable event log.
The output is not a theoretical BPMN diagram. It is a map of every distinct path actually taken, annotated with frequency and duration. A workflow your SOP claims has two paths might have forty-seven variants in reality — each deviation a data quality issue, an edge case someone built a workaround for, or a policy exception that became the default. Variant count is one of the most diagnostic signals in the entire analysis. It tells you whether you have a stable process (low variants, safe for RPA) or a judgment-intensive one (high variants, candidate for an LLM).
Mining tools also compute conformance (how often the actual process matches the intended one), bottleneck attribution (which steps accumulate wait time), and root cause attribution (which case attributes correlate with slow cases). For every workflow in your event log, you get a quantified profile: volume, cycle time, variability, bottleneck location. That profile is the input to an AI suitability scoring model. The whiteboard is not.[5]
Most teams overestimate the data prep barrier — here is a minimal working example from a typical ERP table
The data preparation barrier is real but consistently overstated. Most teams assume they need a dedicated ETL pipeline before they can start mining. For a proof-of-concept sprint, you need a SQL query and a CSV export.
Take a purchase order workflow in a typical ERP. The PO_EVENTS table has a PO_NUMBER (case ID), a STATUS column (activity), and a CHANGED_AT timestamp. That is all you need to build an event log good enough to run your first variant analysis.
If you need to go further — applying Python to do lightweight variant analysis before committing to a full tool deployment — the pm4py library (the open-source Python process mining framework maintained by Fraunhofer) reads XES and CSV directly and produces variant statistics with about fifteen lines of code.
SOP documents written in 2019 and never updated
Happy-path diagrams showing 1-2 process variants
Estimated cycle times from interviews with process owners
Bottlenecks identified by whoever complained loudest last quarter
BPMN diagrams that no longer match the system configuration
Log-derived event sequences from actual production systems
Every variant captured — often 20–80 distinct paths through the same process
Real durations from timestamped event data, not estimates
Bottleneck ranking by total accumulated wait time across all cases
Conformance scores showing where the actual process deviates from intent
Real enterprise processes involve multiple interacting objects — case-centric mining forces a choice that distorts the analysis
Traditional process mining is case-centric: every event belongs to exactly one process instance. In a purchase order workflow, the case ID is the PO number, and every event — creation, approval, goods receipt, invoice match — attaches to that case. This works cleanly when a process has one primary object.
Most enterprise processes do not. An order-to-cash process involves a sales order, multiple delivery notes, one or more invoices, and potentially a customer dispute — all linked, all evolving in parallel. Forcing these into a single case ID produces convergence issues (one event appears to affect multiple cases) and divergence issues (one case fans out into multiple parallel objects). The resulting process graph is distorted.
Object-centric process mining (OCPM), standardized through the OCEL 2.0 format, solves this by letting events reference multiple objects simultaneously. A "Goods Receipt" event can link to both the purchase order and the supplier delivery note without duplicating the event. The OCEL 2.0 standard (SQLite, XML, and JSON exchange formats) adds object-to-object relationships on top of event-to-object relationships, enabling analysis of how objects interact across a process — not just how individual cases progress through steps.
For AI use case discovery, OCPM matters most in these scenarios: procure-to-pay (PO, invoice, payment), order-to-cash (order, delivery, invoice, payment), and HR onboarding (employee record, system provisioning tickets, training completions). When the workflow you are trying to automate spans two or more interacting objects, case-centric mining will systematically underestimate the variant count and misattribute bottlenecks. Check whether your mining tool supports OCEL 2.0 before signing the contract if multi-object processes are on your candidate list.
Three axes, one product. Ranks every mined workflow against itself and forces a verdict on tooling
Once the mined process data exists, you need a scoring model that turns it into a ranked candidate list. The AI Suitability Score is a three-axis product: Volume × Variance × Structure. Score each axis 1–5, multiply, and every workflow in your inventory gets a comparable number on a 125-point ceiling.
Volume measures the ROI ceiling. A workflow that processes 50,000 cases per year at 25 minutes per case represents 20,000+ hours of annual labor. A workflow that runs 50 times a year is a rounding error. Volume is the non-negotiable axis — without throughput, no automation pays back. Score it: 1 = under 1,000 cases/year, 3 = 10,000–50,000, 5 = 100,000+.
Variance is where LLMs eat RPA's lunch. RPA breaks on variance — every new path through a process needs a new rule. Research confirms LLMs outperform standard methods on task grouping and labeling in process-mining-to-RPA workflows, achieving average NMI scores of 0.75 using GPT-4 on task clustering[13]. A high-variant process (score 4–5) is exactly where you want a language model. Score it: 1 = 1–3 variants (pure RPA), 3 = 10–20 (borderline), 5 = 30+ (LLM territory).
Structure measures input parsability. A fully structured workflow — every input a database field, every decision rule-based — does not need an LLM. RPA or a deterministic rule engine handles it cheaper. When inputs include unstructured text (emails, PDFs, free-form comments, contracts), audio, or contextual judgment (is this expense within policy?), the calculus flips. Score it: 1 = fully structured, 3 = mixed, 5 = primarily unstructured or judgment-required.
The model produces a ranked list. Workflows above 40 are real AI candidates. Workflows between 15 and 40 are tool-selection conversations: some are clean RPA, some need better data first, some belong on the future roadmap. Workflows under 15 are usually best left alone.
| Workflow | Volume | Variance | Structure | Score | Better Tool | Primary Signal |
|---|---|---|---|---|---|---|
| Invoice approval (3-way match exceptions) | 5 | 3 | 3 | 45 | LLM | Unstructured vendor notes + PO mismatches |
| Customer support ticket triage | 5 | 5 | 5 | 125 | LLM | Free-text inputs, high urgency variance |
| Contract redlining (standard templates) | 3 | 4 | 5 | 60 | LLM | Document understanding + clause judgment |
| Expense report validation | 4 | 2 | 2 | 16 | RPA | Structured fields, binary policy rules |
| Employee onboarding (system provisioning) | 3 | 2 | 1 | 6 | RPA or neither | Fully structured, low volume |
| Supplier onboarding (documentation review) | 3 | 4 | 4 | 48 | LLM | Unstructured docs, regulatory variation |
| Insurance claims adjudication (complex) | 4 | 5 | 5 | 100 | LLM | Policy interpretation, multi-doc inputs |
| IT ticket routing (category assignment) | 5 | 3 | 4 | 60 | LLM | Free-text descriptions, ambiguous categories |
| Accounts payable duplicate detection | 5 | 1 | 2 | 10 | Rule engine | Fully structured, deterministic match logic |
A five-step sequence that takes raw system event data to a prioritized shortlist. Sequencing is the leverage point
The pipeline is not complicated. The sequencing is the leverage point. Teams that skip straight to AI suitability scoring before running variant analysis and bottleneck profiling end up scoring assumptions rather than what the log data shows.
Step 1: Source system selection. Not every system of record produces clean event logs. SAP and Oracle are the richest sources for finance and procurement. ServiceNow surfaces IT and HR. Salesforce covers sales and customer success. Jira and similar tools capture engineering and operations. The practical constraint is connector availability — Celonis ships hundreds of pre-built extractors; Apromore and open-source tools require manual ETL. Start with the two systems covering your highest-volume workflows, not the ones easiest to connect.
Step 2: Variant analysis. Once the log is loaded, the mining tool reconstructs every distinct case sequence and ranks by frequency. This is the moment of truth. The variant map tells you whether what you thought was a single workflow is six workflows under one name. High-variant processes (30+) flag immediately as LLM territory. Low-variant, high-frequency processes are clean RPA candidates. Processes with high variant counts but low frequency might be worth decomposing — the top three variants might be automatable even if the long tail is messy.
Step 3: Bottleneck identification. The mining tool shows where cases accumulate wait time — which step holds cases the longest in aggregate hours across the year. This is not the same as cycle time. A step that takes two minutes per case but processes 100,000 cases per year at 60% yield is a massive bottleneck even though individual cases move fast. Automate the bottleneck, not the step that is technically interesting. The ROI calculation is straightforward: hours accumulated × hourly cost × fraction automatable.
Step 4: AI suitability overlay. Apply Volume × Variance × Structure to the top twenty bottlenecks by accumulated wait time. That produces your ranked candidate list. Anything above 40 earns a pre-mortem. Anything above 70 earns a funded pilot.
Step 5: Pre-mortem and commit. Before committing to the top three, run a structured pre-mortem against each candidate. The candidates that survive become committed pilots. The ones that do not go back into the pipeline for the next cycle.
Six platforms ranked by fit, cost, and what they actually integrate with — not by analyst-quadrant decoration
The process mining market is dominated by Celonis — roughly 60% market share, top of analyst rankings[1][4]. Market leadership is not the same as fit for your stack. The honest assessment turns on three questions: what are your primary source systems, how much technical capacity do you have for setup, and what is your budget horizon?
The enterprise space consolidated hard in 2021. SAP acquired Signavio. UiPath acquired ProcessGold. IBM acquired myInvenio. All within months. That was a signal — hyperscalers recognized process discovery as a strategic moat for their automation suites[2]. The consequence: choosing a mining tool is now partly a bet on your broader vendor ecosystem.
| Tool | Best fit | Startup cost | SAP connectors | OCEL 2.0 support | When to pick |
|---|---|---|---|---|---|
| Celonis | Large enterprise, SAP-heavy | High (enterprise contract) | Hundreds pre-built | Yes | SAP runs the company; need monitoring + EMS |
| Apromore Community | Mid-market, non-SAP | Free (open-source) | Manual ETL | Partial | Budget constraints; want model control |
| Disco (Fluxicon) | Single-workflow PoC | Low (free trial) | CSV/XES import | No | 90-day sprint, one analyst, no platform budget |
| UiPath Process Mining | UiPath RPA shops | Medium | Limited | Roadmap | Already running UiPath; want integrated discovery |
| IBM Process Mining | IBM / z/OS environments | High | Via IBM stack | Partial | IBM cloud-pak infra; watsonx integration needed |
| SAP Signavio | SAP-native orgs | Medium–High | Native | Roadmap | SAP BTP ecosystem; minimal cross-vendor complexity |
A high suitability score does not guarantee delivery. Run the pre-mortem before committing budget
The AI Suitability Score tells you whether a workflow is technically worth automating. The pre-mortem tells you whether your organization can ship it. Different questions. Conflating them is how teams burn six months building pilots that never deploy.
Five blockers, every time. Data access: does your team have the permissions and API access for the systems involved, or are you about to spend three months waiting on InfoSec? Change management: who owns this process today, and are they a sponsor or a blocker? Audit and regulatory: does the workflow carry compliance constraints that mandate human sign-off at specific steps, and has legal confirmed automation is permissible? Integration cost: what is the actual API work to wire the upstream source and downstream action systems, and does it fit your pilot budget? Ownership dispute: does this workflow span multiple departments, and is there a single accountable owner for the automation? If two departments share ownership, the pilot stalls in cross-team approval loops every time.
The 2025 Deloitte survey underscores this: management support is cited as a barrier by 41% of organizations — the single most common obstacle to value realization, ahead of data quality and technical complexity[9]. The pre-mortem is where you surface that barrier before it surfaces during the build.
When the workflow lives on screens instead of in databases, you need a different instrument
Process mining works on system-of-record event logs. A significant share of enterprise work happens on surfaces that emit nothing structured: Excel spreadsheets, email threads, browser-based internal tools, legacy desktop applications. When an analyst exports an ERP table into a spreadsheet, reconciles it against a PDF invoice, pastes the result into a web form, and emails the approver — none of those steps appear in your SAP event log. They are invisible to process mining.
Task mining fills the gap by recording screen activity (keystrokes, mouse movements, application context) and using computer vision and NLP — increasingly LLM-based — to reconstruct the actual task sequence[3]. UiPath Process Mining's task capture, Celonis Task Mining, and Microsoft Power Automate Process Mining all offer desktop recording. The output looks similar to process mining: a variant map of how analysts actually execute screen-level tasks, annotated with frequency and duration.
The practical constraint is consent and privacy. Screen recording requires explicit employee consent and disciplined data governance — especially in regulated industries. Setup overhead is higher than connector-based mining. Data quality depends on recording duration and coverage. For knowledge worker workflows with high strategic value but no system-of-record footprint, task mining is the right tool. For everything with clean event logs, stay in traditional process mining.
Workflow runs primarily inside enterprise software (SAP, ServiceNow, Salesforce)
Cases are tracked by a system-assigned ID (PO number, ticket ID, contract number)
You can export a timestamped activity log from the source system
Setup budget is tight and speed matters — CSV export to first variant map in hours
Data governance and employee consent complexity would slow the program
Workflow involves desktop apps, spreadsheets, or email that emit no structured events
The actual task sequence is invisible to system logs (copy-paste, browser forms)
You are targeting knowledge worker processes with high strategic ROI but no database footprint
You need to understand exactly which screen steps consume analyst time
You have employee consent and data governance in place for screen capture
Five ways organizations spend on mining and get no AI candidates out the other end
Process mining surfaces uncomfortable truths about which processes are broken and who owns them. Without a C-suite sponsor explicitly committed to acting on the findings, the output is a beautiful process map nobody uses to make a decision. The Deloitte 2025 survey put management support as the #1 barrier — cited by 41% of respondents[9]. Lock in the sponsor before the tool contract.
Some teams insist on mapping every operational process before selecting any candidates. The result is a six-month enterprise-wide inventory and zero pilots. Mine two source systems, score the top twenty, pick three. The remaining processes are still there next quarter.
Process mining will faithfully show you that cases accumulate for forty-eight hours at a specific approval step. What it will not tell you is that the wait exists because one approver is overloaded and the backlog is a staffing decision, not an automation opportunity. Always ask whether a bottleneck has a human solution before building a technical one.
High variant counts are a signal, not a verdict. Thirty variants might mean twenty-eight of them account for two percent of cases — the high-frequency variants are stable and automatable. Always segment by variant frequency before declaring a process too variable for RPA. The long tail is often noise, not signal.
Mining is most valuable as a continuous intelligence layer, not a one-off discovery exercise. Processes change. Systems are updated. Policies shift. Volume spikes during seasonal periods. Organizations that run one sprint, pick three automations, and cancel the license find their automation portfolio drifting out of sync with how processes actually operate.
A concrete sequence — without boiling the ocean
Pick the two systems of record carrying the highest operational process volume — typically SAP or Oracle for finance and procurement, ServiceNow or Jira for IT and operations. Avoid CRM systems first; data quality is lower and processes are harder to scope. Stand up the mining tool (Disco trial for a proof-of-concept sprint; Celonis or Apromore for a production deployment), extract six to twelve months of event log data, and render the first process graph.
A focused analysis sprint. Not a full process inventory. The goal is to score the top twenty bottlenecks from the two source systems against the AI Suitability model. One process analyst (or a data-literate ops manager) owns the sprint. Output: a scored table of twenty candidates with notes on the primary blockers for each.
Take the top ten by AI Suitability Score into a structured pre-mortem with the relevant process owners. Most organizations find that two or three of the top ten are technically strong but organizationally blocked — data access, regulatory constraints, ownership disputes. Remove the blocked ones. The survivors become your pilot shortlist.
Commit to three pilots with explicit funding, timelines, and success criteria. Structure them as the first article in this companion series recommends: one low-risk/high-signal, one medium-risk/high-value, one exploratory. Each should test something different about your organization's ability to execute automation. Set a 60-day go/no-go review for each pilot.
The objections that surface in every mining conversation — answered directly
Do we need Celonis specifically, or can we DIY this?
You do not need Celonis. For a proof-of-concept sprint, Disco (Fluxicon) is free for limited datasets and produces variant maps and bottleneck analysis good enough to score your first ten candidates. Apromore's Community Edition is open-source and covers the full mining feature set. Celonis pays back when you are deploying enterprise-wide across hundreds of processes and need pre-built SAP connectors, a monitoring layer, and commercial support. For a 90-day discovery sprint with one analyst, start with a lighter tool. Graduate to a platform vendor after the approach has been validated in your org.
What if our processes do not have clean event logs?
Most processes have better log data than their owners think. The real question is whether you can access it and whether it is in extractable form. SAP stores event data in table structures that Celonis and similar tools already know how to read. ServiceNow has native event logs. Even without a native connector, any table with a case ID, an activity or status field, and a timestamp is enough to build an event log[10]. The blocker is usually InfoSec permissions, not data absence. Start by mapping which of your systems of record have case-level timestamped status tables. You will find more clean log data than you expect.
What is the right minimum event log size for reliable variant analysis?
There is no hard floor, but variant analysis becomes statistically meaningful around 1,000 cases. Below that, variant frequency distributions are noisy — a path that appears in 5% of cases might represent a real edge case or just sampling variance. For workflows under 1,000 annual cases, consider whether the ROI ceiling justifies the analysis at all; a Volume score of 1 rarely produces a composite score worth piloting.
How do we handle workflows that span systems we do not own?
Cross-system workflows are common and harder to mine cleanly. The standard approach: pick the system that holds the case for the longest portion of the cycle time — that is where bottleneck measurement will be most accurate. For handoff points between systems, you need a shared case identifier that persists across both (an invoice number, a ticket ID, a contract ID). Without a shared key, the cross-system join is unreliable. If you cannot join the logs cleanly, mine each system separately and treat the handoff itself as a candidate bottleneck rather than reconstructing the full cross-system view.
What is the difference between process mining and task mining?
Process mining works on structured event logs from systems of record — timestamped records of which case hit which state, when. Best for workflows that run primarily inside enterprise software. Task mining works on screen recordings — it captures what users actually do on their desktops, including operations that never touch a system log (copy-paste from email, Excel manipulation, browser forms). Process mining is faster to set up, cheaper to run, produces cleaner data. Task mining is necessary when the workflow lives partly or wholly in desktop applications that emit nothing structured. Most organizations should start with process mining and add task mining only for specific knowledge worker workflows where system log coverage is low.
When should we use object-centric mining instead of case-centric?
Use object-centric process mining when the workflow involves multiple interacting business objects — a purchase order linked to multiple invoices and delivery notes, an order linked to multiple shipments, an HR case linked to multiple provisioning tickets. Case-centric mining forces you to pick one object as the 'case', which distorts bottleneck attribution and undercounts variants for the other objects. The OCEL 2.0 standard is now supported by Celonis and partially by Apromore — check before signing. For simpler, single-object workflows (a ticket, a contract, a single PO line), case-centric mining is faster and produces cleaner output.
How does this connect to picking our first three AI workflows?
Mining is the upstream data layer for the workflow selection framework covered in the companion article on picking your first three AI workflows. That framework (risk × value × signal quality) tells you how to structure your first three pilots once you have candidates. Mining tells you which candidates to put into the framework in the first place. The two approaches run in sequence: mine first to generate a ranked candidate list, then apply the selection framework to decide which three to commit to as pilots. Using the selection framework without mined data means scoring workshop opinions rather than measured process data — which is where most first AI programs go wrong.
The question your organization keeps asking — which workflows should we automate with AI — has a data answer. It is sitting in your SAP tables, your ServiceNow event log, your Salesforce activity history. The workflows worth building share a profile: high volume, high variance, unstructured inputs. Cases where an LLM's ability to read context and handle exceptions outperforms any rule engine you could write. Mining gives you the ranked list. Volume × Variance × Structure gives you the tooling verdict. The pre-mortem tells you which ones you can actually ship. None of this requires a workshop.
Shipping the wrong thing confidently is still shipping the wrong thing. Your event logs have been saying so the whole time.
Distributed teams burn productivity at the timezone seam. Decisions buried in threads. Phantom blockers. Parallel divergence. The fix is not better Slack hygiene. It is a structured brief that extracts decisions, blockers, and active work from the tools the team already uses.
Visibility bias is a management failure mode, not a character flaw. Five signal channels, a recognition debt modifier, and a queue that surfaces the contributors your attention misses. Calm correction, not surveillance.
Engineers say it three times before managers hear it. The structural fix is not better listening — it is a delta-aware brief auto-generated 30 minutes before each 1:1, pulling Jira, GitHub, and 5/15s into one page that tags every signal as new, continuing, or resolved.