Measuring What Matters: The 7 Automation Metrics That Actually Predict Success
Ask most teams how their automation is performing and you'll get one of two answers: "It's saving us time" (vague) or "We track 47 KPIs in a dashboard nobody reads" (useless).
Both are wrong. The first tells you nothing actionable. The second buries signal under noise.
After deploying dozens of automation projects, a pattern emerged: the teams that succeed long-term track 7 specific metrics. Not 47. Not 3. Seven. And they're not the metrics most people assume.
Why Most Automation Metrics Fail
The typical automation dashboard tracks things like "number of tasks automated" or "total runs completed." These are activity metrics — they tell you the automation is running, not whether it's working well.
It's like measuring a restaurant's success by counting how many plates leave the kitchen. High volume means nothing if the food is wrong, cold, or going to the wrong table.
The metrics that actually predict success measure three things:
- Quality — Is the automation producing correct results?
- Efficiency — Is it handling work without human babysitting?
- Value — Is it delivering the business outcomes you expected?
Activity metrics (runs completed, tasks processed) are inputs. Quality, efficiency, and value are outputs. You need both, but most teams only track inputs.
The 7 Metrics That Actually Matter
Straight-Through Processing Rate (STP)
The percentage of items that complete the entire automated workflow without any human intervention. This is the single most important automation metric.
Why it matters: Every item that requires human intervention represents a failure point — either the automation can't handle it, or an edge case wasn't accounted for. A declining STP rate is the earliest warning sign that something is breaking.
Thresholds by type: Simple data transfers: 98%+. Rule-based workflows: 85-95%. AI-powered classification: 75-90%. Multi-system orchestrations: 70-85%.
Error Escape Rate
The percentage of errors that make it through to customers, downstream systems, or final outputs without being caught. Different from error rate — this measures errors that escape your safety nets.
Why it matters: An automation with a 5% error rate but 0.1% escape rate is far healthier than one with a 2% error rate but 1.5% escape rate. The first catches its mistakes; the second lets them through. Escape rate measures the quality of your error handling, not just the quality of your processing.
Time-to-Value Ratio
Cumulative net savings divided by total cost (implementation + maintenance + overhead), measured over time. This tells you whether your automation is creating compounding value or just treading water.
Why it matters: Most teams celebrate the initial ROI calculation and never measure again. But automations that break even in month 4 can still end up net negative by month 18 if maintenance costs creep up, edge cases accumulate, or the business process changes underneath them.
Track this with our Automation ROI Tracker — it calculates time-to-value automatically from your pre/post metrics.
Human Intervention Frequency
How often someone has to step in to fix, restart, review, or override the automation. Measured as interventions per 100 runs (or per day/week for continuous automations).
Why it matters: This is different from STP rate. STP measures items that complete without help. HIF measures how much human time the automation still consumes. An automation might have 90% STP but require someone to manually restart it twice a day — that's an HIF problem that STP doesn't capture.
⚠️ The "Babysitter Trap"
If someone checks the automation every morning "just in case," that counts as intervention — even if they never find a problem. Routine monitoring should be automated too. Human eyes should only be needed for genuine exceptions.
Adoption Rate
The percentage of eligible work that actually flows through the automation, rather than being handled manually through old processes.
Why it matters: You can build the most technically perfect automation in the world, and it's worthless if people route around it. Low adoption is the silent killer of automation ROI — your dashboard shows "100% uptime" while half the team still does the work by hand.
Low adoption usually signals a change management problem, not a technical one. See our change management playbook for strategies to get teams to actually use the automation.
Recovery Time (MTTR)
Mean time to recover when the automation fails, breaks, or produces incorrect results. Measured from "problem detected" to "automation running correctly again."
Why it matters: Every automation will fail eventually — APIs change, data formats shift, edge cases appear. The question isn't "will it fail?" but "how fast can you fix it?" Recovery time trending upward means technical debt is accumulating and your documentation isn't keeping up.
Capacity Utilization
How much of the automation's potential throughput you're actually using. An automation that can handle 1,000 invoices/day but processes 200 is at 20% utilization.
Why it matters: Under-utilization means you're paying for more capacity than you need (common with over-engineered solutions). Over-utilization means you're approaching bottlenecks that will cause failures under load. Both require different responses.
If utilization is consistently below 30%, the automation may be over-engineered for the workload. If it's above 80%, start planning capacity upgrades before you hit failures. Use the Automation Health Monitor to check this alongside your other operational metrics.
The Metrics That Don't Matter (As Much As You Think)
Here's what teams commonly track that looks useful but tells you almost nothing:
Vanity Metrics to Stop Obsessing Over
These aren't bad metrics — they're just secondary. Track them for operational awareness, but don't use them to judge whether your automation is succeeding.
Building Your Measurement Framework
Step 1: Establish baselines before automation
You can't measure improvement without knowing where you started. Before launching any automation, capture:
- Current processing time (end-to-end, not just the "main" step)
- Error rate (with a clear definition of what counts as an error)
- Volume handled per day/week
- Number of people involved in the process
- Cost per unit of work (labor + tools + overhead)
Our pre-project checklist includes baseline measurement as one of the 30 items to complete before starting any automation project.
Step 2: Set up automated collection
If measuring a metric requires someone to manually check a dashboard or run a report, it won't get measured consistently. Automate the measurement itself:
- STP Rate: Log every run with a "completed_without_intervention" flag
- Error Escape: Downstream systems report mismatches back; count them
- Time-to-Value: Monthly automated calculation from cost tracking
- HIF: Every manual override or restart triggers a log entry
- Adoption: Compare automation throughput vs. total work volume (from source system)
- MTTR: Incident tracking from "alert fired" to "resolution confirmed"
- Utilization: Daily max throughput sampling divided by capacity benchmark
Step 3: Set review cadence
| Frequency | What to Review | Who | Action |
|---|---|---|---|
| Real-time | Error escape rate, STP rate | Automated alerts | Auto-notify on threshold breach |
| Daily | HIF, queue depth, processing time | Tech maintainer | Investigate anomalies within 4 hrs |
| Weekly | Adoption rate, MTTR trend | Automation owner | Adjust training or communication |
| Monthly | Time-to-value, capacity utilization | Business owner | Report to stakeholders, adjust budget |
| Quarterly | All 7 metrics + trend analysis | Full governance team | Strategic review, roadmap adjustment |
For more on who owns what in this review process, see our automation governance framework.
Reading the Warning Signs
Individual metrics tell you something. Metric combinations tell you much more. Here are the patterns that predict trouble:
🚨 Dangerous Metric Combinations
The One-Page Metrics Dashboard
Here's exactly what a good automation metrics dashboard looks like. One page. Seven numbers. Three colors.
Automation Health Dashboard — Template
For each automation, track these 7 metrics with current value, trend arrow (↑↓→), and status color (🟢🟡🔴).
Try our interactive Automation Documentation Generator to create a runbook that includes a metrics tracking section — fill in your automation details and get a formatted, ready-to-use document you can share with your team.
Common Measurement Mistakes
⚠️ Mistake #1: Measuring too late
If you don't capture baselines before automation, you're guessing at improvement. "It feels faster" isn't a metric. Measure first, then automate.
⚠️ Mistake #2: Measuring too much
A dashboard with 30 metrics is a dashboard nobody reads. Start with STP rate and error escape rate. Add others only when those two are stable and you need deeper insight.
⚠️ Mistake #3: Measuring without acting
Every metric needs an owner and a response plan. If error escape rate hits 2%, who gets notified? What's the next step? A metric without a response plan is just a number on a screen.
⚠️ Mistake #4: Comparing incomparable automations
A data transfer automation and an AI classification system have completely different healthy thresholds. Don't use a single benchmark for all automations — match thresholds to automation type and complexity.
⚠️ Mistake #5: Ignoring trend direction
A metric at 82% that was 90% last month is more concerning than a metric at 78% that was 72% last month. Trend matters more than absolute value. Always look at the arrow, not just the number.
Metrics by Automation Type
Different automations need different emphasis. Here's which metrics matter most for each type:
| Automation Type | Primary Metrics | Secondary Metrics |
|---|---|---|
| Data transfers | STP rate, Error escape | Utilization, Recovery time |
| Customer-facing (support, onboarding) | Error escape, Adoption, Recovery time | STP rate, Time-to-value |
| Internal workflows (approvals, reports) | Adoption, HIF, Time-to-value | STP rate, Utilization |
| AI/ML-powered | Error escape, STP rate, HIF | Adoption, Recovery time |
| Multi-system orchestrations | Recovery time, STP rate, Utilization | Error escape, Time-to-value |
5-Minute Metrics Quick Start
Don't try to track all 7 metrics on day one. Here's the progression:
- Week 1: Set up STP rate tracking. This single metric tells you more about automation health than anything else.
- Week 2: Add error escape rate. Now you know quality (STP) and safety (escapes).
- Month 1: Add adoption rate and HIF. Now you know whether people are using it and how much babysitting it needs.
- Month 2: Add time-to-value. Now you know the business impact is real.
- Month 3: Add recovery time and utilization. Full picture.
Use our Automation Health Monitor to input your current metrics and get an instant health assessment with specific recommendations.
Measurement Checklist
✓ Automation Metrics Readiness Checklist
Before Launch
First 30 Days
Ongoing (Monthly)
Quarterly Review
For the detailed governance framework that wraps around these metrics — including who owns each review and how to escalate — see the automation governance guide.
If you're building documentation for your automation (and you should be), our documentation guide covers how to include metrics tracking in your runbooks.
To assess how your overall automation stack is performing, the ROI Tracker calculates time-to-value and projections automatically from your numbers.
And if you're earlier in the journey — still figuring out which processes to automate — the maturity ladder helps you understand where your organization stands and what metrics are most important at your current level.
Need to plan your next automation project with metrics built in from the start? Our timeline estimator includes measurement setup as part of the project phases.
The pre-project checklist makes sure you don't skip baseline capture — one of the most common measurement mistakes.
Metrics aren't about proving your automation works. They're about knowing — actually knowing — whether it's getting better or getting worse, so you can act before a small drift becomes a big problem.
Start with STP rate. Just that one number. Track it daily for two weeks. You'll learn more about your automation's health from that single metric than from any 47-KPI dashboard.
That's the difference between measuring everything and measuring what matters.
Want automation that comes with measurement built in?
Every Moshi engagement includes metric dashboards, alert setup, and review cadence as standard deliverables — not afterthoughts.
Get a Custom Proposal →