Bahasa Melayu

Ops Metrics: Cycle Time, Throughput, Error Rate, Vendor SLA

Most Ops dashboards are activity reports dressed up as metrics. Tickets touched. Meetings attended. Tasks closed. Slack messages sent. The COO sits through fifteen minutes of bar charts and walks out knowing nothing about whether the function is getting faster, cheaper, or more reliable. They will nod politely and then ask Finance to model what happens if they cut your headcount by two.

That is what an activity report buys you. An expensive seat at a meeting where you are losing the argument before you start.

A COO wants three answers. Is throughput going up. Is cost-per-process going down. Where is the next breakage going to come from. Six metrics, segmented properly and trended over four quarters, give those answers. Anything else is decoration.

Why Activity Metrics Fail Upward

"We closed 1,400 tickets this quarter" tells the room nothing. It does not say whether the work was right, fast, or cheap. It does not say whether 1,400 was more or less than the same effort produced last quarter. It does not separate a refund request from a vendor onboarding from a payroll exception, even though those three things take wildly different amounts of time and money to deliver.

Activity metrics fail upward for one structural reason. They reward motion, not outcome. Once a team is measured on tickets closed, the team gets very good at closing tickets, including the trick of splitting one real problem into four small tickets to inflate the number. The Ops manager who ships that dashboard is not lying. They are just measuring the wrong thing, and the COO will figure that out by the second QBR.

The fix is to swap the verb. Stop counting what the team did. Start measuring what the function delivered.

The Six Metrics That Belong on a QBR Slide

These six survive scrutiny from a numerate COO. They map to throughput, cost, and risk: the three things a finance-literate executive cares about. Each one needs to be segmented by process type, because an average across onboarding, refunds, and vendor POs is a number that describes nothing.

1. Cycle Time per Process

Measured from request-in to delivery-out. Median, not average. Segmented by process type.

Average cycle time is a trap. One stuck vendor PO that took 47 days will pull a fleet of 2-day refunds into a number that looks bad but isn't, or worse, hide a quietly degrading queue underneath one outlier. Median tells you what the typical request actually feels like. P90 tells you what the worst-tolerable case looks like. Both belong on the slide.

Rough ranges to anchor the conversation:

  • Customer refund: median 1-3 business days is healthy. Above 5 days, you have a sign-off bottleneck.
  • Vendor PO from request to issued: median 5-10 business days. Above 15, procurement is the constraint.
  • New employee onboarding (request to fully provisioned): median 2-4 business days. Above a week, IT and HR are not coordinating.
  • Customer escalation from inbound to resolution: median 4-24 hours depending on tier. Above 48 hours, the queue is broken or under-staffed.

The number on the slide is "Refund cycle time, median: 2.1 days, prior quarter: 2.8 days, target: 2.0." Three numbers, one process. Then repeat for the next process. Five processes on a slide is the upper limit before the eye glazes.

2. Throughput per Quarter

Units of work completed, by process category, trended across the last four quarters.

Month-over-month is noise. A four-month rolling view will smooth out a holiday week or a vendor freeze. Quarter-over-quarter is the right cadence for a QBR. It is also the cadence finance plans against, which means your throughput trend lines up with their revenue trend, and the two charts talk to each other.

The trap to avoid: do not aggregate. "Total tickets closed: 1,400" is the activity-report number you are trying to leave behind. The version that earns budget is:

Process Q1 Q2 Q3 Q4 Trend
Refunds processed 612 671 740 812 +33% YoY
Vendor POs issued 84 91 88 96 +14% YoY
Onboardings completed 22 28 34 41 +86% YoY

That table tells the COO three different stories. Refunds are scaling with the business, which is fine. Vendor POs are flat, so explain or fix. Onboardings have nearly doubled, and that is the headcount-justification line. It should be at the top.

3. Error Rate / Rework Percentage

Share of completed work that came back for correction. Segmented, trended, and named.

"Came back" needs a definition the team agrees on before the quarter starts. A refund that the customer disputed because the amount was wrong: rework. A vendor PO that came back because the GL code was off: rework. An onboarding ticket reopened because a system access was missed: rework. Tickets reopened for new requests from the same customer are not rework, they are new work, and conflating the two is how an honest team accidentally inflates its error rate.

Rough industry ranges, treat as anchors not gospel:

  • 3-5%: healthy for most ops processes.
  • 5-8%: normal range, watch trends.
  • 8-12%: a process is straining (staffing, training, or tooling is off).
  • >12%: a process is broken. Stop adding volume until you fix it.

Rework rate is the most useful single number on the slide because it is the leading indicator for everything else. Cost-per-process rises when rework rises. Cycle time rises when rework rises. Customer NPS falls when rework rises. If you only get one extra metric past the COO's attention span, make it this one.

4. Vendor SLA Breach Percentage

Percentage of vendor commitments missed in the period, by vendor. Top three offenders surfaced by name.

This is the metric that turns a passive Ops function into a procurement-influencing one. SLA data lives in the contracts you signed and the tickets you opened. Most ops teams never close the loop between the two, which is why vendor renewals catch them by surprise.

The slide should look like:

Vendor Commitments Breaches Breach %
Vendor A 42 1 2.4%
Vendor B 28 6 21.4%
Vendor C 15 4 26.7%

Two things happen when this slide hits the room. First, finance and procurement want the underlying data. Give it to them, and you have just made yourself the source of truth on vendor performance. Second, the next renewal conversation with Vendor B and Vendor C goes very differently. You walk in with breach data, not vibes, and either the price comes down or the contract goes away.

Surface the top three offenders by name. The COO will remember those names at the renewal meeting. That is the entire point.

5. Automation Coverage

Percentage of in-scope process steps that run without human touch. Track movement, not absolute level.

Absolute automation coverage is a fake metric. A team that processes simple, scriptable refunds will look 80% automated. A team that handles complex contract exceptions will look 15% automated and be doing harder work better. Comparing the two numbers is meaningless.

What matters is the delta. "Last quarter, refund automation was 62%. This quarter, 71%. Next quarter, target 78%." That is a story about a function getting more leverage from the same headcount, which is the headline a COO needs to hear before the budget conversation.

The denominator matters too. Define "in-scope steps" up front and keep the definition stable. Moving the goalposts to make the number look better is a one-quarter trick that finance will catch the second they ask for the underlying step list.

6. Cost-per-Process

Fully loaded cost (labor plus tooling plus vendor) divided by units delivered.

This is the number a COO can compare to revenue per unit, which is the comparison that decides whether your function expands or contracts. If revenue per onboarded customer is $4,200 and cost-per-onboarding is $380, the function has obvious unit economics. If cost-per-onboarding is $1,100, you have a problem you need to surface before finance does.

Three rules for calculating it without lying to yourself:

  1. Include fully-loaded labor, not base salary. A $90K Ops IC costs the company roughly $115-125K once benefits, taxes, and overhead are included. Use the loaded number.
  2. Include tooling at allocation, not total. If your CRM costs $80K/year and ops uses 30% of seats, allocate $24K to ops cost, not $80K.
  3. Include vendor pass-through where it is part of the process. If vendor B charges $40 per onboarding API call and you do 800 onboardings, that is $32K of vendor cost that belongs in the cost-per-process calculation.

Trend cost-per-process quarter over quarter alongside throughput. The pair tells the leverage story. Throughput up 30%, cost-per-process down 12%: the function is scaling. Throughput up 30%, cost-per-process up 8%: you are buying volume with money, not leverage.

The "High Throughput, Rising Error Rate" Diagnostic

Here is the pattern that catches teams off guard, and the one a smart Ops IC names out loud before the COO does.

Throughput is climbing. Cycle times are holding. Rework rate is also climbing. The dashboard looks mostly green, but one yellow line is creeping up.

This is the borrowing-from-quality trap. The team is hitting volume by skipping checks, compressing review steps, or pushing under-trained hires onto live work too early. It looks like a win for two quarters. Then the customer-side complaints arrive, the rework backlog explodes, and the cycle time number breaks because every other request is now a redo.

Typical causes:

  • Under-staffed QA or review layer. Volume grew, the checking layer didn't, so the checks got skipped.
  • New-hire ramp. Three new ICs ramping at once will drive rework up before they drive throughput up. The lag is real and predictable.
  • Tooling change. A new CRM, a new ticketing system, a new approval workflow: all of these cause a temporary rework spike that you should be tracking on purpose, not catching by accident.
  • Process compression. Someone removed a step to speed things up. Now the work is faster and worse.

The fix is not to slow down throughput. It is to find which process is generating the rework, segment to that process specifically, and fix the upstream cause. A 15% rework rate across one process category is a fixable problem. A blended 7% rework rate across the whole function hides the broken process inside the healthy ones.

Name the diagnostic out loud at the QBR. "Throughput is up 14% quarter over quarter, which is good. Rework is also up, from 5.1% to 7.4%, concentrated in vendor PO processing. Two-thirds of that is one new IC ramping. We expect rework to drop back to 5% next quarter as ramp completes. Here is what we are tracking weekly to confirm." That is the narrative of an Ops IC who is paying attention. The COO will trust the next quarter's forecast because of how this quarter was framed.

QBR Slide Pattern: Six Metrics, One Slide

The whole stack belongs on one slide. Forcing it onto one slide is the discipline. If it does not fit, you are over-complicating.

The pattern that works:

Metric Current Quarter Prior Quarter Target One-line Commentary
Cycle time (refund, median) 2.1d 2.8d 2.0d New approval routing live, on-track.
Throughput (total processes) 949 790 920 Ahead of plan, driven by onboarding volume.
Rework rate 7.4% 5.1% <6% Concentrated in vendor POs, ramp-driven, expect Q+1 normalization.
Vendor SLA breach % (Vendor B) 21.4% 18.6% <10% Renewal in Q3, pulling breach data into negotiation.
Automation coverage 71% 62% 78% Two new automations live, third in build.
Cost-per-process $312 $341 $300 Down 8.5%, on track for target.

That is the slide. Six rows. Four columns. One sentence each. No bar charts. No "tickets closed" anywhere. The COO reads it in 30 seconds and knows where to push.

The one-line commentary is the part most Ops ICs skip and the part that does the most work. The number tells the COO what happened. The commentary tells them whether you understand why. The second is what gets you the next budget.

Vanity Metrics to Drop

Kill these on sight. They have no place on a slide that goes to a COO.

  • "Tickets closed": meaningless without process segmentation and rework rate. A team can close more tickets while doing worse work.
  • "Average response time" without segmentation: averages hide the long tail. Use median plus P90, by process.
  • "Hours logged": measures effort, not output. The right metric is hours per unit delivered, which is just cost-per-process under a different name.
  • "Slack messages sent" or "meetings held": activity proxy with no defensible link to outcome. If anyone in your org tracks this on a slide, find out who, and quietly have it removed.
  • "Tickets aging" without thresholds: every ticket is aging. The number that matters is how many are aging past their SLA threshold.
  • "NPS" without volume context: a 72 NPS from 8 customers is a 72 NPS from 8 customers. Pair it with response volume or do not show it.

The test for whether a metric belongs on the slide is simple. If the COO asked "so what" three times in a row, would you have an answer each time? Tickets closed fails on the first "so what." Cost-per-process answers it three layers deep.

What Earns the Next Budget

The Ops IC who walks into the QBR with these six numbers — segmented by process, trended over four quarters, with one sentence of honest commentary on each — is the IC who gets the next headcount. The one who shows up with a tickets-closed bar chart loses the argument before they finish their second slide.

The metrics matter. The framing matters more. A COO is not looking for a dashboard. They are looking for an Ops leader who understands what the function delivers, what it costs, and where it will break next. These six numbers, used well, prove all three.

Bring them next quarter. Then the quarter after that. By the third QBR in a row with this stack, you are not the IC presenting metrics anymore. You are the person finance asks before they build the model.

Learn More