日本語

AI in the Operations Manager Workflow: Where It Helps, Where It Breaks

Last quarter I sat through a vendor demo where the rep clicked one button and a "perfect" SOP for offboarding contractors appeared on screen in about 30 seconds. Before the slide changed, I'd already spotted four things wrong with it. The doc forgot the badge return step. It told you to "deactivate accounts" without naming a system of record. It listed the wrong manager as the approver. And it ended with "if any issues, escalate to IT," which is the kind of advice that gets you a ticket six weeks later asking why a former contractor still had VPN access.

The rep was beaming. The room was nodding. I was making notes about what I'd have to fix if anyone actually used the output.

That demo is the whole problem with AI in Ops, in one slide. The output looks confident. It reads fluently. It hits the major beats a real SOP would hit. And it skips every exception case that makes the SOP worth having. Ops work IS the exception cases. Anyone can write the happy path. The job is what happens when the happy path doesn't apply, and that's exactly where AI gets confidently wrong.

So this isn't another "AI will transform Operations" piece. It's the opposite. After a year of using AI tools in my own week — some that stayed, several that got cut — here's where I think AI actually earns the seat at the desk, where it doesn't, and the 30-day plan I'd give a new Ops Manager who wants to integrate it without producing a mess they'll spend Q2 cleaning up.

Where AI Actually Helps (Use It Tomorrow)

These are the five places I trust AI in my week. Note that "trust" is doing a lot of work in that sentence. I trust AI to start the work, not finish it.

SOP drafting (first draft only)

This is the highest-ROI use I've found. The workflow looks like this: I record a Loom of myself doing the process once, narrating what I'm clicking and why. I take screenshots of the three or four screens that matter. I drop the Loom transcript and the screenshots into Claude or ChatGPT and ask for a draft SOP in our standard format.

The output is roughly 60% usable. I rewrite about 40%, usually the parts where I narrated something out of order, or where the AI inferred a step that I do reflexively but didn't actually mention. But that 60% scaffolding is real time saved. Writing an SOP from a blank page takes me about 90 minutes. Writing one from an AI draft of my own Loom takes about 35.

The trap is using AI to draft an SOP for a process you haven't recorded. Without the Loom, the AI is making things up from the title and a description. Those drafts read fine and are wrong in ways you'll only catch when someone follows them and breaks production. Don't do this. The Loom is non-negotiable.

Vendor contract review summaries

I have about 40 vendor contracts to track. Most of them are 15 to 30 pages. AI is genuinely good at reading one and giving me back a one-pager: renewal date, auto-renew clause, notice window, liability cap, data processing terms, termination conditions, and any unusual clauses worth flagging. I still read the contract before I sign. But for the renewals where I'm asking "what am I locked into and when do I have to act?", AI gets me to an answer in five minutes instead of an hour.

One caveat. AI hallucinates contract terms about 1 in 20 times in my experience. Always cross-check the renewal date and the notice window against the actual document. Those two fields cost real money if they're wrong.

Intake auto-routing

We get inbound requests through a single Ops queue, and they need to get routed to IT, Facilities, People Ops, or kept in Ops. I trained a simple classifier on six months of historical tickets, using keywords plus the team that ended up handling each one. It now auto-tags new tickets with a suggested route.

It catches about 70% correctly. I handle the 30%, and most of those 30% are tickets that genuinely could go to two teams. The 70% is pure time saved. The trick is treating the AI tag as a suggestion, not a routing decision. The ticket still lands in my queue first. I just glance at the suggested tag and confirm or change it. Two seconds per ticket instead of ten.

Anomaly detection in process metrics

Dashboards are good at showing you what's happening. They're bad at telling you when something has changed. I've started running our weekly process metrics through a simple anomaly check (ticket volume, cycle time, error rate, vendor SLA hits) and asking the AI to flag anything that's more than two standard deviations from the trailing four-week average.

It catches things I'd miss for a week or two. Last month it flagged that our average cycle time on procurement tickets had doubled. Turned out one approver had gone on leave and nobody had reassigned their queue. Found it Tuesday instead of the following Monday's review.

Meeting notes synthesis

Cross-functional syncs produce a specific kind of waste: the "wait, what did we actually decide?" Slack thread that fires up 90 minutes later. AI meeting notes (Otter, Fireflies, or just dropping a transcript into Claude with a prompt for "decisions, owners, due dates") kill that thread. I get a summary in my inbox 10 minutes after the meeting ends. I edit it, post it in the channel, done.

Two rules. First, never trust the action item owners without confirming. AI assigns owners based on who spoke last about the topic, which is wrong about a third of the time. Second, AI summaries are a backup, not a substitute for someone in the room actually paying attention. If you're using AI notes to skip the meeting mentally, you're not getting the real value of the meeting in the first place.

Where AI Breaks (Do Not Delegate)

These are the four places where AI output costs more than it saves. I've tried each of them. Each one taught me a specific lesson about why Ops judgment doesn't compress into a prompt.

Judgment calls. "Should we escalate this to legal?" is not a procedural question. It depends on who the customer is, how exposed we are, what the GC is dealing with this week, and whether the situation is novel or a variant of one we've handled before. The AI will give you a confident answer. The answer will be reasonable on the surface. It will also be wrong about a quarter of the time, in ways that will hurt.

Escalation timing. Knowing when to ping the VP versus wait 24 hours is political, not procedural. It depends on whether the VP is in a board prep cycle, whether you've burned escalation credit recently, whether the issue is genuinely time-sensitive or just feels that way to the requester. AI cannot read the room. It will tell you to escalate now because the policy says so. The policy is wrong about half the time.

Vendor negotiation. AI can give you the script. It can summarize comparable pricing. It can draft the email. The leverage is yours. The relationship is yours. The willingness to walk away is yours. I tried using AI to negotiate a vendor renewal once, in the sense of letting it suggest the counter-offer. The counter it suggested was technically defensible and emotionally inert. The vendor agreed instantly, because it was a number they would have agreed to in five seconds. I left money on the table. Never again.

Cross-functional politics. Sales blames Ops for slow lead routing. Ops blames Finance for the contract delay. Finance blames Legal for the redlines. AI cannot navigate that. It will summarize the situation for you, which is occasionally useful. It will not tell you that the actual blocker is a 1:1 between two VPs that hasn't happened in three weeks, because the AI doesn't know the org chart that lives between the org chart.

The "AI-Written SOP" Trap

I want to spend an extra paragraph on this because it's the single biggest mistake I see new Ops Managers make with AI.

AI-drafted SOPs read clean. The grammar is good. The numbered steps look professional. The tone is appropriately neutral. They will pass a casual review. They will fail in production, because every step describes the happy path, and Ops is 60% exception handling. The AI describes how to onboard a new vendor; reality is what to do when the vendor's W-9 is missing, when their banking details fail validation, when their contact person quits two weeks in.

The rule I follow now: never publish an AI-drafted SOP without someone running through it once with the screen recording on. If the SOP says "click Approve in the procurement system," the screen recording catches that there are actually three Approve buttons on that screen and only one is right. The AI can't see the screen. You have to.

If you only take one thing from this article: AI generates the scaffolding, a human runs the script live, and the script gets fixed. Skip that step and you're shipping fiction.

What I Actually Use

People expect me to have a 14-tool AI stack. I don't.

  • Claude or ChatGPT for SOP first drafts and contract summaries.
  • Loom for the source recording that the SOP draft is built from.
  • A screenshot tool for inline visuals in the SOP.
  • The same AI as a rubber duck ("did I miss a step in this checklist?") when I'm too close to a process to see what's missing.

That's the whole stack. No autonomous agents. No AI-driven workflow automation. No "AI Operations Copilot" that promised to run my queue while I slept. I tried two of those. Both produced output I had to undo. Both got cut.

The Optional ACE Framework Lens

If you're tracking how AI shows up across operational work in general, the ACE Framework is a useful map. Five capabilities: Ingest, Analyze, Predict, Generate, Execute.

  • Ingest: intake classification, ticket tagging, contract data extraction. AI is solid here today.
  • Analyze: anomaly detection in process metrics, dashboard summarization. AI is solid here when the metrics are clean.
  • Predict: cycle-time forecasting, capacity planning. AI is mediocre here. Vendors over-promise.
  • Generate: SOP drafts, meeting summaries, vendor email drafts. AI is solid here as a first draft, never as a final.
  • Execute: auto-routing, auto-approving, auto-anything. AI is the most over-promised here. Most "Execute" features are actually Ingest dressed up to look autonomous.

Most Ops AI value today lives in Ingest and Generate. Treat anything labeled "Predict" or "Execute" with extra skepticism, especially in vendor demos.

The 30-Day Plan

If you're a new Ops Manager and you want to integrate AI into your week without producing a mess, this is the path. One workflow per week. Measure honestly. Cut what doesn't work.

Week Workflow What to do Success metric
1 SOP drafting Pick one process you own. Record a Loom. Draft the SOP with Claude or ChatGPT. Track time-to-publish vs. your previous drafts. Time saved net of cleanup. If the AI draft costs you more time to fix than it saves, kill it.
2 Intake classification Add an AI tag suggestion on a low-stakes queue (not the executive queue). Don't auto-route yet. Just measure how often the suggested tag is correct. Tag accuracy ≥ 70%. If lower, the model needs more training data or a smaller scope.
3 Meeting notes synthesis Run AI notes on three cross-functional meetings. Compare each AI summary to your own handwritten notes. Note what the AI missed. Decisions captured: 100%. Owners assigned correctly: ≥ 90% after your edit.
4 Audit Review the three workflows. What actually saved time? What created cleanup work? What did your team think? Document the keep/cut list and circulate it. Honest keep/cut list. No tool stays just because it's new.

Two rules for the audit. First, count cleanup time as time spent. The AI draft that takes you 35 minutes to fix didn't save you 55 minutes; it saved you whatever the delta is to writing from scratch, minus the cleanup. Second, if a teammate doesn't trust an AI workflow, that mistrust is data. Don't override it with enthusiasm. Find out what they saw that you missed.

Closing

AI is a junior teammate who never sleeps and is sometimes wrong with great confidence. Treat it like one. Give it scoped tasks. Review its output. Don't let it ship to customers, vendors, or executives unsupervised. Use it for the parts of your job that are scaffolding, not judgment. The judgment is still yours, and that's the part that makes you an Operations Manager rather than someone who copy-edits AI output for a living.

The job hasn't changed. The blank page has gotten a little less blank. Everything that mattered before still matters.

Learn More