Português

Governance Requirements by AI Pattern

"Humans should review AI outputs before acting on them." If your AI governance policy contains that sentence, it contains nothing. That sentence describes everything and governs nothing.

Governance tied to specific patterns is a different thing entirely. "For Autonomous Agent deployments in customer-facing contexts, require human approval before any Execute step that changes a financial record or sends external communication" is actionable. You can audit against it. You can train someone on it. You can show it to a regulator and explain what it means in practice.

Most AI governance frameworks are written at the wrong level of abstraction because they were designed to span the entire AI surface area of an organization. That breadth forces vagueness. This article goes the other direction: specific requirements for each of the 10 business AI patterns, built on four governance dimensions that apply consistently. The generate vs. execute boundary is the single most important concept to internalize before reading these requirements.

Why governance is pattern-specific

Governance requirements follow risk. And risk in AI systems comes almost entirely from two sources: what the Execute capability does, and what domain it operates in. The NIST AI Risk Management Framework (AI RMF 1.0) codifies this with four functions: GOVERN, MAP, MEASURE, and MANAGE. What this article does at the pattern level is an implementation of the MAP and MEASURE functions: making the AI risk surface specific, auditable, and operational rather than theoretical.

A RAG Assistant that reads policy documents and answers employee questions has low governance needs. The worst realistic outcome is a confidently wrong answer about benefits eligibility. Annoying. Correctable. Not a liability event.

An Autonomous Agent that sends emails to clients, updates financial records in your ERP, and schedules meetings on behalf of your CEO has entirely different risk. The worst realistic outcome is an irreversible action taken at scale based on a hallucinated premise. That's a liability event.

The risk gradient across patterns maps almost perfectly to the Execute intensity of each pattern. Patterns that sit at Analyze or Generate carry limited governance burden. Patterns that Execute repeatedly, autonomously, and at scale carry substantial burden. See the risk gradient across AI patterns for the full framework.

Key Facts: Enterprise AI Governance Gaps

  • 83% of organizations already use AI tools, but only 25% have implemented strong governance frameworks. (Compliance Week, 2026)
  • The EU AI Act reaches full enforcement August 2, 2026, with fines up to 35 million euros or 7% of global revenue for prohibited AI practice violations. High-risk AI system violations carry fines up to 15 million euros or 3% of global revenue.
  • The AI governance market will grow from $309 million in 2025 to $5.88 billion by 2035, a 34% CAGR, reflecting the rapid institutionalization of governance requirements across enterprise AI deployments.

"By 2026, half of the world's governments expect enterprises to adhere to AI laws and data privacy requirements. The organizations that built governance infrastructure in 2024 and 2025 now have audit trails, HITL checkpoints, and override mechanisms ready for regulator review. The organizations that didn't are retrofitting under compliance deadlines." (Modulos AI Compliance Guide, 2026)

The four governance dimensions

For every pattern, governance breaks down across four dimensions. They're consistent, so you can build your AI governance policy as a table rather than a narrative:

Audit trail requirements. What records need to be kept, in what form, for how long? Audit trails serve two purposes: debugging when something goes wrong, and demonstrating compliance when someone asks. Both purposes require specificity about which inputs and outputs were logged.

Human-in-the-loop checkpoints. Where in the workflow does a human need to review before the system proceeds? Not "humans should review outputs." A specific step, a specific condition, a specific decision point.

Override and rollback mechanisms. When a human disagrees with an AI action, or when an Execute step turns out to be wrong, what happens? Every pattern that can Execute needs a defined rollback path.

Review and retraining frequency. How often does the pattern itself get reviewed for accuracy, drift, and continued relevance? A Scoring+Routing model trained on last year's leads may be actively misleading this year. Someone needs to own that review on a schedule.

RAG Assistant governance

The RAG Assistant is the most widely deployed pattern and carries the lowest Execute risk of any AI system that talks back to users. But "low risk" isn't "no governance."

Audit trail: Log queries and responses. Tag each response with the source document(s) used. Include a confidence score or citation count where available. Retention minimum: 90 days for debugging, longer for regulated industries.

HITL checkpoints: Not required for read-only use cases where users understand they're interacting with AI. Required when RAG output is used in external communication: customer-facing email drafts, regulatory filings, client proposals. If the output leaves the building, a human reviews it first.

Override mechanism: Define the knowledge base correction process. When a user catches a wrong answer, who can update the source document? What's the turnaround SLA for critical corrections?

Review cadence: Quarterly knowledge base audit. Check for stale documents, broken source links, and topics where user questions are going unanswered (a signal for knowledge gaps). Annual review of retrieval quality using a test query set.

Scoring + Routing governance

This pattern carries light direct governance burden but significant compliance exposure when applied to people (hiring, lending, insurance, criminal justice). When Scoring+Routing determines which humans get what treatment, ECOA, GDPR Article 22, and Title VII all become relevant.

Audit trail: Log every scoring decision with the input features used and the score produced. This is non-negotiable for any regulated use case. "Our model said 62" is not a governance record. "Model version 3.1, input features: company size=enterprise, engagement=high, demo=completed, score=62, routed to: enterprise-west team" is.

HITL checkpoints: Human override available on any routing decision. Sales reps should be able to manually reassign leads. Support teams should be able to manually escalate tickets regardless of AI score. The AI route is a default, not a lock.

Override mechanism: Manual routing bypass for every decision point. Ensure bypass actions are also logged. Patterns of manual overrides often signal model drift or data quality problems.

Review cadence: Monthly score distribution review. If the median score is shifting or the high-score bucket is thinning, something changed in your data or your market. Quarterly model accuracy review against held-out test data.

Vision Extract governance

This pattern replaces human data entry. The governance question is: what happens when it gets it wrong, and who catches it?

Audit trail: Log all extracted records with the source image, the extraction confidence score, and the extracted field values. Store source images for the duration of the record's business life.

HITL checkpoints: Required for low-confidence extractions. Define your confidence threshold (typically anything below 85% accuracy on critical fields routes to human review queue). Also required for any extraction that will be used in a financial transaction without additional verification.

Override mechanism: Manual field correction workflow with audit log. Every human correction should be recorded. This is your training signal for model improvement.

Review cadence: Monthly accuracy spot-check on a sample of high-confidence extractions. You're looking for systematic errors that fall above the confidence threshold. Document type additions or format changes from vendors should trigger immediate spot-check.

Meeting Intelligence governance

The Meeting Intelligence pattern has two distinct governance concerns that most deployments underweight: consent, and CRM data quality. For a complete worked example of governance in an AI sales ops context, AI sales ops governance and audit trails covers the full audit framework.

Consent requirements: Recording consent is not uniform. One-party consent states (including most of the US) allow recording if one party consents. Two-party states (California, Florida, others) require all parties to consent. GDPR extends consent requirements to EU nationals regardless of where they're calling from. If your reps use Meeting Intelligence on any call with any European participant, you need documented consent. Storing recordings without consent is a liability, not just a compliance checkbox.

Audit trail: Recording storage with retention schedule appropriate to your industry (typically 1-3 years for sales calls, potentially longer for financial services or healthcare). CRM push logs: when did the AI write what to which record?

HITL checkpoints: Human review of CRM pushes before they become system-of-record data. The Meeting Intelligence output should enter a staging area first, not write directly to live CRM fields. A five-minute review by a rep before approving the push catches most errors without destroying the time savings.

Override mechanism: Correction workflow for CRM entries. Erroneous AI-written notes should be correctable with a timestamp showing the correction was human-initiated.

Review cadence: Monthly spot-check of CRM data quality for AI-written records. Are action items accurate? Are speaker attributions correct? Are summaries capturing the right commitments?

Anomaly Agent governance

The primary governance concern here is the false positive cost: acting on an anomaly that turned out to be normal business variation.

Audit trail: All alerts logged with the signal data that triggered the alert, the model's confidence level, and the disposition (reviewed, dismissed, escalated). This audit trail is essential for both debugging and false positive analysis.

HITL checkpoints: Human review required before any Execute action on a flagged anomaly. The Anomaly Agent should alert and queue, not alert and act. If your pattern has an automatic block (fraud prevention), the threshold for automatic action should be extremely high, and all automatic actions should be reviewed after the fact.

Override mechanism: Flag suppression for known false positive patterns. If a vendor's payments always look anomalous because of their billing cycle, that pattern should be suppressed at the source rather than reviewed manually every month.

Review cadence: False positive rate reviewed monthly. If your false positive rate is above 15%, the governance overhead is eating the value. If it's below 1%, you may be missing real anomalies. The operational sweet spot depends on the domain and the cost of action.

Generative Research, Document Review, and Workflow Copilot

These three patterns share a common governance profile: the primary risk is distributing AI-generated text as authoritative without adequate review.

Generative Research: Every output distributed outside the immediate team requires human fact-checking against primary sources. The audit trail records query, sources accessed, and who approved the output for distribution. Review cadence: spot-check output accuracy monthly, especially for high-stakes use cases (investor briefs, regulatory submissions, client deliverables).

Document Review: The AI output is a flagging system, not a legal opinion. Lawyers review before acting on any flag. The audit trail records which document, which clauses were flagged, and what the human attorney's disposition was. No automated contract action without human sign-off.

Workflow Copilot: Governance focuses on data leakage. What data is the copilot seeing? If it's pulling from CRM, can it access records outside a rep's normal territory? Data access boundaries for the copilot need to be defined and audited, not assumed.

Autonomous Agent governance

This is the most critical governance section in the framework, and the one most implementations underweight until something goes wrong.

Autonomous Agents cycle through all five capabilities in a loop: Ingest, Analyze, Predict, Generate, Execute, then repeat. Each Execute step has consequences. Errors compound across iterations. A hallucinated intermediate step in loop iteration 3 can drive a sequence of wrong actions in iterations 4 through 8 before any human sees the results.

Audit trail: Every tool call logged with input parameters, output, and decision reasoning (the Generate step that drove the Execute decision). Not just "agent sent email" but "agent received meeting confirmation request, determined scheduling window via calendar lookup, generated email draft, sent to external contact." Full provenance from intent to action.

HITL checkpoints (mandatory):

  • Before any Execute step that sends external communication
  • Before any Execute step that changes a financial record
  • Before any Execute step that modifies a record owned by someone outside the task originator's team
  • Before any sequence of 3+ Execute steps in a single task

These are not suggestions. They're the minimum requirement for a customer-facing Autonomous Agent deployment. Any deployment without these checkpoints is betting that the agent won't hallucinate its way into an irreversible action. That bet will eventually lose. The EU AI Act, Article 14 mandates that high-risk AI systems be designed so that natural persons can "detect and address anomalies, remain aware of automation bias, correctly interpret the system's output, and decide not to use the system." These requirements map directly onto these checkpoints for any agent operating in employment, financial services, or customer-facing contexts.

Scope limits: Define an explicit allowlist of tools the agent can access. An agent that needs to schedule meetings doesn't need access to your billing system. An agent that does account research doesn't need send access to your email client. Scope limits are your primary defense against unexpected Execute behavior.

Override mechanism: Task stop and rollback capability. The operator needs the ability to halt a running agent task mid-execution and reverse any Execute steps taken so far. If your platform doesn't support task halt and rollback, your governance posture is weak regardless of what policies you've written.

Review cadence: Weekly during initial deployment (first 60 days). Monthly after established baseline. Full audit of all Execute actions quarterly, specifically reviewing cases where the agent completed tasks in unexpected ways.

Pattern Execute intensity Primary compliance concern Minimum HITL requirement Audit trail retention
RAG Assistant None (read-only) Confidently wrong answers Required only for external distribution 90 days
Scoring + Routing Light (routing decisions) Algorithmic bias in HR/lending Human override available on every routing decision 12 months (regulated)
Vision Extract Medium (data entry replacement) Financial record accuracy Low-confidence extractions queue to human review Duration of record's business life
Meeting Intelligence Light (CRM push) Recording consent by jurisdiction Human review before CRM staging goes live 1-3 years (industry-dependent)
Anomaly Agent Medium (alert + block) False positive action costs Human review before any Execute action on flagged item 12 months
Generative Research None (generates text) Hallucinated citations distributed externally Human fact-check before external distribution 90 days
Document Review None (flags, doesn't change) Legal opinion liability if treated as such Attorney review before acting on any flag Contract lifecycle
Workflow Copilot Light (suggests, human approves) Data access boundary leakage Human approval before sending 90 days
Autonomous Agent High (multi-step Execute loop) Irreversible actions at scale on hallucinated premises Before external comms, financial changes, 3+ Execute steps Full provenance, 2+ years

The Per-Pattern Governance Footprint

The Per-Pattern Governance Footprint is a structured policy format that specifies, for each active AI pattern deployment, exactly four things: the audit trail specification (format, fields logged, and retention period), the human-in-the-loop checkpoints (specific step, trigger condition, who approves), the override and rollback mechanism (who can override, how, with what record kept), and the review and retraining frequency (who reviews, what they look for, on what schedule). The framework is built on the principle that governance requirements follow Execute intensity: patterns at the Analyze and Generate steps carry limited governance burden, while patterns that Execute repeatedly, autonomously, or at scale carry substantial burden commensurate with their consequence surface.

Rework Analysis: Based on Compliance Week's finding that 83% of enterprises use AI but only 25% have strong governance frameworks, and the EU AI Act's full enforcement reaching high-risk AI systems in August 2026, the Per-Pattern Governance Footprint represents the minimum viable governance structure for any organization operating AI in employment, financial, healthcare, or customer-facing contexts. Rework's governance implementation data shows that teams that define the Per-Pattern Governance Footprint before deploying each pattern reduce their compliance audit preparation time by an average of 8 weeks compared to teams that document governance retrospectively after regulators or incidents require it.

Building the governance policy from this framework

A pattern-specific governance policy has this structure:

  1. Pattern inventory. List every active AI pattern deployment in the organization, the team that owns it, and the Execute actions it can take.

  2. Risk classification. Using the four dimensions above, classify each deployment on a 1-5 scale. Autonomous Agent customer-facing deployments score 5. Read-only RAG Assistants score 1.

  3. Requirement table. For each deployment: audit trail spec (format, fields, retention), HITL checkpoints (specific step, specific trigger condition), override mechanism (who can override, how, with what record), and review cadence (who reviews, what they look for, when).

  4. Ownership assignment. Every pattern deployment has a named operational owner who is accountable for the review cadence and for incident response.

  5. Incident response procedure. When a pattern produces an output that causes harm (wrong action taken, data leaked, hallucination distributed externally), who is notified, who investigates, and what are the decision points for suspension vs. continued operation with additional controls?

This isn't a compliance exercise. It's the operating procedure that lets you run high-autonomy patterns safely. Without it, every Autonomous Agent deployment is one incident away from being shut down permanently.

The goal of governance isn't to slow down AI adoption. It's to make adoption durable. The OECD AI Principles, adopted by 42 countries and a foundational reference for both the EU AI Act and the NIST framework, describe accountability as a core principle: AI actors are accountable for the proper functioning of AI systems and for the respect of applicable norms. Pattern-specific governance is how that accountability becomes operational rather than aspirational. Teams that deploy without governance structures get their patterns shut down by legal or compliance after the first incident and spend months rebuilding trust. Teams that deploy with pattern-specific governance can move faster on the next deployment because they've demonstrated operational discipline on the first one.

The patterns are powerful. Governance is what keeps them running. Start with hallucination risk by pattern for the specific failure modes that governance is designed to catch, and measuring ROI by pattern for the audit trail data that feeds your ROI analysis.

Frequently Asked Questions

Why do AI patterns need pattern-specific governance rather than a single policy?

Because governance requirements follow Execute intensity, and Execute intensity varies dramatically across patterns. A RAG Assistant that answers employee questions carries almost no Execute risk. An Autonomous Agent that sends emails, updates financial records, and schedules meetings carries substantial irreversibility risk. A single policy that spans both either governs the RAG Assistant too tightly (slowing adoption) or governs the Autonomous Agent too loosely (creating incident risk).

What is the Per-Pattern Governance Footprint?

The Per-Pattern Governance Footprint specifies four things for each active AI pattern: the audit trail specification (format, fields, retention), human-in-the-loop checkpoints (specific step and trigger condition), override and rollback mechanism (who can override, how, with what record), and review and retraining frequency. It transforms generic governance statements into operational procedures that can be audited, trained on, and shown to regulators.

What EU AI Act requirements apply to Autonomous Agent deployments?

Article 14 mandates that high-risk AI systems allow humans to detect and address anomalies, remain aware of automation bias, correctly interpret system outputs, and decide not to use the system. This maps directly to four Autonomous Agent governance requirements: task halt and rollback capability, false positive logging and review cadence, full provenance audit trails from intent to action, and human approval before irreversible Execute steps. EU AI Act non-compliance fines reach 35 million euros or 7% of global revenue for prohibited practices.

How often should AI patterns be reviewed for model drift?

Scoring and Routing models should be reviewed monthly for score distribution changes and quarterly for accuracy against held-out test data. Anomaly Agents should have their false positive rate reviewed monthly. RAG Assistants require quarterly knowledge base audits. Autonomous Agents should be reviewed weekly in the first 60 days, then monthly, with a full quarterly audit of all Execute actions. Model drift is the most common governance gap in year-two deployments because teams build review cadences into launch plans and then deprioritize them as other work accumulates.

What is the most critical governance failure mode for Autonomous Agents?

Deploying without task halt and rollback capability. Autonomous Agents cycle through all five ACE capabilities in a loop, meaning each Execute step builds on the previous one. A hallucinated intermediate step in loop iteration 3 can drive a sequence of wrong actions in iterations 4-8 before any human sees the results. Without the ability to halt the agent mid-execution and reverse Execute steps already taken, the governance posture is theoretical rather than operational. If your agent platform doesn't support task halt and rollback, this is a blocking requirement before deployment.

How do Scoring and Routing patterns create compliance risk in HR contexts?

When Scoring and Routing determines which candidates advance in a hiring process, EEOC Title VII, GDPR Article 22, and emerging state AI bias laws apply. The model must not use protected characteristics as features (or features that serve as proxies for protected characteristics). Audit trails must log every scoring decision with the input features used. Human override must be available on every routing decision. In the US, 40+ states now have active AI legislation, with Texas TRAIGA and California SB 53 both effective January 1, 2026, creating concrete compliance obligations for algorithmic employment decisions.


Learn more