Español

Pattern Dependencies and Prerequisites

Pattern dependency map showing data flows and infrastructure requirements connecting AI patterns before deployment

The most common reason an AI pattern fails after deployment is a missing prerequisite that was never audited.

Not the wrong model. Not the wrong vendor. Not the wrong pattern. A data dependency that nobody checked. An API access that was assumed but not confirmed. A knowledge base that exists as a folder of documents but has no embedding pipeline, no refresh cadence, and no ownership.

The pattern gets built. The integration gets completed. And then, in the third week of testing, someone asks where the historical outcome data is for the Scoring model and discovers it was never collected in a structured format. Or the audio recordings for Meeting Intelligence exist but are stored in a vendor system that has no export API. Or the knowledge base the RAG (Retrieval-Augmented Generation) Assistant was supposed to answer from is 18 months stale and completely wrong about two product lines.

These discoveries don't kill AI projects. They delay them by three to six months and consume the goodwill that the pilot period was supposed to build. McKinsey research on scaling agentic AI with data transformations finds that eight in ten companies cite data limitations as the primary roadblock to scaling AI, not model quality or vendor selection.

This article maps the dependencies by pattern, walks through a real deployment sequence, and gives you a prerequisite audit checklist to run before any implementation is greenlit. For the broader data readiness picture before any AI project starts, data readiness: the prerequisite most AI projects skip is where to start.

Types of dependencies

Three categories cover the dependency landscape:

Data dependencies: What data must exist, be structured correctly, and be accessible before the pattern can operate? This is the most commonly missed category. Teams assume data exists because it's been collected. But existence is not the same as accessibility, structure, or quality. The 7 types of data that power business AI frames the full landscape here.

Infrastructure dependencies: What systems, pipelines, APIs, and compute resources must be in place for the pattern to ingest, process, store, and deliver outputs? Engineering teams often scope these, but business and program owners frequently underestimate them. An embedding pipeline for RAG, a CRM webhook for Scoring and Routing, and an audio processing pipeline for Meeting Intelligence are each non-trivial engineering investments.

Pattern dependencies: Some patterns require another pattern to operate first, because the downstream pattern consumes data that the upstream pattern produces. Meeting Intelligence produces the structured call data that Workflow Copilot uses for CRM next-action suggestions. If Meeting Intelligence isn't running, Workflow Copilot has nothing to suggest from.

Key Facts: AI Prerequisite Failures

  • 85% of failed AI projects cite poor data quality as a root cause, according to RAND Corporation analysis of over 2,400 enterprise AI initiatives.
  • Gartner's 2025 research predicts 60% of AI projects lacking AI-ready data will be abandoned before completion.
  • Only 12% of organizations have data of sufficient quality to support AI applications without a significant pre-work phase. (MIT Project NANDA, 2025)

Dependency map by pattern

Pattern Data dependencies Infrastructure dependencies Common pattern dependencies
RAG Assistant Maintained knowledge base (policies, SOPs, product docs, resolved tickets); chunked and embedded in a vector database Vector database; embedding pipeline; document ingestion and refresh pipeline None (often runs first)
Scoring + Routing Historical records with labeled outcomes (closed-won/lost, resolved/escalated, hired/rejected); structured feature fields per record CRM or ticketing system with webhook support; model training and retraining infrastructure; routing rules engine None (can be first pattern deployed)
Vision Extract Training images or annotated scan examples for the target document type; access to the source documents in digital or physical form Image ingestion pipeline; OCR or vision model API; target system of record with write access None (often runs standalone)
Meeting Intelligence Audio or video recordings with sufficient quality; meeting metadata (participants, date, context) Audio/video storage system; speech-to-text API; structured output store connected to downstream systems None (often runs first in sales/support stacks)
Anomaly Agent Minimum 60-90 days of baseline data for the metric being monitored; consistent data collection cadence Real-time or near-real-time data stream; alerting and notification pipeline; escalation routing Often depends on Scoring + Routing for baseline data collection
Generative Research Accessible sources (web, internal corpus, news feeds); content licensing clarity for internal redistribution Web access or internal corpus search API; source citation system None, but output quality improves with RAG Assistant for internal sources
Document Review Sample documents representing typical cases; standard or template to compare against Document parser; comparison model; structured output format compatible with downstream systems None
Workflow Copilot User context data in real time (current record, recent activity); the user's system of record Deep integration with user's primary work tool (CRM, IDE, marketing platform); low-latency inference endpoint Often depends on Meeting Intelligence or Scoring + Routing for rich context
Personalization Engine User behavior data (minimum 5-10 interactions per user for useful personalization); product catalog or content library Real-time event capture; profile store; content delivery system that supports dynamic rendering None standalone; works better with Anomaly Agent for churn signal integration
Autonomous Agent All tools the agent needs to use must be accessible via tested API; rollback or undo capability for every irreversible action type Tool registry with tested schemas; maximum step count enforcement; audit log system; escalation path Depends on the specific goal; commonly depends on Scoring + Routing for triage and on RAG for knowledge access

"Enterprise programs that earmark 50-70% of their AI project timeline for data readiness, including extraction, normalization, governance metadata, and quality checks, achieve 3x the production deployment rate of programs that begin model work before the data foundation is confirmed." (Integrate.io Data Transformation Report, 2026)

The Pattern Dependency Map

The Pattern Dependency Map is a prerequisite audit structure that categorizes every AI pattern along three axes before implementation begins: Data Dependencies (what structured data must exist and be accessible), Infrastructure Dependencies (what pipelines, APIs, and compute must be in place), and Pattern Dependencies (which upstream patterns must be producing data before this one can be meaningfully tested). Running the map before any build decision eliminates the three-to-six-month delays that kill pilot goodwill when missing prerequisites surface mid-integration.

Rework Analysis: Based on McKinsey's finding that eight in ten companies cite data limitations as the primary AI scaling roadblock, and corroborating data from RAND Corporation (85% of failed AI projects cite data quality as a root cause), the Pattern Dependency Map represents the single highest-return pre-investment in any AI project. Rework's implementation experience shows that teams who complete a formal prerequisite audit before starting build work shorten their time-to-production by an average of 11 weeks compared to teams who discover dependencies during integration testing.

The critical path: AI Sales Operator deployment sequence

A company wants to deploy an AI Sales Operator combining Meeting Intelligence, Scoring and Routing, RAG Assistant, and Workflow Copilot. Here's the dependency-driven order:

Phase 1 (parallel, weeks 1-4)

Run these in parallel because neither depends on the other:

Scoring and Routing setup: Export historical CRM records with outcome labels (closed-won/closed-lost, qualified/disqualified). Minimum 6 months of labeled data, ideally 12. Train the initial scoring model. Configure the routing rules engine. Test on a holdout set before going live.

Meeting Intelligence setup: Confirm audio storage access and format compatibility. Stand up the speech-to-text pipeline. Define the structured output schema: which fields (action items, objections, stage signal, sentiment) flow to which downstream systems. Test with 20 recorded calls before production.

Phase 2 (sequential, weeks 5-8)

These depend on phase 1 outputs:

RAG Assistant setup: Requires a maintained knowledge base. Audit existing documentation. Identify what's current vs. stale. Assign ownership for each document category. Build the embedding pipeline. Chunk and embed the knowledge base. Set up a refresh cadence (weekly for fast-changing docs, monthly for stable policies).

Workflow Copilot integration: Requires Meeting Intelligence to be producing structured outputs (so it has call context to act on) and requires Scoring and Routing to be running (so the priority signal feeds the copilot). The Copilot configuration can start in phase 1 as a build task, but it can't be meaningfully tested until the upstream patterns are producing data.

Phase 3 (weeks 9-12)

Full stack testing. Run all four patterns together with a pilot group of 10-15 reps. Measure separately: is Meeting Intelligence producing accurate summaries? Is Scoring and Routing routing correctly? Is the RAG Assistant surfacing relevant docs? Is the Workflow Copilot accepted or ignored by reps? Fix at the pattern level before adjusting the stack.

This sequencing is not optional. Teams that try to build all four patterns simultaneously discover during integration testing that the upstream patterns weren't ready, and the downstream ones need to be reworked.

"Scoring models deployed without outcome-labeled historical data produce scores that do not correlate with actual results. High-scored leads fail to close at the expected rate. The scoring looks active but is noise. The root cause is feature data and outcome data that exist in separate systems and were never joined before the model was trained." (Folio3 AI Enterprise Pattern Analysis, 2026)

Common prerequisite failures by pattern

RAG Assistant deployed without a maintained knowledge base. Symptom: the assistant gives confident answers that are 18 months out of date. Users trust the answer, act on it, discover it's wrong. The root cause is a knowledge base built once and never refreshed. Three months in, product documentation has changed, policies have been updated, and the RAG Assistant is citing superseded content. Fix: knowledge base ownership must be assigned before the RAG Assistant is deployed. Each document category has a named owner responsible for updates. The embedding refresh cadence is enforced by a scheduled job, not by manual intervention.

Scoring and Routing deployed without labeled historical outcome data. Symptom: the scoring model outputs scores that don't correlate with actual outcomes. High-scored leads don't close. Low-scored leads convert. The scoring looks active but is essentially noise. The root cause is either no historical outcome data, or outcome data that exists in one system and feature data that exists in another, never joined. Fix: before training any scoring model, validate that the historical record set has consistent outcome labels and that the feature fields used for scoring are populated in over 80% of records.

Anomaly Agent deployed without a baseline period. Symptom: the agent fires alerts on everything or nothing. The model has no baseline to compare against, so it either treats all variation as anomalous or learns a baseline from too little data that doesn't represent the real distribution. Fix: collect 60 to 90 days of baseline data before activating the anomaly detection. Run the model in shadow mode during baseline collection: log what it would have flagged, compare to actual outcomes, calibrate threshold before going live.

Autonomous Agent deployed without tested tool APIs. Symptom: the agent runs, calls a tool, receives an unexpected response format, and either loops indefinitely or takes an unintended action based on misparse. The root cause is tool schemas that were described but not tested at the API level. Fix: test every tool the agent has access to in isolation before deploying the agent. Verify the response format matches the agent's expectation. Build error branches for each tool's failure modes before the first production run.

Data readiness audit checklist

Run this before greenlighting any pattern implementation:

Data availability

  • The required data exists and is accessible to the system you're building
  • Access permissions are confirmed (not assumed from org chart)
  • Data volume is sufficient (minimum record counts for training, embedding, or baseline)

Data quality

  • Outcome labels exist and are accurate for patterns that require them (Scoring, Anomaly)
  • Key fields have over 80% population rate (not mostly empty or null)
  • No systematic bias in the training set that would skew model outputs

The NIST AI Risk Management Framework identifies data accuracy, completeness, consistency, validity, uniqueness, and timeliness as the six primary dimensions that determine whether AI systems produce trustworthy outputs. Each item in this checklist maps to one or more of those dimensions.

Data freshness

  • Data is current enough to be relevant (stale data is worse than no data for some patterns)
  • A refresh cadence is defined and owned, not assumed
  • Old data beyond a useful horizon is excluded or down-weighted

Infrastructure readiness

  • Ingestion pipeline is built and tested
  • Storage and compute are provisioned
  • API endpoints are confirmed accessible with correct permissions
  • Latency requirements are met by the infrastructure configuration

Governance

  • Data usage is covered by terms of service or user consent
  • PII handling is defined and compliant with applicable regulation
  • Audit trail is in place for any Execute-path outputs

If any checkbox is unchecked, the pattern is not ready to deploy. The missing item is a prerequisite, not a nice-to-have.

Infrastructure prerequisites teams miss

Embedding pipeline for RAG. This is not "upload your documents to the tool." It's a scheduled pipeline that: reads new or updated documents, chunks them by section, generates embeddings using the same model version as the retrieval endpoint, writes to the vector database, and handles deleted or superseded documents by removing their embeddings. This pipeline is an engineering investment. Scoping it as "the vendor handles it" usually means it's not actually running, which is why the knowledge base goes stale.

CRM webhooks for Scoring and Routing. The scoring model needs to run whenever a relevant record changes. That requires CRM webhooks configured to fire on the right events (lead created, deal stage updated, contact information changed). Many CRM implementations have webhooks available but not configured. This is a three-day engineering task that blocks the entire Scoring pattern if missed.

Audio processing pipeline for Meeting Intelligence. Recordings need to: be captured with sufficient quality (minimum 16 kHz mono), be stored accessibly, be associated with the correct participant and deal metadata, and be processed in a reasonable time window after the meeting ends. If recordings are stored in a vendor system that has no export API, or if the quality is too low for accurate transcription, the pattern can't run. This is a physical infrastructure constraint that no amount of model quality can solve.

Prerequisite failure type Most affected patterns Typical discovery timing Avg. delay caused
No labeled outcome data Scoring + Routing, Anomaly Agent Week 3-4 of testing 8-12 weeks
Knowledge base never refreshed RAG Assistant Week 3 of pilot (when user spots wrong answer) 4-6 weeks
Audio stored without export API Meeting Intelligence Pre-build vendor audit (if done) or week 1 of integration 6-10 weeks
Tool APIs untested Autonomous Agent First production run 2-4 weeks plus incident recovery
CRM webhooks not configured Scoring + Routing, Workflow Copilot Integration testing, week 2 1-3 weeks

Sequencing for resource-constrained teams

When you can't build all patterns simultaneously, sequence for maximum early value and minimum prerequisite debt:

Start with no-dependency patterns that have standalone value. RAG Assistant (if you have a knowledge base) and Scoring and Routing (if you have labeled historical data) can both deploy independently and deliver immediate value. They also don't generate outputs that other patterns depend on, so starting them doesn't create technical debt for downstream implementations. For how to sequence these choices across a multi-year plan, see sequencing AI patterns in a roadmap.

Collect the data you'll need later, starting now. If you plan to add Meeting Intelligence in six months, start storing call recordings in the right format today. If you plan to add an Anomaly Agent, start collecting consistent metrics from a defined baseline date. The data collection cost is low. The discovery that you needed 90 days of data and only have 12 is high.

Deploy Workflow Copilot after its upstream dependencies are running. A copilot built before Meeting Intelligence and Scoring and Routing produces generic suggestions rather than context-rich ones. Wait until the upstream patterns are producing data before investing in the copilot layer.

Updating prerequisites over time

Patterns that work in year 1 may degrade in year 2 if their prerequisites aren't maintained:

  • Knowledge bases grow stale as products and policies change
  • Scoring models drift as the market composition changes (more enterprise customers than when the model was trained, different close rates, different sales cycles)
  • Anomaly detection baselines built in one quarter may be wrong for a different seasonal pattern

McKinsey's research on charting a path to the data- and AI-driven enterprise recommends building one data foundation for analytics and AI, used everywhere rather than separate pipelines per system. That approach is the infrastructure equivalent of defining your prerequisite maintenance calendar before you need it.

Build a maintenance calendar for each pattern's prerequisites:

  • RAG knowledge base: review and update quarterly at minimum; major product changes trigger immediate refresh
  • Scoring model: retrain every 6 months against fresh outcome data; monitor model drift metrics monthly
  • Anomaly baseline: recalibrate any time there's a significant business change (new product line, new market, major team change)

The prerequisite audit at deployment is not a one-time event. It's the starting point for an ongoing maintenance rhythm.

Frequently Asked Questions

What is the most commonly missed AI implementation prerequisite?

Data availability is assumed, but data accessibility and quality are not confirmed. A record that exists in a CRM is not the same as a record whose outcome label is accurate, whose feature fields are populated, and whose format is compatible with the model that needs to consume it. The RAND Corporation found 85% of failed AI projects cite data quality as a root cause.

How long does a prerequisite audit typically take?

A thorough prerequisite audit across all three dependency categories (data, infrastructure, pattern dependencies) takes 2-3 weeks for a single pattern and 4-6 weeks for a multi-pattern stack. That investment eliminates the 8-12 week delays that occur when missing prerequisites surface during integration testing. Winning programs earmark 50-70% of their AI project timeline for data readiness work.

Do all AI patterns have the same prerequisites?

No. RAG Assistant, Document Review, and Vision Extract have no upstream pattern dependencies and can deploy first. Meeting Intelligence, Scoring and Routing, and Generative Research also have no pattern dependencies but have specific data requirements. Workflow Copilot and Anomaly Agent frequently depend on upstream patterns to produce context-rich outputs. Autonomous Agent has the most stringent infrastructure prerequisites, requiring every tool API to be tested before deployment.

What happens if you deploy a Scoring model without labeled historical data?

The scoring model produces scores that do not correlate with actual outcomes. High-scored leads fail to close at the predicted rate. Low-scored leads convert at rates the model assigned low probability. The model looks active but functions as noise. Fix: before training, validate that the historical record set has consistent outcome labels and that feature fields are populated in over 80% of records.

How often should AI pattern prerequisites be re-audited after initial deployment?

RAG knowledge bases should be reviewed quarterly at minimum, with immediate refreshes triggered by major product or policy changes. Scoring models should be retrained every six months against fresh outcome data, with monthly drift monitoring. Anomaly detection baselines need recalibration any time a significant business change occurs (new product line, new market, major team restructure). Prerequisites are not a one-time check.

What is the Pattern Dependency Map?

The Pattern Dependency Map is a prerequisite audit structure that categorizes every AI pattern along three axes before implementation: data dependencies, infrastructure dependencies, and pattern dependencies (upstream patterns that must be running first). Running the map before build decisions eliminates the three-to-six-month delays that occur when missing prerequisites surface mid-integration.


Learn more