English

Stage 1 to 2: From Ad-Hoc to Pilot, The Most Common AI Stall Point

Stage 1 to Stage 2 AI maturity transition framework showing the path from ad-hoc AI use to governed piloting

Your team is already using AI. ChatGPT, Copilot, Gemini, Perplexity, maybe a dozen others. Nobody knows exactly which. There's no policy. There's no budget line. Some of it lives on personal accounts. And you have no idea what's working, what's exposing the company to risk, or whether any of it moves the business forward. That's Stage 1.

It's also where most companies in 2026 are sitting. McKinsey research found that just 11% of companies have adopted gen AI at scale, and nearly two-thirds haven't scaled beyond a few pilots. You're not behind. You're in the majority.

The Stage 1-to-2 transition isn't glamorous. It doesn't involve buying a new platform or announcing an AI initiative at an all-hands. It involves making a single intentional choice: picking one use case, naming an owner, defining what success looks like, and measuring it. That's it.

But this is precisely the step most companies skip. Instead they announce more pilots, form a committee, or keep waiting for the "right" AI strategy to emerge. The result is Stage 1 drag: the shadow AI problem grows, compliance risk accumulates, and 12 months later the board asks what the company has actually done.

This article gives you the concrete playbook for making the transition.

What Stage 1 actually looks like

Key Facts: The Stage 1 Reality

  • 78% of knowledge workers use personal AI tools at work without explicit employer approval, and most organizations don't know which tools are in use across their workforce (Microsoft Work Trend Index, 2024)
  • 60% of AI projects unsupported by AI-ready data will be abandoned through 2026, making data readiness audit the highest-leverage action at Stage 1 before any pilot is launched (Gartner, 2025)
  • Only 11% of companies have adopted generative AI at scale; nearly two-thirds haven't scaled beyond a few pilots, meaning the Stage 1-to-2 transition is still the most common stall point in enterprise AI (McKinsey, 2025)

Before you can exit Stage 1, you need to recognize you're in it. Here are the diagnostic signs.

Individual tool use without inventory. Employees are using AI tools you haven't approved, some you don't know exist. A few are paying out of pocket. Others are using free tiers. Nobody's been told not to, because there's no policy.

No formal budget line for AI. Any AI spend is buried in individual expense reports, software subscriptions, or departmental budgets that don't call it AI. The Chief Financial Officer (CFO) can't tell you what the company spends on AI.

IT and Legal are worried but not empowered. IT has heard about data going into ChatGPT. Legal has a vague concern about IP. But neither team has authority, mandate, or guidelines to do anything about it yet.

No return on investment data. Individual employees will tell you AI "saves time" but nobody can quantify how much or attach it to business outcomes. The productivity gains are anecdotal.

Competing tool subscriptions. Multiple teams are evaluating or lightly using different tools for the same problem. Sales is looking at one AI sales tool. Marketing just signed up for a different one. They don't talk to each other.

No security review of AI tools. Vendors haven't gone through the standard procurement security review that applies to your other SaaS. They just got an employee credit card.

If three or more of these are true, you're in Stage 1. That's fine. Most companies in 2026 are. But it's not a stable state.

Why Stage 1 is both normal and risky

Stage 1 isn't a failure. It's how AI enters every organization. Employees experiment before leadership has a framework. That's actually healthy early on. The problem is staying there.

The risk accumulates along three dimensions. And the risk is real: Gartner found that organizations will abandon 60% of AI projects unsupported by AI-ready data through 2026, meaning Stage 1 companies that skip governance and data work are setting up expensive failures.

Data exposure. Employees are pasting content into public AI tools without knowing whether that content is sensitive. Customer records, financial projections, draft contracts, internal strategy documents. Every paste is a potential data exposure. Without a policy defining what can and can't go into AI tools, the default is "everything goes in."

IP and compliance risk. If an employee uses AI to generate content, code, or analysis, questions about IP ownership, bias, and regulatory compliance don't have a company-sanctioned answer. The employee acted alone. The company is liable.

Opportunity cost. The more teams experiment without coordination, the less organizational learning accumulates. Every team reinvents the same prompts. Nobody shares what works. The company pays for AI effort without getting AI value.

The good news is you don't need to solve all of this to move to Stage 2. You need to solve enough to make one structured pilot possible.

Stage 2 exit criteria

Stage 2 isn't a destination. It's a new floor. Here are the three requirements to call yourself a Stage 2 organization.

Requirement What it means What it doesn't mean
AI policy exists A written policy covering acceptable use, data restrictions, and approval process, shared with all employees A perfect, comprehensive, legally reviewed policy. A working draft is fine.
At least one defined pilot with a named owner One use case with a hypothesis, a success metric, a time boundary, and one person accountable for results Multiple pilots running simultaneously without accountability or measurement
Baseline measurement before the pilot starts You know the current state: hours spent, cost incurred, or quality level, before AI changes anything Post-hoc rationalization of results

All three need to be true. If you have a policy but no pilot, you're Stage 1 with better governance. If you have a pilot but no policy, you're Stage 1 with better experimentation. Stage 2 requires both.

How to pick the first pilot

Use case selection is where most Stage 1-to-2 transitions fail. Teams either pick the most exciting use case (customer-facing, highest visibility, hardest data problems) or they let the loudest department drive the choice. Neither approach works.

The right framework has three filters. Apply them in order.

Filter 1: Data readiness. Before picking any use case, ask whether you have clean, accessible, policy-cleared data to support it. Data readiness is the most common silent killer of AI pilots. A use case with great business appeal but poor data readiness will fail. A use case with lower appeal but strong data will teach you something real. Start with the data you have, not the data you wish you had.

Filter 2: Risk profile. For your first pilot, avoid customer-facing Execute capabilities. Execute actions have direct, visible consequences: emails sent, records updated, deals changed, responses delivered. When things go wrong in a pilot, you want the impact internal. Rate each use case on a simple risk scale: low (internal only, human reviews outputs), medium (customer-facing, AI drafts but human sends), high (fully automated customer interaction). Pick a low-risk use case for Pilot 1. The Generate vs. Execute boundary explains why this distinction matters for your first pilot.

Filter 3: Impact potential. Among the low-risk, data-ready options, pick the one with the clearest business impact: hours saved, conversion rate improved, error rate reduced. This doesn't need to be huge. It needs to be measurable.

A concrete example. A 50-person SaaS company applies this framework and surfaces three candidates: (1) AI-assisted lead scoring using CRM data, (2) AI-generated first-draft outbound email sequences for sales development reps, and (3) AI-powered support ticket categorization and routing.

Lead scoring (option 1) fails Filter 1. The CRM has incomplete data for 40% of records. Option 3 fails Filter 2 for their risk tolerance since it touches customer response. Option 2 passes all three filters. Their CRM and email system have clean data. It's internal to the SDR team. And they can measure reply rate and meeting booked rate directly. Pilot 1 is AI-generated SDR email sequences.

That's the whole selection framework.

Building the pilot charter

Once you've picked the use case, formalize it. The pilot charter doesn't need to be long. It needs to exist.

A Stage 2 pilot charter has five elements:

1. The hypothesis. State what you believe will happen and why. "We believe AI-assisted SDR emails will increase reply rate by 15% because our reps spend 40% of their prospecting time on email personalization that AI can do faster."

2. The success metric. One primary metric. Not five. One. For the SDR example: reply rate on AI-assisted sequences vs. control group sequences over 60 days.

3. The baseline measurement. The current state, measured before the pilot starts. If you don't measure before, you can't prove after. Pull the current reply rate data before you touch anything.

4. The time boundary. Pilots without end dates run forever. Set 60 or 90 days. At the end, you decide: scale, extend, or kill. All three outcomes are valid. Running indefinitely is not.

5. The named owner. One person is accountable for the pilot results. Not a committee. Not a working group. One person who presents results at the end of the time boundary.

If you can't fill in all five, you're not ready to start the pilot.

"The Stage 1-to-2 transition requires exactly one thing: one pilot with a hypothesis, a measurable baseline, and a named owner. Not a strategy deck, not a governance committee, not an enterprise AI platform contract. One bounded, measurable experiment. That's the whole bar." (Rework)

The Stage 1-to-2 Crossing Test

A four-question diagnostic that confirms an organization has genuinely crossed from Stage 1 to Stage 2, rather than relabeled its Stage 1 activities. Question 1: Does a written AI use policy exist and have all employees confirmed receipt? Question 2: Is there exactly one named pilot with a documented hypothesis and success metric? Question 3: Was the baseline measurement captured before the pilot started? Question 4: Does the pilot have a named owner and a defined end date? If any answer is "no," the organization is still in Stage 1. The Crossing Test is deliberately strict: it's easy to claim Stage 2 status based on activity. The Crossing Test measures governance and structure, not activity volume.

Minimum viable governance for Stage 2

Your AI policy at Stage 2 doesn't need to be a 40-page legal document. It needs to cover five things.

Approved tools list. The specific AI tools employees are permitted to use, with the conditions under which they're permitted. Start with what people are already using and make those official. Add approval criteria for new tools.

Data restrictions. Which categories of data cannot be entered into external AI tools without explicit approval. At minimum: customer personally identifiable information (PII), financial projections, mergers and acquisitions-related content, and confidential contracts. This single decision eliminates the majority of compliance risk at Stage 1.

New tool request process. How an employee gets a new AI tool approved. Keep it lightweight: a form, a named reviewer (IT or Legal), and a 5-business-day turnaround. The goal isn't to block adoption. It's to create a record.

Incident reporting. What employees should do if an AI tool does something wrong: wrong output sent to a customer, data inadvertently exposed, model produces discriminatory content. Even a simple "email [name] immediately" creates accountability.

No-use zones. Specific decisions AI cannot make without human review. Regulated decisions (credit, hiring, medical) are the floor. Add anything specific to your industry.

This policy doesn't need legal sign-off to be useful. It needs to exist and be shared. You refine it as you learn.

Building Your AI Use Policy covers this in full detail with section-by-section guidance.

The data readiness check before committing

Most Stage 1 companies are surprised by how unready their data is for AI pilots. Before you commit to a use case, run a five-question audit.

  1. Can you access the data the AI would need today, without a multi-week IT project?
  2. Are the key fields at least 70% populated, or are there significant null gaps?
  3. Is the data recent enough to reflect current business reality?
  4. Is there one authoritative source, or competing systems with conflicting records?
  5. Has legal or security cleared this data category for use in external AI tools?

If you answer "no" or "I don't know" to two or more questions, the use case has a data readiness dependency that will surface as pilot failure. Either fix the data first or pick a different use case. The Data Readiness article gives you the full audit framework.

Common Stage 1-to-2 failure modes

Failure mode 1: Picking the wrong first pilot. The highest-profile, most exciting use case is almost never the right first pilot. Customer-facing, high-risk, data-poor. Pick boring and measurable over exciting and complex.

Failure mode 2: Skipping the baseline. "We'll figure out the ROI after the pilot" produces arguments, not evidence. Always measure before you change anything. If you forgot to measure before and the pilot is already running, stop and measure now. Any baseline is better than none.

Failure mode 3: Policy paralysis. Some organizations try to write the perfect AI policy before starting any pilot. They consult Legal, IT, Compliance, HR. The policy review runs six months. Meanwhile, shadow AI expands. A working draft with known gaps beats a perfect policy that doesn't exist yet.

Failure mode 4: Too many pilots. "We should run five pilots in parallel to learn faster." No. Five pilots with no single owner, no control groups, and no shared infrastructure produce five inconclusive data points. One well-run pilot with proper measurement produces one real answer.

Failure mode 5: Changing the metric mid-pilot. If the pilot isn't producing the results you hoped for, the temptation is to switch metrics. Don't. The metric is set in the charter. If the pilot fails by the original metric, that's useful information. "The AI email sequences didn't improve reply rate" is a real finding. Pivoting to a different metric mid-stream to salvage a failing pilot produces misleading data.

What Stage 2 actually feels like

A Stage 2 organization has a policy posted somewhere (shared drive, company handbook), one pilot with a charter and an owner, a start date, a scheduled read-out, and a baseline measurement on file. That's genuinely it.

It doesn't feel like a transformation. It feels like one small project managed correctly. That's the point.

The transformation happens because this single pilot, run well, produces real data that builds the case for Stage 3. Companies that sprint to Stage 3 without a Stage 2 foundation find themselves with multiple AI tools, no shared infrastructure, and no evidence that any of it works. They've built Stage 3 complexity on a Stage 1 data foundation.

Rework Analysis: Based on enterprise AI transition patterns, the median time to complete the Stage 1-to-2 Crossing Test for a mid-market company is 8-14 weeks when the CEO has set the mandate. The most common cause of delay is the baseline measurement requirement: teams that discover they can't easily pull the pre-pilot metric realize they have a data readiness problem that must be addressed before the pilot starts. This delay is actually valuable. It surfaces the data gap before it kills the pilot, rather than after.

Stage 2 is unglamorous. Do it anyway.

What comes next

Once your first pilot completes and you've made the scale/extend/kill decision, you're ready to think about moving from pilot to production. That transition (moving to Stage 3) has its own set of requirements, infrastructure decisions, and failure modes. The next step is the hardest one in the whole maturity curve.

Read: Stage 2 to 3: From Pilot to Scaled for the production deployment checklist and infrastructure requirements.

Read: The 5 Stages of AI Maturity to see where this transition fits in the full maturity model.

And if you're wondering whether your transformation will stick: Why Most AI Transformations Fail covers the structural reasons most organizations stall between stages.

See also: