AI in the Growth Marketer Workflow

It's 9:14 on a Tuesday. The SDR tool just emailed you about its "new AI agent for growth marketing." Subject line uses the word unlock. You haven't opened it. You've been twenty minutes deep in a Claude tab, pasting in a CSV of 14,000 signup events, asking it to find the weird drop-off between step three and step four of activation. It found two. One is real. One is you forgetting to filter out internal users. That tab (the one nobody on the team sees, the one that doesn't show up in any vendor pitch deck) is your actual AI workflow.

The Growth Marketing Manager job description you got hired off says you should be "AI-fluent." Nobody told you what that meant at 9 a.m. on a Tuesday. The vendor demos make it sound like AI is a button you press to generate, run, and ship a winning experiment. The reality is messier and a lot more useful: AI is five prompts you keep in a Notion doc, and the discipline to know when it's lying to you. That's the gap this piece is about. The difference between vendor AI (the thing in the email subject line) and workflow AI (the thing that's already in your browser tab).

If you're a growth IC one to four years into the role at a B2B SaaS or PLG company, this is the honest map. Where it saves you hours. Where it quietly produces garbage. The stack you'd actually run, not the Gartner quadrant.

Where AI actually helps

Forget the feature list. Think in moments: the specific points in your week where pasting context into a model changes the next thirty minutes.

Hypothesis generation. This is the biggest unlock and the one that gets the least vendor coverage. Paste your activation funnel and your week-four cohort retention curve into Claude. Ask: "What are the ten weirdest patterns in this data and what would you test next?" You'll throw out seven of the suggestions. They'll be obvious, generic, or wrong. The other three will be tests you wouldn't have thought of, usually because they cross a boundary your team has implicitly decided is someone else's problem (a pricing page nudge that's "marketing's thing," a re-engagement email gate that's "lifecycle's thing"). AI is shameless about crossing boundaries. That's the value.

Lifecycle copy variants. Give it the segment, the trigger event, the prior version's copy, and the goal. Ask for five variants in five voices. You'll keep one and a half. That's faster than briefing a copywriter for a day-three reactivation email that nobody's going to read closely anyway. The honest framing: AI copy is fine for the long tail of lifecycle messages where the marginal ROI of a human-written variant doesn't justify the calendar time. It's not fine for your activation hero copy or your homepage. Match the tool to the stakes.

Cohort analysis sanity-check. Paste the SQL or the chart, ask "what's wrong with this analysis." This is the one I use the most. It catches the obvious mistakes before the readout: survivorship bias in your retention curve, weekend seasonality you didn't normalize for, the cohort that turns out to be 80% one big customer who happened to sign up that week. You'd catch most of these in peer review eventually. AI catches them at 9:30 instead of in a Slack thread on Thursday.

Behavioral pattern hunting in event data. Dump 5,000 rows of events for users who churned versus users who retained, ask for differences. Not a forecast. Not a "predict who will churn." A hypothesis pump. The output is a list of "users who retained were 4x more likely to invite a teammate in the first 24 hours," which you then go validate properly in your analytics tool. Treat AI as the thing that surfaces the question, not the thing that answers it.

Readout summaries. Turn a twelve-tab spreadsheet readout into a three-paragraph Slack post that the Head of Growth will actually read. This is the one task ChatGPT does well first try. Give it the test name, the hypothesis, the numbers, and the verdict. Ask for "three paragraphs, plain English, lead with the result, no jargon." Done. You just got fifteen minutes back, and the message is better than the one you'd write tired at 5 p.m.

Five prompts I keep in Notion

Hypothesis pump. "Here's our activation funnel and 4-week retention by cohort. Give me the 10 weirdest patterns and one test for each."

Copy variants. "Day-3 reactivation email. Segment: . Prior version: [Y]. Five variants in five voices, max 80 words each."

SQL sanity-check. "Here's the SQL and the chart. What's wrong with this analysis? List five risks ranked by severity."

Readout summary. "Test results below. Write a 3-paragraph Slack post for the Head of Growth. Lead with the verdict. No jargon."

Cohort sanity-check. "Here's a retention curve. What artifacts could be inflating it? Survivorship, seasonality, customer concentration, anything else?"

That's the whole stack. Five prompts, one Notion doc.

Where AI breaks (and you'll embarrass yourself)

The other half of "AI-fluent" is knowing when to close the tab. Models are confident in exactly the places they shouldn't be.

Causal claims. AI will happily tell you "the email caused the lift." It cannot know that. It has no holdout group. It has no priors about your other launches that week. It will produce a clean, well-written paragraph attributing 12% activation lift to a copy change, and you'll paste that paragraph into a readout, and someone with a stats background will ask one question that sinks you. The rule is simple. AI never adjudicates causation. Always demand a holdout, a pre-registered hypothesis, and a confidence interval before anything ships as "this caused that."

B2B nuance. The model doesn't know your buyer is a CFO with a 90-day procurement cycle, three internal stakeholders, and a quarterly budget review that lands on a Thursday. Outputs read like DTC growth-hack Twitter, with "create urgency," "leverage scarcity," "add a countdown timer." You can teach it your context with a long system prompt, but it'll regress every fourth output. For B2B lifecycle, treat AI as a junior copywriter who's never sat in a sales call.

Retention forecasts. It will fit a curve and project month-twelve retention from your month-three data. The curve is wrong. Long-tail retention almost never follows the shape AI wants to fit, and the model doesn't know the difference between PLG self-serve and sales-led patterns. Use Mixpanel/Amplitude/PostHog native cohort projection, or have your data team run a proper retention model. Not an LLM.

North-star definition. Never let AI pick your metric. The north-star is a strategy conversation with your CEO, your CFO, and your product lead. It's downstream of the business model, the buyer, and the moat. AI doesn't know any of that. It'll suggest "weekly active users" because that's what most articles in its training data said, and that's exactly the kind of metric that gets a PLG company optimizing for the wrong loop for two quarters.

Where AI lies to you

Causal claims. Confident attribution without a holdout.

Retention forecasts. Fits a curve, projects, calls it data.

B2B nuance. Defaults to DTC growth-hack patterns.

North-star definition. Never let a model pick your metric.

AI in personalization (Mutiny / dynamic content) — when it works

Dynamic personalization is the place where the vendor pitch and the workflow reality are closest, but only at scale.

It works when three things are true. The page has high traffic (think tens of thousands of visits a month, not hundreds). The segments are obvious and stable (industry, company size, paid traffic source, named-account list), not behavioral micro-segments that flicker between sessions. And the variant is real: a different proof point, a different headline, a different industry case study. Not just swapping the buyer's first name into a hero and calling it personalized.

It doesn't work for low-traffic pages (you'll never reach significance), for grammatically fragile copy (8% of visits get a sentence with the wrong article and you've now under-personalized in a way that reads worse than the control), or for "personalized" emails that swap a first name and a logo. If your version of "personalization" can be done with a mail-merge field, it's not personalization, it's a mail-merge field.

Pricing reality: Mutiny and Intellimize are enterprise-priced. They make sense for a $20M ARR company with a clear ICP and a marketing team that can build segment-specific creative. They do not make sense for a $2M ARR company whose homepage gets 4,000 visits a month. If a vendor is pitching dynamic personalization to a Series A team, they're pitching the wrong thing.

The "fully automated growth loop" trap

Every six months a vendor demo cycles back through the funnel: "AI generates the test, runs it, reads the result, ships the winner. Your growth program runs itself." The slide deck is gorgeous. The diagram has arrows that loop back into themselves.

Three reasons this is dangerous, in order.

First, you lose the institutional learning. The reason your team is good at growth in eighteen months is not that you ran more tests; it's that the people running the tests built intuition about your buyer, your product, and which patterns generalize. Automate the loop and that intuition never compiles. You end up with a team that can't function without the tool, running tests it can't critically read.

Second, the loop ships before anyone audits the hypothesis. Most failed growth tests fail at the hypothesis stage, not the execution stage. A bad hypothesis dressed up in good copy and shipped to 50% of traffic costs you more than the marginal value of running it. The judgment call (is this question worth answering?) is the highest-leverage moment in the whole experiment, and it's the one you can't outsource.

Third, the loop optimizes for short-term clicks over compounding metrics. AI readout systems will tell you the variant won because the click-through went up. They cannot tell you that the variant attracted lower-quality leads who churned at month two. By the time you notice, you've shipped twelve "winners" that collectively dragged retention down four points.

The growth marketer who automates themselves out of the readout meeting also automates themselves out of the next promotion. Keep the human in the loop where the judgment calls live: hypothesis quality, kill criteria, segment definition, what counts as a win. Let AI do the typing, not the thinking.

The practical stack (the one I actually use)

No quadrants, no logo dump. Here's what's in my browser:

Claude (Sonnet for daily, Opus for big context). Analysis, SQL review, anything where I paste 5,000 rows or a long context. Better than ChatGPT at "read this carefully and tell me what's wrong." This is where the cohort sanity-checks and hypothesis pumps live.
ChatGPT. Copy variants, quick rewrites, brainstorming subject lines. Faster turnaround for short tasks. Better tone control on consumer-ish copy. Worse at long context.
Cursor or Windsurf. Only if you write your own SQL or Python. Saves about 30% on the analysis script you'd otherwise pair-program with the data team. Skip this if you don't already write code.
Native AI in Amplitude / Mixpanel / PostHog. The "ask in plain English" feature. Useful for the 80% of questions where you'd otherwise file a ticket with the data team. Don't trust it on causal questions; it'll happily run a query that looks right and gives you the wrong answer.
Mutiny / Intellimize. Only at scale, only for top-of-funnel, only if you have the traffic and the segment-specific creative. If you don't, you're not ready for this tier yet.
Avoid: any tool whose pitch is "AI agent that runs your growth program." That's a button that ships untested hypotheses against your funnel.

Optional: the ACE Framework lens

If you want a strategic frame for where AI fits in growth, the ACE Framework (Ingest, Analyze, Predict, Generate, Execute) maps cleanly. AI helps most in Analyze (cohort sanity-checks, pattern hunting in event data) and Generate (copy variants, hypothesis lists). It's weakest at Predict (retention forecasts and causal claims, the two places it's confidently wrong). It's neutral on Ingest and Execute (those are still tooling problems, not model problems). One paragraph, that's it. Read the ACE Framework if you want the deeper version, but for daily workflow the takeaway is: lean on AI for analysis and generation, never for prediction.

A 30-day plan to integrate AI without breaking your workflow

The mistake most growth marketers make is treating "use AI more" as a tool adoption problem. It's a habit problem. Here's the four-week version.

Week 1. Pick three recurring tasks. Not ten. Three. The readout summary, the lifecycle copy brief, the weekly cohort scan. Build a prompt for each one, save it in Notion with the input format spelled out. Don't try to automate everything. The goal of week one is one good prompt per task, used once.

Week 2. Add Claude or ChatGPT to your readout review. Before you ship the readout, paste the test result and the analysis into Claude. Ask: "What would you push back on if I presented this in a meeting?" Treat the answer as a peer review, not gospel. Half of what comes back is junk. The other half is the question someone in the meeting was about to ask. You'll feel the time savings by Friday.

Week 3. Run one experiment where AI generated the hypothesis. Pick a candidate from your hypothesis-pump prompt. Run it the same way you'd run any other test (proper hypothesis, MDE calculation, holdout, readout). Track whether AI-sourced hypotheses win at a different rate than the ones you generated yourself. The honest answer: similar rates, but you'll have generated 3x more candidates, which means your test backlog is now bigger and better-prioritized.

Week 4. Audit. Open the Notion doc. Which prompts saved you time this month? Which produced output you had to redo? Kill the bad ones. Keep three to five, max. The point is a sharper workflow, not more tools. Anyone who tells you they have 40 prompts they use weekly is lying or in a vendor ad.

The closing line

Two things to take into next Tuesday.

AI doesn't make a bad growth marketer good. It makes a good one faster, by removing the typing tax on the parts of the job that don't require judgment. The skill that compounds isn't prompt engineering. It's knowing which questions are worth asking in the first place, and that's still the human's job, all the way down.

Growth Marketer Playbooks