PM Metrics: Outcome vs Output, North Star, and Leading Indicators (B2B SaaS)
I've sat through the QBR where the PM presents fourteen shipped features in Q3. New onboarding flow. Bulk-edit. SAML. Three integrations. Two new dashboards. Mobile parity for the deal pipeline. The slide is a beautiful matrix of green checkmarks. Everyone nods.
Then the CEO leans forward and asks one question: "Did retention move?"
Silence. The PM glances at the analyst, who glances at the data engineer, who glances at the floor. Somebody finally says, "We think so, but the cohort isn't fully baked yet." The CEO writes something in their notebook. The room knows.
That PM didn't get fired. They got something worse. They got quietly removed from the next strategy conversation. Six months later their roadmap is being co-written by sales, because product can't prove its work matters.
This is the gap between output and outcome, and most PMs land on the wrong side of it. Not because they're lazy. Because output is photogenic and outcomes are not.
Cagan's framing — output is what you ship, outcome is what changes
Marty Cagan has been beating this drum for fifteen years and most of us still get it wrong.
Output is everything you can see in a sprint demo. Features shipped. Tickets closed. Story points burned down. Releases tagged. It's countable, weekly, and lives in Jira where every stakeholder can see it.
Outcome is what changed in customer behavior or business performance because of that output. Activation rate moved from 38% to 51%. Net revenue retention held at 112% through a price increase. Time-to-first-deal dropped from 9 days to 4. The deal pipeline pod's accounts now log in 3.2 times per week instead of 1.8.
Output is a noun on a slide. Outcome is a verb in a customer's life.
PMs default to output for one reason: it's the only thing that's reliably observable on a Friday. You shipped seven things this sprint, here are the seven things, here are the screenshots. Outcome takes 30, 60, sometimes 90 days to surface, and the causal chain is messier ("did activation move because of the new flow, or because Q3 is just a high-intent quarter?"). When you're stressed and the deck is due Monday, you reach for the countable thing.
The catch: the countable thing is not what your CEO is paying you for. They're paying for behavior change. Until your slide reflects that, you're a delivery manager with a fancier title.
The 5 outcome metrics that actually hold up at B2B SaaS
You don't need 47 metrics. You need five, defined explicitly, refreshed weekly or monthly depending on the cadence, and trusted by finance.
1. Activation rate. The percentage of new accounts that hit your product's "aha" event within N days of signup. The hard part is defining the event, not measuring it. For a CRM, "3 deals created within 14 days" is a defensible aha. It means the customer has actually moved their pipeline in. For a project management tool, "5 tasks created across 2 projects with 1 assignee other than the creator" works. Pick the event, write it down, defend it for a quarter before changing it. Activation under 40% in B2B SaaS is a leak. Above 60% is rare and worth protecting.
2. Retention. Two flavors, and you need both. Gross revenue retention (GRR) measures what you kept from existing accounts before any expansion. Typically 85-92% is healthy in mid-market SaaS, and anything under 80% is a churn problem. Logo retention measures the percentage of accounts you didn't lose, regardless of contract size. Track both by cohort (signup month or quarter) so you can see whether new cohorts are healthier than old ones. If your Q1 cohort retains better than your Q3 cohort, the team that shipped between them did real work.
3. Expansion. Net revenue retention (NRR) is the headline number. Take the revenue from a cohort 12 months ago, look at what that same cohort generates today (after churn and after expansion), divide. Above 110% in B2B SaaS is good, above 120% is great, above 130% means you've found something most companies haven't. Pair it with expansion ARR per account so you can see whether the average customer is growing or whether you've got a few whales pulling the average up.
4. NPS or CSAT. Lagging, directional, useful as a tripwire. NPS lags real behavior by 60 to 90 days. By the time it drops, the damage has already been done in your churn pipeline. Don't steer with it. Use it to confirm a story you're already seeing in retention and expansion. If NPS drops 8 points and retention is fine, you have a perception problem. If NPS drops 8 points and retention drops too, you have a product problem and you needed to know about it sooner.
5. Revenue per user (or per account). ARPU or ARPA, tracked over time. The cleanest signal that your product is getting more valuable to its customers. If activation is up, retention is up, but ARPU is flat, you've built a stickier free tier and not much else. If ARPU is climbing and retention holds, the product is doing what a product is supposed to do.
These five together cover the customer's journey through your product: did they start using it (activation), did they stay (retention), did they grow (expansion), did they like it enough to recommend it (NPS), and did they spend more (ARPU). Anything you measure outside these five should be in service of explaining one of them.
The North Star — one per pod, not one per company
Amplitude popularized "the North Star metric" as a single number a company organizes around. Daily active users for a consumer app. Nights booked for Airbnb. Songs played for Spotify.
For B2B SaaS at Series A or B, one-per-company breaks down. A CRM pod, a billing pod, and a reporting pod do completely different jobs. Forcing them to share a North Star turns it into a vague company-level metric like "weekly active accounts" that nobody actually steers by.
The fix: one North Star per pod. The single metric that, if it moves, your pod did its job this quarter. Examples I've seen work:
- CRM activation pod → % of new accounts with 3+ deals created in the first 14 days
- Lead routing pod → % of qualified leads routed to a rep in under 5 minutes
- Onboarding pod → % of accounts completing the setup checklist within 7 days
- Reporting pod → weekly active dashboard viewers per paying account
- Mobile pod → % of weekly active users who log in from mobile in any given week
Each is specific. Each is one number. Each can be moved by the team that owns it without arguing with another team about attribution. That's the test. If two pods can both claim credit when the metric moves, it's not a North Star, it's a company-level outcome metric.
The pod-level North Star is not the company's North Star. The company still cares about NRR. But the pod cares about the leading thing that drives NRR for the customer journey it owns.
Leading vs lagging indicators — you can't steer with a metric that takes 90 days
Every metric I've named so far is somewhere on the lagging-to-leading spectrum, and you have to know where.
Lagging indicators tell you what already happened. NPS lags by 60-90 days. Retention lags by a quarter or more. Expansion ARR lags by the contract cycle. By the time these move, the work that moved them was done two quarters ago. You don't steer with them. You autopsy with them.
Leading indicators move week-to-week and predict the lagging metric. Activation rate is a leading indicator for retention (accounts that activated retain better, full stop). Setup checklist completion is a leading indicator for activation. Time-to-first-value is a leading indicator for setup completion. Support ticket volume on a newly launched feature is a leading indicator for whether that feature is going to drive churn or stick.
The rule I live by: pair every lagging metric with one or two leading indicators that you trust to predict it. If your North Star is retention and the only thing you measure is retention, you're flying blind for 90 days at a time. If your North Star is retention and you also track activation rate weekly, you'll know you have a problem in week 2 instead of week 14.
For each of the five outcome metrics, here are the leading indicators worth tracking:
- Activation → setup checklist completion %, time-to-first-key-action, % of accounts with 2+ users in week 1
- Retention → weekly active users per account, feature adoption breadth (how many features each account touches), support ticket volume per account
- Expansion → seat utilization %, % of accounts at 80%+ of plan limit, feature requests on premium-tier features
- NPS → support response time, time-to-resolution, % of tickets resolved on first contact
- ARPU → upgrade-prompt impression-to-click rate, % of accounts that hit a paywall in the last 30 days
You're not going to track all of these. You're going to pick one or two leading indicators per outcome metric, write them down, and check them weekly. The discipline matters more than the specific picks.
Wiring the metric tree
A metric tree is the visible artifact that says "here's how our pod's work connects to the business." North Star at the top, 3 to 4 driver metrics below it, leading indicators under each driver. Drawn out, the CRM activation pod might look like this:
NORTH STAR: % of new accounts with 3+ deals created in 14 days
├── DRIVER 1: Setup checklist completion rate
│ ├── Leading: Time to first deal created
│ └── Leading: % of accounts inviting a 2nd user in week 1
├── DRIVER 2: Onboarding email engagement (open + CTA click)
│ ├── Leading: Day-3 email open rate
│ └── Leading: Day-7 reply rate to "stuck?" email
├── DRIVER 3: First-week support ticket volume
│ ├── Leading: % of tickets tagged "setup confusion"
│ └── Leading: Median time-to-resolution on setup tickets
└── DRIVER 4: % of accounts with sales-team intro call scheduled
├── Leading: Calendly link click rate from welcome email
└── Leading: No-show rate on intro calls
Three things matter about this tree. First, every driver feeds the North Star with a clear causal story (better setup completion → more accounts hitting 3 deals in 14 days). Second, each leading indicator under a driver is something you can pull from your tooling tomorrow without a data engineering project. Third, the whole thing fits on one page.
Tooling, by category:
- Leading indicators (event data): Amplitude or Mixpanel. The product analytics tool tracks the user-level events that compose your leading indicators.
- Outcome metrics (warehouse data): Snowflake or BigQuery for the warehouse, Looker or Metabase for the dashboard. Outcome metrics involve revenue, contracts, and customer-level joins that don't live in the product analytics tool.
- The metric definitions themselves: a single Notion doc, owned by the PM, readable by engineering. Not a wiki page that gets stale. A living document where the activation event definition lives next to the SQL that calculates it. If your eng team can't find the definition of activation in 30 seconds, your metrics are not real.
The mistake I see junior PMs make: building this tree once, presenting it at one all-hands, and never updating it. The tree should be revisited every quarter. Driver metrics that don't move the North Star get cut. New leading indicators get added when you discover them.
The QBR slide that holds up to scrutiny
One slide. Here's the structure:
PRODUCT QBR — Q3 — CRM ACTIVATION POD
NORTH STAR: % of new accounts with 3+ deals in 14 days
Q2: 38% → Q3: 51% (+13 pts)
DRIVER METRICS (Q-over-Q):
• Setup checklist completion: 44% → 67%
• Onboarding email CTA click: 12% → 19%
• Week-1 support ticket volume per account: 1.8 → 1.1
LEADING INDICATORS THAT EXPLAIN WHY:
• Time-to-first-deal: 6.2 days → 2.8 days
(driven by new setup flow shipped wk 3 of Q3)
• Day-7 active users per account: 1.4 → 2.1
(invite-2nd-user prompt added wk 5)
BUSINESS CONTEXT:
• NRR held at 112% (no Q-over-Q change)
• Activation→retention correlation: activated accounts
retain 89% vs 61% for non-activated cohort
That's the whole slide. The features shipped during the quarter (the new onboarding flow, the invite-prompt, the email rewrite) go in an appendix. They're context, not the headline.
The reason this slide survives the CEO's question is that it answers the question before they ask it. "Did the work matter?" Yes. Activation went from 38% to 51%. Here are the three drivers. Here's the leading indicator that proves the new flow caused it. And by the way, the 89% vs 61% retention split tells you why activation is the right thing to be obsessed with.
If your VP can read that slide in 20 seconds and walk into the CEO's office and re-tell the story without notes, you've nailed it. If they need you in the room to translate, the slide isn't done.
The "we shipped 14 features" trap and how to climb out
Feature-count slides feel safe. They're countable. They prove you were busy. They give credit to engineering, which keeps the eng manager happy. They look like work.
They also kill PM credibility on a 12-month timeline. Here's the math: your CEO has been to maybe 20 PM QBRs in their career. By the third one, they've stopped asking "what did you ship" and started asking "what changed for the customer." If your slides keep answering the first question, you'll be the PM whose work the CEO can't quite remember when promotion conversations happen.
Climbing out is a one-conversation problem and a two-quarter execution problem.
The conversation is with your VP, ideally before the next quarter starts. It sounds like this: "I want to change how we report progress next quarter. Here's the metric tree for our pod. Here's the QBR slide template I'm going to use. The features we shipped will move to an appendix. I'll still report what we built, but the headline will be what changed for the customer. Can I get your read on the metric definitions before I take it to the team?"
Two things happen. One: your VP gets a draft of a better PM, and they remember the conversation. Two: you commit yourself in writing. You can't quietly slide back to feature-count slides three weeks later when the metric tree feels hard.
The execution problem is the next 6 months. Quarter one will be awkward. You'll show up at a QBR with an outcome slide and one of the drivers will be flat or down, and you'll have to talk through it instead of hiding behind 14 green checkmarks. Quarter two, you'll have a Q-over-Q comparison and the conversation gets easier. By quarter three, your slide is the slide other PMs are quietly screenshotting before their own QBRs.
What to do this week
Don't wait for next quarter. Do this in the next 10 days.
- Pick your pod's North Star. One sentence, one metric. If you can't do this in an afternoon, your scope is too broad and you have a different problem.
- Write the 3 driver metrics underneath it. Each one with a specific definition (not "engagement," but "% of weekly active users completing key action X").
- Find the 2 leading indicators per driver that you can pull from Amplitude or Mixpanel tomorrow. If you can't pull them tomorrow, your tooling has a gap that's worth a separate ticket.
- Put it all in a Notion doc. One page. Metric tree at the top, definitions and SQL queries below.
- Send the doc to your manager and ask for one round of pushback before the next QBR. Don't ask for approval. Ask for the one definition they'd argue with. Whatever they push back on is the metric your eng team is most likely to misinterpret too. Fix it before the QBR, not during.
The PM who reports outcomes, even imperfectly, beats the PM who reports outputs perfectly. Output reporting is the safer-feeling, career-limiting choice. You can keep building the same feature-count slide for three more years and quietly watch yourself drift out of the strategy conversation, or you can spend two awkward quarters making the switch and compound trust from there.
Start this quarter. The metric tree doesn't have to be perfect. It just has to exist.
Learn More

Principal Product Marketing Strategist
On this page
- Cagan's framing — output is what you ship, outcome is what changes
- The 5 outcome metrics that actually hold up at B2B SaaS
- The North Star — one per pod, not one per company
- Leading vs lagging indicators — you can't steer with a metric that takes 90 days
- Wiring the metric tree
- The QBR slide that holds up to scrutiny
- The "we shipped 14 features" trap and how to climb out
- What to do this week
- Learn More