CSM Metrics: NRR, GRR, Health Scores, Expansion Contribution

A CSM I worked with had a portfolio that looked like a Christmas tree. Eighty-three accounts, seventy-one of them green, eight yellow, four red. Her dashboard was the cleanest on the team. Her manager used her health-score discipline as a model in QBR prep.

Then a $240K renewal walked into the call and said, "We've decided not to renew."

Green. The account had been green for six months. Logins were healthy, NPS was 8, the support queue was quiet. The CSM hadn't talked to the new VP because the new VP was still figuring out the org and hadn't reached out. The previous champion had quietly left. A competitor had been running a six-week pilot inside the customer's other business unit. None of that was in the health score.

Here's what we learned from that call: a green health score is data, not safety. A metric you don't review weekly is a lagging indicator no matter what label is on it. And the same dashboard that saves your quarter can lie to you for ninety days if nobody's reading it.

This is the playbook for the metrics CSMs are graded on. What each one tells you, what it pretends to tell you, and how to design the system so you see churn coming three months out instead of three weeks late.

Why CSM Metrics Mislead

The trouble with CSM metrics isn't that they're wrong. They're often very accurate. The trouble is that accurate doesn't mean predictive, and most CSM scorecards are built out of accurate-but-late numbers.

NRR is accurate. It tells you exactly what happened to a cohort of revenue over twelve months. It also tells you about churn that's already settled and expansion that's already booked. By the time NRR moves, the account decisions that drove it were made one to three quarters ago.

Renewal rate is accurate. It tells you how many accounts said yes when the contract came up. It's also a metric that gives you 30 days of warning at most, because that's when procurement starts the redline.

Health scores are accurate to whatever inputs you put in them. If you put in login frequency and ticket volume, you get a metric that grades login frequency and ticket volume. That isn't churn risk. It's adjacent to churn risk, on a good day.

The fix isn't more metrics. It's separating the metrics that tell you what's happening now (so you can intervene) from the ones that tell you what already happened (so you can learn). Both are useful. They're useful for different things.

NRR vs GRR: What Each One Actually Tells You

Customer onboarding sets the NRR ceiling. By the time a renewal call happens, the math has been baked in for months. But to manage the math, you need to know what the two top-line retention metrics are saying.

Gross Revenue Retention (GRR) isolates retention skill. It measures how much of your starting ARR you kept, after churn and downgrades, with no expansion to hide behind. GRR is capped at 100%. There's no upside in the formula.

Net Revenue Retention (NRR) captures total account growth. It includes expansion, so it can exceed 100%, and it can mask retention problems if expansion is strong. NRR is what investors care about. GRR is what tells you whether your CS function is healthy.

Here's the math, on a portfolio that closed the year with $1M starting ARR.

Starting ARR (Jan 1):           $1,000,000
- Churn (full cancellations):     -$80,000
- Downgrades (kept but smaller):  -$50,000
+ Expansion (upsell, cross-sell):+$200,000
Ending ARR (Dec 31):           $1,070,000

GRR = (Starting ARR − Churn − Downgrade) / Starting ARR GRR = ($1,000,000 − $80,000 − $50,000) / $1,000,000 = 87%

NRR = (Starting ARR − Churn − Downgrade + Expansion) / Starting ARR NRR = ($1,000,000 − $80,000 − $50,000 + $200,000) / $1,000,000 = 107%

Both numbers are true. They're saying different things.

GRR of 87% means you lost 13 cents of every starting dollar to retention failure. Best-in-class B2B SaaS GRR is 90% or higher; 87% is the line where leadership starts asking questions. NRR of 107% looks healthy and would impress a board, but it's healthy because expansion covered for retention, not because retention was strong.

If your team is reporting NRR alone, you have a blind spot. A team running 75% GRR and 115% NRR has a churn problem that aggressive expansion is papering over, which works until expansion slows down, which happens in every downturn. Always report them together.

Slice by segment. A portfolio NRR of 107% can hide an enterprise NRR of 125% and an SMB NRR of 89%. Average masks two different businesses. At minimum, slice by segment (SMB / mid-market / enterprise) and tenure (year-one customers vs renewals). The story in each slice is different and the action is different.

Health Scores That Predict Churn (Not Decorate Dashboards)

Most health scores fail because they grade login frequency. Logins go up when a new admin joins. They go down for two weeks because someone went on vacation. They tell you almost nothing about whether the account values the relationship.

A health score that predicts churn has four properties:

Inputs that move before renewal, not at renewal
A small number of inputs (4 to 6, not 17)
Weights based on historical churn correlation, not gut feel
A weekly refresh, with a CSM-visible delta from last week

Here's a working rubric we've run with multiple CSM teams. Five inputs, weighted, scored 1–5 per input, total score on a 100-point scale.

Input	Weight	What "5" looks like	What "1" looks like
Product usage trend (last 30 vs prior 30 days)	25%	Active users +10% or more, key feature adoption rising	Active users down 25%+, key feature unused for 30 days
Executive sponsor presence	25%	Named sponsor, met in last 90 days, attended last QBR	No sponsor named, or sponsor left and replacement not engaged
Support sentiment (last 60 days)	20%	Tickets resolved fast, no escalations, positive feedback	Multiple escalations, unresolved P1, customer-side frustration in writing
NPS / CSAT delta	15%	NPS 9+ or trending up 2+ points	NPS 6 or below, or down 2+ points from last survey
Time since last business review	15%	QBR or business review in last 90 days	No structured business conversation in 6+ months

Thresholds:

80–100: Green (low risk, expansion-ready)
60–79: Yellow (watch: at least one input is a 2 or 3)
Below 60: Red (active intervention required this week)

Three things make this rubric work in practice.

One: weights based on churn correlation, not opinion. The first time you run this, weights are a guess. After two quarters of churned-account data, recalibrate. Pull your last twenty churned accounts and ask: which input was lowest 90 days before they churned? That input's weight goes up. We've seen teams discover that "executive sponsor presence" is twice as predictive as "product usage," but they wouldn't have known without the post-mortem.

Two: one red input drops the whole score below green. If your exec sponsor left and hasn't been replaced, that account is not green even if usage is still strong. Build a hard rule: any single input scoring 1 caps the total at yellow.

Three: refresh weekly, with a delta. A health score that updates monthly is a lagging indicator. The CSM should see, every Monday morning, the score and the change from last week. A drop from 82 to 71 is the signal. The score itself isn't.

Expansion Contribution: Measure It Separately

Expansion is the part of NRR that requires the most care to measure honestly. Two CSMs with the same gross retention can have wildly different expansion contributions, and the one driving more expansion is creating more value — but only if the expansion was theirs to drive.

Track expansion ARR per CSM, separately from gross retention. GRR tells you who's keeping their accounts. Expansion ARR per CSM tells you who's growing them. Don't combine them into a single number; you'll lose the signal.

Distinguish CSM-sourced from sales-sourced expansion. Not every dollar of expansion in a CSM's portfolio is theirs. If sales found the opportunity, ran the demo, and closed it, that's sales-sourced expansion in a CSM-owned account. The CSM should get credit for being credible enough that the customer was willing to take the call. They should not get credit for running the deal. A clean rule we've used: CSM-sourced expansion is anything where the CSM identified the need and either ran the conversation or warmly handed off to a named AE with full context. Anything else is sales-sourced.

Expansion is part of the role, but the role isn't sales. The way you measure it determines whether your CSMs become trusted advisors or quota-carriers in disguise.

Cap expansion at what the customer can absorb. A CSM who doubles a customer's seat count and gets fifty percent of those seats churn-cancelled at renewal didn't drive expansion. They drove a downgrade with extra steps. Some CS teams measure twelve-month expansion (how much expansion stuck after a year), not point-in-time expansion. That's a much harder number to game.

A simple expansion contribution scorecard, per CSM, per quarter:

Metric	Target	Why
CSM-sourced expansion ARR	$X	The number the CSM influenced
Sales-sourced expansion in portfolio	tracked, not targeted	Context for portfolio growth
12-month expansion retention	90%+	Did the expansion stick?
Expansion accounts also at green health	100%	Don't expand at-risk accounts

The Lagging Trio (Useful, But Not For Prevention)

Three metrics that every CS dashboard reports and that almost nobody can act on:

Renewal rate. Tells you how many accounts said yes when the contract came up. By the time renewal rate moves, the conversations that drove the outcome are over. Useful for trend spotting and quota planning. Useless for saving the next account.

Churn dollar. Tells you what you lost. Useful for sizing the problem and prioritizing which segments need investment. Useless for the CSM trying to prevent the next churn this week.

Churn count. Tells you who you lost. Useful for post-mortems and identifying patterns ("we lost six SMB accounts in Q3, all in the same vertical, all with the same complaint about feature X"). Useless as a forward signal.

Report these. Track them. Don't run your week off them. A scorecard that leads with renewal rate is a scorecard that finds out about the at-risk account from the procurement email.

A One-Page Weekly Scorecard

The metrics CSMs review every Monday, not the full dashboard, just the page that shapes the week.

Section	What's on it	Why
Portfolio health	# accounts in green / yellow / red, change from last week	Direction matters more than absolute
At-risk accounts (red + yellows trending down)	Account name, ARR, days since last touch, primary risk input	The intervention list for this week
Renewals in next 60 days	Account, ARR, current health, forecast call (commit / probable / risk)	Forecast accuracy is a separate skill
Expansion pipeline	CSM-sourced opportunities, stage, ARR estimate	Keeps expansion in view alongside retention
Three numbers to walk into 1:1 with	This is the headline	One trend, one win, one risk

That last row is the most important. CSMs who go into their manager 1:1 with three specific numbers ("my green count dropped from 51 to 47, I saved a $90K account by getting the new VP onsite, and I'm worried about [account] because the sponsor went quiet") are running their portfolio. CSMs who go in with "things are mostly fine" are reading the dashboard, not running it.

Common Pitfalls

Vanity health scores. If your health score is mostly login frequency, you have a login-frequency dashboard, not a health score. Scores that don't include exec sponsor presence and time-since-last-business-review will mislead you, because those are the inputs that move first when an account goes cold.

NRR reported only in aggregate. A 107% NRR can hide a 125% enterprise number and an 85% SMB number. Slice by segment, slice by tenure, slice by ACV band. The aggregate is the marketing number. The slices are the operating numbers.

No leading-indicator dashboard. If every metric on your scorecard is a lagging indicator, you'll find out about the at-risk account from procurement, not from your own data. At least three of the metrics on a CSM's weekly scorecard should be inputs that move 60–90 days before renewal.

Over-engineering health scores. Seventeen-input weighted models with seasonal adjustments and machine-learned decay curves don't predict churn better than five-input models with weekly review. They predict churn worse, because nobody understands them and nobody acts on them. The CSM firefighting trap is real, and one symptom is health-score complexity that nobody has time to maintain.

Rolling weekly dashboards into monthly leadership reports. The CSM's weekly view and the leadership monthly view need to be different. Leadership wants NRR/GRR by segment. CSMs need health-trend deltas and the at-risk list. Don't force one into the other.

What Good Looks Like

A CSM team running this playbook hits three benchmarks:

Forecast accuracy within 5% on renewal calls 60 days out. If the CSM tells you on day 60 that a deal is "commit," it should renew at least 95% of the time. If forecast accuracy is below 80%, the health score isn't predictive — go fix the inputs.

Churn warning lead time of 90+ days for at-risk accounts. When an account churns, you should be able to look back at the health score and see the warning at least three months before the renewal call. If the score went red two weeks before the call, the score isn't doing its job.

Expansion contribution per CSM tracked and trending up. Not just at the portfolio level. Each CSM should have a CSM-sourced expansion number, and it should be moving in the right direction quarter over quarter. The QBR is one of the highest-leverage moments for surfacing expansion. If your QBRs aren't producing pipeline, the QBR design is the problem, not the CSM.

The CSM whose green-portfolio account told her they weren't renewing came back to the team and helped rebuild the rubric. We added executive sponsor presence as a 25% input. We made one red input cap the whole score at yellow. We added a "time since last meaningful conversation" timer. Six months later, her renewal forecast accuracy was inside 4%, and she'd caught two at-risk accounts 100+ days before the renewal call.

The metrics didn't change. The discipline of reading them did.

Customer Success Manager Playbooks