Bahasa Indonesia

Communicating Results to Non-ML Stakeholders: How DS ICs Stop Burying the Lede

The last time I sat through a data scientist's readout to a CFO, the first slide was a confusion matrix. The CFO asked, politely, what she was looking at. Twelve minutes later, she still didn't know what to do on Monday. The model was good. The presentation was a small disaster.

This is the lede-burial pattern, and it kills more DS work than bad code ever will. You spent six weeks on a churn model. The exec gets eight minutes on the agenda. If your first slide doesn't answer the question they came to ask, the meeting ends with a polite "let's circle back," and your model goes into the drawer with the others.

This guide is the framework I use, and the one I drill into junior DS ICs on my team. It isn't about dumbing down. It's about respecting that the exec's job is to decide, not to learn ML. Your slide either helps them decide or it doesn't.

The "what changed?" lede

Every executive presentation answers one question first: what is different now versus last quarter?

Not "here's how we built the model." Not "the ROC curve looks great." What changed in the business, and what should we do about it.

The lede is the first sentence the exec hears or reads. If you have to scroll to find it, you've already lost. Compare these two openings for the same churn analysis:

Buried: "We trained a gradient-boosted model on 18 months of account telemetry, achieving an AUC of 0.87 with 5-fold cross-validation. Feature importance suggests support ticket velocity is the strongest signal."

Clear: "$4.2M of ARR is at high churn risk in Q2, up from $2.8M last quarter. The increase is concentrated in 12 accounts over $100K. We need to decide today which 5 the CSM team calls this week."

Same model. Same data. The second one starts the meeting. The first one ends it.

If you can't write the "what changed" sentence in 30 seconds, the work isn't ready to present. Go back to your laptop. The presentation can't fix a thesis you haven't named yet.

A useful drill: before you build a single slide, write the email subject line you'd send if the exec asked "what did you find?" If the subject line is "Q2 churn analysis update," that's a status report, not a finding. If it's "$4.2M ARR exposed across 47 accounts; recommend we move now on the top 12," that's a finding. Build the deck around the subject line.

The 1-slide answer

Assume the meeting gets cut to 90 seconds. It happens more often than you'd think. The CRO got pulled into a customer call. The CFO has the board prep at 2pm. You have one slide and a minute and a half.

What's on it?

Three things, every time:

  1. The headline number. One bolded line. The dollar amount, the lift, the change. Not the methodology.
  2. The decision required. What are we asking the exec to approve, fund, or change? Phrase it as a verb sentence: "Approve adding 2 CSMs to the high-risk pod" or "Pause the pricing test for SMB segment until next week."
  3. The owner and the date. Who does the thing, by when. Without this, the meeting ends with vibes.

Everything else is appendix. The model card, the feature importance, the holdout performance, the slice analysis: those go in slides 5 through 30, and you only show them if asked.

I make my team build the 1-slide answer first, before any other slide. If you can't write it, you don't have a finding yet. You have a project status. Those are different artifacts and they go to different rooms.

A good 1-slide answer reads like a decision memo, not a research summary. The CFO can take it to her boss. The CRO can forward it to his VPs. If your slide can't be lifted out of context and still make sense, it's not finished.

When to use confidence intervals (and when to skip)

Here's the heresy: most of the time, confidence intervals don't belong on the slide.

I know. We were trained to always show uncertainty. And in a peer DS review, you should. That's how the work gets pressure-tested. But in an executive room, a confidence interval often does the opposite of what you intended. You meant "I'm being rigorous." They heard "you don't actually know." The decision freezes. Nobody acts. The model didn't change anything.

The rule I use: show the interval when the decision flips on the lower bound. Hide it when it doesn't.

Two examples.

Show it. A pricing test shows a 4% revenue lift, 95% CI [-1%, 9%]. The lower bound is negative. The decision absolutely flips on that. If the true effect is -1%, you don't roll out. The CI is the entire point. Lead with it.

Hide it. A churn model says 47 accounts are at high risk, with a calibration interval that says "between 41 and 53 will actually churn in the next 90 days." The decision (call them this week) doesn't change whether the number is 41 or 53. The interval distracts. Put it in the appendix, mention it once if asked: "the band is plus or minus 6 accounts at 90% confidence, doesn't change the action."

The false-precision tax is real, but the false-uncertainty tax is just as real. A CI of [-1%, 9%] presented next to a directional recommendation gives the exec exactly the wrong signal: that you're hedging because you don't believe your own number. If you believe the directional call, make the directional call. The CI lives in the appendix where DS peers can interrogate it.

When in doubt, ask yourself: "if the lower bound were 20% worse, would the recommendation change?" If yes, show the interval. If no, you're showing it for yourself, not for the room.

The "model says X but business knows Y" tension

This is the moment that breaks junior DS ICs. The model says one thing. The Head of Sales pushes back: "that's not what I'm seeing in the field." The room turns to you. What now?

Don't fight it on the slide. You will lose, and you should. The Head of Sales has context the model didn't have.

Instead, do three things, in order:

1. Name the conflict, out loud. "The model is predicting that mid-market deals over $50K close 30% faster when we lead with the integrations demo. Mike, you're saying that's not matching what you see in the field. Let's pull on that. It matters."

This sounds simple. It is, in fact, the hardest part. Most DS ICs go quiet, or worse, defensive. Naming the conflict says: I trust my model, I trust your gut, and one of us is missing something. Let's find what.

2. Show the data the model saw. Not the algorithm. The data. "Here are the 340 deals from the last 12 months that fed this model." This often resolves the conflict instantly. The Head of Sales looks at the data and says "oh, those are mostly inbound deals, my pushback was about outbound, which the model didn't see." Now you have a real finding: the model is right for inbound, the gut is right for outbound, and the roadmap is to build a separate model for outbound or to scope this one to inbound only.

3. Ask what the business knows that the data doesn't. "Mike, what would you have to see in the field for the model's recommendation to make sense to you?" This flips the conversation from defense to collaboration. You're no longer arguing for your model. You're collecting features.

Nine times out of ten, the gut is right about something the data didn't capture: a recent strategy shift, a competitor entering the segment, a comp plan change three months ago that the data hasn't fully absorbed. Treat the pushback as free signal. Write it down. It's your next feature.

The one case where you should hold the line: when the Head of Sales' pushback is "I just don't believe it." That's not a counter-signal. That's discomfort with being measured. Stay calm, hand them the data, and let them sit with it.

Translating model probability into business action

A propensity score of 0.73 means nothing to a CFO. Stop putting probabilities on executive slides unless you've translated them.

Translate to one of three things, depending on the audience:

  • Dollars (for CFOs, CEOs, finance partners)
  • Deals or accounts (for sales leaders, CROs)
  • Headcount or hours (for COOs, ops leaders)

A churn model that says "23% of accounts are at risk" becomes:

$4.2M ARR exposed across 47 accounts. 12 of those are over $100K. If we save 5 of the top 12, we recover $1.7M.

A lead-scoring model that says "this lead has propensity 0.84" becomes:

Top-decile leads close at 31%. If we route them to senior AEs, we project 14 additional closed-won deals per quarter at our current ASP, roughly $980K in incremental ARR.

A demand forecast that says "Q3 demand will be 12% above plan" becomes:

We need 6 additional CSMs hired by end of Q2 or we miss SLA on 18% of new accounts.

The translation rule: if your number doesn't end in dollars, deals, or people, translate again. The exec's brain runs on those three units. Probability scores, lift percentages, and information gain don't trigger budget conversations. Dollars do.

A useful worksheet I make my team fill out before any exec presentation:

Model output What it means in plain English Dollars Deals/Accounts Headcount/Hours
0.73 churn propensity, top 12 accounts These 12 accounts are most likely to leave in the next 90 days $4.2M ARR 12 accounts, $1.7M concentrated in top 5 1 CSM × 6 weeks of save-motion work
4% pricing lift, p<0.05 The new pricing tier outperforms the control by 4% +$2.1M ARR over 12 months 340 deals affected at current run-rate 0 incremental headcount, 1 PM week to roll out

If you can't fill in the right-hand columns, you don't have an executive finding. You have a research output. They're not the same thing.

Stakeholder pre-reads

The single highest-leverage habit I've adopted in five years of DS work: send a one-page pre-read 24 hours before the meeting.

Not the deck. One page. Three bullets. The decision you're asking for.

Format I use:

Subject: [Decision needed] Q2 Churn Risk — recommend we move on top 12 accounts this week

What changed:
- $4.2M ARR at high churn risk in Q2, up from $2.8M last quarter
- 12 accounts over $100K, concentrated in 5 customers
- Top driver: support ticket velocity (3x increase in last 30 days)

What I'm asking for:
- Approve CSM save-motion outreach to top 12 this week
- Approve $40K budget for retention discount package on top 5
- Decision by EOD Wednesday so motion starts Thursday

What I'll bring to the meeting:
- The 12 accounts, scored
- The save-motion playbook draft
- The 90-day projected outcome with and without action

Three things this does:

  1. The exec walks into the meeting already 80% to a decision. They've had time to think, to pull in their team's read, to come back with a sharper question.
  2. You discover objections in writing, asynchronously, where you can address them calmly. Not on the spot, where you'll fumble.
  3. If the meeting gets canceled (and one in three executive meetings in my experience does), the decision still gets made. The pre-read is the artifact. The meeting is just the ratification.

The mistake I see DS ICs make: they send the deck as the pre-read. The exec opens slide 1, sees a confusion matrix, and closes the email. Don't send the deck. Send the page. The deck is the appendix.

What to leave out

Things that should not be on the slide for an exec audience:

  • ROC curves
  • Confusion matrices
  • Feature importance plots (unless one feature is the entire story and you're naming it)
  • Hyperparameter tables
  • Anything with "log-loss," "perplexity," "KL divergence," or "MAP" in the title
  • Cross-validation fold breakdowns
  • Loss curves
  • The architecture diagram of your model
  • A list of libraries you used

If the exec asks, you have all of this in the appendix. You should be able to flip to slide 18 and answer "yes, AUC is 0.87, calibrated on a 5-fold holdout." Don't volunteer it.

The bar is brutal but fair: if a slide doesn't help the exec decide, it doesn't go in the front of the deck. The model artifacts go to peer DS review, which is a different meeting with a different audience. Conflating those audiences is how 30-slide decks get built that nobody reads.

I'll go further: if you find yourself wanting to show the ROC curve, ask why. Usually it's because you're proud of the model. That's fair, you should be. But pride doesn't drive decisions. The headline number does. Show the headline. The ROC curve is for your DS lead.

The "we built a model that doesn't change anything" trap

The worst outcome in DS work isn't a wrong model. It's a correct model that nobody acts on.

I've watched teams ship beautiful churn models that sit in a Looker dashboard for a year because nobody ever defined what the CSM team should do with them. I've seen lead-scoring models with 0.91 AUC that produce zero incremental closed-won, because routing logic never changed. The model was right. The action loop was missing.

This is the trap, and it's set before the first line of code is written.

The fix is upstream of communication. Before you start the project, write down the answer to this question: "if the model worked perfectly, what would change in the business on Monday?"

If you can't answer that in one sentence, do not start the project. Go back to the stakeholder. Ask them. If they can't answer it either, the project isn't ready.

A good answer: "The CSM team would have a ranked list of at-risk accounts every Monday and would call the top 10 that week."

A bad answer: "We'd understand churn better."

Understanding is not an action. It's a feeling. Nobody got promoted for understanding churn better. People get promoted for retaining $4M of ARR by calling 12 accounts.

Once you have the answer, the rest of the project becomes easier:

  • The output format is dictated by the action ("a ranked list of accounts")
  • The cadence is dictated by the workflow ("every Monday")
  • The success metric is dictated by the business outcome ("ARR retained vs control")
  • The communication strategy is dictated by who acts on it ("the CSM team, weekly, in their existing pipeline review")

If the action loop is real, the communication writes itself. The slide says: "here's the list. Call the top 10. We'll measure retention in 90 days."

If the action loop is missing, no amount of slide polish will save you. The model goes into the drawer. The exec stops attending your meetings. The next DS hire inherits the same model and the same dashboard, and the cycle repeats.

This is the hardest discipline in IC DS work. Almost nobody teaches it in school. The technical bar is the easy bar. The action-loop bar is what separates DS ICs who get promoted from those who don't.

Learn More