Bahasa Indonesia

Lead Scoring Model Decay: Why Your Scoring Model Goes Stale and How to Fix It

Lead scoring model decay over time

There's a pattern in almost every B2B marketing org that has run lead scoring for more than 18 months. The model launched well. Sales trusted the scores. MQL-to-opportunity conversion looked solid. Then, quietly, things started to slip.

Rejection rates crept up. Sales started hedging on MQL quality. The pipeline-from-marketing number softened. Nobody could point to a single change that caused it. It just... happened.

What happened is model decay.


Only 44% of B2B companies review or recalibrate their lead scoring models more than once per year (Forrester Research), meaning the majority of revenue teams are acting on scoring logic that no longer reflects their buyers.

Webinar attendance as a behavioral scoring signal can lose up to 60% of its predictive power within 12 months as attendance patterns shift, and most teams don't detect the decay until rejection rates have already climbed (Bizible/Marketo research).

B2B revenue teams that use closed-loop data to recalibrate scoring weights report 36% higher accuracy in predicting SQL conversion than teams that rely on MAP default configurations (MarketingProfs).


Not a bug, not bad data hygiene (though that can accelerate it), and not a campaign problem. The scoring model itself drifted out of alignment with how your buyers actually behave today. The weights assigned to behaviors, firmographic attributes, and intent signals were built on data from 18 months ago. Back then, you had a different product, a different ICP, and a different content mix. Your buyers aren't the same. Your model didn't get the memo.

The worst part is that a decayed model gives you false confidence. A model that never existed would at least keep you honest. A model that used to work, but no longer does, generates a steady stream of "qualified" leads that sales quietly stops working. And the dashboard still looks fine.

Why Scoring Models Decay

Decay isn't one thing. It's the accumulation of several small shifts, none of which breaks the model on its own, but all of which degrade it together over time.

Market and ICP drift. Your ICP statement from 18 months ago may no longer describe your best buyers. If you've moved upmarket, added a new vertical, or shifted focus from SMB to mid-market, the firmographic attributes that once predicted fit (company size, industry, tech stack) may need different weights. The model doesn't know your strategy changed. A current shared ICP framework agreed on by both teams is the reference point for every weight recalibration.

Product and messaging changes. New features create new use cases. New use cases attract new personas. If your product added a project management layer last year, you're now attracting ops leaders who weren't in your original ICP model. Their behavioral patterns look different from the demand-gen manager the model was built for.

Behavioral signal drift. Channels and formats that drove intent signals 18 months ago may no longer carry the same meaning. Webinar attendance used to be a strong mid-funnel signal. Now, with virtual event fatigue, someone sitting through a webinar may be less engaged than someone who read three specific blog posts. If your scoring model still weights webinar attendance at 15 points, it's overstating intent for a lot of leads. Forrester's analysis of scoring model failure identifies signal drift as the most common but least-diagnosed cause of declining model accuracy.

Data quality degradation. Records go stale. Fields stop being populated. Integrations break. A score that depended on a tech stack data feed stops updating when that enrichment service lapses. You're now scoring on blank fields.

Scoring weight inertia. Nobody recalibrates because nothing is visibly broken. The model keeps running. Reports keep generating numbers. It takes a deliberate intervention to notice that the numbers are less meaningful than they used to be.

Key Facts: Lead Scoring Model Reliability

  • Only 44% of B2B companies review or recalibrate their lead scoring models more than once per year, according to Forrester Research, meaning the majority are operating on stale logic.
  • Companies that implement quarterly scoring audits see 20-30% improvement in MQL-to-pipeline rates compared to annual-review teams, per Demand Gen Report survey data.
  • Revenue teams that use closed-loop data to recalibrate scoring weights report 36% higher accuracy in predicting SQL conversion than teams that rely on MAP default scoring configurations (MarketingProfs).

Signal Decay vs. Model Decay

These are related but different problems, and they require different fixes.

Signal decay is when a specific data point loses predictive power. The signal still fires (webinar attendance is still being tracked, content downloads still get scored) but the correlation between that signal and eventual close rate has weakened. You'd find this by running a correlation analysis between individual scoring attributes and closed-won outcomes. If webinar attendance used to correlate with close at r=0.4 and now correlates at r=0.1, that signal has decayed.

Model decay is when the overall model's ability to predict pipeline and close rate has degraded, even if no single signal has failed dramatically. The model's aggregate score stops being predictive. A score of 80 used to mean a 40% close rate; now it means 22%. The whole system has drifted, not just one attribute.

Signal decay is fixable by removing or reweighting individual attributes. Model decay often requires a fuller rebuild: go back to closed-won data and rerun the correlation analysis from scratch. But you can catch both with the same quarterly monitoring routine. The question is what to watch.

How to Detect Decay Early

Don't wait for someone to complain. Build detection into your reporting cadence.

Rejection rate trend. Track the percentage of MQLs that sales rejects each month. A rejection rate above 25-30% is a yellow flag. Above 40% is a red flag. If the rejection rate is climbing quarter over quarter, that's the clearest early signal that what marketing is calling "qualified" no longer matches what sales recognizes as sales-ready. This is model decay showing up in human behavior.

MQL-to-opportunity conversion rate drop. Pull a rolling 90-day conversion rate for MQL-to-opportunity. If it's falling without a corresponding drop in lead volume or a change in lead source mix, the model is degrading. The leads look qualified by score but aren't converting at the rate they used to. Cross-check against your MQL-to-SQL score thresholds: a threshold that was calibrated on older data can quietly inflate conversion at the top while the real pipeline quality erodes.

High-score leads not converting. Segment your MQLs into score buckets (80-100, 60-79, 40-59) and track conversion rates by bucket. If your 80-100 bucket used to convert at 35% and is now converting at 18%, the model has lost its discriminatory power at the high end. Either the weights are wrong, or the threshold was set too low relative to the current ICP.

Sales feedback loops flagging quality. Listen to what sales says in the weekly lead quality call, specifically which MQL sources and attributes they're consistently rejecting. If they're saying "all the webinar leads are junk" or "the financial services leads never convert," that's signal decay hiding in anecdote form.

Decay Indicator Yellow Flag Red Flag
MQL rejection rate 25-30% 40%+
MQL-to-opportunity conversion drop 15% decline vs. prior quarter 30%+ decline
High-score (80+) conversion rate Below model expectation Below mid-score bucket conversion
Sales team feedback Sporadic complaints Systematic rejection of specific signals
Score-to-close correlation Weakening but positive No meaningful correlation

The 90-Day Decay Audit: A Quarterly Checklist

The 90-Day Decay Audit is a structured quarterly process for detecting and correcting lead scoring model drift before it becomes a pipeline crisis. The audit has four steps: (1) correlation check: for each scoring dimension, calculate the win-rate difference between leads that triggered the signal versus those that didn't; (2) signal pruning: remove or reweight signals where win-rate difference is under 5 percentage points; (3) signal addition: test new behavioral signals with enough historical data; (4) ICP sync: confirm that firmographic weights still reflect the current shared ICP definition. Quarterly is the right cadence for most teams: often enough to catch drift before it becomes a crisis, not so frequent that it becomes a drain on marketing ops bandwidth.

Check correlation of each scoring dimension against recent closed-won data.

Export closed-won deals from the past quarter. For each scoring dimension (industry, company size, job title, page visits, content downloads, email engagement, demo requests, trial starts) calculate the win rate for leads that triggered that signal versus those that didn't. If the win rate difference is less than 5 percentage points, the signal isn't doing meaningful work.

Remove or reweight signals that no longer predict.

If "whitepaper download" has essentially the same win rate as no download, it's adding noise, not signal. Either remove it or drop it to 1-2 points. Counterintuitively, a simpler model with fewer attributes often outperforms a complex one, because the high-signal attributes aren't diluted by low-signal noise.

Add new signals that emerged.

New content types, new product actions, new intent data sources: any behavioral signal you started tracking in the past 6-12 months may now have enough historical data to evaluate for predictive value. Run the same correlation analysis and add it to the model if it clears the threshold.

Review firmographic weighting against your current ICP definition.

If sales leadership updated the ICP in the past year (new target verticals, new headcount ranges, new revenue tiers) your firmographic scores should reflect that. If they don't, you're generating high fit scores for companies that sales will never prioritize. The lead qualification frameworks your team uses for opportunity stage entry are a useful cross-check: if the criteria there have changed but your scoring weights haven't, they're out of sync.

Check data pipeline health.

Audit the fields that feed your scoring model. Are they being populated at the same rate as six months ago? If a field that used to be 80% populated is now 40% populated, the signals dependent on it are effectively halved. Fix the data pipeline before recalibrating the weights.

Time-Decay Rules for Behavioral Signals

Behavioral signals have a natural shelf life. A lead who visited your pricing page 14 months ago and hasn't been back since is not the same as a lead who visited yesterday. Your model should reflect that.

Half-life logic for engagement scores.

A practical implementation: behavioral scores decay by 50% every 90 days. If a lead earned 20 points for a pricing page visit, after 90 days that signal is worth 10 points, after 180 days it's worth 5, and after a year it's essentially zero. This keeps recent engagement weighted appropriately without requiring manual resets.

Most mature MAPs (Marketo, HubSpot, Pardot) support time-decay scoring natively. In Marketo, you can set score decay rules on a schedule. In HubSpot, you can use workflows to subtract points from behavioral scores on a rolling basis. If your MAP doesn't support it natively, a monthly batch job that reduces behavioral scores by a fixed percentage is a workable substitute.

Hard expiry for time-sensitive signals.

Some signals should expire entirely, not decay gradually. Trial start dates, event attendance, pricing page visits during a specific campaign window: these have a clear recency threshold. After 30 or 60 days, they no longer indicate current intent. Build hard resets into your MAP so that a trial start from 9 months ago doesn't keep inflating the lead's score indefinitely.

Example time-decay schedule:

Signal Type Half-Life Hard Expiry
Demo request N/A: route immediately 7 days if not followed up
Pricing page visit 30 days 60 days
Product feature page 45 days 90 days
Webinar attendance 30 days 60 days
Content download 60 days 120 days
Email open/click 30 days 45 days
Trial / product-led signal 14 days 30 days

Keeping the Model Lean

The temptation when something isn't working is to add more signals. If 10 attributes aren't giving us accurate scores, surely 20 will.

But the opposite is usually true. More attributes mean more noise, more maintenance burden, and more ways for the model to produce a high score for a lead that isn't actually ready. The best scoring models tend to have 5-8 high-signal attributes, not 25 medium-signal ones.

The lean model test: can you explain your scoring model to a sales rep in two minutes? If yes, it's probably the right size. If explaining it requires a whiteboard and a half hour, it's too complex and nobody will trust it, least of all sales. Wikipedia's overview of lead scoring methodology describes how the most durable models stay focused on a small set of explicit and implicit signals rather than accumulating every available data point.

Governance: Who Owns Recalibration

Model maintenance fails when ownership is unclear. Marketing ops usually maintains the MAP and the scoring logic. But the decision about what to weight and how to weight it requires input from sales leadership, specifically whoever can validate what closed-won actually looks like.

Recommended governance structure:

  • Marketing ops owns the quarterly audit process and the technical implementation of changes.
  • Marketing leadership signs off on ICP-related firmographic weight changes.
  • Sales leadership validates changes to behavioral signals through a 30-day pilot before full rollout.
  • Both teams review the results of each audit together, not in separate meetings but in the same room (or the same video call).

Document every weight change with a timestamp, the rationale, and the data that supported it. This creates an audit trail. When someone challenges the model 8 months from now, you can show what changed and why.

Tools and Implementation Notes

Marketo: Supports time-decay scoring via Smart Campaigns on a recurring schedule. Use the "Change Score" action with a negative value. Set up a weekly batch that subtracts a percentage of behavioral scores. The Score field history view lets you audit when and why scores changed.

HubSpot: Manual score decay is less native. Use a workflow with criteria-based triggers to subtract score values when engagement date fields exceed a threshold. HubSpot's predictive lead scoring feature (Enterprise tier) incorporates some auto-decay logic, but the manual model remains more transparent.

Pardot/Marketing Cloud Account Engagement: Score decay rules are available but limited. Pardot's built-in scoring automation is useful for adding points; subtracting them on a schedule requires more custom workflow logic.

If your MAP doesn't support native decay: Run a monthly export of all leads with behavioral scores, apply a fixed reduction (e.g., 20% for scores older than 60 days), and import the updated values. It's manual, but it works. Build the process into your monthly ops calendar so it doesn't get skipped.

Rework Analysis: Based on the 90-Day Decay Audit framework and industry benchmarks, teams that run quarterly scoring audits using closed-loop data maintain MQL-to-opportunity conversion rates that are 20-30% higher than teams that review annually. The most common finding during an audit is that 2-3 behavioral signals (typically webinar attendance and early-stage content downloads) are no longer carrying meaningful predictive weight. Removing them simplifies the model without reducing accuracy. Rework's lead management platform surfaces rejection rate trends and score-to-close correlations in a single dashboard, making it straightforward to identify decay signals before they become pipeline problems. See rework.com/pricing for current plan details.

The Model That Nobody Trusts

The end state of unmanaged decay isn't a broken model. It's a model that still runs, still produces scores, and still converts MQLs. But sales has quietly stopped believing in it. Reps start doing their own qualification. They ignore the score and judge leads by different criteria. Marketing reports look fine. Pipeline health degrades.

By the time the conversation surfaces, usually at a quarterly business review where someone asks why pipeline generation is down, the model has been unreliable for months. And fixing it then takes six months of data collection and revalidation to get back to credibility.

The quarterly audit prevents this. Fifteen minutes a month on rejection rates, one two-hour session per quarter on correlation analysis and weight review. That's what separates a scoring model that compounds its value from one that quietly becomes infrastructure everyone works around. Forrester's research on lead scoring measurement makes clear that teams without a pre-established baseline metric cannot objectively evaluate whether their model is working, making decay invisible until it becomes a crisis.

Connect this to your closed-loop reporting cadence and your joint lead scoring framework, and you have a system that gets better over time instead of worse.

Frequently Asked Questions

What is lead scoring model decay?

Lead scoring model decay is the gradual degradation of a scoring model's ability to predict sales-readiness. It happens when the weights, signals, and thresholds in the model no longer reflect how current buyers behave. The market has shifted, the ICP has changed, product messaging has evolved, or behavioral signals that used to predict conversion no longer do. A decayed model still produces scores, but those scores have lost their correlation to pipeline and close rate.

How do I detect lead scoring model decay early?

Watch four indicators: (1) MQL rejection rate trend: above 25-30% is a yellow flag, above 40% is red; (2) MQL-to-opportunity conversion rate drop: a falling rate without a volume change suggests the model is degrading; (3) high-score leads not converting: if your 80+ score bucket is converting at the same rate as your 60-79 bucket, the model has lost discriminatory power; (4) systematic sales feedback: if reps consistently reject the same signal type (webinar leads, a specific campaign source), that's decay showing up as anecdote.

How often should a lead scoring model be recalibrated?

Quarterly audits are the standard for most teams. Monthly is unnecessary overhead unless you have high lead volume and a dedicated marketing ops resource. Annual is too infrequent: by the time annual review rolls around, the model may have been quietly misleading for 6-9 months. Build the 90-Day Decay Audit into your marketing ops calendar as a recurring commitment, not a one-time fix.

What's the difference between signal decay and model decay?

Signal decay is when one specific data point loses predictive power (for example, webinar attendance no longer correlates meaningfully with close rate). Model decay is when the model's overall ability to rank leads by conversion probability has degraded, even if no single signal has failed dramatically. Signal decay is fixed by reweighting or removing individual attributes. Model decay often requires a fuller rebuild: going back to closed-won data and rerunning the correlation analysis from scratch.

How do time-decay rules work in lead scoring?

Time-decay rules reduce the point value of behavioral signals as they age. A practical default: behavioral scores decay 50% every 90 days. A pricing page visit worth 20 points at the time of occurrence is worth 10 points after 90 days, 5 points after 180 days, and essentially zero after a year. This keeps recent engagement appropriately weighted without requiring manual resets. Most mature marketing automation platforms (Marketo, HubSpot, Pardot) support time-decay scoring natively or via workflow automation.

What signals should I remove during a scoring audit?

Remove or reweight any signal where the win-rate difference between leads that triggered the signal and those that didn't is less than 5 percentage points. If "whitepaper download" has a win rate of 12% among leads that triggered it and 10% among those that didn't, it's adding noise, not signal. The counterintuitive finding in most scoring audits is that simpler models with 5-8 high-signal attributes outperform complex models with 20+ medium-signal attributes.

Learn More