More in
Data Migration Guide
Exporting Cleanly from Salesforce: The Migration-Ready Export Guide
abr. 18, 2026
Exporting Cleanly from HubSpot: What the Native Export Misses
abr. 18, 2026
Exporting Cleanly from Pipedrive: Deals, Contacts, and Activity History
abr. 18, 2026
Escaping Spreadsheets: The 5-Step Migration to a Real CRM
abr. 18, 2026
Handling Historical Activities, Notes, and Emails During CRM Migration
abr. 18, 2026
Post-Migration Data Audit: What to Verify and When
abr. 18, 2026
User Access During CRM Migration: The Least-Privilege Approach
abr. 18, 2026
Communicating the CRM Migration to Your Sales Team
abr. 18, 2026
Rollback Planning for CRM Migrations: Hope You Don't Need It
abr. 18, 2026
Long-Term Archiving of Legacy CRM Data: What to Keep and How
abr. 18, 2026 · Currently reading
Long-Term Archiving of Legacy CRM Data: What to Keep and How
A company migrated away from Salesforce in 2021. The migration went well. The new CRM was working. But nobody made a decision about what to do with the old Salesforce org.
Two years later, they were still paying for Salesforce licenses. Twenty licenses at $150/month each: $36,000 per year for read-only access to historical data that three people in the company had looked at in the past six months. The reason it hadn't been canceled: "We're not sure what's in there that we might need."
A data governance incident in 2023 (a GDPR erasure request that required finding and deleting records across all systems) finally forced the decision. They spent three weeks auditing the Salesforce org, discovered most of it was redundant with the new CRM, exported what wasn't, and canceled. The total cost of waiting: $72,000 in unnecessary license fees, three weeks of IT time, and a GDPR response that took 14 days instead of 2.
The archive decision that should have happened in 2021 happened in 2023. This guide is that decision, made at the right time. It picks up where the post-migration data audit leaves off — once the 72-hour and 30-day audits confirm the new CRM is complete, the source system's role shifts from rollback safety net to historical archive.
What You're Legally Required to Keep
Before deciding what to archive, understand what you're legally required to retain. The answer varies by region, industry, and record type.
GDPR (European Union and UK):
GDPR creates a tension that many companies don't fully resolve: the right to erasure (individuals can request their data be deleted) versus legitimate retention obligations (you may need certain data for legal or contractual reasons). The Wikipedia GDPR article covers the right-to-erasure provisions (Article 17) and the exemptions for legal obligation retention — both of which directly affect what archived CRM data you must be able to delete on request.
In practice, for CRM data:
- You can retain data you have a legitimate business interest in (active customer relationship, contractual obligation)
- You must delete data on erasure request, with limited exceptions
- Retention periods must be defined — "we keep it indefinitely" is not GDPR-compliant
- You must be able to demonstrate that archived legacy data is included in your data map (i.e., if someone submits an erasure request, you must be able to find and delete their records in the archive too)
US data retention:
No single federal CRM data retention law exists, but sector-specific rules apply:
- Financial services (SEC Rule 17a-4): certain business records must be retained for 3-7 years in a non-erasable format
- Healthcare (HIPAA): if your CRM contains PHI, retention rules are strict and deletion procedures must be documented
- California (CCPA): consumers have deletion rights similar to GDPR; your archive must support those deletion requests
The NIST guidelines on data retention and disposal provide a technology-neutral framework for defining retention periods and secure deletion procedures — useful as a compliance reference regardless of which sector-specific regulations apply to your business.
General commercial records:
Contract-related data (deal terms, signed agreements, customer communications around contract negotiations) often falls under general commercial records retention, typically 7 years in the US and EU for financial records.
Who owns this decision:
Legal or compliance, with input from IT and RevOps. This is not a unilateral IT call. Get a signed memo from legal or compliance that specifies your retention periods by record type before you archive anything. The record types you're making retention decisions on here overlap directly with the scope defined in handling historical activities, notes, and emails — what you chose to migrate, archive, or discard at export time shapes which categories now need formal retention treatment.
The Retention Decision Framework
With the legal baseline understood, apply a retention policy by record type.
Retention policy template:
| Record type | Recommended retention | Rationale |
|---|---|---|
| Closed-won deals (contract data, deal terms) | 7 years minimum | Commercial records, auditable contracts |
| Closed-lost deals | 3 years | Pipeline analysis, competitive reference |
| Active contacts at migration time | 3 years post-migration | Ongoing business relationship reference |
| Inactive contacts (no engagement in 2+ years) | 1 year post-migration, then delete | No legitimate interest after erasure |
| Activity logs: rep-entered notes | 3 years | Relationship context, dispute resolution |
| Activity logs: system-generated events | 1 year maximum | Low signal; no retention obligation |
| Email metadata (subject, date, sender) | 2 years | Sufficient for most reference needs |
| Email body content | 3 years for deal-related emails | Contract/dispute reference |
| Custom report snapshots | 5 years | Business performance records |
| User access and audit logs | 3-5 years | Security and compliance requirements |
Who decides:
Legal or compliance signs off on the retention periods. RevOps proposes the record types and business rationale. IT owns the implementation (what format, where stored, how accessed).
Document the decision formally. A retention policy doesn't need to be a 50-page document, but it needs to exist as a written record that was reviewed and approved. If a regulator asks about your data retention practices two years from now, "we discussed it in a meeting" is not an answer.
Choosing an Archive Format
The archive format determines what you can do with the data later. Optimize for: portability (readable without the source system), queryability (can someone search it?), and cost.
Option A: CSV + JSON export
Export all objects as flat CSV files (one file per object) plus any relationship data as JSON or separate join tables. Store with a data dictionary that maps each column to a field name and type.
Pros: Any tool can read it. No vendor lock-in. Easy to audit for erasure requests. Can be opened in Excel, Sheets, or any data tool.
Cons: Querying requires either loading into a database or using spreadsheet tools. Relationships between objects require manual joins.
Best for: Most teams. Simple, durable, portable.
Option B: Database dump
A full backup of the source CRM's underlying database (if the vendor provides one).
Pros: Can be restored to a fresh instance if needed. Preserves all relationships natively.
Cons: Format is often proprietary or version-specific. Restoration requires matching infrastructure. Not human-readable without tooling.
Best for: Teams with IT capability who want the option to restore the source system temporarily (e.g., for a major legal discovery request).
Option C: Cloud data warehouse (BigQuery, Redshift, Snowflake)
Export CRM data into a structured cloud warehouse where it can be queried via SQL.
Pros: Fully queryable. Handles erasure requests via SQL DELETE. Scalable. Can be connected to BI tools.
Cons: Requires cloud infrastructure setup and ongoing maintenance. Overkill for most teams under 100,000 contacts.
Best for: Teams that already use a data warehouse and want CRM history to be part of it.
Option D: SaaS-to-cloud managed export (Salesforce Data Archive, HubSpot archiving tools)
Some vendors offer native archiving that moves older data to cheaper storage tiers while keeping it queryable through the platform.
Pros: Stays within the familiar platform interface. Minimal migration effort.
Cons: Still requires vendor licensing. Doesn't actually decommission the source system. Doesn't solve the "we're paying for a system we don't use" problem.
Best for: Companies that need a short-term archive solution while planning a full decommission.
Archive format comparison:
| Format | Portability | Queryability | Cost | Complexity |
|---|---|---|---|---|
| CSV + JSON | High | Low (manual) | Near zero | Low |
| Database dump | Low (proprietary) | Low (requires restore) | Low | Medium |
| Cloud warehouse | High | High (SQL) | Low-medium (storage only) | Medium |
| SaaS-to-cloud managed | Medium | Medium | Medium (licensing) | Low |
Storage Options and Cost
Once you've chosen an archive format, decide where it lives.
Cold storage (AWS Glacier, Azure Archive, Google Coldline):
- Cost: ~$0.004/GB/month (AWS Glacier Deep Archive)
- Retrieval time: Hours to days
- Best for: Data you might need for legal/compliance but don't expect to access regularly
- Retrieval cost: $0.02/GB for expedited; lower for standard
For a 100GB CRM export (typical for a 5-year-old CRM with 100,000 contacts), cold storage costs about $0.40/month. A 5-year archive costs roughly $24 in storage. That's not a typo.
Standard cloud storage (AWS S3, Azure Blob, Google Cloud Storage):
- Cost: ~$0.023/GB/month (AWS S3 Standard)
- Retrieval: Immediate
- Best for: Data you access occasionally (monthly or quarterly)
- Good middle ground between cost and access speed
Cloud data warehouse (BigQuery, Redshift, Snowflake):
- Storage cost: ~$0.02/GB/month (BigQuery)
- Query cost: Additional per TB scanned
- Best for: Data you query regularly (weekly) or need to support self-serve lookups
On-premises:
- Cost: Capital expenditure for hardware + IT maintenance
- Access: Immediate
- Best for: Organizations with existing on-prem infrastructure and strong data sovereignty requirements
- Cons: Maintenance burden; hardware failure risk
For most companies migrating from a mid-market CRM:
Cold storage is the right default for data older than 2 years. Standard cloud storage for data from the last 2 years that might be referenced. Cloud warehouse only if you're already running one and the incremental cost is minimal.
Building an Access Path for the Sales Team
Reps will ask for old records. Build a process before the first request arrives.
The realistic request volume:
In the first 30 days post-migration, expect 5-15 requests per 50 reps. After 90 days, it drops to 1-3 per month. After 6 months, almost nothing. Plan for high volume early, then steady-state low volume.
Access options, ranked by practicality:
Option 1: Keep source system read-only for 90 days
The simplest approach. Don't decommission immediately. Give reps direct read-only access to the old CRM for 90 days. After that, the request volume drops low enough that a manual lookup process handles it.
Cost: One quarter of license fees. For 20 users at $75/month read-only: $4,500. Worth it for most teams to avoid building a parallel access system. This 90-day read-only period is also what makes rollback planning viable — don't start the decommission process until you're confident the new CRM is fully validated and the rollback window has closed.
Option 2: Submit a request to IT
After the read-only period ends, "submit a request" becomes the access path. Rep emails ops or IT with the contact name and reason. IT queries the archive and responds within 24 hours.
This works at low volume (under 5 requests per month). It doesn't scale if requests stay high.
Option 3: Self-serve archive query (for technical teams)
A lightweight web UI or database query tool that reps can use to search the archive by contact name or email. Requires setup time but reduces IT burden at scale.
Realistic for: Teams with a BI tool already (Tableau, Looker, Metabase) where the archive can be added as a data source.
Decommissioning the Source System Cleanly
The decommission is the last step, and the one that creates problems if it's rushed.
Decommission checklist:
- Archive export verified and validated (row counts match source, sample records readable)
- Retention policy signed off by legal/compliance
- All active integrations pointing to the source CRM identified and either redirected or decommissioned
- SSO/SAML configuration updated (remove the old CRM from the identity provider)
- Any email or calendar sync connected to the old CRM disconnected
- API keys and OAuth tokens issued by the old CRM revoked or rotated in connected systems
- Vendor contract cancellation submitted with correct notice period (typically 30 days)
- Internal documentation updated (runbooks, IT docs that reference the old system)
- Data map updated to remove the old CRM (especially important for GDPR compliance documentation)
- Final backup taken on the day of cancellation (belt-and-suspenders)
Cancellation timing:
Most CRM vendors require 30 days written notice before the billing period ends. Check your contract. Missing the cancellation window by one day can mean another full billing cycle.
Integrations are the most common decommission failure point. An internal tool that still makes API calls to the old CRM will start throwing errors the day after decommission. Audit all integrations before canceling. A simple search for the old CRM's domain name in your code repositories, Zapier workflows, and automation tools often surfaces connections that nobody remembered. This audit also confirms that email and calendar sync has been fully redirected to the new CRM before the legacy system goes offline.
Common Pitfalls
Archiving in a proprietary format only the vendor can read. If your archive is a Salesforce-formatted export that can only be interpreted by Salesforce's tools, you're paying Salesforce indefinitely. Archive in open formats: CSV, JSON, SQL, or a standard database format. PwC's data governance advisory research recommends vendor-neutral open formats for long-term data retention as part of responsible data governance — ensuring that archived data remains accessible and auditable independent of vendor relationships.
Keeping the source system active past its retention usefulness. The costs accumulate quietly. Set a decommission date before the migration is complete. Put it on the project plan. Treat it as a deliverable.
Not documenting where the archive lives and how to access it. Two years from now, the person who managed the migration might be gone. Document the archive location, access credentials, and query process in a place that survives staff turnover.
Forgetting API keys and integrations tied to the legacy system. Every tool that was connected to the old CRM (email sync, marketing automation, data enrichment) needs to be audited before decommission. One forgotten Zapier workflow that sends a confirmation email to new contacts via the old CRM's API will break the day after cancellation, and it might not be noticed for weeks.
What to Do Next
Complete the retention policy document and archive format decision before the 90-day post-migration mark. The window between "migration complete" and "decommission complete" is typically 90-180 days. Use that time well.
The archive connects directly to handling historical activities, notes, and emails, specifically the records you decided to archive rather than migrate. Those records need to be findable in the archive when reps ask for them.
And the post-migration data audit is a good trigger for starting the archive decision: once the 72-hour and 30-day audits confirm the new CRM is complete, the source system's role shifts from "backup in case of rollback" to "archive of historical data." That's the moment to start the decommission process.
For rollback planning, note that once you've started decommissioning the source system, rollback is no longer an option. Don't start the decommission process until you're confident the new CRM is fully validated.
Learn More
- Post-migration data audit: what to verify and when
- Handling historical activities, notes, and emails
- Rollback planning: hope you don't need it
- Preparing your data before you migrate anything
- CRM workflow automation: what to build after the data is clean
- CAC payback and SaaS survival: measuring the cost of a bad migration
