We had 1.3 million Contact records that needed updating in Dataverse.
Not 1,300.
Not 13,000.
1.3 million.
At that scale, you stop asking “Can we do this in the UI?”
You start asking, “What is the safest way to do this once?”
For this project, the answer was a console app.
1. What This Tool Is For
A console app is a controlled, repeatable way to perform large-scale data updates directly against Dataverse using the SDK.
It’s built for:
- Bulk updates
- High-volume migrations
- One-time correction jobs
- Situations where UI tools and flows would struggle, throttle, or time out
It’s not flashy. It’s not point-and-click.
It’s controlled.
2. When to Use It (And When Not To)
Use a console app when:
- You’re updating hundreds of thousands or millions of records
- You need precise mapping logic (e.g., converting old option set values into a new structured field)
- You need the ability to safely rerun the job if something interrupts it
- You want full control over batching and throttling
- You need to update inactive records
- You cannot afford partial or inconsistent updates
That last one matters. Many UI-based tools won’t allow edits to inactive records. The SDK will.
Do not use a console app when:
- You’re updating a few hundred records
- The logic is simple and event-driven
- The business needs ongoing automation rather than a one-time correction
If the number makes you pause when you say it out loud, a console app should at least be considered.
3. Pre-Flight Checklist (Before You Click Run)
This is the part that protects you.
Environment Safety
- ✅ Confirm the environment URL is correct
- ✅ Confirm the application user exists in that environment
- ✅ Confirm the application user has read + write access to the table
- ✅ Print the connected organization URL at runtime to verify
Never assume. Always print the org URL.
Data Safety
- ✅ Clearly define your update conditions in plain English
- ✅ Confirm you are not unintentionally overwriting populated data
- ✅ Create a view that shows exactly how many records match your intended update criteria
- ✅ Validate your mapping logic separately before bulk execution
If you cannot describe your update rule in one sentence, don’t run it yet.
Operational Readiness
- ✅ Choose a reasonable page size (e.g., 500)
- ✅ Understand it may run for hours
- ✅ Confirm the logic is idempotent (safe to rerun)
Future You will thank you for that last one.
4. Step-by-Step (Using a “Test Small First” Pattern)
Here’s the responsible pattern.
Step 1 — Write Safe Update Logic
Example logic in plain terms:
- Only retrieve records where:
- Old field has data
- New field is empty
- Map old values to new values using a dictionary
- Skip records that:
- Already have the new value populated
- Don’t have a valid mapping
Your update rule should be designed so that running it twice does not change anything the second time.
That’s what makes it production-safe.
Step 2 — Run in Dry Mode First
Use a toggle like:
private const bool DryRun = true;
- Log what would be updated
- Do not write anything yet
Confirm:
- The count matches expectations
- The value mappings are correct
- No unexpected values appear
Step 3 — Test on a Small Subset
Temporarily limit your query to:
- A known segment, or
- A small sample (e.g., first 100 records)
Verify:
- Only intended records are updated
- No inactive record errors
- No permission issues
Then remove the test filter.
Step 4 — Run the Full Job
Switch:
private const bool DryRun = false;
Run the job. At large scale, it will:
- Page through the dataset
- Update only qualifying records
- Skip everything else
If interrupted (network, power, deployment window), rerunning will continue safely because already-updated records no longer match the “new field is empty” condition.
5. Common Gotchas (The Stuff That Burns Consultants)
Overwriting Existing Data
If you forget to check whether the new field already contains a value, you can overwrite valid data.
Make overwrites intentional, not accidental.
Scanning Everything Every Time
If your query does not filter properly (for example, it retrieves all records regardless of update state), reruns will be slow and frustrating.
Your filter should naturally exclude already-updated rows.
Running Against the Wrong Environment
Always log the connected environment URL at runtime.
Bulk updates and wrong environments are not a fun combination.
Underestimating Runtime
One update per record across hundreds of thousands of rows will take time.
If performance becomes an issue:
- Use batched requests (e.g., ExecuteMultiple)
- Tune page sizes carefully
- Monitor throttling
Correctness first. Optimization second.
6. Validation Checklist (How You Prove It Worked)
After the job completes:
In Dataverse
Create a view:
- Old field contains data
- AND New field is empty
If the count is zero, the migration completed successfully.
If not, investigate before re-running blindly.
In Logs
Review:
- Total records scanned
- Total updated
- Total skipped
Those numbers should align with your expected impact.
7. Real Scenario
In our case, we needed to migrate from a legacy “Source” field into a redesigned “Source” field.
There were 1.3 million Contact records.
We needed to:
- Convert old option set values into new grouped categories
- Preserve existing populated values
- Include inactive records in the update
- Ensure the process could safely resume if interrupted
A console app gave us:
- Deterministic mapping logic
- Safe reruns
- Control over update behavior
- Visibility into progress
- The ability to update inactive records through the SDK
The power went out mid-run.
We reran it.
It continued updating only the remaining records because the logic excluded anything already processed.
That’s what a safe migration looks like.
Final Thought
When the dataset gets large enough, the question isn’t:
“Can we do this?”
It’s:
“What is the most controlled, repeatable way to do this without creating new problems?”
Sometimes the most professional solution is the least visible one.
And if you ever find yourself staring at a seven-digit record count, wondering how to move forward —
Now you have a pattern.