Skip to content
Back to articles
DataverseFeatured

Why We Used a Console App to Update 1.3 Million Records (And When You Should Too)

When your update count hits seven digits, the question stops being “can we do this?” and becomes “what’s the safest way to do this once?” This article explains why a console app using the Dataverse SDK is often the most controlled option for high-volume updates—especially when you need deterministic mapping logic, safe reruns, batching/throttling control, and the ability to update inactive records—plus a practical “test small first” pattern and validation steps to prove it worked.

February 18, 202610
Share
Why We Used a Console App to Update 1.3 Million Records (And When You Should Too)

We had 1.3 million Contact records that needed updating in Dataverse.
Not 1,300.
Not 13,000.
1.3 million.

At that scale, you stop asking “Can we do this in the UI?”
You start asking, “What is the safest way to do this once?”
For this project, the answer was a console app.


1. What This Tool Is For

A console app is a controlled, repeatable way to perform large-scale data updates directly against Dataverse using the SDK.

It’s built for:

  • Bulk updates
  • High-volume migrations
  • One-time correction jobs
  • Situations where UI tools and flows would struggle, throttle, or time out

It’s not flashy. It’s not point-and-click.
It’s controlled.


2. When to Use It (And When Not To)

Use a console app when:

  • You’re updating hundreds of thousands or millions of records
  • You need precise mapping logic (e.g., converting old option set values into a new structured field)
  • You need the ability to safely rerun the job if something interrupts it
  • You want full control over batching and throttling
  • You need to update inactive records
  • You cannot afford partial or inconsistent updates

That last one matters. Many UI-based tools won’t allow edits to inactive records. The SDK will.

Do not use a console app when:

  • You’re updating a few hundred records
  • The logic is simple and event-driven
  • The business needs ongoing automation rather than a one-time correction

If the number makes you pause when you say it out loud, a console app should at least be considered.


3. Pre-Flight Checklist (Before You Click Run)

This is the part that protects you.

Environment Safety

  • ✅ Confirm the environment URL is correct
  • ✅ Confirm the application user exists in that environment
  • ✅ Confirm the application user has read + write access to the table
  • ✅ Print the connected organization URL at runtime to verify

Never assume. Always print the org URL.

Data Safety

  • ✅ Clearly define your update conditions in plain English
  • ✅ Confirm you are not unintentionally overwriting populated data
  • ✅ Create a view that shows exactly how many records match your intended update criteria
  • ✅ Validate your mapping logic separately before bulk execution

If you cannot describe your update rule in one sentence, don’t run it yet.

Operational Readiness

  • ✅ Choose a reasonable page size (e.g., 500)
  • ✅ Understand it may run for hours
  • ✅ Confirm the logic is idempotent (safe to rerun)

Future You will thank you for that last one.


4. Step-by-Step (Using a “Test Small First” Pattern)

Here’s the responsible pattern.

Step 1 — Write Safe Update Logic

Example logic in plain terms:

  • Only retrieve records where:
    • Old field has data
    • New field is empty
  • Map old values to new values using a dictionary
  • Skip records that:
    • Already have the new value populated
    • Don’t have a valid mapping

Your update rule should be designed so that running it twice does not change anything the second time.
That’s what makes it production-safe.

Step 2 — Run in Dry Mode First

Use a toggle like:

private const bool DryRun = true;

  • Log what would be updated
  • Do not write anything yet

Confirm:

  • The count matches expectations
  • The value mappings are correct
  • No unexpected values appear

Step 3 — Test on a Small Subset

Temporarily limit your query to:

  • A known segment, or
  • A small sample (e.g., first 100 records)

Verify:

  • Only intended records are updated
  • No inactive record errors
  • No permission issues

Then remove the test filter.

Step 4 — Run the Full Job

Switch:

private const bool DryRun = false;

Run the job. At large scale, it will:

  • Page through the dataset
  • Update only qualifying records
  • Skip everything else

If interrupted (network, power, deployment window), rerunning will continue safely because already-updated records no longer match the “new field is empty” condition.


5. Common Gotchas (The Stuff That Burns Consultants)

Overwriting Existing Data

If you forget to check whether the new field already contains a value, you can overwrite valid data.
Make overwrites intentional, not accidental.

Scanning Everything Every Time

If your query does not filter properly (for example, it retrieves all records regardless of update state), reruns will be slow and frustrating.
Your filter should naturally exclude already-updated rows.

Running Against the Wrong Environment

Always log the connected environment URL at runtime.
Bulk updates and wrong environments are not a fun combination.

Underestimating Runtime

One update per record across hundreds of thousands of rows will take time.

If performance becomes an issue:

  • Use batched requests (e.g., ExecuteMultiple)
  • Tune page sizes carefully
  • Monitor throttling

Correctness first. Optimization second.


6. Validation Checklist (How You Prove It Worked)

After the job completes:

In Dataverse

Create a view:

  • Old field contains data
  • AND New field is empty

If the count is zero, the migration completed successfully.
If not, investigate before re-running blindly.

In Logs

Review:

  • Total records scanned
  • Total updated
  • Total skipped

Those numbers should align with your expected impact.


7. Real Scenario

In our case, we needed to migrate from a legacy “Source” field into a redesigned “Source” field.
There were 1.3 million Contact records.

We needed to:

  • Convert old option set values into new grouped categories
  • Preserve existing populated values
  • Include inactive records in the update
  • Ensure the process could safely resume if interrupted

A console app gave us:

  • Deterministic mapping logic
  • Safe reruns
  • Control over update behavior
  • Visibility into progress
  • The ability to update inactive records through the SDK

The power went out mid-run.
We reran it.
It continued updating only the remaining records because the logic excluded anything already processed.

That’s what a safe migration looks like.


Final Thought

When the dataset gets large enough, the question isn’t:
“Can we do this?”

It’s:
“What is the most controlled, repeatable way to do this without creating new problems?”

Sometimes the most professional solution is the least visible one.

And if you ever find yourself staring at a seven-digit record count, wondering how to move forward —
Now you have a pattern.

Tags

#data migration#console app#data#data movement#metadata