MongoDB to PostgreSQL Migration: 3 Months, 2 Mental Breakdowns, 1 Lesson

Press enter or click to view image in full size

Day 1: “This will be easy, just a data migration.” Day 47: I woke up at 3 AM screaming about BSON arrays. My therapist now knows what denormalization means.

12 min read1 day ago

–

It Started With a Slack Message

Tuesday, 10:47 AM.

Our CTO drops this in #engineering:

“We’re migrating to PostgreSQL. MongoDB isn’t cutting it anymore. Should be straightforward.”

I looked at my coffee. It was still hot. In 30 seconds, my entire week just got destroyed.

“Straightforward,” he said. Like moving 300GB of nested, schema-less, inconsistently structured data from one database to another is “straightforward.”

Master Git in Minutes — 35 Essential Commands & Real-World Workflows

I’ve been building systems for 8 yea…

Press enter or click to view image in full size

Day 1: “This will be easy, just a data migration.” Day 47: I woke up at 3 AM screaming about BSON arrays. My therapist now knows what denormalization means.

12 min read1 day ago

–

It Started With a Slack Message

Tuesday, 10:47 AM.

Our CTO drops this in #engineering:

“We’re migrating to PostgreSQL. MongoDB isn’t cutting it anymore. Should be straightforward.”

I looked at my coffee. It was still hot. In 30 seconds, my entire week just got destroyed.

“Straightforward,” he said. Like moving 300GB of nested, schema-less, inconsistently structured data from one database to another is “straightforward.”

Master Git in Minutes — 35 Essential Commands & Real-World Workflows

I’ve been building systems for 8 years. I’ve survived production fires, deleted databases, and that one time we accidentally DDoS’d ourselves. But this migration? This broke me in ways I didn’t know were possible.

Here’s how a “simple data migration” turned into 3 months of hell, 2 genuine mental breakdowns, and one lesson I’ll never forget.

Why We Even Had to Migrate (Spoiler: We F*cked Up)

Let me rewind 2 years.

We were a hot new startup. MongoDB was cool. Everyone used MongoDB. “Schema-less!” they said. “Flexible!” they promised. “Just throw your data in and iterate fast!”

So we did. Boy, did we ever.

Our MongoDB setup looked like this:

// Users collection - seemed fine at first{  _id: ObjectId("..."),  name: "John",  email: "john@example.com",  settings: {    theme: "dark",    notifications: {      email: true,      push: false,      // Sometimes this field exists, sometimes it doesn't      sms: true      }  },  // Array of subdocuments - this is where things got messy  orders: [    {      orderId: "ORDER-123",      items: [        { productId: "PROD-1", quantity: 2 },        { productId: "PROD-2", quantity: 1 }      ],      // Sometimes we had this field      shippingAddress: "123 Main St",      // Sometimes we didn't      // Sometimes it was an object instead      // Sometimes it was null but the key existed      // We had no idea anymore    }  ]}

At first, this was great. Ship fast, iterate, move on.

Then we hit 100,000 users.

Then 500,000.

Then… problems.

What started breaking:

Queries that took 50ms now took 8 seconds

Nested array lookups destroyed performance
Index on orders.items.productId? Good luck with that

Data inconsistencies everywhere

Some users had settings.notifications as an object
Others had it as a boolean
A few had it as a string (how?!)
Nobody knew which version was “correct”

Reporting was impossible

Business team: “Give us total revenue by product”
Me: “Sure, just let me write a 200-line aggregation pipeline that might crash MongoDB”

Transactions didn’t exist (this was pre-MongoDB 4.0)

We had orders that paid but never shipped
Inventory that decremented but orders that failed
Money disappeared into the void

The final straw? A bug where users could duplicate their orders by clicking fast enough. No transactions, no locks, just chaos.

MongoDB wasn’t the problem. We were the problem. We used it wrong. And now we had to pay the price.

The Plan (That Immediately Fell Apart)

Our plan was simple:

Week 1: Design PostgreSQL schema Week 2: Write migration scripts Week 3: Test migration on staging Week 4: Migrate production

Simple, right?

Here’s what actually happened:

Week 1: Realized our data was so inconsistent we couldn’t even define a schema Week 2: Spent 5 days just understanding what data we had Week 3–7: Wrote increasingly desperate data cleaning scripts Week 8: First test migration took 40 hours and crashed Week 9–12: Everything was on fire

Mental Breakdown 1: The Schema That Made Me Cry

Designing the PostgreSQL schema should have been easy. Just normalize the data, right?

Wrong.

See, MongoDB lets you be lazy. You don’t think about structure. You just throw data in there and figure it out later.

“Later” had arrived. And we had no idea what structure we’d been throwing.

The **orders** field alone was a nightmare:

67% of users had orders as an array of objects
23% had it as a nested object with string keys
8% had it as just an order ID string
2% had… XML? (HOW?!)

I spent 2 weeks just cataloging the variations. My Excel sheet had 47 tabs. Each tab was a different schema variation we found in production.

One night, around 11 PM, I was staring at my screen. I’d just found our 48th schema variation — someone had stored an entire HTML invoice as base64 inside the orders.notes field.

I closed my laptop. I sat there. I didn’t move for 20 minutes.

My wife found me. “Are you okay?”

“No,” I said. “I’m not.”

She hugged me. I cried a little. Not gonna lie.

The lesson hit me like a truck: Schema-less doesn’t mean no schema. It means you have infinite schemas. And one day, you’ll have to deal with all of them.

The Migration Script From Hell

After finally defining a schema, I had to write the migration script.

How hard could it be? Read from MongoDB, write to PostgreSQL. Basic ETL, right?

Oh, sweet summer child.

Problem #1: ObjectId to UUID

MongoDB uses ObjectId. PostgreSQL uses… whatever you want, but probably UUID or integers.

Easy conversion? Nope.

# Seemed simpledef convert_object_id(obj_id):    return str(obj_id)# Until we realized...# ObjectIds are 12 bytes (24 hex chars)# UUIDs are 16 bytes (36 chars with dashes)# Our foreign keys were now broken# And nobody could tell which records matched anymore

We ended up creating a mapping table. 300GB of mapping table. For 2 weeks, we ran two databases in parallel, syncing IDs. It was horrible.

Problem #2: Denormalization

In MongoDB, we had:

{  userId: "123",  orders: [    { orderId: "A", products: [{ id: "P1" }, { id: "P2" }] },    { orderId: "B", products: [{ id: "P3" }] }  ]}

In PostgreSQL, this becomes:

-- 3 tablesusers (user_id)orders (order_id, user_id) order_items (order_id, product_id)-- And suddenly...-- 1 MongoDB document = 50 PostgreSQL rows

Our 300GB MongoDB database exploded into 890GB in PostgreSQL. Nobody saw that coming.

Problem #3: Nested Arrays

MongoDB: “Store arrays in arrays in arrays! It’s fine!” PostgreSQL: “lol no”

We had documents like this:

{  teams: [    {      members: [        {          roles: ["admin", "editor", "viewer"],          permissions: {            canEdit: true,            resources: ["doc1", "doc2", "doc3"]          }        }      ]    }  ]}

To flatten this into PostgreSQL tables, I created:

teams
team_members
member_roles
member_permissions
permission_resources

One MongoDB document → 5 PostgreSQL tables → 200+ rows.

The script to handle this took 2 weeks to write. It was 1,800 lines of Python. It worked… mostly.

Problem #4: Type Inconsistencies

Remember how I said MongoDB is schema-less? This is where it hurts.

// In MongoDB, all of these exist simultaneously{ age: 25 }           // number{ age: "25" }         // string  { age: "twenty-five" } // oh god why{ age: null }         // null{ age: "" }           // empty string{ age: undefined }    // JavaScript moment{ }                   // field doesn't exist

My migration script had to handle all of these. For every field. Across 200+ fields.

I wrote more try-catch blocks than actual logic.

The Test That Took 40 Hours

First test migration on staging. Click “run.”

I go home. It should finish overnight.

Next morning: Still running.

That afternoon: Still running.

Next day: Still running.

40 hours later: Done.

“Great! Let’s check the data.”

SELECT COUNT(*) FROM users;-- 487,234-- Wait... we have 500,000 users in MongoDBSELECT COUNT(*) FROM orders;-- 89,123-- We have 2.3 million orders in MongoDB

Where did 12,766 users go? Where did 2.2 million orders go?

3 days of debugging later:

Script crashed on emoji in names (Unicode encoding issues)
Script crashed on dates before 1970 (PostgreSQL doesn’t like that)
Script crashed on documents over 16MB (BSON limit)
Script silently skipped 40,000 documents due to a bug in error handling

I wanted to quit. I genuinely considered quitting.

Mental Breakdown #2: The Weekend That Never Ended

Week 9. We’re trying our 5th full migration test.

It’s Saturday morning. I’m supposed to be at a family thing. My phone buzzes.

“Migration crashed again. Server ran out of memory.”

I close my eyes. I breathe. I drive to the office.

The problem:

Our migration script loaded all 300GB into memory before writing to PostgreSQL. Why? Because I’m an idiot who forgot to stream the data.

I spent the entire weekend rewriting it. No sleep Saturday night. 4 hours Sunday. Back to the office Monday looking like a zombie.

My manager: “You look terrible.” Me: “I feel terrible.” Manager: “Take a day off.” Me: “If I stop now, I’ll never come back.”

That week, I:

Rewrote the entire migration pipeline to stream data
Added progress tracking (why didn’t I do this from the start?)
Implemented checkpointing so crashes could resume
Learned more about PostgreSQL COPY than anyone should

The new script: 400 lines instead of 1,800. Faster. Better. Actually worked.

It took me 9 weeks to write what should’ve taken 1 week.

Why? Because I didn’t understand the problem. I thought it was a technical problem. It wasn’t.

It was a data problem. And data problems are always worse than code problems.

The Things That Saved Us

After 10 weeks of suffering, we finally got it working. Here’s what actually helped:

1. Staging Environment That Matched Production

We initially tested on a staging database that was 6 months old. Turns out, production had evolved. A lot.

When we synced staging to a fresh snapshot: Found 30 new data variations we’d missed.

Lesson: Test with real, current production data. Or you’re testing nothing.

2. Incremental Rollout

We didn’t flip a switch. We ran both databases in parallel for 4 weeks:

Week 1: 5% of reads from PostgreSQL
Week 2: 20% of reads
Week 3: 50% of reads + 10% of writes
Week 4: 100% reads, 50% writes
Week 5: 100% everything, MongoDB on standby

Found bugs in production that we never saw in testing. This saved our ass.

3. Fallback Plan

We kept MongoDB running read-only for 2 months after full migration.

Found a bug? Fall back to MongoDB.

Caught 3 critical issues this way. Without fallback, we’d have been screwed.

4. Data Validation Scripts

After every migration, we ran validation:

-- Count everythingSELECT COUNT(*) FROM users;SELECT COUNT(*) FROM orders;SELECT COUNT(*) FROM order_items;-- Compare totals-- Check random samples-- Verify foreign keys

Found mismatches every single time. Fixed them before going live.

5. Acceptance That We’d Lose Some Data

Hard truth: We lost data.

3,284 documents were so corrupted we couldn’t migrate them
147 users had invalid data we couldn’t parse
Some legacy stuff from 2019 that nobody understood

We documented everything. Notified affected users. Moved on.

Perfect is the enemy of done. We shipped at 99.4% data integrity. Good enough.

What We Learned (The Hard Way)

Lesson 1: Schema-less is a lie

MongoDB isn’t schema-less. You still have a schema. It’s just implicit, inconsistent, and exists only in your code.

One day, you’ll need an actual schema. And you’ll pay for every shortcut you took.

Lesson 2: Migrations are never “just data”

They’re about:

Business logic that assumes the old structure
Code that depends on the old behavior
Users who expect things to work the same way
All your shortcuts coming back to haunt you

Data is the easy part. Everything else is hell.

Lesson 3: Optimize for reading your data, not writing it

MongoDB made writes easy. Reads became impossible.

PostgreSQL makes reads easy. Writes take more planning.

Which do you do more: write data or read data?

For us, it was 95% reads. We optimized for the 5%. Stupid.

Lesson 4: Your database should enforce rules

MongoDB let us store anything. So we stored everything. Badly.

PostgreSQL forces you to think. Define your types. Set your constraints. Enforce your rules.

This felt limiting at first. Then we realized: constraints are documentation that computers enforce.

Lesson 5: Ask for help earlier

I spent 6 weeks struggling alone because I didn’t want to admit I was drowning.

Week 7, I finally asked for help. Senior engineer took one look and said: “Why are you loading everything into memory?”

One question. Saved me 3 weeks.

Pride is expensive. Humility is free.

The Aftermath

We finally completed the migration. December 3rd. 3 months and 2 days after starting.

The results:

✅ Query performance improved 40x (8 seconds → 200ms) ✅ Reporting went from “impossible” to “easy” ✅ Data consistency went from “lol” to “enforced by the database” ✅ Bugs related to data races dropped 90% ✅ Business team can now write their own SQL queries (they learned SQL just for this)

❌ It took 3 months instead of 1 month ❌ Cost us ~$80K in engineering time ❌ I lost 15 pounds from stress (not the fun way) ❌ My sleep schedule is still messed up ❌ My therapist knows too much about database schemas now

Was it worth it?

Yes. 100%. We should’ve done it 2 years ago.

Would I do it again?

Only if you double my salary and give me therapy vouchers upfront.

If You’re Considering This Migration

Here’s my honest advice:

✅ Do it if:

You’re doing more reads than writes (90%+ reads)
You need real transactions
Your data has consistent structure (or should have)
You need complex queries or reporting
You have time and resources to do it right

❌ Don’t do it if:

You’re truly doing rapid prototyping (early startup)
Your data is genuinely unstructured (logs, events, real-time data)
You don’t have time to do it properly
MongoDB is working fine (don’t fix what ain’t broke)
You think it’ll be quick (it won’t)

📋 If you decide to do it:

Budget 3x your estimate

If you think 1 month, plan for 3
If you think $50K, budget $150K
Murphy’s law is real

Understand your data first

Export samples
Catalog variations
Find your edge cases
Budget 30% of time just for this

Build in stages

Don’t Big Bang it
Run in parallel
Incremental rollout
Always have a fallback

Test with production data

Staging with old data = useless
Fresh snapshot or nothing
Test your edge cases

Plan for data loss

You will lose some data
Document what you lose
Get business sign-off
Move on

Get help

Don’t do this alone
Hire a consultant if needed
Ask questions early
Your pride isn’t worth 3 months of hell

The One Lesson That Matters

Here it is. The thing I learned after 3 months of suffering:

Your database is not just a storage layer. It’s a contract between your past self and your future self.

When I write schema-less data to MongoDB, I’m making a promise:

“I’ll remember what this structure means. I’ll keep it consistent. I’ll document the implicit rules.”

I broke that promise. Past me screwed over future me.

PostgreSQL forces you to write that contract upfront. It enforces it. And yeah, it’s more work now.

But future you will thank you. Trust me.

The Tools That Saved My Life

After this nightmare, I collected everything into practical resources so nobody else has to suffer like I did:

If you’re dealing with database migrations or performance issues, these are the resources that actually helped:

📊 SQL Performance Cheatsheet — The Query Mistakes That Kill Databases in Production

After debugging this and many similar failures, I condensed the patterns into a small reference I now keep open while building and debugging. The exact queries, indexes, and patterns that saved our migration.

👉

If this saves you even one production incident, it already paid for itself.

🔧 Backend Failure Playbook — How Real Systems Break and How to Fix Them

I’ve seen too many backend systems fail for the same reasons — and too many teams learn the hard way. I turned those incidents into a practical field manual: real failures, root causes, fixes, and prevention systems.

No theory. No fluff. Just production.

👉

⚠️ Production Failures Playbook — 30 Real Incidents with Timelines, Root Causes, and Fixes

Before we ever deploy to production, we run through a brutal checklist — and when things break, we follow a structured incident process. This playbook includes 30 real incidents that cost companies between $10K-$1M.

👉

Use them if you want to avoid learning these lessons the expensive way.

Want More Real Production Stories?

I write about what actually breaks in production — no fluff, no tutorials, just real engineering.

👉 Free here:

You’ll get:

Real production failures (and fixes)
Lessons learned the expensive way
Tools and cheatsheets I actually use
No BS, no sponsored content, no AI-generated fluff

Join 5,000+ engineers who read my weekly breakdowns.

Final Thoughts

If you’re reading this because you’re considering a MongoDB to PostgreSQL migration, I have one message:

It’s going to be harder than you think. Budget for that. Plan for that. Emotionally prepare for that.

But if you do it right, if you take your time, if you ask for help, if you don’t try to be a hero…

You’ll come out the other side with a better system. Better data. Better sleep.

And maybe, just maybe, you’ll avoid the mental breakdowns.

Or at least keep it to one.

Good luck. You’re gonna need it.

Got migration horror stories? Drop them in the comments. Misery loves company, and we’ve all shipped code we couldn’t understand 3 months later.

Day 1: “This will be easy, just a data migration.” Day 47: I woke up at 3 AM screaming about BSON arrays. My therapist now knows what denormalization means.

It Started With a Slack Message

Day 1: “This will be easy, just a data migration.” Day 47: I woke up at 3 AM screaming about BSON arrays. My therapist now knows what denormalization means.

It Started With a Slack Message

Why We Even Had to Migrate (Spoiler: We F*cked Up)

The Plan (That Immediately Fell Apart)

Mental Breakdown 1: The Schema That Made Me Cry

The Migration Script From Hell

The Test That Took 40 Hours

Mental Breakdown #2: The Weekend That Never Ended

The Things That Saved Us

1. Staging Environment That Matched Production

2. Incremental Rollout

3. Fallback Plan

4. Data Validation Scripts

5. Acceptance That We’d Lose Some Data

What We Learned (The Hard Way)

Lesson 1: Schema-less is a lie

Lesson 2: Migrations are never “just data”

Lesson 3: Optimize for reading your data, not writing it

Lesson 4: Your database should enforce rules

Lesson 5: Ask for help earlier

The Aftermath

If You’re Considering This Migration

✅ Do it if:

❌ Don’t do it if:

📋 If you decide to do it:

The One Lesson That Matters

The Tools That Saved My Life

📊 SQL Performance Cheatsheet — The Query Mistakes That Kill Databases in Production

🔧 Backend Failure Playbook — How Real Systems Break and How to Fix Them

⚠️ Production Failures Playbook — 30 Real Incidents with Timelines, Root Causes, and Fixes

Want More Real Production Stories?

Final Thoughts

Similar Posts