Press enter or click to view image in full size
Day 1: “This will be easy, just a data migration.” Day 47: I woke up at 3 AM screaming about BSON arrays. My therapist now knows what denormalization means.
12 min read1 day ago
–
It Started With a Slack Message
Tuesday, 10:47 AM.
Our CTO drops this in #engineering:
“We’re migrating to PostgreSQL. MongoDB isn’t cutting it anymore. Should be straightforward.”
I looked at my coffee. It was still hot. In 30 seconds, my entire week just got destroyed.
“Straightforward,” he said. Like moving 300GB of nested, schema-less, inconsistently structured data from one database to another is “straightforward.”
Master Git in Minutes — 35 Essential Commands & Real-World Workflows
I’ve been building systems for 8 yea…
Press enter or click to view image in full size
Day 1: “This will be easy, just a data migration.” Day 47: I woke up at 3 AM screaming about BSON arrays. My therapist now knows what denormalization means.
12 min read1 day ago
–
It Started With a Slack Message
Tuesday, 10:47 AM.
Our CTO drops this in #engineering:
“We’re migrating to PostgreSQL. MongoDB isn’t cutting it anymore. Should be straightforward.”
I looked at my coffee. It was still hot. In 30 seconds, my entire week just got destroyed.
“Straightforward,” he said. Like moving 300GB of nested, schema-less, inconsistently structured data from one database to another is “straightforward.”
Master Git in Minutes — 35 Essential Commands & Real-World Workflows
I’ve been building systems for 8 years. I’ve survived production fires, deleted databases, and that one time we accidentally DDoS’d ourselves. But this migration? This broke me in ways I didn’t know were possible.
Here’s how a “simple data migration” turned into 3 months of hell, 2 genuine mental breakdowns, and one lesson I’ll never forget.
Why We Even Had to Migrate (Spoiler: We F*cked Up)
Let me rewind 2 years.
We were a hot new startup. MongoDB was cool. Everyone used MongoDB. “Schema-less!” they said. “Flexible!” they promised. “Just throw your data in and iterate fast!”
So we did. Boy, did we ever.
Our MongoDB setup looked like this:
// Users collection - seemed fine at first{ _id: ObjectId("..."), name: "John", email: "john@example.com", settings: { theme: "dark", notifications: { email: true, push: false, // Sometimes this field exists, sometimes it doesn't sms: true } }, // Array of subdocuments - this is where things got messy orders: [ { orderId: "ORDER-123", items: [ { productId: "PROD-1", quantity: 2 }, { productId: "PROD-2", quantity: 1 } ], // Sometimes we had this field shippingAddress: "123 Main St", // Sometimes we didn't // Sometimes it was an object instead // Sometimes it was null but the key existed // We had no idea anymore } ]}
At first, this was great. Ship fast, iterate, move on.
Then we hit 100,000 users.
Then 500,000.
Then… problems.
What started breaking:
Queries that took 50ms now took 8 seconds
- Nested array lookups destroyed performance
- Index on
orders.items.productId? Good luck with that
Data inconsistencies everywhere
- Some users had
settings.notificationsas an object - Others had it as a boolean
- A few had it as a string (how?!)
- Nobody knew which version was “correct”
Reporting was impossible
- Business team: “Give us total revenue by product”
- Me: “Sure, just let me write a 200-line aggregation pipeline that might crash MongoDB”
Transactions didn’t exist (this was pre-MongoDB 4.0)
- We had orders that paid but never shipped
- Inventory that decremented but orders that failed
- Money disappeared into the void
The final straw? A bug where users could duplicate their orders by clicking fast enough. No transactions, no locks, just chaos.
MongoDB wasn’t the problem. We were the problem. We used it wrong. And now we had to pay the price.
The Plan (That Immediately Fell Apart)
Our plan was simple:
Week 1: Design PostgreSQL schema Week 2: Write migration scripts Week 3: Test migration on staging Week 4: Migrate production
Simple, right?
Here’s what actually happened:
Week 1: Realized our data was so inconsistent we couldn’t even define a schema Week 2: Spent 5 days just understanding what data we had Week 3–7: Wrote increasingly desperate data cleaning scripts Week 8: First test migration took 40 hours and crashed Week 9–12: Everything was on fire
Mental Breakdown 1: The Schema That Made Me Cry
Designing the PostgreSQL schema should have been easy. Just normalize the data, right?
Wrong.
See, MongoDB lets you be lazy. You don’t think about structure. You just throw data in there and figure it out later.
“Later” had arrived. And we had no idea what structure we’d been throwing.
The **orders** field alone was a nightmare:
- 67% of users had
ordersas an array of objects - 23% had it as a nested object with string keys
- 8% had it as just an order ID string
- 2% had… XML? (HOW?!)
I spent 2 weeks just cataloging the variations. My Excel sheet had 47 tabs. Each tab was a different schema variation we found in production.
One night, around 11 PM, I was staring at my screen. I’d just found our 48th schema variation — someone had stored an entire HTML invoice as base64 inside the orders.notes field.
I closed my laptop. I sat there. I didn’t move for 20 minutes.
My wife found me. “Are you okay?”
“No,” I said. “I’m not.”
She hugged me. I cried a little. Not gonna lie.
The lesson hit me like a truck: Schema-less doesn’t mean no schema. It means you have infinite schemas. And one day, you’ll have to deal with all of them.
The Migration Script From Hell
After finally defining a schema, I had to write the migration script.
How hard could it be? Read from MongoDB, write to PostgreSQL. Basic ETL, right?
Oh, sweet summer child.
Problem #1: ObjectId to UUID
MongoDB uses ObjectId. PostgreSQL uses… whatever you want, but probably UUID or integers.
Easy conversion? Nope.
# Seemed simpledef convert_object_id(obj_id): return str(obj_id)# Until we realized...# ObjectIds are 12 bytes (24 hex chars)# UUIDs are 16 bytes (36 chars with dashes)# Our foreign keys were now broken# And nobody could tell which records matched anymore
We ended up creating a mapping table. 300GB of mapping table. For 2 weeks, we ran two databases in parallel, syncing IDs. It was horrible.
Problem #2: Denormalization
In MongoDB, we had:
{ userId: "123", orders: [ { orderId: "A", products: [{ id: "P1" }, { id: "P2" }] }, { orderId: "B", products: [{ id: "P3" }] } ]}
In PostgreSQL, this becomes:
-- 3 tablesusers (user_id)orders (order_id, user_id) order_items (order_id, product_id)-- And suddenly...-- 1 MongoDB document = 50 PostgreSQL rows
Our 300GB MongoDB database exploded into 890GB in PostgreSQL. Nobody saw that coming.
Problem #3: Nested Arrays
MongoDB: “Store arrays in arrays in arrays! It’s fine!” PostgreSQL: “lol no”
We had documents like this:
{ teams: [ { members: [ { roles: ["admin", "editor", "viewer"], permissions: { canEdit: true, resources: ["doc1", "doc2", "doc3"] } } ] } ]}
To flatten this into PostgreSQL tables, I created:
- teams
- team_members
- member_roles
- member_permissions
- permission_resources
One MongoDB document → 5 PostgreSQL tables → 200+ rows.
The script to handle this took 2 weeks to write. It was 1,800 lines of Python. It worked… mostly.
Problem #4: Type Inconsistencies
Remember how I said MongoDB is schema-less? This is where it hurts.
// In MongoDB, all of these exist simultaneously{ age: 25 } // number{ age: "25" } // string { age: "twenty-five" } // oh god why{ age: null } // null{ age: "" } // empty string{ age: undefined } // JavaScript moment{ } // field doesn't exist
My migration script had to handle all of these. For every field. Across 200+ fields.
I wrote more try-catch blocks than actual logic.
The Test That Took 40 Hours
First test migration on staging. Click “run.”
I go home. It should finish overnight.
Next morning: Still running.
That afternoon: Still running.
Next day: Still running.
40 hours later: Done.
“Great! Let’s check the data.”
SELECT COUNT(*) FROM users;-- 487,234-- Wait... we have 500,000 users in MongoDBSELECT COUNT(*) FROM orders;-- 89,123-- We have 2.3 million orders in MongoDB
Where did 12,766 users go? Where did 2.2 million orders go?
3 days of debugging later:
- Script crashed on emoji in names (Unicode encoding issues)
- Script crashed on dates before 1970 (PostgreSQL doesn’t like that)
- Script crashed on documents over 16MB (BSON limit)
- Script silently skipped 40,000 documents due to a bug in error handling
I wanted to quit. I genuinely considered quitting.
Mental Breakdown #2: The Weekend That Never Ended
Week 9. We’re trying our 5th full migration test.
It’s Saturday morning. I’m supposed to be at a family thing. My phone buzzes.
“Migration crashed again. Server ran out of memory.”
I close my eyes. I breathe. I drive to the office.
The problem:
Our migration script loaded all 300GB into memory before writing to PostgreSQL. Why? Because I’m an idiot who forgot to stream the data.
I spent the entire weekend rewriting it. No sleep Saturday night. 4 hours Sunday. Back to the office Monday looking like a zombie.
My manager: “You look terrible.” Me: “I feel terrible.” Manager: “Take a day off.” Me: “If I stop now, I’ll never come back.”
That week, I:
- Rewrote the entire migration pipeline to stream data
- Added progress tracking (why didn’t I do this from the start?)
- Implemented checkpointing so crashes could resume
- Learned more about PostgreSQL COPY than anyone should
The new script: 400 lines instead of 1,800. Faster. Better. Actually worked.
It took me 9 weeks to write what should’ve taken 1 week.
Why? Because I didn’t understand the problem. I thought it was a technical problem. It wasn’t.
It was a data problem. And data problems are always worse than code problems.
The Things That Saved Us
After 10 weeks of suffering, we finally got it working. Here’s what actually helped:
1. Staging Environment That Matched Production
We initially tested on a staging database that was 6 months old. Turns out, production had evolved. A lot.
When we synced staging to a fresh snapshot: Found 30 new data variations we’d missed.
Lesson: Test with real, current production data. Or you’re testing nothing.
2. Incremental Rollout
We didn’t flip a switch. We ran both databases in parallel for 4 weeks:
- Week 1: 5% of reads from PostgreSQL
- Week 2: 20% of reads
- Week 3: 50% of reads + 10% of writes
- Week 4: 100% reads, 50% writes
- Week 5: 100% everything, MongoDB on standby
Found bugs in production that we never saw in testing. This saved our ass.
3. Fallback Plan
We kept MongoDB running read-only for 2 months after full migration.
Found a bug? Fall back to MongoDB.
Caught 3 critical issues this way. Without fallback, we’d have been screwed.
4. Data Validation Scripts
After every migration, we ran validation:
-- Count everythingSELECT COUNT(*) FROM users;SELECT COUNT(*) FROM orders;SELECT COUNT(*) FROM order_items;-- Compare totals-- Check random samples-- Verify foreign keys
Found mismatches every single time. Fixed them before going live.
5. Acceptance That We’d Lose Some Data
Hard truth: We lost data.
- 3,284 documents were so corrupted we couldn’t migrate them
- 147 users had invalid data we couldn’t parse
- Some legacy stuff from 2019 that nobody understood
We documented everything. Notified affected users. Moved on.
Perfect is the enemy of done. We shipped at 99.4% data integrity. Good enough.
What We Learned (The Hard Way)
Lesson 1: Schema-less is a lie
MongoDB isn’t schema-less. You still have a schema. It’s just implicit, inconsistent, and exists only in your code.
One day, you’ll need an actual schema. And you’ll pay for every shortcut you took.
Lesson 2: Migrations are never “just data”
They’re about:
- Business logic that assumes the old structure
- Code that depends on the old behavior
- Users who expect things to work the same way
- All your shortcuts coming back to haunt you
Data is the easy part. Everything else is hell.
Lesson 3: Optimize for reading your data, not writing it
MongoDB made writes easy. Reads became impossible.
PostgreSQL makes reads easy. Writes take more planning.
Which do you do more: write data or read data?
For us, it was 95% reads. We optimized for the 5%. Stupid.
Lesson 4: Your database should enforce rules
MongoDB let us store anything. So we stored everything. Badly.
PostgreSQL forces you to think. Define your types. Set your constraints. Enforce your rules.
This felt limiting at first. Then we realized: constraints are documentation that computers enforce.
Lesson 5: Ask for help earlier
I spent 6 weeks struggling alone because I didn’t want to admit I was drowning.
Week 7, I finally asked for help. Senior engineer took one look and said: “Why are you loading everything into memory?”
One question. Saved me 3 weeks.
Pride is expensive. Humility is free.
The Aftermath
We finally completed the migration. December 3rd. 3 months and 2 days after starting.
The results:
✅ Query performance improved 40x (8 seconds → 200ms) ✅ Reporting went from “impossible” to “easy” ✅ Data consistency went from “lol” to “enforced by the database” ✅ Bugs related to data races dropped 90% ✅ Business team can now write their own SQL queries (they learned SQL just for this)
❌ It took 3 months instead of 1 month ❌ Cost us ~$80K in engineering time ❌ I lost 15 pounds from stress (not the fun way) ❌ My sleep schedule is still messed up ❌ My therapist knows too much about database schemas now
Was it worth it?
Yes. 100%. We should’ve done it 2 years ago.
Would I do it again?
Only if you double my salary and give me therapy vouchers upfront.
If You’re Considering This Migration
Here’s my honest advice:
✅ Do it if:
- You’re doing more reads than writes (90%+ reads)
- You need real transactions
- Your data has consistent structure (or should have)
- You need complex queries or reporting
- You have time and resources to do it right
❌ Don’t do it if:
- You’re truly doing rapid prototyping (early startup)
- Your data is genuinely unstructured (logs, events, real-time data)
- You don’t have time to do it properly
- MongoDB is working fine (don’t fix what ain’t broke)
- You think it’ll be quick (it won’t)
📋 If you decide to do it:
Budget 3x your estimate
- If you think 1 month, plan for 3
- If you think $50K, budget $150K
- Murphy’s law is real
Understand your data first
- Export samples
- Catalog variations
- Find your edge cases
- Budget 30% of time just for this
Build in stages
- Don’t Big Bang it
- Run in parallel
- Incremental rollout
- Always have a fallback
Test with production data
- Staging with old data = useless
- Fresh snapshot or nothing
- Test your edge cases
Plan for data loss
- You will lose some data
- Document what you lose
- Get business sign-off
- Move on
Get help
- Don’t do this alone
- Hire a consultant if needed
- Ask questions early
- Your pride isn’t worth 3 months of hell
The One Lesson That Matters
Here it is. The thing I learned after 3 months of suffering:
Your database is not just a storage layer. It’s a contract between your past self and your future self.
When I write schema-less data to MongoDB, I’m making a promise:
“I’ll remember what this structure means. I’ll keep it consistent. I’ll document the implicit rules.”
I broke that promise. Past me screwed over future me.
PostgreSQL forces you to write that contract upfront. It enforces it. And yeah, it’s more work now.
But future you will thank you. Trust me.
The Tools That Saved My Life
After this nightmare, I collected everything into practical resources so nobody else has to suffer like I did:
If you’re dealing with database migrations or performance issues, these are the resources that actually helped:
📊 SQL Performance Cheatsheet — The Query Mistakes That Kill Databases in Production
After debugging this and many similar failures, I condensed the patterns into a small reference I now keep open while building and debugging. The exact queries, indexes, and patterns that saved our migration.
👉
If this saves you even one production incident, it already paid for itself.
🔧 Backend Failure Playbook — How Real Systems Break and How to Fix Them
I’ve seen too many backend systems fail for the same reasons — and too many teams learn the hard way. I turned those incidents into a practical field manual: real failures, root causes, fixes, and prevention systems.
No theory. No fluff. Just production.
👉
⚠️ Production Failures Playbook — 30 Real Incidents with Timelines, Root Causes, and Fixes
Before we ever deploy to production, we run through a brutal checklist — and when things break, we follow a structured incident process. This playbook includes 30 real incidents that cost companies between $10K-$1M.
👉
Use them if you want to avoid learning these lessons the expensive way.
Want More Real Production Stories?
I write about what actually breaks in production — no fluff, no tutorials, just real engineering.
👉 Free here:
You’ll get:
- Real production failures (and fixes)
- Lessons learned the expensive way
- Tools and cheatsheets I actually use
- No BS, no sponsored content, no AI-generated fluff
Join 5,000+ engineers who read my weekly breakdowns.
Final Thoughts
If you’re reading this because you’re considering a MongoDB to PostgreSQL migration, I have one message:
It’s going to be harder than you think. Budget for that. Plan for that. Emotionally prepare for that.
But if you do it right, if you take your time, if you ask for help, if you don’t try to be a hero…
You’ll come out the other side with a better system. Better data. Better sleep.
And maybe, just maybe, you’ll avoid the mental breakdowns.
Or at least keep it to one.
Good luck. You’re gonna need it.
Got migration horror stories? Drop them in the comments. Misery loves company, and we’ve all shipped code we couldn’t understand 3 months later.