Performance issues in backend systems rarely come from one obvious bug.
In our production system, a high-traffic API showed:
- 📉 P95 latency ~ 820ms
- 📈 Frequent database CPU spikes
- ⛔ Increased timeout rates during traffic bursts
After a structured optimization process, we reduced:
- ⚡ P95 latency: 820ms → 310ms
- 🧠 Database CPU usage reduced by ~40%
- 📊 Timeout rate reduced by ~70%
Here’s the exact breakdown of what we did.
1️⃣ Step One: Measure Before Optimizing
Before touching any code, we collected:
- P50 / P95 / P99 latency
- Slow query logs
- DB execution plans
- CPU & memory metrics
- Event loop lag
We added a lightweight response-time logger:
app.use((req, res, next) => {
const start = process.hrtime.bigint();
res.on("finish", () => {
const duration =
Numb...
Performance issues in backend systems rarely come from one obvious bug.
In our production system, a high-traffic API showed:
- 📉 P95 latency ~ 820ms
- 📈 Frequent database CPU spikes
- ⛔ Increased timeout rates during traffic bursts
After a structured optimization process, we reduced:
- ⚡ P95 latency: 820ms → 310ms
- 🧠 Database CPU usage reduced by ~40%
- 📊 Timeout rate reduced by ~70%
Here’s the exact breakdown of what we did.
1️⃣ Step One: Measure Before Optimizing
Before touching any code, we collected:
- P50 / P95 / P99 latency
- Slow query logs
- DB execution plans
- CPU & memory metrics
- Event loop lag
We added a lightweight response-time logger:
app.use((req, res, next) => {
const start = process.hrtime.bigint();
res.on("finish", () => {
const duration =
Number(process.hrtime.bigint() - start) / 1_000_000;
console.log(`${req.method} ${req.url} - ${duration.toFixed(2)}ms`);
});
next();
});
🔎 Observation
- ~68% latency was database time
- ~20% was repeated identical queries
- Remaining was serialization + network overhead
Conclusion: 🧩 Primary bottleneck = Database layer.
2️⃣ Query Optimization (Biggest Impact 🔥)
❌ Problem 1: Missing Composite Index
Original query:
SELECT *
FROM orders
WHERE user_id = $1
AND status = 'completed'
ORDER BY created_at DESC
LIMIT 20;
Execution plan showed:
- Sequential scan
- High disk I/O
- Large row filtering
✅ Fix: Composite Index
CREATE INDEX idx_orders_user_status_created
ON orders(user_id, status, created_at DESC);
📊 Result
- Query time: 180ms → 35ms
- Removed full table scan
- Reduced disk reads significantly
Indexing alone reduced endpoint latency by ~25%.
❌ Problem 2: Over-Fetching Data
Original:
SELECT * FROM users WHERE id = $1;
But API only needed:
- name
- profile_picture
✅ Optimized Query
SELECT name, profile_picture
FROM users
WHERE id = $1;
*🎯 Impact *
- Reduced payload size
- Lower memory allocation
- Faster JSON serialization
Small change. Measurable gain.
❌ Problem 3: N+1 Query Pattern
Original logic:
const orders = await getOrders(userId);
for (const order of orders) {
order.items = await getItems(order.id);
}
Under 20 orders → 21 queries.
✅ Optimized Join Query
SELECT o.id, o.created_at, i.product_id, i.quantity
FROM orders o
LEFT JOIN order_items i ON i.order_id = o.id
WHERE o.user_id = $1;
📈 Result
Drastically reduced DB round trips
Improved P95 latency stability
3️⃣ Introducing Multi-Layer Caching 🧠
Important principle:
⚠️ Do NOT cache blindly. Cache only:
- Read-heavy endpoints
- Expensive aggregations
- Low-volatility data
🟢 Layer 1: In-Memory Cache (Short TTL)
Used for ultra-hot endpoints.
import NodeCache from "node-cache";
const cache = new NodeCache({ stdTTL: 30 });
async function getUserProfile(userId) {
const key = `user:${userId}`;
const cached = cache.get(key);
if (cached) return cached;
const data = await fetchFromDB(userId);
cache.set(key, data);
return data;
}
Use case:
- Frequently accessed profile endpoints
- Dashboard metadata
Latency improvement: ~10–15%
🟡 Layer 2: Redis Distributed Cache
Used for:
- Aggregated stats
- Leaderboards
- Expensive computations
async function getDashboardStats(userId) {
const key = `dashboard:${userId}`;
const cached = await redis.get(key);
if (cached) return JSON.parse(cached);
const data = await computeStats(userId);
await redis.set(key, JSON.stringify(data), "EX", 120);
return data;
}
📊 Impact
- Significant DB load reduction
- Improved P95 & P99 latency
- Reduced spike sensitivity
4️⃣ Cache Invalidation Strategy ⚠️
Caching without invalidation creates stale data problems.
We used:
- Event-based invalidation after writes
- Short TTL for volatile data
- Versioned keys when necessary
Example:
await redis.del(`dashboard:${userId}`);
Triggered immediately after order creation.
We avoided:
- Long-lived static caches
- Global flushes
- Blind TTL-only approaches
5️⃣ Connection Pool Optimization 🔌
We observed:
- DB pool exhaustion during spikes
- Increased wait time for connections
Original pool size: 10 Optimized to: 25 (after validating DB capacity)
📈 Result
- Reduced queuing delay
- Stabilized latency under burst traffic
6️⃣ JSON Serialization Optimization ⚡
Large nested objects increased serialization cost.
Instead of returning deeply populated objects, we:
- Reduced unnecessary fields
- Flattened response structure
- Avoided over-population
Serialization overhead dropped ~8–10%.
7️⃣ Final Optimized Flow 🏗️
Client
↓
Load Balancer
↓
Node.js (Clustered)
↓
In-Memory Cache
↓
Redis
↓
Optimized Indexed Queries
↓
Database
Thanks :).....