Part 3: The Quick Win — Measuring the Baseline and Introducing Redis Cache
In Part 1, we built the infrastructure. In Part 2, we seeded 5,000 properties into an intentionally naive database schema. Today, we face the consequences — and we fix them. Or at least, we appear to.
If you're jumping in here, start with Part 1 and Part 3: The Quick Win — Measuring the Baseline and Introducing Redis Cache In Part 1, we built the infrastructure. In Part 2, we seeded 5,000 properties into an intentionally naive database schema. Today, we face the consequences — and we fix them. Or at least, we appear to. If you're jumping in here, start with Part 1 and Part 2 — they set up the entire stack and data layer this post builds on. If you're continuing from Part 2, you already have 5,000 properties in PostgreSQL, zero indexes beyond primary keys, and a database schema that's about to reveal exactly why caching exists. Today we build the API layer, measure how slow it is, and introduce Redis. By the end of this post, you'll see a 95% speed improvement — and understand exactly why that improvement comes with a hidden cost. Part A: The Foundation — Building the API Layer Before we can measure anything, we need endpoints that actually serve data. This is the DRF (Django REST Framework) layer — the thing we test, the thing we cache, and the thing that exposes every inefficiency we built into Part 2's database. Step 1: The Serializers Serializers turn Django models into JSON. We're using nested serializers here — which is a common pattern in REST APIs and also the primary trigger for the N+1 query problem we're about to demonstrate. Create Nested serializers that expose the full object graph:
Property → Agent → Office
Property → Location This is intentionally naive. The nesting triggers N+1 queries because
Django fetches each related object separately instead of in a single JOIN.
from rest_framework import serializers
from .models import Office, Agent, Location, Property class OfficeSerializer(serializers.ModelSerializer):
class Meta:
model = Office
fields = [‘id’, ‘name’, ‘city’, ‘phone’] class AgentSerializer(serializers.ModelSerializer):
office = OfficeSerializer(read_only=True) class LocationSerializer(serializers.ModelSerializer):
class Meta:
model = Location
fields = [‘id’, ‘city’, ‘state’, ‘zip_code’] class PropertySerializer(serializers.ModelSerializer):
location = LocationSerializer(read_only=True)
agent = AgentSerializer(read_only=True) The nesting is the key detail here. When the serializer renders a Property, it also renders the Agent, which also renders the Office. Each level of nesting is a separate database query — unless we do something about it. We won't. Not yet. That's Part 4's job. Step 2: The Views — Three Endpoints, Three Stories We're creating three different views of the same data. This isn't just for testing — it's the scientific method applied to caching. We need a control group (naive), an experimental group (cached), and a hint at what comes next (optimized). Create Three views of the same data: from django.utils.decorators import method_decorator
from django.views.decorators.cache import cache_page
from rest_framework import generics
from .models import Property
from .serializers import PropertySerializer class PropertyListView(generics.ListAPIView):
“”“
The naive baseline. No caching. No query optimization.
This is the ”before“ picture.
”“”
queryset = Property.objects.all().order_by(‘-created_at’)
serializer_class = PropertySerializer class CachedPropertyListView(PropertyListView):
“”“
The cached version. Same queryset as PropertyListView, but with
@cache_page(60) applied. This caches the entire HTTP response
(headers + JSON body) in Redis for 60 seconds. class OptimizedPropertyListView(generics.ListAPIView):
“”“
The database-optimized version. No cache, but uses select_related
to fetch Property + Agent + Office in a single query with JOINs
instead of 41 separate queries. The Step 3: The URLs Create Three routes for three views.
We keep them under different paths so we can test them side-by-side
without redeploying code or toggling settings.
from django.urls import path
from .views import PropertyListView, CachedPropertyListView, OptimizedPropertyListView urlpatterns = [
path(‘properties/live/naive/’, PropertyListView.as_view(), name=‘property-naive’),
path(‘properties/cached/’, CachedPropertyListView.as_view(), name=‘property-cached’),
path(‘properties/live/optimized/’, OptimizedPropertyListView.as_view(), name=‘property-optimized’),
]
Update from django.contrib import admin
from django.urls import path, include urlpatterns = [
path(‘admin/’, admin.site.urls),
path(‘api/’, include(‘housing.urls’)), # ← Add this line
]
Step 4: Enable SQL Logging We need to see every query Django fires. Add this to This prints every SQL query to the Docker logs. You'll see the N+1 problem in real time. Step 5: Restart and Verify # Test that the endpoint exists
curl http://localhost:8000/api/properties/live/naive/ | jq ‘.results[0].title’
If you see a property title, the API is alive. Part B: The Instrumentation — Define What We Measure An engineer without metrics is just a person with an opinion. Before we optimize anything, we agree on what "fast" and "slow" mean in this context. What We Measure Response latency (ms) — How long the HTTP request takes, end to end. Measured from the moment Query count — How many SQL queries Django fires to assemble the JSON response. A well-optimized endpoint should use 1-3 queries. Our naive endpoint uses 41. Query time (ms) — The total time PostgreSQL spends executing those queries. This is separate from serialization time, network time, and Python overhead. Cache hits/misses — Did Redis serve this response, or did we go to the database? A cache hit means zero database queries. A cache miss means we pay the full cost. What Endpoint We Measure We're testing What Tools We Use curl with Django SQL logging — We enabled this in Step 4. Watch the Docker logs during a request: You'll see every query scroll past. redis-cli monitor — Real-time stream of every command Redis receives. Open this in a separate terminal and leave it running: When you hit the cached endpoint, you'll see EXPLAIN ANALYZE — PostgreSQL's query planner. Shows exactly what the database does with a query — sequential scan vs index scan, estimated cost, actual time. Look for Locust — Load testing and visualization. This is the tool that turns numbers into graphs. We'll install it shortly. Part C: The Baseline — Measure the Slow Path This is the "before" picture. We hit the naive endpoint, we watch it work, and we document every inefficiency. Test 1: The Single Request Expected output: Your number will vary depending on your machine, but it should be somewhere between 0.060s and 0.080s. That's 60-80 milliseconds for 20 rows of JSON. Test 2: Count the Queries Watch the Docker logs during the request: Hit the endpoint again. Scroll through the logs. Count the Total: 1 query for properties + 20 queries for agents + 20 queries for offices + 20 queries for locations = 61 queries to render 20 properties. This is the N+1 problem in its purest form. The serializer asks for each property's agent. Django fetches each agent separately. The serializer asks for each agent's office. Django fetches each office separately. It's not a bug — it's the default behavior when you use nested serializers without query optimization. Test 3: The Query Plan Open a PostgreSQL shell: Run the main query with You'll see output like this: The key line is Exit the PostgreSQL shell ( Test 4: The Bombardment — Simulating Load A single request tells you latency. Multiple concurrent requests tell you scalability. Create a simple bash script to simulate 50 users hitting the endpoint at the same time. Create for i in {1..50}; do
curl -o /dev/null -s -w “Request $i: %{time_total}s\n” http://localhost:8000/api/properties/live/naive/ &
done
wait
Make it executable and run it: You'll see output like this: Notice the pattern: the first few requests complete in ~0.5 seconds. The middle batch climbs to ~1.0 seconds. The final batch hits ~1.4 seconds. This is the multiplication effect. Each request has to wait for the previous ones to finish. PostgreSQL's connection pool is finite. When 50 requests arrive simultaneously, the 50th request waits in a queue while the first 49 execute. The Baseline Results This is the number we beat. Part D: The Load Testing Tool — Installing Locust The bash script gives us numbers. Locust gives us graphs. And graphs tell stories that tables can't. Step 1: Install Locust Step 2: Create the Locust Test File Create Locust test configuration for the housing portal API.
Simulates real users hitting both the naive and cached endpoints. Run with:
locust -f locustfile.py –host=http://localhost:8000 Then open http://localhost:8089 in your browser.
from locust import HttpUser, task, between class NaiveUser(HttpUser):
“”“
Simulates a user hitting the unoptimized endpoint.
This is the baseline — no cache, no query optimization.
”“”
wait_time = between(1, 2) # Wait 1-2 seconds between requests
@task
def get_properties(self):
self.client.get(“/api/properties/live/naive/”, name=“Naive (No Cache)”) class CachedUser(HttpUser):
“”“
Simulates a user hitting the cached endpoint.
First request is a cache miss. Subsequent requests are cache hits.
”“”
wait_time = between(1, 2) Step 3: Run Locust Open Step 4: Configure the Test In the Locust UI: Number of users: 50 Click Start swarming. Step 5: Understanding the Locust Interface Locust shows you three tabs: Statistics — A table showing median, average, min, max, and percentile response times. The columns that matter: Charts — Live graphs showing: Failures — Any HTTP errors (500, 404, timeouts). Should be empty for this test. Step 6: What to Capture Run the test twice — once for the naive endpoint, once for the cached endpoint. For each run, capture screenshots of: The Statistics tab after the test stabilizes (after all 50 users have spawned and made at least 5-10 requests each). This gives you the median, 95th, and 99th percentile numbers. The Charts tab showing the response time graph over the full duration of the test. You want to see the curve — flat for cached, climbing for naive. Step 7: Stop the Test Click Stop in the Locust UI. The test stops immediately. The statistics remain on screen so you can review them. Part E: The Cache — Introducing Redis Now we flip the switch. Same data. Same serializer. One decorator. Everything changes. How Django's First request (cache miss): Second request (cache hit): After 60 seconds: The cache expires. The next request is a cache miss again. The cycle repeats. Test It Manually Hit the cached endpoint once to prime the cache: First request (cache miss): Hit it again immediately: Second request (cache hit): That's 4 milliseconds. The database wasn't touched. Redis served a 50KB JSON string from RAM in 4ms. Inspect the Cache Key You'll see something like: That's Django's auto-generated cache key. The components: Monitor Redis in Real Time Open a second terminal and run: Leave this running. In your first terminal, hit the cached endpoint: In the Hit it again. You'll see the same Wait 60 seconds. Hit it again. You'll see: The This is the cache in action. Every command is visible. This is your debugging tool when cache behavior gets weird. Part F: The Comparison — Before vs After Same test. Same endpoint pattern. Different results. The Warm Cache Test The bash bombardment script from earlier tests the "cold start" problem — what happens when the cache is empty and 50 users hit it simultaneously. Now we test the opposite: what happens when the cache is already warm? Prime the cache: Now run the bombardment test against the cached endpoint: Expected output: All requests complete in under 10ms. No degradation. No queuing. Redis doesn't care how many requests hit it simultaneously — it's single-threaded and fast enough that even the 50th request feels instant. The Locust Comparison : Mixed User Update class CachedUser(HttpUser):
“”“
Most of your traffic should be cached. We weight this 5x higher.
”“”
weight = 5 # 5x more likely to spawn than NaiveUser
wait_time = between(1, 2) class NaiveUser(HttpUser):
“”“
A small amount of traffic hits the naive endpoint for comparison.
”“”
weight = 1
wait_time = between(2, 5) Run Locust again: Start the test with 100 users total (the weighting means ~83 users will hit cached, ~17 will hit naive). Prime the cache first by hitting the cached endpoint once manually before starting the Locust test: Let Locust run for 2-3 minutes. Capture the results. 📊 [Locust Screenshot: Side-by-Side Statistics] 📊 [Locust Screenshot: Side-by-Side Response Time Chart]
housing/serializers.py:
"""
housing/serializers.py
<span class="k">class</span> <span class="nc">Meta</span><span class="p">:</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">Agent</span>
<span class="n">fields</span> <span class="o">=</span> <span class="p">[</span><span class="sh">'</span><span class="s">id</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">name</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">email</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">phone</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">office</span><span class="sh">'</span><span class="p">]</span>
<span class="k">class</span> <span class="nc">Meta</span><span class="p">:</span>
<span class="n">model</span> <span class="o">=</span> <span class="n">Property</span>
<span class="n">fields</span> <span class="o">=</span> <span class="p">[</span>
<span class="sh">'</span><span class="s">id</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">title</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">description</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">property_type</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">price</span><span class="sh">'</span><span class="p">,</span>
<span class="sh">'</span><span class="s">bedrooms</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">bathrooms</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">location</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">agent</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">status</span><span class="sh">'</span><span class="p">,</span>
<span class="sh">'</span><span class="s">view_count</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">created_at</span><span class="sh">'</span><span class="p">,</span>
<span class="p">]</span>
housing/views.py:
"""
housing/views.py
“”“
First request: cache miss, hits the database, saves to Redis.
Subsequent requests: cache hit, served from Redis, zero DB queries.
</span><span class="sh">"""</span>
<span class="nd">@method_decorator</span><span class="p">(</span><span class="nf">cache_page</span><span class="p">(</span><span class="mi">60</span><span class="p">))</span>
<span class="k">def</span> <span class="nf">dispatch</span><span class="p">(</span><span class="n">self</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span>
<span class="k">return</span> <span class="nf">super</span><span class="p">().</span><span class="nf">dispatch</span><span class="p">(</span><span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">)</span>
This is a preview of Part 4. We</span><span class="sh">'</span><span class="s">re including it here so you can
compare </span><span class="sh">"</span><span class="s">fast cache</span><span class="sh">"</span><span class="s"> vs </span><span class="sh">"</span><span class="s">fast database</span><span class="sh">"</span><span class="s"> side by side.
</span><span class="sh">"""</span>
<span class="n">queryset</span> <span class="o">=</span> <span class="n">Property</span><span class="p">.</span><span class="n">objects</span><span class="p">.</span><span class="nf">select_related</span><span class="p">(</span>
<span class="sh">'</span><span class="s">agent__office</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">location</span><span class="sh">'</span>
<span class="p">).</span><span class="nf">all</span><span class="p">().</span><span class="nf">order_by</span><span class="p">(</span><span class="sh">'</span><span class="s">-created_at</span><span class="sh">'</span><span class="p">)</span>
<span class="n">serializer_class</span> <span class="o">=</span> <span class="n">PropertySerializer</span>
@method_decorator(cache_page(60)) line is the entire cache implementation. One decorator. 60 seconds. That's the "quick win" — and the reason this post exists.
housing/urls.py:
"""
housing/urls.py
core/urls.py to include the housing app's routes:
"""
core/urls.py
"""
core/settings.py:
# Add this anywhere in settings.py, typically near the bottom
LOGGING = {
'version': 1,
'disable_existing_loggers': False,
'handlers': {
'console': {
'class': 'logging.StreamHandler',
},
},
'loggers': {
'django.db.backends': {
'level': 'DEBUG',
'handlers': ['console'],
},
},
}
docker compose restart backend
curl sends the request to the moment it receives the full response.
GET /api/properties/live/naive/ as the baseline and GET /api/properties/cached/ as the optimized version. Both return 20 results per page (DRF's default pagination). Both use the exact same serializer. The only difference is the @cache_page decorator.
--write-out — Terminal-based response timing. One command, one number. No install needed.
curl -o /dev/null -s -w "Total time: %{time_total}s\n" http://localhost:8000/api/properties/live/naive/
docker compose logs -f backend | grep "SELECT"
docker compose exec redis redis-cli monitor
GET and SET commands appear.
docker compose exec db psql -U user -d housing_db
EXPLAIN ANALYZE
SELECT * FROM housing_property
ORDER BY created_at DESC
LIMIT 20;
Seq Scan on housing_property. That's a full table scan. No index. PostgreSQL reads every single row to find the 20 most recent ones.
curl -o /dev/null -s -w "Total time: %{time_total}s\n" http://localhost:8000/api/properties/live/naive/
Total time: 0.068s or Total time: 0.040257s
docker compose logs -f backend
SELECT statements. You should see:
SELECT ... FROM housing_property ORDER BY created_at DESC LIMIT 20
SELECT ... FROM housing_agent WHERE id = 1
SELECT ... FROM housing_office WHERE id = 1
SELECT ... FROM housing_location WHERE id = 1
SELECT ... FROM housing_agent WHERE id = 2
SELECT ... FROM housing_office WHERE id = 2
SELECT ... FROM housing_location WHERE id = 2
... (repeat 20 times)
docker compose exec db psql -U user -d housing_db
EXPLAIN ANALYZE:
EXPLAIN ANALYZE
SELECT * FROM housing_property
ORDER BY created_at DESC
LIMIT 20;
Limit (cost=XXX..XXX rows=20 width=XXX) (actual time=X.XXX..X.XXX rows=20 loops=1)
-> Sort (cost=XXX..XXX rows=5000 width=XXX) (actual time=X.XXX..X.XXX rows=20 loops=1)
Sort Key: created_at DESC
-> Seq Scan on housing_property (cost=0.00..XXX.XX rows=5000 width=XXX) (actual time=X.XXX..X.XXX rows=5000 loops=1)
Seq Scan on housing_property. That's a sequential scan — PostgreSQL is reading every single row from disk into memory, sorting them, and then taking the first 20. With 5,000 rows, this is tolerable. With 50,000 rows, it's slow. With 500,000 rows, it's a disaster.\q).
bombardment_test.sh in your project root:
#!/bin/bash
# Fires 50 requests in parallel and records each one's response time
chmod +x bombardment_test.sh
./bombardment_test.sh
Request 2: 0.522941s
Request 6: 0.555113s
Request 5: 0.559119s
Request 1: 0.561309s
...
Request 45: 1.261981s
Request 50: 1.261836s
...
Request 32: 1.467066s
Request 25: 1.469146s
Metric
Value (No Cache)
Single request
60-80ms
Query count
61 queries
Query time (estimated)
~40ms
Under load (50 concurrent users)
500ms - 1500ms
Failure rate
0% (slow, but functional)
pip install locust
pip freeze > requirements.txt
locustfile.py in your project root:
"""
locustfile.py
<span class="nd">@task</span>
<span class="k">def</span> <span class="nf">get_properties</span><span class="p">(</span><span class="n">self</span><span class="p">):</span>
<span class="n">self</span><span class="p">.</span><span class="n">client</span><span class="p">.</span><span class="nf">get</span><span class="p">(</span><span class="sh">"</span><span class="s">/api/properties/cached/</span><span class="sh">"</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="sh">"</span><span class="s">Cached (Redis)</span><span class="sh">"</span><span class="p">)</span>
locust -f locustfile.py --host=http://localhost:8000
http://localhost:8089 in your browser. You'll see the Locust web UI.
Spawn rate: 10 users per second
Host: http://localhost:8000 (already set via --host flag)
@cache_page Workscache_page decorator does one thing: it saves the entire HTTP response — headers, status code, JSON body, everything — as a single string in Redis, keyed by the request URL.
/api/properties/cached/
/api/properties/cached/
curl -o /dev/null -s -w "Total time: %{time_total}s\n" http://localhost:8000/api/properties/cached/
Total time: 0.065s
curl -o /dev/null -s -w "Total time: %{time_total}s\n" http://localhost:8000/api/properties/cached/
Total time: 0.004s or Total time: 0.002012s
docker compose exec redis redis-cli keys "*"
1) ":1:views.decorators.cache.cache_page.GET./api/properties/cached/.d41d8cd98f00b204e9800998ecf8427e"
:1: — the Redis database number (we're using database 1)views.decorators.cache.cache_page — the decorator that created this keyGET./api/properties/cached/ — the HTTP method and path.d41d8cd98f00b204e9800998ecf8427e — a hash of the query parameters (empty in this case, but if the URL had ?page=2, the hash would be different)
docker compose exec redis redis-cli monitor
curl -o /dev/null -s http://localhost:8000/api/properties/cached/
monitor terminal, you'll see:
"GET" ":1:views.decorators.cache.cache_page.GET./api/properties/cached/..."
GET command. No SET — because the cache already has it.
"GET" ":1:views.decorators.cache.cache_page..."
"SETEX" ":1:views.decorators.cache.cache_page..." "60" "..."
GET returned nothing (cache expired), so Django queried the database and wrote a new value with SETEX (set with expiry).
curl -o /dev/null -s http://localhost:8000/api/properties/cached/
#!/bin/bash
for i in {1..50}; do
curl -o /dev/null -s -w "Request $i: %{time_total}s\n" http://localhost:8000/api/properties/cached/ &
done
wait
Request 1: 0.004s
Request 2: 0.003s
Request 3: 0.005s
Request 4: 0.004s
...
Request 48: 0.007s
Request 49: 0.006s
Request 50: 0.005s
locustfile.py to test both endpoints side by side:
from locust import HttpUser, task, between
<span class="nd">@task</span>
<span class="k">def</span> <span class="nf">get_cached</span><span class="p">(</span><span class="n">self</span><span class="p">):</span>
<span class="n">self</span><span class="p">.</span><span class="n">client</span><span class="p">.</span><span class="nf">get</span><span class="p">(</span><span class="sh">"</span><span class="s">/api/properties/cached/</span><span class="sh">"</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="sh">"</span><span class="s">Cached (Redis)</span><span class="sh">"</span><span class="p">)</span>
<span class="nd">@task</span>
<span class="k">def</span> <span class="nf">get_naive</span><span class="p">(</span><span class="n">self</span><span class="p">):</span>
<span class="n">self</span><span class="p">.</span><span class="n">client</span><span class="p">.</span><span class="nf">get</span><span class="p">(</span><span class="sh">"</span><span class="s">/api/properties/live/naive/</span><span class="sh">"</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="sh">"</span><span class="s">Naive (No Cache)</span><span class="sh">"</span><span class="p">)</span>
locust -f locustfile.py --host=http://localhost:8000
curl -o /dev/null -s http://localhost:8000/api/properties/cached/


