Today I experienced data loss to my SQLite database. It’s a frustrating story because of what a big noob mistake I made.
Why I Chose Containers
It all started when I decided to use Podman and containers to run my Django application. I had heard many great things about SQLite and figured “hey, I could make it work in the container setup.” I like to think I’m pretty sharp and have some modest DevOps skills.
This has been a long journey and experience for me, experimenting with running my own containers on my own architecture. My previous architecture leveraged NixOS to stand up and run the host machines. However, I was finding that maintaining the deployment pipelin…
Today I experienced data loss to my SQLite database. It’s a frustrating story because of what a big noob mistake I made.
Why I Chose Containers
It all started when I decided to use Podman and containers to run my Django application. I had heard many great things about SQLite and figured “hey, I could make it work in the container setup.” I like to think I’m pretty sharp and have some modest DevOps skills.
This has been a long journey and experience for me, experimenting with running my own containers on my own architecture. My previous architecture leveraged NixOS to stand up and run the host machines. However, I was finding that maintaining the deployment pipelines to my production machine was difficult. I loved the concept of NixOS, but in practice it was way too much for a one-man shop to run.
My Current Setup and con
So that brings me to my current setup where I just have a pretty bare-bones Debian machine. I then installed Podman and the machine is ready to start running containers. In order to orchestrate the host, I then set up some pretty simple shell scripts for scaffolidng and deployment. This architecture has been running 3 different web applications successfully.
At it’s core, it just consists of running systemd services running Podman containers. This infrastructure setup allowed for a pretty simple and tight deployment cycle that could allow me to deploy a new service with a new tag.
My systemd service file looks like the following:
# /etc/systemd/system/my-app.service
[Unit]
Description=my-app (single service)
After=network-online.target
Wants=network-online.target
[Service]
# Lifecycle
ExecStartPre=-/usr/bin/podman rm -f my-app
ExecStart=/usr/bin/podman run \
--name my-app \
-p ${PORT}:80 \
-v /opt/my-app-data/my-app.sqlite3:/opt/db.sqlite3:Z \
${IMAGE_TAG}
ExecStop=/usr/bin/podman stop -t 10 my-app
ExecStopPost=/usr/bin/podman rm -f my-app
[Install]
WantedBy=multi-user.target
Variables are injected into the systemd templates using environment variables. The deployment script has a strcuture like:
#!/usr/bin/env bash
set -euo pipefail
ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
APPLICATION_NAME="my-app"
ENV_DIR="/etc/app/${APPLICATION_NAME}"
ENV_FILE="$ENV_DIR/$APPLICATION_NAME.env"
SERVICE="$APPLICATION_NAME.service"
UNIT_SRC="${ROOT_DIR}/systemd/${SERVICE}"
UNIT_DST="/etc/systemd/system/${SERVICE}"
# make sure the application files exist
mkdir -p "$ENV_DIR"
install -m 0640 "$ROOT_DIR/env/"$APPLICATION_NAME".env" "$ENV_FILE"
install -m 0644 -D "$UNIT_SRC" "$UNIT_DST"
# Restart service (brief downtime)
sudo systemctl daemon-reload
sudo systemctl enable --now "$SERVICE" || true
sudo systemctl reload-or-restart "$SERVICE"
echo "Deployed and (re)started ${SERVICE}"
Configuring SQLite for the Django
So now this brings us back to our Django application. As a pretty complicated web application, it needs to persist data and I decided SQLite is the easiest way to start and test this application. So I did what any smart man would do and made sure to mount the SQLite file as a volume. There you go, persistence solved. In an effort to boost performance, I also enabled WAL mode since that’s how everyone is doing it these days.
This is how I configured my database in my Django application:
SQLITE_PRAGMAS = [
"PRAGMA journal_mode=WAL",
"PRAGMA busy_timeout=5000",
"PRAGMA synchronous=NORMAL",
"PRAGMA cache_size=1000000000",
"PRAGMA foreign_keys=true",
"PRAGMA temp_store=memory",
]
SQLITE_PRAMA_STRING = "; ".join(SQLITE_PRAGMAS)
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.sqlite3',
'NAME': BASE_DIR / 'data' / 'db.sqlite3',
"OPTIONS": {
"timeout": 5,
"transaction_mode": "IMMEDIATE",
'init_command': SQLITE_PRAMA_STRING
},
}
}
This ended up working out pretty well and we’ve scaled to a couple hundred users a month with this current setup.
The Day Everything Went Wrong
So I go to make a deployment this morning. I run my usual deployment job which made UI changes to my main dashboard. This dashboard had metrics about my current funnel statistics of the website. That’s when I realize the numbers had changed from when I viewed the data right before. I did some digging and then I realized I had lost data.
Finding the Root Cause
I looked through the logs and couldn’t find any crash data so I figured this must’ve been a configuration issue about persisting data from the Docker container to the machine. I did one more check and realized the big problem I was facing.
I was only persisting the final database file to the machine. Here’s what I didn’t understand about WAL mode:
When you enable WAL (Write-Ahead Logging) mode in SQLite, it doesn’t just use one file anymore—it creates three:
- myapp.db - your main database file
- myapp.db-wal - the write-ahead log (where new changes are written first)
- myapp.db-shm - the shared memory file (coordinates between processes)
Think of it like writing in a notebook. With WAL mode, SQLite writes changes to the WAL file first (like a scratch pad), and only occasionally transfers them to the main database file (the permanent notebook). This happens when the WAL file gets big enough or when you explicitly checkpoint it.
So here’s what was happening to me:
- User makes changes to the database
- SQLite writes those changes to the .db-wal file (inside the container)
- The .db-wal file sits there, waiting to reach the threshold to flush
- I deploy a new container, destroying the old one
- The .db-wal file gets wiped out because I never mounted it as a volume
- The main .db file never got those changes
I had mounted the .db file thinking “great, database solved!” But the actual new data was living in the .db-wal file that was getting destroyed with every deployment. Classic case of solving most of the problem and thinking you’re done.
So the new systemd services looks like the following:
# /etc/systemd/system/my-app.service
[Unit]
Description=my-app (single service)
After=network-online.target
Wants=network-online.target
[Service]
# Lifecycle
ExecStartPre=-/usr/bin/podman rm -f my-app
ExecStart=/usr/bin/podman run \
--name my-app \
-p ${PORT}:80 \
-v /opt/my-app-data:/opt/my-db:Z \ # <- Mount the directory instead!
${IMAGE_TAG}
ExecStop=/usr/bin/podman stop -t 10 my-app
ExecStopPost=/usr/bin/podman rm -f my-app
[Install]
WantedBy=multi-user.target
The Lesson
It was a tough lesson losing real production data with actual users. But at least now I can share and help anyone else who may be trying to configure a similar setup. So if you’re running SQLite with WAL mode in containers, remember: it’s not just one file—mount all three (the .db, .db-wal, and .db-shm files) or you’ll be learning this lesson the hard way like I did. Trust me, it’s way easier to add two extra lines to your volume config than to explain to users why their data disappeared.