I wanted to track how much my Claude API usage was actually costing me. Not the billing page estimate - the real cost. Per request. Per task. Per tool call.
So I built Langley: an intercepting proxy that captures every Claude API request, extracts token usage, calculates costs, and shows it all in real-time. In one coding session.
The Problem
Claude’s billing shows monthly totals. Helpful, but useless for:
- Debugging - "Why did this task cost $5?"
- Optimization - "Which tool is eating my context?"
- Accountability - "What’s this project actually costing?"
I needed request-level visibility. Something that sits between my code and Claude, captures everything, and gives me analytics.
The Architecture
Langley is a TLS-intercepting proxy. Traffic flows through it transparently:
Your App -> HTTPS -> Langley -> HTTPS -> Claude API
|
v
SQLite DB
|
v
Dashboard
It generates certificates on-the-fly, captures request/response pairs, parses Claude’s SSE streams, extracts token counts, and calculates costs using a pricing table.
The dashboard shows:
- Real-time flow list (WebSocket updates)
- Token counts and costs per request
- Analytics by task, by tool, by day
- Anomaly detection (large contexts, slow responses, retries)
What Made It Work
1. Security From the Start
Before writing code, we did a security analysis. Matt (our auditor persona) found 10 issues to address:
- Credential redaction on write (never store API keys)
- Upstream TLS validation (no self-signed upstream)
- CA key permissions (0600, not world-readable)
- Random certificate serials (not predictable)
- LRU cert cache (prevent memory exhaustion)
These weren’t afterthoughts - they shaped the design.
2. Phased Implementation
We broke the work into phases:
| Phase | Deliverable |
|---|---|
| 0 | Basic HTTP proxy that forwards requests |
| 1 | TLS interception, SQLite persistence |
| 2 | REST API, WebSocket server, basic UI |
| 3 | Token extraction, cost calculation, analytics |
| 4 | Full dashboard with filtering and charts |
| 5 | Polish, documentation, blog |
Each phase built on the last. Each had a clear deliverable.
3. Right-Sized Technology
- Go - Single binary, easy deployment, great TLS libraries
- SQLite - No server needed, WAL mode for concurrent reads
- React - Just works, Vite for fast builds
- WebSocket - Real-time without polling
No Kubernetes. No Postgres. No microservices. Just the minimum to solve the problem.
The Tricky Parts
SSE Parsing
Claude’s streaming API uses Server-Sent Events. Token counts come in message_start and message_delta events, scattered across the stream. The parser accumulates them correctly:
case "message_start":
// Extract input tokens from initial message
if usage := msg["usage"]; usage != nil {
flow.InputTokens = usage["input_tokens"]
}