Building a Claude Traffic Proxy in One Session (opens in new tab)

I wanted to track how much my Claude API usage was actually costing me. Not the billing page estimate - the real cost. Per request. Per task. Per tool call.

So I built Langley: an intercepting proxy that captures every Claude API request, extracts token usage, calculates costs, and shows it all in real-time. In one coding session.

The Problem

Claude’s billing shows monthly totals. Helpful, but useless for:

  • Debugging - "Why did this task cost $5?"
  • Optimization - "Which tool is eating my context?"
  • Accountability - "What’s this project actually costing?"

I needed request-level visibility. Something that sits between my code and Claude, captures everything, and gives me analytics.

The Architecture

Langley is a TLS-intercepting proxy. Traffic flows through it transparently:

Your App -> HTTPS -> Langley -> HTTPS -> Claude API
|
v
SQLite DB
|
v
Dashboard

It generates certificates on-the-fly, captures request/response pairs, parses Claude’s SSE streams, extracts token counts, and calculates costs using a pricing table.

The dashboard shows:

  • Real-time flow list (WebSocket updates)
  • Token counts and costs per request
  • Analytics by task, by tool, by day
  • Anomaly detection (large contexts, slow responses, retries)

What Made It Work

1. Security From the Start

Before writing code, we did a security analysis. Matt (our auditor persona) found 10 issues to address:

  • Credential redaction on write (never store API keys)
  • Upstream TLS validation (no self-signed upstream)
  • CA key permissions (0600, not world-readable)
  • Random certificate serials (not predictable)
  • LRU cert cache (prevent memory exhaustion)

These weren’t afterthoughts - they shaped the design.

2. Phased Implementation

We broke the work into phases:

PhaseDeliverable
0Basic HTTP proxy that forwards requests
1TLS interception, SQLite persistence
2REST API, WebSocket server, basic UI
3Token extraction, cost calculation, analytics
4Full dashboard with filtering and charts
5Polish, documentation, blog

Each phase built on the last. Each had a clear deliverable.

3. Right-Sized Technology

  • Go - Single binary, easy deployment, great TLS libraries
  • SQLite - No server needed, WAL mode for concurrent reads
  • React - Just works, Vite for fast builds
  • WebSocket - Real-time without polling

No Kubernetes. No Postgres. No microservices. Just the minimum to solve the problem.

The Tricky Parts

SSE Parsing

Claude’s streaming API uses Server-Sent Events. Token counts come in message_start and message_delta events, scattered across the stream. The parser accumulates them correctly:

case "message_start":
// Extract input tokens from initial message
if usage := msg["usage"]; usage != nil {
flow.InputTokens = usage["input_tokens"]
}

Loading more...

Keyboard Shortcuts

Navigation
Next / previous item
j/k
Open post
oorEnter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help