Oxide's internal tips on LLM use

LLMs at Oxide

Oxide pays for your LLM use through a company account. If you don’t know what to use, use Claude Code in the terminal or in VS Code. We also have an OpenAI account if you want to try their Codex product.

Getting started with Claude Code

(Logistics elided.)

What to try first?

Run Claude Code in a repo (whether you know it well or not) and ask a question about how something works. You’ll see how it looks through the files to find the answer.

The next thing to try is a code change where you know exactly what you want but it’s tedious to type. Describe it in detail and let Claude figure it out. If there is similar code tha…

LLMs at Oxide

Getting started with Claude Code

(Logistics elided.)

What to try first?

Run Claude Code in a repo (whether you know it well or not) and ask a question about how something works. You’ll see how it looks through the files to find the answer.

The next thing to try is a code change where you know exactly what you want but it’s tedious to type. Describe it in detail and let Claude figure it out. If there is similar code that it should follow, tell it so. From there, you can build intuition about more complex changes that it might be good at.

Usage tips

Use Sonnet 4.5 or Opus 4.5

Sonnet 4.5 is the default and it is very solid. Opus 4.5 is pretty new as of Dec 2025 and it is clearly even better. They cut the price of Opus by 2/3 with 4.5 ($5/M input, $25/M output), so it is no longer absurdly priced compared to Sonnet ($3/M input, $15/M output). They claim the higher price may actually net out because it uses fewer tokens, perhaps because it is less likely to waste time on wrong directions. In practice, I’m not sure whether this is true. I have been spending more because it feels like Opus can do more.

Claude Code will sometimes automatically use the cheaper, faster Haiku 4.5 for subtasks like exploring a codebase. You can try setting it as the main model for the chat with /model, but the speed/intelligence tradeoff isn’t worth it. There is also Sonnet 4.5 with a 1M context window available, but long context weakens performance substantially, so you are almost certainly better off paring the context to make it fit in 200k.

Prompt with as much detail as possible

We’ve learned through decades of experience with search engines to be very careful about what we type into a prompt. This is the opposite of what you want with LLMs — they are capable of pulling nuance out of everything you say. So instead of figuring out the shortest prompt that will do the thing, ramble about the problem, tradeoffs, your hopes and fears, etc.

Track cost in real time

Spending too much is a good sign that Claude is spinning its wheels and you should think about how to prompt it better. By default, the TUI does not want to show you what you’re spending in real time — you have to run /cost manually to see it. Add this to ~/.claude/settings.json for a statusline at the bottom showing real-time session cost (ccusage):

"statusLine": {
"type": "command",
"command": "npx ccusage@latest statusline"
}

Run npx ccusage in the terminal to see daily/weekly/monthly usage tables.

Don’t argue, don’t compact. Just start over.

As conversation length grows, each message gets more expensive while Claude gets dumber. That’s a bad trade! Use /context and /cost or the statusline trick above to keep an eye on your context window. CC natively gives a percentage but it’s sort of fake because it includes a large buffer of empty space to use for compacting.

Run /reset (or just quit and restart) to start over from scratch. Tell Claude to summarize the conversation so far to give you something to paste into the next chat if you want to save some of the context.

Usage-based (API key) billing

We are trying usage-based billing rather than monthly subscription plans. It’s cheaper (CC team is $150 per seat) and it’s easier to administer because we don’t have to worry about who gets a seat. Plus there’s no $20 or $50 or $100 psychological hurdle to getting started. Another advantage of pay-as-you-go is that you will virtually never hit rate limits — Anthropic are happy to sell us as much usage as we want to pay for.

Note that this setup does not include a subscription to Claude chat on web/mobile. That would be $25 per seat. You can expense a personal subscription if you want that.

API keys for other tools

Create an API key at https://platform.claude.com/ in the API keys sidebar. Most tools accept it directly or via the ANTHROPIC_API_KEY env var.

Other agentic coding TUIs

OpenAI Codex

GPT-5 is cheaper than Sonnet and quite good, though people seem to generally prefer Claude Code.

(Logistics elided.)

Provider-agnostic tools

OpenCode works with multiple providers, including Cerebras, Groq, DeepSeek, and Moonshot (Kimi K2). It’s fun to try other models, but I find myself coming back to Claude Code every time.

Resources

Effective context engineering for AI agents (Anthropic, Sept 2025)
Link to an internal talk on LLMs from September
Here’s how I use LLMs to help me write code (Simon Willison)
Using AI Right Now: A Quick Guide (Ethan Mollick)
How to Build an Agent (Thorsten Ball, Amp)
How I think about LLM prompt engineering (François Chollet)
Interpretability: Understanding how AI models think (Anthropic)
The future of agentic coding with Claude Code (Anthropic)
Interview with Claude Code team (Latent Space)
Is RL + LLMs enough for AGI? — Sholto & Trenton (Dwarkesh, May 2025)
How LLMs actually think — Sholto & Trenton (Dwarkesh, March 2024)

LLMs at Oxide

Getting started with Claude Code

What to try first?

LLMs at Oxide

Getting started with Claude Code

What to try first?

Usage tips

Use Sonnet 4.5 or Opus 4.5

Prompt with as much detail as possible

Track cost in real time

Don’t argue, don’t compact. Just start over.

Usage-based (API key) billing

API keys for other tools

Other agentic coding TUIs

OpenAI Codex

Provider-agnostic tools

Resources

Similar Posts