Using LLMs at Oxide

LLM use varies widely, and the ramifications of those uses vary accordingly; it’s worth taking apart several of the (many) uses for LLMs.

LLMs as readers

LLMs are superlative at reading comprehension, able to process and meaningfully comprehend documents effectively instantly. This can be extraordinarily powerful for summarizing documents — or of answering more specific questions of a large document like a datasheet or specification. (Ironically, LLMs are especially good at evaluating documents to assess the degree that an LLM assisted their creation!)

LLM use varies widely, and the ramifications of those uses vary accordingly; it’s worth taking apart several of the (many) uses for LLMs.

LLMs as readers

While use of LLMs to assist comprehension has little downside, it does come with an important caveat: when uploading a document to a hosted LLM (ChatGPT, Claude, Gemini, etc.), there must be assurance of data privacy — and specifically, assurance that the model will not use the document to train future iterations of itself. Note that this may be opt-out (that is, by default, a model may reserve the right to train on uploaded documents), but can generally be controlled via preferences — albeit occasionally via euphemism. (OpenAI shamelessly calls this checked-by-default setting "Improve the model for everyone", making anyone who doesn’t wish the model to train on their data feel as if they suffer from a kind of reactionary avarice.)

A final cautionary note: using LLMs to assist comprehension should not substitute for actually reading a document where such reading is socially expected. More concretely: while LLMs can be a useful tool to assist in the evaluating of candidate materials per [rfd3], their use should be restricted to be as a tool, not as a substitute for human eyes (and brain!).

LLMs as editors

LLMs can be excellent editors. Engaging an LLM late in the creative process (that is, with a document already written and broadly polished), allows for LLMs to provide helpful feedback on structure, phrasing, etc. — all without danger of losing one’s own voice. A cautionary note here: LLMs are infamous pleasers — and you may find that the breathless praise from an LLM is in fact more sycophancy than analysis. This becomes more perilous the earlier one uses an LLM in the writing process: the less polish a document already has, the more likely it is that an LLM will steer to something wholly different — at once praising your groundbreaking genius while offering to rewrite it for you.

LLMs as writers

While LLMs are adept at reading and can be terrific at editing, their writing is much more mixed. At best, writing from LLMs is hackneyed and cliché-ridden; at worst, it brims with tells that reveal that the prose is in fact automatically generated.

What’s so bad about this? First, to those who can recognize an LLM’s reveals (an expanding demographic!), it’s just embarrassing — it’s as if the writer is walking around with their intellectual fly open. But there are deeper problems: LLM-generated writing undermines the authenticity of not just one’s writing but of the thinking behind it as well. If the prose is automatically generated, might the ideas be too? The reader can’t be sure — and increasingly, the hallmarks of LLM generation cause readers to turn off (or worse).

Finally, LLM-generated prose undermines a social contract of sorts: absent LLMs, it is presumed that of the reader and the writer, it is the writer that has undertaken the greater intellectual exertion. (That is, it is more work to write than to read!) For the reader, this is important: should they struggle with an idea, they can reasonably assume that the writer themselves understands it — and it is the least a reader can do to labor to make sense of it.

If, however, prose is LLM-generated, this social contract becomes ripped up: a reader cannot assume that the writer understands their ideas because they might not so much have read the product of the LLM that they tasked to write it. If one is lucky, these are LLM hallucinations: obviously wrong and quickly discarded. If one is unlucky, however, it will be a kind of LLM-induced cognitive dissonance: a puzzle in which pieces don’t fit because there is in fact no puzzle at all. This can leave a reader frustrated: why should they spend more time reading prose than the writer spent writing it?

This can be navigated, of course, but it is truly perilous: our writing is an important vessel for building trust — and that trust can be quickly eroded if we are not speaking with our own voice. For us at Oxide, there is a more mechanical reason to be jaundiced about using LLMs to write: because our hiring process very much selects for writers, we know that everyone at Oxide can write — and we have the luxury of demanding of ourselves the kind of writing that we know that we are all capable of.

So our guideline is to generally not use LLMs to write, but this shouldn’t be thought of as an absolute — and it doesn’t mean that an LLM can’t be used as part of the writing process. Just please: consider your responsibility to yourself, to your own ideas — and to the reader.

LLMs as code reviewers

As with reading comprehension and editing, LLMs can make for good code reviewers. But they can also make nonsense suggestions or otherwise miss larger issues. LLMs should be used for review (and can be very helpful when targeted to look for a particular kind of issue), but that review should not be accepted as a human substitute.

LLMs as debuggers

LLMs can be surprisingly helpful debugging problems, but perhaps only because our expectations for them would be so low. While LLMs shouldn’t be relied upon (clearly?) to debug a problem, they can serve as a kind of animatronic rubber duck, helping to inspire the next questions to ask. (And they can be surprising: LLMs have been known to debug I2C issues from the screenshot of a scope capture!) When debugging a vexing problem one has little to lose by using an LLM — but perhaps also little to gain.

LLMs as programmers

LLMs are amazingly good at writing code — so much so that there is borderline mass hysteria about LLMs entirely eliminating software engineering as a craft. As with using an LLM to write prose, there is obvious peril here! Unlike prose, however (which really should be handed in a polished form to an LLM to maximize the LLM’s efficacy), LLMs can be quite effective writing code de novo. This is especially valuable for code that is experimental or auxiliary or otherwise throwaway. The closer code is to the system that we ship, the greater care needs to be shown when using LLMs. Even with something that seems natural for LLM contribution (e.g., writing tests), one should still be careful: it’s easy for LLMs to spiral into nonsense on even simple tasks. Still, they can be extraordinarily useful — and can help to provide an entire spectrum of utility in writing software; they shouldn’t be dismissed out of hand.

Wherever LLM-generated code is used, it becomes the responsibility of the engineer. As part of this process of taking responsibility, self-review becomes essential: LLM-generated code should not be reviewed by others if the responsible engineer has not themselves reviewed it. Moreover, once in the loop of peer review, generation should more or less be removed: if code review comments are addressed by wholesale re-generation, iterative review becomes impossible.

In short, where LLMs are used to generate code, responsibility, rigor, empathy and teamwork must remain top of mind.

LLMs as readers

LLMs as readers

LLMs as editors

LLMs as writers

LLMs as code reviewers

LLMs as debuggers

LLMs as programmers

Similar Posts