Researchers want to kill the vibe, propose better model for AI coding

A pair of MIT researchers have detailed a proposed new model for software that would help both humans and AI code generators alike create better and more transparent applications. No more vibing!

The approach is detailed in a paper authored by MIT’s Eagon Meng and Daniel Jackson, titled “What You See is What it Does: A Structural Patten for Legible Software”. They flag up the problem of “illegible” modern software, which lacks “direct correspondence between code and observed behavior”.

Modern software is often, also, “insufficiently modular” they continue, “leading to a failure of three key requirements of robust coding”: incrementality, integrity, and transparency.

These are not purely human flaws. In fact, they argue that the growing use…

A pair of MIT researchers have detailed a proposed new model for software that would help both humans and AI code generators alike create better and more transparent applications. No more vibing!

Modern software is often, also, “insufficiently modular” they continue, “leading to a failure of three key requirements of robust coding”: incrementality, integrity, and transparency.

These are not purely human flaws. In fact, they argue that the growing use of LLMs has “exposed deep flaws in the practice of software development, and how a reevaluation might be needed to capitalize on the benefits of LLMs and mitigate their failings.”

When LLMs are used to add code to an existing repo, “it can be hard to control which modules are modified, and to ensure that existing functionality is not broken.” Moreover, “Programmers complain that LLM coding assistants recommend patches that often break previously generated functionality.” And “whole app” builders are often unable to extend functionality before “certain (undefined) limits.”

This inability of LLMs to work incrementally while presenting integrity could ultimately limit the role of LLMs, the researchers suggest.

That would be a crying shame of course for the tech giants who have poured billions into building LLMs and promising enterprises they will allow them to rationalize their dev teams.

But it’s also a concern for the development teams who are indeed increasingly reliant on LLMs. GitHub’s latest figures show that LLM use is not just ubiquitous but has been instrumental in making TypeScript the number one language on the platform, in part because it plays nicer with agent assisted code.

Meng and Jackson’s answer is to break systems into “concepts,” separate modules of “user-facing units of functionality that have a well defined purpose and thus deliver some recognizable value.”

Less abstractly, they say that concepts in a social media app could include “post”, “comment”, “friend”, etc.

At the same time, concepts should structure the underlying implementation of the app. The authors say this makes them similar to microservices, but without the sort of dependencies, such as the ability to call or query each other’s state, that can lead to “a tangled web of connections.” But, they add that concepts can still have dependencies on lower level services such as database or networking services.

Concepts, in turn, would be orchestrated by an application layer. This would avoid “coupling… allowing concepts to be designed independently and then composed later into applications.”

The paper works through a few existing ideas for this, before settling on a proposed granular approach of synchronizations which “act like contracts” spelling out exactly how concepts are supposed to interact.

“Why can’t we read code like a book? We believe that software should be legible and written in terms of our understanding: our hope is that concepts map to familiar phenomena, and synchronizations represent our intuition about what happens when they come together,” Meng said in an MIT post about the work.

Because the synchronizations are explicit and declarative they can be analyzed, verified, and also generated by LLMs.

In the paper, the authors suggest their approach would mean “LLM-based tools can offer more than ‘vibe coding’ in which results are unpredictable, limits of complexity are easily reached, and each new coding step risks undermining previous ones.”

“Distributed implementations might be achieved by allowing concept instances to run on different servers, with synchronizations as the mechanism to keep servers in step,” it adds.

Jackson and Meng suggest the architecture could lead to concept catalogs of “well-tested, domain-specific concepts.”

These could be incorporated by both human and AI coders. “You still have to deal with the inherent complexity of features interacting. But now it’s out in the open, not scattered and obscured.”

Which sounds like a great idea. It’s just a shame that it would make vibe coding redundant, just as the term made it into the dictionary. ®

Similar Posts