Vibe coding needs a spec, too

Kiro is AWS’ AI IDE that brings structure to AI coding with spec-driven development.

Connect with Deepak on the Kiro Discord server and read more about spec-driven development on The New Stack.

We last spoke to Deepak Singh in March about how enterprise-ready agents are.

Congratulations to user Whymarrh for winning a Populist badge on their answer to [Git commits are duplicated in the same branch after doing a rebase](https://stackoverflow.com/questions/9264314/git-commits-are-duplicated-in-the-same-branch-after-…

Kiro is AWS’ AI IDE that brings structure to AI coding with spec-driven development.

Connect with Deepak on the Kiro Discord server and read more about spec-driven development on The New Stack.

We last spoke to Deepak Singh in March about how enterprise-ready agents are.

Congratulations to user Whymarrh for winning a Populist badge on their answer to Git commits are duplicated in the same branch after doing a rebase.

Learn more about the future of software engineering in the AI age on November 3rd, when our CEO, Prashanth Chandrasekar, speaks at a virtual OpenAI Forum.

TRANSCRIPT

Ryan Donovan: What happens to junior developers in the age of AI? Our CEO, Prashanth Chandrasekar, talks with OpenAI’s Head of Developer Relations, Romain Huet, at the OpenAI Forum. Listen live on Monday, November 3rd at 3:00 PM Eastern Daylight Time. Link in the show notes.

[Intro Music]

Ryan Donovan: Hello everyone, and welcome to the Stack Overflow podcast, a place to talk all things software and technology. I am Ryan Donovan, your humble host, and today we’re talking about spec-driven development. We’ve touched on it a few times so far. It seems to be something everyone’s thinking about: how to develop software using just a spec? Maybe. My guest today is a returning customer, Deepak Singh, the VP of Developer Agents and Experiences at AWS. So, welcome back to the show, Deepak.

Deepak Singh: Thanks, Ryan. It’s great to be back.

Ryan Donovan: We’re gonna skip over the intro. People can go to the previous episode if they want to know your origin story. So, let’s dive right in. I’ve been hearing a lot of people talking about building specs for these agentic coding tools. What exactly is spec-driven development?

Deepak Singh: Well, actually, to best to understand spec, it’s actually useful to go back and see how agentic software development has evolved over the last, I would say, 18 months. More in the last year, as much as anything, but let’s go back 18 months. I think in the past, we’ve talked about the fact that as software development assistance became common, they were amongst the first, sort of, ‘AI tools’ that were out there. Initially, you found them useful, but they weren’t changing your life. They weren’t changing the way teams were working. They weren’t changing the way you were shipping code because they were glorified typists – you start typing and they would finish typing for you, right?

Ryan Donovan: Autocomplete Plus Plus.

Deepak Singh: Yeah. You know, autocomplete on steroids. That started changing with, you know, some folks on my team call it ’ Agentic Chat,’ came around where you started having these multi-term conversations with an agent that help you say, ‘I want a function that does X.’ But because you were doing it in something that had context about your code, you were able to get much more intelligent answers because it was looking at your code base, looking at a project, looking at the open files, and giving you recommendations on how to do it using knowledge. It was thinking, and that continued to advance ’cause you had more and more advanced LLMs coming out. And then, you know, suddenly this term became really popular, called ‘white coding.’ You know, white coding’s a lot of fun. You know, when you’re a hack like me and you want to quickly create an app that you wanna show your son and have fun with, it works really, really well. And we actually found people having a lot of success with what you might call ‘white coding.’ But then we started diving in even deeper, and we noticed a few things. One, in this world, the code was almost exclusively being generated by AI. You were not typing the code anymore. Sometimes you went back and fixed it, but very often the way you fixed it was also telling the agent, ‘you got this wrong. This is how I think you should think about it,’ and the agent would go fix it, and you may do final tweaks on your own. You started seeing developers going from just writing function by function, bit by bit, to generating all the code upfront, and then going and editing it using the agent as an assistant. We started digging even deeper and talking to the more senior engineers on how they approached it, ’cause we noticed, at least in Amazon, their senior engineers were almost faster adopters, and more advanced adopters of these agents than anybody else. I think you’ve met– Garmin has said publicly that 80% of developers at Amazon are now using AI agents to do their software development. And we found a few interesting things, and this is what led to us investing so heavily in what we call ‘spectrum and development.’ For simple problems, they would just assign it to an agent. Your white coding approach, where you just quickly prompt it, and it would generate code. But for more advanced and more complex problems, they were actually writing down instructions and doing it the way they would do on a whiteboard, potentially. When they were working with another engineer, they were writing what we would call a ‘specification.’ ‘This is how my code should behave; this is what its outcomes should be; this is how it might be structured. Here are some of the dependencies I want it to think about. Here’s a design.’ So, there were two things that really helped us get to what the design of Kiro. One was: we should have a system, an IDE, that quickly allows you to, what you might call, ‘vibe code.’ That doesn’t go away. It’s still very useful, but can we use that same fun mechanism—’cause people like by coding for a reason—to help make the whole idea of spec authoring, creating these instructions, much more interesting, and much more powerful, and also a shareable artifact. So that’s what Kiro does. It essentially brings you a user interface which is centered around creating these specifications for solving problems, and you can continue to write code for simple stuff, but for the more complex stuff, that’s just how it works.

Ryan Donovan: Technical specs, functional specs – those aren’t new, right? Those have been around as long as I’ve been in software and probably longer. So, the spectrum of development– is it the same thing as these specs that were given to humans?

Deepak Singh: Yes and no. They are similar in concept, so if you talk to any experienced engineer, they will tell you that the first thing they do when they look at a problem is to write a specification on how they would address it. Now. Conceptually, it is exactly the same. You are telling an agent how to think about the problem, how it might want to deconstruct it in its head, to give it more context, the right context to do a better job. That part is exactly the same. The part that’s different is, one: you are not typing a spec itself. You’re being much more high-level. You are telling the agent, ‘this is how I think about the problem,’ and the agent is smart enough to convert that into a spec that it can go work on. So, you know, it could be a set of bullet points, it could be a markdown file, it could be a set of rules. There’s various ways to express that specification. I know of people who, not even in an IDE, but in a more consumerish tool, just create bullet points of specifications where they’ll express their desire, or they could just type them all out, and then they’ll bring it into their IDE, or into their software development agent to go and complete it. So, we started seeing that happen a lot, but conceptually, it’s exactly the same. But the tools you have and how you express yourself is a little more high-level. You don’t have to go write down every line and write down every diagram. Your agents are smart enough to do that for you.

Ryan Donovan: I assume you’d have to put in a lot more of what is usually tacit information for a senior developer, right? You have to talk about the sort of working context of the code, and the company, and that sort of thing.

Deepak Singh: Specifications by themselves are pretty powerful, but you can also augment them, providing it a set of tools. These are MCP servers, typically. Here’s a set of MCP servers that you can use along with this specification. You can create steering files. A steering file is just guides, you know, which are part of your project. And they might say, ‘Hey, here are the languages you’re allowed to use. Here are the build systems you’re allowed to use. Here’s the style guide,’ and you can use this for every project. They’re not like one-offs. They can just stay with you. They get checked in with your project, and so on. And I think this pattern of combining your tools, your steering files, the context that comes with the specification – it’s a very powerful context, and I think part of the reason you’re seeing all the success you’re seeing these days is because people are getting better, and better at how to manage, and understand how to use these tools, and these concepts compared to six months ago.

Ryan Donovan: Yeah. And how to provide context, yeah. You said Kiro, which is your new AI Agentic IDE, right? You said it was designed with spectrum of development in mind?

Deepak Singh: That’s correct. Its interface is specifications of spec, as we call them: ‘up front and center.’ The user experience is built around that. So, typically, if you are inside an ID and you want to create something, you would go into a chatbot and start typing what you want, or you might start asking it to break a problem down into smaller sets. With Kiro, what you do is you give it a problem, then Kiro might come back and ask you some questions, but what it does is it tells you, ‘Hey, I’m going to start creating a spec,’ and it creates a Kiro UI. You’ll see a button that says spec. And a spec is actually three files. They’re all in markdown: a requirements doc, a design doc, and a set of tasks. So, you first create, you know, you vibe a set of requirements. And these requirements could be as simple as ‘build me a solar system with nine planets,’ but what it actually does is it breaks them down into user stories, and for more complex problems, it might break them down into user stories and some design diagrams, or whatever. Here’s how you think about the problem. You collaborate with that. It’s almost like whiteboarding, except it’s in markdown. Once you are comfortable with the set of requirements, you can say, ‘yep, I like these. Go forward,’ and it creates a design. ‘This is how I’m gonna design it, here are all the dependencies, this is a call graph.’ And again, it can keep it simple, or it could get very, very complex. Here, usually, what we find people doing is breaking it up into a set of specs, not a single spec, and once you’re comfortable with the design, you say, ‘okay, I’m good with the design.’ It creates a set of tasks. It says, ‘these are all the things I’m gonna do. I’m going to write a function that does blah. I’m going to write a set of unit tests,’ and so on, and so forth. Then, basically, you can say ‘go’ and it starts executing on those tasks. You can then do them one by one, you can do them all in one go, you can choose if you see the – because it tells you what it’s doing, you can say, ‘I don’t like the direction you’re going,’ and you can sort of interrupt it and say, ‘change your thinking and do that too.’ And it will rewrite the specification, sort of, the task list, as an example. So, it’s highly interactive, but you are just using natural language. You’re telling it things, and it’s doing them on your behalf. The fun part is, let’s say, written a piece of code, and a new requirement comes up, and you want to change it. You can do it two ways: you can write a new specification, or you can go back to the original one and change it.

Ryan Donovan: The multi-step processing is interesting, ’cause you know, a lot of folks have talked about the LLM itself as like a black box, breaking up the spec and being like, ‘well, this is the AI design doc, is this right?’ Edit it, fix it, change it. ‘Here’s the list of tasks. Is this right?’ Edit it, fix it. Are you able to provide your own context, each of those steps? And does Kiro have additional context and, like, system prompts to make those steps work?

Deepak Singh: Absolutely. So, there’s a lot of optimization in the way the system works and how it thinks about when you’re giving it prompts, how it converts it into a specification, how it uses that specification. That’s all there. You can augment it because within your Kiro project, you have your steering files and your tools, your MCP servers, which are all part of giving the right context, and they can change the way the spec is interpreted because you provided that context. Kiro itself does a lot. Then, of course, it’s taking into account—especially if it’s existing code—it’s taking into account your code base, as well. And over time, I suspect people will get smarter and smarter about how they provide that context. Kiro has some– we have some pretty interesting ideas around everything from spec validation, using neuro-symbolic techniques, to other approaches that people have. The other thing Kiro does is it almost, to a fault, does test-driven development – internally, is designed to do test-driven development. That’s the way it writes code. So, it does the things that we hope every engineer does, but it does them because you know it doesn’t get lazy.

Ryan Donovan: Well, I mean, it’s an interesting tack to be like, ‘I’m gonna force this to do test-driven development, whether you like it or not.’ Is there a way to opt outta that?

Deepak Singh: As we keep looking at different ways people build software, we have other ideas on, sort of, how Kio could write software. By default, it might do it as test, but you can choose other approaches, and it’s kind of fascinating to see, depending on who’s using these tools, how effective they are. I mean, we’ve seen people starting to build some very advanced distributed systems out there feature, like Kiro itself, for example. Right before we shipped, we had to write a notification system, and typically it would’ve taken a couple of engineers, a week or so to write it, and here we had one engineer do it in half a day, or something. Not even one of the more senior engineers on the team, but they just wrote a spec, wrote it, sent the code for code review to one of their colleagues, and got checked in, and we were able to ship it. And that’s a simple example, and now you’re getting examples of very large systems that are being written by teams of, you know, 5-10 engineers using these techniques, and it’s really powerful.

Ryan Donovan: For the tasks list, can you sort of interrupt it and be like, ‘actually, for this task, go create an issue and assign it to Dave, because I want Dave coding this one.’

Deepak Singh: Not yet. That would be cool. So, here’s another feature that Kiro has where you could potentially accomplish this. It’s not doing it autonomously, but there are ways to do it. So, Kiro has this concept of ‘agent hooks,’ where essentially, a hook is a watch system. It watches, it’s an event-driven system, and when a certain change happens, it can go and do something else. So, it’s like an ‘if this, then that.’ Like, if something happens, it’ll do something. And the good part in Kiro is that even your steering files and your hooks can all be created using Kiro. So, you don’t have to manually create them. And a hook could include things like, ‘every time this happens, page so and so,’ or send it to them for code review, or whatever. You know, if you have an MCP server that allows you to talk to the issue tracking system, you could, you know, commit code. So, as the industry moves forward and some of this interactivity, et cetera, happens, all of this is going to get quite powerful. A step away from Kiro, a little bit to the QCLI, where today, you can create what we call ‘custom agents.’ And a custom agent is basically, you can adopt a persona, you can say, ‘right now I want to be an operator or a DevOps person, tomorrow I want to be a Java developer.’ And what it does is it captures your steering files, and your MCP servers, and some of these things into a profile, effectively. I’m simplifying, but that’s what it does. And when you declare, ‘I’m so and so,’ it automatically takes that hat on. It’s like your classic, you know, you hear these examples of people telling an LLM, ’ think I’m a librarian,’ and then it changes the way it approaches. Here, it’s a little more organized way of doing that.

Ryan Donovan: Right. You give it the tools that SRE, or a security person, would have – ‘do this role.’ It’s interesting. You hinted at ways that you’re thinking of improving specs automatically and adding additional context. What are those sorts of things looking like?

Deepak Singh: The most obvious one is what we call neuro-symbolic AI. AWS has a history of using these mathematical verifier solvers to say that something is correct. So, a very simple example is how do you know that your spec is a valid spec? For various reasons, there’s many ways to get there, and we have already started experimenting with approaches that use these neuro-symbolic methods to say, ‘is this a valid spec? And if it does something, is that going to be a correct thing?’ There are many areas that we use it already in the AI world, for example, in the AWS console, when you’re trying to use an agent to look at a networking property. It actually uses these neuro-symbolic techniques to make sure that the actual network paths and results are giving you a real network. It’s possible. And those endpoints are actually reachable; it’s not just making them up. So, that’s where they typically came – was to address some of these kinds of challenges, and now we are applying them to the coding problem. And so that’s just one example.

Ryan Donovan: You say neuro-symbolic. Are you actually building, like, knowledge graphs, getting into cognitive psychology representations, or anything? Or is it–

Deepak Singh: Yeah, I mean, these are these automated reasoning solvers, ’ SAT solvers,’ as they call them. What neuro-symbolic AI is is it marries automated reasoning, so mathematical proofs, with AI techniques. You’re using artificial intelligence, but everything is grounded in mathematical solvers, so there’s a formal verification process. I think the part that we have spent a lot of time on is – how do you make it possible to build these systems without having a PhD in applied math?

Ryan Donovan: Right, right. You mentioned spotting the endpoints and dependencies that don’t exist. Does it help with overall hallucination reduction?

Deepak Singh: So, in fact—I’m going away from coding and Kiro for a bit—in Bedrock, we have a capability that, just recently, one GA called Automated Reasoning Verification Checks, where you can create a model. Let’s say that you are a bank, and there are certain rules that need to be followed when you are asking questions, things that are factual, they may be in a compliance regime, they may be in how a checking transaction system works, and you can express them as guardrails in the Bedrock Guardrail system, using automated reasoning techniques. And so, when an agent or a chatbot gives you an answer, it runs against those verifiers to make sure that the answer it’s giving you is correct. A very simple example is if you ask for pricing data, how do you make sure that the pricing data is factually correct? So, you can create these models for that.

Ryan Donovan: Yeah. Getting pricing data is generally not a great use of LLMs, right?

Deepak Singh: So, that’s where you add these mathematical techniques that are factually sound. Sometimes, you can do it more simply ’cause you can talk to an MCP server that does a deterministic API call, and gives you back. But in a natural language setting, that gets tricky, and you can’t always just keep calling APIs. Here, you have something that can just verify for you because you have the guardrail. That’s an example. It’s a feature in Bedrock right now.

Ryan Donovan: Recently, I’ve sort of been thinking everybody’s talking about guardrail. Then I was like, ‘wait, what is a guardrail on a technical level?’ How does that work? And to have it built into the LLM process is an interesting solve for it.

Deepak Singh: And actually, a lot of LLMs have really good guardrails, but their guardrails tend to focus on maliciousness. They don’t know your systems, they don’t know what’s right or wrong in the context of your data, your work, what transactional system. And that’s where adding in some of these additional guardrails, whether you are building invalidation, you know, you can add in human guardrails. We do code reviews for a reason. Those are examples.

Ryan Donovan: This developer, the future, is just – ‘right now, good specs.’ Is everybody gonna just be an architect from here on out?

Deepak Singh: I actually think it just means they’re writing code a different way. I actually don’t think it’s architects. The most successful developers I see are senior engineers who are hands-on coders, ’cause they are the ones who really understand the systems that they’re building. So, I would say you have to really be a systems thinker, knowing it is less important if you are really good at writing JavaScript code, but understanding how your system works makes you more effective. Another way to think about it is, you know, at Amazon, we have this leadership principle for our principal engineer called ‘Illuminate and Clarify.’ It basically means that if you’re a senior engineer, you’re often working with a team with other junior engineers. You have to explain your problem and how you might simplify it to solve it, ’cause you’re taking a complex problem, shining a light on it, then helping people understand how they might fix it. That same skill is really well applied to guiding and driving the behavior of an agent. So, I think that’s how I like to think about it. And part of the reason is the people who I see being more successful and effective are these senior engineers who are really good at that. But there are people who are really, really strong systems developers, as well.

Ryan Donovan: Yeah, I agree. And I think that that skill is harder to teach, and a lot of senior engineers sort of learn that on the job. Do you envision a sort of, like, system thinking bootcamp on the various sites?

Deepak Singh: Yeah. And people often ask me, like, ‘how is teaching programming going to evolve?’ And I think it is already becoming more important to be very good at that critical thinking. How do you break down a problem? How do you express yourself so that the spec that you get is a higher-quality spec? And that’s not often how you’re taught. You’re taught a programming language, you’re taught functions, et cetera. Though they’re still important, we still need to understand what the code is doing, but that barrier is way down because your agents are so good at that, but this ‘driving the behavior of an agent’ is an earned skill. It’s not art. It’s an earned skill, and I think that the people who learn that quickly are gonna be the most effective.

Ryan Donovan: Yeah, the critical thinking piece is a hard one, and I—you know, a humanities guy—I think there’s a lot of folks I’ve seen that come out of the humanities who have really good, sort of, historical critical thinking skills as programmers. Do you think there’s a place for philosophy and various other places in new programming?

Deepak Singh: I haven’t thought that far. I’ll keep it simpler, which is: [it’s] more important to understand how a system works, how you want to instruct, and how you would construct a profile or a spec and work with an agent than it is to– ‘I am the world’s best JavaScript programmer,’ because they’ll help you, but it’s not gonna help you as much as the other side of it, at least in my experience.

Ryan Donovan: Since working on spec-driven development, have you come up with any sort of tips, ways to understand a system better? To create a better spec?

Deepak Singh: That’s right. I mean, some of the work underlying Kiro is as it looks at a code base, as it tries to analyze it—how it generates a spec, what’s in the design, what’s in the specification—that’s all influenced by some of the work that we have inside Kiro to make all of that work better. And we’ll continue doing that, whether it’s in the agent, whether it’s in the agent design, whether it’s in the way it’s in our system prompts, whether it’s in the way we manage context, all of those. I think what people forget very often is: LLM by itself is good, but it’s the scaffolding around it that makes it useful. And the scaffolding is– that’s why you often hear, like, ‘I’m using the same LLM. Why am I getting different results?’ It’s because there’s a scaffolding that goes around it.

Ryan Donovan: Do you have templates for that scaffolding? Do you have, like, standard things you pass around to people to be like, ‘this is best practices?’

Deepak Singh: Well, yes and no. There’s a community inside companies—I’ve seen it inside Amazon, I’ve seen it with our customers, I’ve seen it in our Discord for Kiro—because agents have certain behavioral patterns, and as people become good at using them, they come up with, ‘hey, if you prompted this way, it’ll give you better results.’ So, that gets shared. Our focus is very much on helping people just express their thoughts and giving an interface that allows you to move back and forth between expressing yourself in natural language, visualizing it as a spec, which is in text or in diagrams, and then the code conversion happens with the agent. We want to make that flow delightful and easy. That’s the whole point of Kiro.

Ryan Donovan: Have you found that there’s any part of specs that are difficult for AI to translate?

Deepak Singh: If your context gets too big, LLMs tend to start doing worse, so there’s a tipping point. So, I think that’s the kind of, like, what’s the right size of spec? Like, when do you break it down? Those are things that I think people are learning. As we learn with them, we will help them do it better. We can provide them best practices. We do that. But Kiro’s our developer product, which has a very active Discord channel. People share within themselves, and our goal is to learn from that and make sure they have fewer and fewer things to spend their time thinking about, and just focus on outcomes.

Ryan Donovan: You mentioned you’re thinking about the automated improvement of specs, adding more context automatically. What else is in a roadmap that you can share for Kiro?

Deepak Singh: Because we have the CLI product, as well– you know, those custom profiles are an example of things that we are starting to think about. Right now, Kiro is very interactive. It’s very obvious that there are some things that could be done less interactively, where you have a set of tasks, you want to shut your laptop down and go to bed, but you want to assign them, you know, to run in the background. Those are obvious things that we are gonna do. I think this is more of an industry thing. Multimodality, I’ll just add, is another thing that’s somewhat – we already do it, but I think there’s more that can be done. I think over time in the industry, the interesting challenge is gonna become: as the volume of code and the speed of code generation goes up, I would say exponentially potentially, what are the new bottlenecks that are gonna come up? And I think those are areas that we are actively investigating. Because writing code is one thing, then where do you get stuck after that? How can we help? It’s just a great time to be writing IPs that are designed to help anyone express their ideas and convert them to software. We definitely think that the most successful people are the folks who know, you know, the systems that they’re building, [they] understand their outcomes the best, ’cause they have the ‘we get the best results.’ But, you know, we’ve had fun building with the community with Kiro. Kiro is about making writing software fun, but also robust, shareable, and scalable, where, you know, six months after you started the project, you still have the right context, which is one of the things people tend to lose when they’re just bug coding. And so far, we’ve had tremendous response from the developer community. You know, I think spectrum and development is here to stay. I hear it more and more often, people using that term. So, I’m excited to see where it goes.

Ryan Donovan: All right, everyone, it’s the time of the show again, where we shout out somebody who came onto Stack Overflow, dropped some knowledge, shared some curiosity, and earned themselves a badge. So, today, we’re shouting out a populous badge winner: somebody who came to a question, dropped an answer that was so good, it outscored the existing accepted answer. So, today we are shouting out ‘Whymarrh,’ who answered ‘Git commits are duplicated in the same branch after doing a rebase.’ If you’re curious about that, we have an answer for you in the show notes. I am Ryan Donovan. I edit the blog, host the podcast here at Stack Overflow. If you have questions, concerns, topics to cover, complaints, please send them to me at podcast@stackoverflow.com, and if you wanna reach out to me directly, you can find me on LinkedIn.

Deepak Singh: Thanks for having me. It’s always great talking to you because it’s a great way to talk to the broader Stack Overflow community, which many of us have been members of a very, very long time. And it’ll be great to see how this world of AI-driven development evolves over the next few months. I’m Deepak Singh, I lead the Kiro team. You can find Kiro at kiro.dev, and join our Discord. Those are the best places to find me and the rest of the team. We love hanging around there with all of you. Happy spectrum and development.

Ryan Donovan: All right, well, thanks for listening, everyone, and we’ll talk to you next time.

Similar Posts