From Vibes to Governed: Spec-Driven Network Agent Development

What’s in This Episode

- Why spec-driven development exists – and how it bridges the gap between vibe coding’s speed and infrastructure’s non-negotiable need for predictability
- What SDD looks like in practice – constitutions, specs, tasks, and why most of the work happens before a single line of code is generated
- The TDD parallel – how spec-driven development echoes test-driven development, and why engineers who spent years writing structured test suites are well-positioned for this moment
- Inside NetClaw – soul files, tools.md, heartbeat files, and what 336 GitHub stars in one month says about where the community is headed
- The red team session – what happened when Capobianco opened NetClaw live to 600 network engineers in the VibeOps Slack and watched 35 social engineering attempts unfold in real time
- MCP’s actual state – of 56 infrastructure MCPs catalogued, exactly one is vendor-official; community is running ahead of the enterprise, and the window to catch up is narrowing

With spec-driven development, you’re going to iterate in natural language and keep refining your specifications before you implement. That’s the discipline. That’s the payoff.

John Capobianco

Head of AI & DevRel, Itential

Governed by Design, Not by Hope

Vibe coding is great for side projects. It’s a liability when the blast radius is a BGP session. The engineers who understand why that distinction matters – the ones who’ve had to fix something at 2 a.m. – are exactly who this conversation is aimed at.

SDD doesn’t slow things down. It front-loads the thinking that production demands anyway. The constitution, the specifications, the tasks – those aren’t overhead. They’re the artifacts that let a team review, version, and amend intent before anything touches infrastructure. When something needs to change, you go back to the spec, not to the code.

That’s what governing AI in operations actually looks like. Not guardrails bolted on after the fact. Structure designed in from the start.

There’s a fundamental mismatch between how an LLM behaves – probabilistic – and what we expect from infrastructure: determinism. Spec-driven development is how you start to bridge that gap.

William Collins

Host – The Cloud Gambit

Why the Best Infrastructure Engineers Are Closest to Getting This Right

The argument isn’t that network engineers need to become software developers. It’s that the skills they already have – requirements gathering, change control discipline, structured troubleshooting – are exactly what spec-driven development demands from a human. The model does the building. The engineer drives the intent.

That framing matters, because the question of who owns the agent, and who’s accountable for what it does, is already real. The engineer whose agent is doing work inside an enterprise isn’t a liability. They’re the most valuable person in the room.

–+

William Collins • 00:00

There’s a phrase floating around AI circles right now. It’s called vibe coding. You describe what you want. The model writes to code, you ship it, and then you hope for the best. It works great for side projects, for demos, weekend projects, spinning something up from scratch, but it kind of falls apart sometimes at the moment that you point an AI agent at production infrastructure. Today, we have a returning guest, someone who’s been building in this exact space. And what he’s built has a lot to say about why specs matter with reliability, why governance isn’t optional, and why the people most qualified to build some of these things are the ones with that knowledge, the ones who have actually had to fix, you know, a BGP session at 2 a.m.

William Collins • 00:54

Let’s dig in. Welcome to another episode of the Cloud Gambit, where we talk about cloud, AI, a little bit of culture, everything in between. With me is my always on, always at 100% co-host, Yvonne Sharp. How are you doing today, Yvonne?

John Capobianco • 01:26

You should not open the show by lying to people. Not always on, not always 100%, but I am happy to be here with you fun people today.

William Collins • 01:37

Right on. And hey, I just want to get into this episode. It’s been a little bit crazy lately. There’s been a lot going on in the news. AI is just taking the world by storm. But exciting today. We have a returning guest, John Capo Bianco.

William Collins • 01:53

I think everybody probably knows who you are. And today you are the head of AI Endeavor, Pitential and a Google. Google Developer Expert. Let’s see the proof of that, sir.

Speaker 2 • 02:05

Yes, I got this ghastly tattoo of the GDE logo on me to signify that I got my Google developer expert. We were talking earlier in the show about, you know, just what it means to be an expert in the certifications. And it was neat because I didn’t really prepare for it. I’d been building tools with Google, Gemini CLI particularly was what drew them and drew my nomination for the program. So it’s kind of neat in that you have to be nominated in by an existing GDE or by someone at Google. They look at a review, your GitHub, your YouTube, your community presence. You fill out kind of a form.

Speaker 2 • 02:44

And then you do a 45-minute to an hour-long one-on-one interview with a Google staff member. And it’s, I mean, it’s, it’s informal and friendly and cordial and everything, but it’s serious. I mean, they don’t leave any stone unturned. They really check your expertise. We actually played a game of battle chess 9000. Then I explained to them how I used Nano Banana and VO3 and hosted it on Google and built it all with Google. So we actually played a game of chess, which was kind of fun during the interview.

Speaker 2 • 03:15

But no, I’m really proud of that. And it has been a while. We haven’t seen each other since I joined Itential and even in this new year. So happy new year to you both. I really applaud all of you latest episodes. I’ve been watching and following up. And yes, the world of AI has moved super fast.

Speaker 2 • 03:31

It continues to move fast. I think, what was it, 11 months now that we talk about MCP? It feels like a year ago almost that we had the 1st kind of episode about, hey, MCT. And I like.

John Capobianco • 03:47

You got to pay attention to this thing. Yeah.

Speaker 2 • 03:49

Yeah. Yes. And I, I mean, I would say that it has shaped up. I think that we, I think that we were all correct in that episode and our assumptions. And I think maybe we even undersold it a little bit. It’s really become a predominant tool in this AI age of development, right? Yeah.

William Collins • 04:08

And it’s like the stick hole for operationalizing everything, really, with like larger companies. It’s huge.

Speaker 2 • 04:15

Well, we just put out a blog post about what’s more of an info page about. So I found 56 different infrastructure-related MCPs that I would consider trustworthy, valuable, viable, useful. 56 of them. Everything from sources of truth, and you can name it, like InfraHub has one, Nautobot has one, Netbox has one. All the way to infrastructure, Arista, Junos, Cisco. There’s a wide, wide variety of them. And what’s neat, William, so of the, here, take a guess, though.

Speaker 2 • 04:54

Of the 56, how many do you think are community written versus official MCPs by whatever vendor, XYZ?

William Collins • 05:04

Community written, like maybe 10 or less.

Speaker 2 • 05:08

So, one of them is official. The thousand eyes is an official MCP from MC from Cisco. Every other MCP is a community project.

William Collins • 05:17

I love that. That’s actually great to hear. You know, you figure that like with the bigger companies that they want to keep everything kind of like lock and key. They want to like expose it behind an endpoint from their platform. They want to take that approach. But yeah, that’s that’s awesome to hear that this, you know, throughout in the community.

Speaker 2 • 05:36

I think you might, I think we would all agree that it’s a little bit, it’s a little bit different than, let’s say, monetizing HTTP, where there’s sort of web and server and front ends and WWW pages that you can visit. I think people are struggling with if we give away MCPs, right? Like, let’s say there is an MCP interface on a piece of hardware someday, in addition to an SSH interface and a REST comp interface or whatever. Is that giving away too much of it? Right? Like, do you know what I mean? Like, I’m sure there’s some calculus going on there, and there’s some real, very intelligent people trying to figure out the math behind monetization, how to monetize.

Speaker 2 • 06:17

Is it going to be free? Is it just an interface that we attach? Right. So, community is a little bit ahead on this. And I don’t, that’s not, I don’t think that’s, I think that’s to be expected. While the larger ships get their stories straight on how they’re going to offer these MCPs, there’s still some smaller players that when I talk to them, I’m not going to name and shame anyone, but they say we’re, you know, we’re working on our MCP. And I feel like you better hurry, right?

Speaker 2 • 06:46

Like, this is starting to become yesterday’s story. Table stakes. Yeah. Right. Yeah. Yeah.

William Collins • 06:53

So, what I don’t, before we get into the weeds anyway, you don’t have to give like a huge, you know, the lexicon of John Capo Bianco and all the things, but you’ve I’ll just ask a question, you know, and maybe that lends to or leads you to talk a little bit about your background. But you are a You’re not a software engineer that learned how to do networking. You’re more a network engineer that learned software. So your career trajectory is kind of like, or your career arc has been very unique and very different. Did I kind of say that right? Because you started like 25 years in the industry networking.

Speaker 2 • 07:39

Yes. So the only thing, though, that I would add to that is that I went to school to be a programmer. So my formal education, the three years I spent at community college, were doing things like Java, C, three years of C, two years of Java, JavaScript, HTML, COBOL, KICS, JCL, some legacy languages, VB6, I think at the time. It was still Visual Basic 6. But my placement out of college was in a support desk. I just, I was working full-time at the time. And when it was career day, and you had to go find like a company to do your placement to get, because the community college program, the last two years of your schooling, there’s like two days a week where you’re doing plate job placement.

Speaker 2 • 08:27

Kind of like a. An apprenticeship with sorts, right? Anyway, I was, I got the worst of the apprenticeships, and I ended up in like some service desk, help desk with the Ministry of Health instead of a programming job. So that led me to get like A, a network plus, and start working on infrastructure certs to build my career and to excel from help desk up through the ranks. So, but I didn’t do any real programming, like I was never paid to write. I had one six-month contract once where I did QA work and I did write some code, to be fair. So, I did do some early in my career, like a contract to write some code.

Speaker 2 • 09:09

So, I, but I, you know what I mean? But 20 years later, when the two combined in this idea of net DevOps and infrastructure as code and network automation, that really helped me excel because it was kind of like going back to my roots of coding and programming. And it made a lot of sense to me. But the whole time I did networking, it was like, it felt weird. It felt wrong. Like, we have to log into all these boxes and do all this stuff by hand. Like, this doesn’t make sense, right?

Speaker 2 • 09:38

Meanwhile, Active Directory, it’s all GPOs. You know, it’s all automated. The storage was all automated. The people doing application development, it was all automated, and they were doing really cool stuff with code. And then I was doing networking, and it was like, ugh. Here’s the notepad file for router two, and then you know, and copy and paste it in. And it just never felt right, you know, until things like Ansible and Python.

Speaker 2 • 10:03

And then I was able to circle back and kind of marry the two.

William Collins • 10:08

Yeah, you, what is it? In 2019, you wrote Automate Your Network or 2018?

Speaker 2 • 10:15

Yeah, 2019. That’s right. In March of 2019, it just celebrated seven years, if you could believe it. But yes, the Automate Your Network book was 2017. And that was after doing Ansible for about three years. My wife sort of convinced me. And Carl Buckman at the time actually offered to co-write the book with me.

Speaker 2 • 10:37

And I thought, no, let me try this. And I shopped it around. I shopped out three sample chapters to all the well-known publishers. And nobody was, I was getting letters back like, you know, network automation is not a thing. And this is an interesting topic. And some of the rejection letters, I wish I had kept them. I really wish I had kept some of them.

Speaker 2 • 10:58

So then I went to Amazon and I found a way to self-publish through Kindle. And actually, it was really fun to do it that way because I was completely master of my fate and master of every page, every letter, what the cover looked like. I had no outside influence bothering me or restricting me in any way. You know, my wife was my editor. So, and she’s not in IT. She’s like, her background’s in auto insurance. Okay.

Speaker 2 • 11:26

So, but she, you know, she said that might help as a layman. If she can understand these chapters, then anybody should be able to understand these chapters. Right. So that was the approach of the 1st book. And then I swore I never would ever write another book again. And then, and then a few years later, I ended up writing the Pike Yes book out of a similar passion, out of a similar like. I don’t know.

Speaker 2 • 11:50

Like, nobody talks about PyPS. I never hear it anywhere unless it’s me or Danny Wade talking about it. I think I was the only one who brought it up at AutoCon. I see Cisco has added it to their new exams, which kind of gives me hope and rekindles a little bit of excitement that there’s going to have to be some PyEKS material now, given it’s on the blueprint, right? So, yeah.

William Collins • 12:15

I just can’t believe like 2019 doesn’t seem like that long ago. I feel like we’ve been talking about MCP for five years sometimes.

Speaker 2 • 12:23

Yes.

William Collins • 12:23

Just the amount of time that I’ve spent with it, just in my own professional life, it’s, it just seems like it’s been around for years. And it’s just funny thinking we were sitting here less than a year ago having a conversation about whether it was going to be real or not. And I know, I know.

Speaker 2 • 12:41

And you know what? I don’t. It’s not a matter of gloating about being right about the topic. You know, you know, a lot of us put a lot of risk and put a lot on the line. But, but it is a matter of saying, like, what else do you need to see to start adopting it and using it? Right. Like, I just, Andy, Andy Lapteff and I, it came up in conversation during a podcast I was doing with them and we did a hello world together.

Speaker 2 • 13:07

In 30 minutes, we had Andy get clone the net, the Netbox MCP, do a virtual environment, install it, connect it to Copilot, and then ask, how many circuits do I have? And it answered in 30 minutes with Andy, just like this, try to do it remotely with all the bumps and warts and sharp edges that it comes with. And then 10 minutes later, we added it to his cloud desktop. And I could see Andy smiling. I could see his mind expanding. I could see him embracing this. So I think that anybody can start doing this and using these tools.

Speaker 2 • 13:50

And I think we’re going to lead into a topic about how, okay, so let’s be fair to network engineers, right? Like I said, 25 years, not everyone has a network programming background like I do, or computer programming background. You do 10 years of certs, 15 years, let’s say. It takes you three to five to get your CC and A. It takes you another four years to get your NP. You devote three years to get your IE. That’s 15 years of expertise and a lot of time.

Speaker 2 • 14:18

And then all the production hours you’ve put in to building and supporting and architecting and designing and P1 calls, your whole focus in life has been the network. And now suddenly overnight, this loudmouth comes on and starts saying you’ve got to automate all this stuff and you’ve got to learn Ansible and you’ve got to learn Python. And there’s a better way to do it. And it’s easy and anyone can do it and you should be doing it. Maybe I was a little dismissive in the time and effort it takes to learn Python, to learn Ansible, to develop the skills of VS Code, of Git. That was Andy’s 1st Git clone, by the way. Did you know that?

Speaker 2 • 14:57

So I have to try to meet, I want to meet people where they are. Maybe that is your very 1st Git clone and you’re intimidated by that activity because you don’t really understand it, right? Like there’s a broad spectrum of where people are at, right? So, but now, now we’ve got spec-driven development, right? And I think that this approach, anybody can do it because it is truly, purely natural language, and you don’t have to know anything about coding or the mechanisms or the syntax. None of it, none of it. And if you are an expert in whatever domain it is, for us, it’s networking and cloud and security and all those awesome IT things.

Speaker 2 • 15:48

You start to drive it with specifications, which are very natural for us to consume as network architects. We all live and breathe by requirements, right? Have requirements not been the driving focus of your career? More or less?

John Capobianco • 16:02

Very, very much, yes.

William Collins • 16:04

Why don’t you talk us through? So, if anybody goes, you know, you’re probably most famously known right now on the internet for Netclaw, and I want to get into that in a 2nd , all the claw things. But if you go to your Netclaw project, I don’t actually haven’t looked recently, but I think you have like a sole markdown file, you have your Asian MD file, your like an identity file, maybe a constitution file. What do these files mean and how are they different? Do you want to start there? Would that be a good

Speaker 2 • 16:35

Yeah, well, I so you’re right. We’ve moved into this world of, so an agent, right? When we’re talking about AI agents, we need to give them personality and guardrails and controls and security, all those not so fun things that we always talk about and think about, but to make them production grade. Well, now through the Markdown, you know, like the Sol file. For example, in my Sol file, I tell it, I give it some serious, strong guardrails, and I try to apply my networking domain specific expertise. So never make any changes that will lock the user out. Don’t apply any ACLs that will prevent, you know, management traffic.

Speaker 2 • 17:20

When you’re updating DTY lines, don’t do this. Don’t change any enable secrets unless explicitly requested to do so. Certain things of that nature go into the Sol file. There are a few other files. Let me quickly, you just bear with me here. I had the folder up. I just have to relaunch it because I crashed.

William Collins • 17:40

Yeah, Markdown has kind of taken over the world, John. Markdown is huge.

Speaker 2 • 17:46

And it makes sense because it’s, you know, why do we need, like, I see the need for JSON and YAML when you’re dealing with structured data. But yeah, I love Markdown and it’s almost this kind of multi-modality. Like it respects emojis and it expects, it respects emojis. It respects all the formatting you can do, tables, tabular data, lots of great stuff you can do in Markdown that the AI respects. Let me just quickly launch my NetClaw project and read out some of the files and give people the kind of the gist on the bare bones of NetClaw. And honestly, in terms of the genesis of NetClaw, it really is out of curiosity. So I was following all of the news about Claude with the WD and then the problem with the domain name and the problem with the intellectual property.

Speaker 2 • 18:44

And then the Bitcoin, the cyber crypto people got involved and started doing bad things. And then they changed to OpenClaw. And then OpenAI gets involved and takes them on. So it’s just been a curious journey to see if can we apply those same ideas of skills and Markdown-driven agent to drive the network to be very, very good at network operations. So, here’s some of the files that I have. There is a user file, and that user file is something you answer, and it’s your user preferences.

Speaker 2 • 19:28

So, your name, your role, your time zone, something that the agent can then communicate with the operator. So, I have this user file that you can update so that the agent itself knows more about you, the human that’s interfacing with it. There’s a tools MD file, and that defines how the tools work. So, there’s a lot of information in here on how tools work. I’m looking at some of the soul file. So, here’s a section of the soul file. It talks about how you work.

Speaker 2 • 20:03

You’re going to do an audit trail using gate. Every session starts during the session, during the session end. I want you to gather state when netbox is available, cross-reference your device state against the source of truth, flag discrepancies. It’s pretty neat. It’s all just human words. And then there’s, you can even add a heartbeat, William. So, if you, if you have, if you have a system, because NetClaw has connections with the OpenClaw communication channels, so WhatsApp or Telegraph or Slack or Discord or whatever, right?

Speaker 2 • 20:36

Google Meet or Google chat. So, you can make a heartbeat.md file that tells it to check in with its upstream channel autonomously every X number of minutes.

William Collins • 20:49

I didn’t know that one.

Speaker 2 • 20:51

Isn’t that neat? Yeah, there’s a heartbeat. So, yeah, if you’re listening now and you have not added this, add a heartbeat.md file. And it’s almost like a cron job where the agent will know, okay, the user wants me to check in with them in Slack or in WebEx or whatever every 15 minutes, right?

William Collins • 21:11

So when somebody actually starts one of these projects, the 1st thing they’ll do is they’ll probably have a, like if they already have code existing, they’ll do like a like a claw to knit or whatever Gemini’s flavor of doing that is. And or they’ll have like an agents MD file and they’ll do like an import context from those other ones. That’s what is the agents.md file is like a high-level architectural overview of the project. And it should be limited to that. So, all these other files, I guess what I’m trying to get to is the reason these other files exist is because you don’t want to throw, try and muck up everything in the same markdown file. Would that be correct, you think?

Speaker 2 • 21:56

Yeah, that’s right. Nice discrete markdown files with proper naming convention. And actually, if you look up the standard for an open claw, they will recommend the six or seven files you should have in your repository to build a net claw or an open claw properly. The other thing I want to highlight, William, is that, and this has led to a lot of debate whether or not MCP, we’ve already talked quite a bit about it, and we clearly see it’s not going anywhere. However, some people are more inclined to believe that all we need is a CLI and API and these skills files. So, a new way of doing things. For example, this actually just came up today.

Speaker 2 • 22:36

There is no MCP for WebEx. So, I was doing some Vibe coding to add WebEx to NetClaw because there was an open issue. People wanted to use WebEx with it. And it determined that the best thing to do is just write a skill that uses the WebEx APIs, and that we didn’t actually need an MCP for that. So, it’s kind of interesting the way things progress that you build these skills files that instruct the agent on how to use a tool, right? Not necessarily, right? And you give it a few examples and you kind of give it some scenarios and stuff.

Speaker 2 • 23:11

And you build a skill. Someone gave me this tip, or I found this on Twitter or something. Someone said to add in your soul file somewhere the instructions to learn from every session and to add to the skills and add to the other files like automatically at the end of your sessions with the idea of a self-improving agent, right? The agent will know that it called a skill and maybe it got a 404 error the 1st time it tried, and the 2nd time it tried a different way. Maybe it will update, self-update that skill to say, here’s the correct syntax so we avoid this error. Right? Like now we’re talking about mini-humans.

Speaker 2 • 23:57

We’re talking about agents that are learning on their own and self-improving as you run them just by telling it: improve yourself as you go, write, rewrite to these files, and it just adds more markdown.

John Capobianco • 24:16

Well, I think what folks who haven’t explored this arena yet maybe don’t realize is that a markdown file is a text file. Like it’s just a text file with formatting details in text that make it appear pretty. That’s all a markdown file is. When people are talking about markdown, they’re just talking about verbal instructions that are in a particular file format that’s not even particularly technical or difficult. Like an elementary school child could figure out markdown. One of the questions I have for you, John, is as you’re thinking about how to structure these markdown files, and we talk a lot about agents, which again, are defined by markdown files. We talk about skills that are defined by markdown files.

John Capobianco • 25:09

How do you think about how to break up the tasks or the roles that you want your system to implement? How do you think about breaking those up into when do you need a different agent? When do you need a different skill? And how do you keep from cramming too much together in one? Like, how do you figure out what the logical downside are?

Speaker 2 • 25:32

It’s an extremely valid criticism of NetClaw without directly criticizing NetClaw because I just keep stuffing things. No, I didn’t mean it that way. I needed to hear that because in my head, Yvonne, I’m constantly playing to myself, okay, you’ve built sort of a monolithic thing here at this point, right? Like you just keep finding MCPs and finding skills and adding them, adding them, adding them. Maybe it would have been a better approach to have a net claw for security and all the security MCPs, a net claw for source of truth, and it has all the source of truth stuff. Right. And then you have a, and then you have kind of more of a human resources approach with nice, well-defined agents without tool sprawl.

Speaker 2 • 26:17

But, you know, for an open source project, people can pick and choose. You know, I don’t expect everyone to use every skill and every tool. And there’s also a tipping point where the LLM will start to hallucinate and use the wrong skill or use the wrong MCP.

John Capobianco • 26:35

Yeah, that’s really what was driving my question is that one of the things that we found, and we’ll talk about this more, sure, I’m sure when we start talking more about the spec-driven AI development, is AI seems to perform best with very tight, discrete instructions when you break it up into lots of different components instead of one big instruction set. Some of that has to do with context window, some of that has to do with interaction services, all those fun things. I don’t know that I’ve heard any kind of good guidance yet on how to think about that. Other than like, I tried it this way and this seemed to work and this is what makes it work.

Speaker 2 • 27:23

And this is about two weeks for me that I’ve been using this tool. I lean on GitHub’s spec kit. So, GitHub, you know, who knows better than GitHub or the Microsoft people or the Google people or the, right? There’s some really, very talented people there. So, when they release a kit that says, here’s a kit for spectrum and development, it integrates with cloud code and gives you the/commands to run. And Claude knows the order to run them in. It was remarkable.

Speaker 2 • 27:54

My mother, I’m positive that if I did the technical stuff of installing all the tools and getting her ready to go and putting her in front of a terminal, she could build an application for bingo or whatever she would love to build, right? You literally say/spec kit.constitution. I’d like you to build clean code. Here’s my guidelines. Here’s my guardrails. And you give it the instructions to build a constitution. Once you have a constitution, you start building specifications/specify .

Speaker 2 • 28:25

And then what you want to specify. It builds. Dozens of markdown files with different names and different tracking. It’s all driven by folders. It kind of has a Git-like feel to it. Where, let’s say, I get my bingo version one with mom done, and she kind of goes, Well, I’d actually like to change it to use this palette or something. Something changes in her mind.

Speaker 2 • 28:47

You just start over and make amendments to the Constitution, just like we as humans and you Americans do with your Constitution. You make amendments to it that then change the way the specifications are driven and the tasks that are driven from the specifications. Ultimately, you do these five or six steps, Yvonne. And the last one is implement. And once you do implement, that’s when it starts to generate actual HTML code and JavaScript code and Python code and whatever code. So you actually do most of your time with spec-driven development is planning and chatting and talking in natural language, developing these markdown files. And when I peeled back the onion and looked in these markdown files, they’re user stories.

Speaker 2 • 29:36

So it’s actually using popular human software development techniques in the form of user stories inside of these markdown files that it creates.

William Collins • 29:48

You connected to, so like you’re working with Py ATS and writing that book and everything. I think you. Maybe, I think, I hope this was you. I think it was you, but you’ve kind of connected spec-driven development or prompt, you know, this prompt engineering and different things to test. Test-driven development.

Speaker 2 • 30:10

Yeah, TDD. Yeah. Yeah.

William Collins • 30:11

And that’s an interesting framing, I think, in this whole conversation, really, as someone who wrote these, a lot of these test suites for PyETS for years. Can you unpack that parallel for us a little bit, if you don’t mind?

Speaker 2 • 30:26

Yeah. So one of the early things I did when I was started to dig in. So, 1st of all, you are being too humble. William has written an amazing piece on spec-driven development. He really is a thought leader in this. I really enjoyed reading it. And he should include it in the, you should include it in the conversation notes because it really is thought leadership at its best on this topic.

Speaker 2 • 30:49

Now, in terms of test-driven development, my very 1st public presentation for Cisco when I worked for Cisco as an employee was in Amsterdam at Cisco Live. And I did something about Cisco PyTS and test-driven networking, I think I called it TDN, where we were going to take the ideas and principles of software test-driven development and apply it to networking with PyETS with this idea of, it’s kind of neat with test-driven development. You write a test case and you want it to fail. It’s interesting, right? The 1st thing you actually want is a failed test, and then you modify as minimal as possible either the code or in our case with the network, the network condition to make that test pass. And then you just keep iterating and refactoring and adding more and more test cases, right? It’s a beautiful, brilliant way to approach network design, especially with Greenfield, when you can develop the tests as you roll out.

Speaker 2 • 31:48

So I did a Claude chat and asked them for an image to compare TDD to SDD and draw parallels or differences between the two. And it gave me this really nice image. I shared it on my social networks of the test-driven development way and where you iterate there. And in the spec-driven development way, it’s very similar, except that right before you implement, all those previous steps are where you’re going to iterate. You’re just going to iterate in natural language and keep refining your specifications and your tasks and stuff with natural language. And then you’re going to implement it and develop the code. There are some parallels there.

Speaker 2 • 32:30

I just wanted to see if it would help a click in my brain because I had been doing test-driven development for a while with PiETS. And were there like natural stepping stones to get to spec-driven development?

Speaker 3 • 32:45

So that’s actually. Sorry, but.

Speaker 4 • 32:49

One of the things, and we, oh, I was just going to say, William and I have talked about this a lot.

John Capobianco • 32:53

I’ve mentioned it several times on this podcast, but I’m going to say it again, mostly because I want your take on it, John. But one of the things that I’ve observed is that I have a few team members that are really exploring and diving into this space. One of them has been in the industry a long time and really started and cut his teeth in the waterfall days. Right. And if you remember back when we did waterfall, you wrote this huge spec. You defined absolutely everything that the thing was going to do. You went through all these iterations to be sure that that spec was exactly right.

John Capobianco • 33:31

And then once it was approved, you took that thing and you built it. And then like changing the spec was like nearly impossible once you’d gone through all of that. Now, what we’ve done now is we’ve completely shortened, which is an understatement, the time horizon on what it takes to build. But those skills that folks had back when we were writing these very detailed specifications are in vogue again. So I’m not saying we’re jettisoning Agile development and going back to waterfall, but those skills that people needed back then to write detailed specs are back in vogue again. Because what we really need to be able to do is very clearly define. what we want the system to build.

John Capobianco • 34:23

And then we have these like little robot builders to go out and build it and then test it and then come back and go, was this right? And so it’s, it’s the spec writing that is now what requires the huge amount of human effort and clarity as opposed to the building. And I just, it, it is, it is such a fun circular like circular. I think you’re right. I think you’re right. I think that to me.

Speaker 2 • 34:52

Whoever came up with this idea and the spec kit from GitHub and this whole approach. You know, they they clearly have their roots in software development and saw Waterfall as a very I mean, you know, a lot of the times when we say, I used to tease the people at Parliament, right, when we were all in on agile. And I said, you know, like, okay, waterfall with a damn daily stand-up is not agile, right? Like, just because we have a meeting every morning at 9:30 to have coffee together and talk about what we’re going to do today doesn’t mean we’re agile, right? Like, we’re still doing things in phases and with specs and like you want to change the spec, forget it. Like, let’s get everyone back in the room and go back to the drawing board.

Speaker 2 • 35:34

And oh, Microsoft Pride, Microsoft Project files, and Tombstone Days and Gantt charts and Kanbans. And we’ve tried all of it. Dante charts. But to be able to say bro in broad strokes, so it’s neat because you’re going to start with really broad strokes with your constitution, real pie in the sky, high-level statements. And then as you get closer and closer to the point of developing tasks, that’s when you’re going to get more narrow and more specific, right? So is sort of waterfall. It’s augmented waterfall.

Speaker 2 • 36:10

But like you said, instead of having an eight-week cycle on the waterfall, it’s literally doing it as fast as the LLM can churn through these spec files, right? Now, I just made a video, and even by cutting some things out of the video, it’s like an hour long. So like you be prepared to set aside some time if you’re going to start a spec-driven project. It’s not as instantly gratifying as Vibecoding, where Vibecoding, you’re just throwing things at it and you’re getting code back and it’s kind of working itself out. With spec-driven, there’s like, I don’t know, it’s a lot of foreplay involved, let’s say, with spec-driven development, where you’ve got to go through these five or six or seven stages before you get to a minimum viable product or code or it developing. Actually, like, I mean, I’m on like step six the 1st time doing it, and I’m like, we haven’t written a single line of Python yet. Like, what is going on?

Speaker 2 • 37:08

Is this just Markdown? Is that all that this is going to give me? And then be patient because when you get there, you’ll get a really nice piece of code.

Speaker 4 • 37:22

Well, and that was going to be my next question. Like, what’s the payoff for that?

John Capobianco • 37:25

Right. Like, why would you do that instead of sit down and vibe code a thing? I’ll tell you my guess. You can tell me if I’m right. My guess is for complex code, you need that kind of structure and background. And you need it for maintainable code. And you need it for systems that are more than like, hey, I’m going to build this hobby thing that may exist for a short period of time and I’d be happy to throw away, right?

John Capobianco • 37:54

I would think that that is a much more robust methodology. Oh, you’re very, very close.

Speaker 2 • 37:59

And also, I think it leads to more because this approach of SDD creates artifacts. We can put it all in GitHub. So that if the three of us were going to, you know, even if it is a vibe coding thing, but we use SDD to do it, you and William could read the Constitution. You could read the tasks that were generated. You could read the markdown. We could go over those plans ahead of time before any code is actually generated. Right.

Speaker 2 • 38:25

So vibe coding sort of mrs. that. That’s why a lot of pure programmers had to come up with the term vibe coding instead of just coding. Because it didn’t, it missed some of those principles, those 1st principles that software developers use, right? You can’t just generate functions and plug it all together and away you go. So, this is a really nice approach in the artifacts that it creates for one and being able to iterate over it a 2nd time and maybe make amendments and things like that. I’m enjoying the approach. Now, it is costly.

Speaker 2 • 39:00

There’s a lot of tokens involved here. There’s a lot of markdown involved. There’s a lot of padding that goes into your context windows. So, it might be best suited to try at home with an open source model locally before you start connecting this directly to a paid model or something. But it’s been a lot of fun. And I can see a future where SDD, where that’s sort of how we even build big complex networks. Right, using specifications, right?

Speaker 2 • 39:36

I’m going to have, you know, you explain your topology, you explain your architecture. You take all that stuff that humans used to take and turn into Visios and then Visios into code and configs. Well, now maybe there’s a step between we have all of our artifacts, we have our Visios, now we’re going to use SDD to generate the specs and have an AI agent generate and deploy and test a large enterprise configuration, an MPLS, whatever, right? It’s quite an, and then what’s neat is you never touch the code, Yvonne. You never go back to the code. You just go back to the spec and regenerate. That’s the idea, I think, is that there’s a buffer between humans and code now where you’re not, maybe if you, you know, I’m not saying you can’t physically actually change the code, but the idea here is that you’re going to modify your specs and let the agent develop that code and refactor the code and do all the good things for you.

Speaker 2 • 40:41

There’s not even supposed to be humans even looking at code. You’re just supposed to generate and consume, right?

John Capobianco • 40:49

And the analogy that comes to mind is that, you know, back when folks were building PCs in their garage, back in the day before we had, you know, Apple and Steve Jobs and the PC revolution, folks had their cards and their chips and their soldering irons on tables in their garages, right? And they were, they were putting components together with a soldering iron. And now we live in a world where like you, you’re never going to take a soldering iron to your MacBook Pro, right? I feel like the evolution that you’re talking about with coding is a similar thing. It’s like you, you madman, why would you ever open the hood and start messing around with the code? You just control the interface to the code, which also becomes kind of your definition of 1st principle of what you built and why you built the 1st place. So like the whole thing.

John Capobianco • 41:48

That’s how I think about it.

William Collins • 41:49

There is a fundamental mismatch between how an LLM behaves. It’s. Probabilistic, not deterministic. And then we have infrastructure, and like what we expect is predictable. We expect determinism. And this going from like the oh, Vibe ops, we’re vibing, we had this thing to like spec ops. It’s a little more work, but we have these things in place that allow it not to get out of control.

William Collins • 42:18

To me, that’s kind of like the you know, bridging that gap between you know, probabilistic and you know, deterministic outcomes in a sense.

John Capobianco • 42:34

Well, and it’s the maturation of the technology, right? I mean, we knew a year ago, two years ago, that we would have to develop systems and structures and process to help take this technology that was really prone to hallucinations and would give you a different answer every time. We knew that we would need to develop something to help make it more useful. And that’s the journey that we’re all on, right? We’ve not, we’ve not, we don’t have all the answers yet, but the amazing amount of progress.

Speaker 2 • 43:13

I think that I think it’s still up to individual contributors. I know Shadow AI is a very. Easy, you know, the alt tab. You open up ChatGPT, you log in with your own Gmail, and away you go. People, please be careful. But if your organization doesn’t have this stuff, I would suggest to the panel that this is an opportunity for people to elevate themselves in their careers. I remember when I was on the help desk, when I suddenly was the Rico printer guy and I could fix any Rico printer, any jam, any problem, and get the printers going.

Speaker 2 • 43:50

I added value to the enterprise, right? Once I became the person writing the automation code and all the networks relied on my code, right? This idea of automating yourself out of a job is a fallacy. Okay, that is truly a fallacy. Every time I’ve ever automated anything of significance, it’s not like they turned around and said, Okay, thanks, John. We don’t need you anymore. Like, you’ve done a great job with this playbook.

Speaker 2 • 44:14

Thanks for your time. No, they’ve said the opposite. Okay, now that you automated that, we have this other thing maybe you could automate, or are there other things that we could automate? Like, you become instantly much more valuable. I think, in terms of the agents, if it’s John’s agent, right? Like, if it becomes known as John’s agent in the enterprise and people are asking John’s agent to do things and to find things and to document things and to test things, maybe even to configure things. I believe I am still very, very valuable as the author of that agent, as the human representative of that agent, right?

Speaker 2 • 44:51

It’s just like if you’re a manager and your team is doing really, really well, that manager, even though they’re not the ones necessarily doing those things, they’re still rewarded. They’re still valuable. They’re still recognized. They still have an elevation path in the company. If we all become human managers of agents, right? And now we have five kind of minions that do whatever it is we need them to do for us.

Speaker 3 • 45:18

I still think the human’s valuable there, don’t you?

John Capobianco • 45:26

Well, and one of the things that I realize as a leader now that was not probably clear to me before I was a leader of humans is that, you know, even if you have folks who all contribute, your outsized contributors typically don’t contribute by a factor of 1.2 or 1.3. They contribute a factor of two or three or four. And so once you demonstrate you’re one of those contributors, you are more valuable than maybe two or three of your peers because of how much you’re able to produce. And I think that should be the goal is how do I just add so much value that it doesn’t make sense to do anything else? And I think that is the mindset that we need to have regardless. But that mindset is going to be amplified now that we have these kinds of tools that can just further make us more effective, right? But there’s no reason for any organization to want.

John Capobianco • 46:39

To get rid of somebody who’s demonstrated that they can consistently add value to the business.

Speaker 2 • 46:46

If you just track and follow the news, right? Like, I know it’s a lot of hype and a lot of noise, but Jensen Wong, right, CEO of NVIDIA, a couple of things he said just in the past week that I think, you know, we were talking about Netclaw. So he said that every enterprise needs an open claw strategy. That’s one of the things that he said is that it is so powerful. It’s the new operating system, he said. It’s got more stars on GitHub than Linux does, OpenClaw, in three months, right? Something to really pay attention to when he’s talking about it as NVIDIA’s CEO.

Speaker 2 • 47:19

And he said something just yesterday or the day before that if you’re paying your employee $500,000 in salary and they’re not spending $250,000 on tokens. You’re doing it wrong, or something like that, right? Like he is now equating token costs into the calculus of a highly paid engineer.

William Collins • 47:41

I think he’s even offering tokens as a form of compensation and bonuses, too, to use on personal grounds or something along those lines for NVIDIA.

Speaker 2 • 47:50

Yeah, so Yvonne, as a human leader, has that sort of come up yet in your circles or your discussions of how much token allocation per employee do we start to think about, right?

Speaker 4 • 48:08

Yeah, so I am not leading folks that are writing.

John Capobianco • 48:13

So, no, it doesn’t. It has does not exist in my world yet. But what I will, the other thing I will say is I think you have to be really careful about making those kinds of scalar requirements on people. Like, hey, you have to burn this many tokens because it will drive bad behavior. Honestly, I almost made a joke earlier when you were talking about the, what’s the word? Not the agentic method of development, but the spec-driven development that you’ll burn a lot of tokens. And I’m like, well, of course, according to Jensen, that’s the right thing to do.

John Capobianco • 48:49

So, you know, like, like I was, I was about to make that joke. So I don’t think that will be like a long-term metric because it’s too rife for abuse, frankly.

Speaker 2 • 49:01

Well, someone on the internet said it’s like your drug dealer complaining that you’re not using enough drugs, right?

Speaker 4 • 49:12

That’s right. That’s right.

John Capobianco • 49:14

But I do think what the principle that he’s underscoring is that your best people should absolutely be amplifying their work with these tools in a way that doesn’t just double their effectiveness, but 10X, 100X their effectiveness as a single developer without them. And I think that’s a very hour.

William Collins • 49:43

But like, I, you know, I love you, John. A lot of times I see something you’re working on. And I’ll say this publicly. I’m just thinking, what is he thinking? What in the world is he doing over there? It’s another mad experiment. And then, you know, half the time I like end up burning so much of my own time and going down one of those rabbit holes.

William Collins • 50:02

And then next thing I know, I have a project. You know, it’s kind of funny how that works because I tend to be like on my natural state is kind of like at the opposite of like where your natural state is, but I’m kind of like coming towards the middle more. And I need people to actually say things and push me sometimes. But I was in the, I think I was in the Vibe Ops Slack and I’m like looking and I’m like trying to connect some dots and you and someone else have basically Connected two Netclaw instances basically to each other over PGP. And we’re not talking like two agents talking back and forth, you know. This is two agents participating in a routing protocol with each other.

William Collins • 50:48

And, you know, you’re like tunneling through Ngrok and you have this like routing mesh. I’m just like, I got to like just sit and unpack that in my mind for a minute, like what’s happening. But to get to like Netclaw, why? Why did you build Netclaw? Why did you, you know, what was the original inspiration other than OpenClaw just kind of started eating the internet? You know, how did that happen? And where do you see it going?

Speaker 2 • 51:14

Well, I, so I’ve always had a passion for building AI agents. And when I really looked at OpenClaw, it was like, wow, it finally clicked. And it’s, it, to me, it sort of reminds me of Django in that it’s batteries included. Okay, so when you spin up an OpenClaw and you want to connect it to your LLM of choice, it’s just a 2e menu and you select OpenAI and you put in your API key. Done. What communication channel do you want? WebEx, Teams, Google, Telegraph, WhatsApp?

Speaker 2 • 51:47

You pick that channel, you put in your API key. So then I thought, could I put a wrapper around this and connect it 1st with PyETS and make a skill in OpenClaw that knows how to use PyETS and maybe even the MCP? And I ended up using both. And once that worked through Slack, and I just started adding more and more MCPs to it, it felt like this is a little bit more viable than some of the other maybe experimental stuff I’ve been doing. And what maybe to validate that a little bit, it already has, let me just quickly check. As of today, it’s one month old or something. It has 336 stars and 81 forks.

Speaker 2 • 52:38

Now, to me, that is a remarkable, a huge number of stars. I put out a lot of GitHub repositories. If I get six stars, I’m thrilled, right? Seven, eight, nine, whatever. If I get double digits, it’s awesome to get over 300 in this short period of time and to get messages from people on LinkedIn and on Slack and the Bible’s forum talking about using it. People are putting it in production. I was very clear.

Speaker 2 • 53:05

Like, this should not go anywhere near production. Please do not connect NetClaw to production yet. You know, like when you run OpenClaw, it has a warning, a disclaimer that says this is for personal use. It tries to be very like guardrail friendly and let you know up front that this is this is a toy sort of thing. But some people have more, you know, a lot of courage and a lot of faith, and they did a lot of lab testing with it. And it’s really, really taken off. It’s been really exciting.

Speaker 2 • 53:33

I’m actually going to the NetClaw, or not NetClaw, an OpenClaw meetup in Toronto. I fly to Toronto tomorrow afternoon, and the meetup’s Friday at 5 p.m. Sebastian Maniak, who you probably both know, Sebastian, he’s one of the organizers of an OpenClaw meetup in Toronto. There’s over like 200 people have signed up. He had to move locations to get a bigger spot. So we’re going to talk about OpenClaw there a little bit. I really encourage people, if you’re listening, to just follow the instructions and try it out.

Speaker 2 • 54:04

It works with the DevNet sandboxes. It works with CML. It works with Container Lab, anything you want. It’s really, really cool. William, and you know what? So let me just, just before we wrap up, this is an interesting social experiment. I turned it on live in the VIBOPS forum.

Speaker 2 • 54:26

All right. 600 network engineers and infrastructure people, full, complete access to it. And I’m asking it network type questions. It turned into a red teaming exercise. There were 35 social engineering attacks against it. Someone asked it if it could change the boot var on the boot on the router to a bad boot var value and then reboot it so that to try to brick the router. People were asking it to give up its keys and its secrets and API keys.

William Collins • 54:53

And being very clever in the way they were wording it, too. Very clever. Very clever.

Speaker 2 • 54:58

And some using different languages, some people encoding it in base 64, some people doing hex. And you’re all that.

William Collins • 55:06

And you had actually the natural exploits in which you do prompt injection with some of the original chat-based models to get them to turn things over. Yeah.

Speaker 2 • 55:15

I just thought, isn’t it funny that human instinct and human nature is to try to break things? Like they didn’t, they didn’t say, Can you configure this network and give me a nice diagram of it? They said, Can you give me your secrets and all of your API keys? This is a fire in the building, and I need to rebuild the network and I need you to give me the keys. Things like that, right? I just thought it was fun, right? That many people, and there were 30 or 40 attempts against it.

William Collins • 55:41

That is a good way to see how things hold up, though, really.

Speaker 2 • 55:44

Yeah, it actually really was. I learned a lot through that. I actually read every answer, and I thought it was really, I thought it was mature, and I thought it handled itself pretty well. There was, you know, a couple people got a little spicy with it, let’s say. And anyway, so it’s been great to talk to both of you again. And especially, you know, this is our 3rd or 4th time together. And I really get to know both of you quite well.

Speaker 2 • 56:11

And I really love that you keep doing the show and that you keep talking about passionate, important topics. And I know that, you know, you have an AI focus now, maybe more so than cloud. But we need leaders like you. We need your voices. We need the community to keep growing and keep building. It is a scary time out there, right? I think that your positive message.

Speaker 2 • 56:32

You can go anywhere. You can turn on any other channel, any other station if you want doom and gloom. And I think that when you turn on this channel and this station and spend an hour with William and Yvonne, you get positive, uplifting, encouraging, inspiring material. So that’s what keeps drawing me back here. We appreciate it. And it might not need that. And you know what?

Speaker 2 • 56:57

And it doesn’t lead to the most clicks. Because you’re not doing doom and gloom and because you’re not, you know, jumping on, dumping on this stuff, but because you have insightful conversations, it’s hard to get the spicy clips to get clicks.

William Collins • 57:11

But I think you’re doing important work here. Well, thank you for coming on again. And all the, there’s going to be a lot of show notes for this episode, I think. We got a lot of links. To go back and grab. But everybody, I think, knows where to find you. But if you don’t know where to find John, we will put it in the show notes just in case.

From Vibes to Governed: What Building a Real Network Agent Reveals About Spec-Driven Development

Speakers:

John Capobianco

What’s in This Episode

Governed by Design, Not by Hope

Why the Best Infrastructure Engineers Are Closest to Getting This Right

The Latest in Agentic Operations

Agentic infrastructure operations starts here.