Skip to content

Instantly share code, notes, and snippets.

@simonw
Created April 27, 2026 23:32
Show Gist options
  • Select an option

  • Save simonw/d2c716c008b3ba395785f865c6387b6f to your computer and use it in GitHub Desktop.

Select an option

Save simonw/d2c716c008b3ba395785f865c6387b6f to your computer and use it in GitHub Desktop.
{
"text": "[{\"Start\":0,\"End\":13.85,\"Speaker\":0,\"Content\":\"A lot of people woke up in January and February and started realizing, oh wow, I can churn out 10,000 lines of code in a day. It used to be you'd ask ChatGPT for some code and it would spit out some code and you had to run it and test it. The coding agents, they take that step for you.\"},{\"Start\":13.85,\"End\":19.5,\"Speaker\":0,\"Content\":\"And an open question for me is how many other knowledge work fields are actually prone to these agent loops?\"},{\"Start\":19.5,\"End\":22.78,\"Speaker\":1,\"Content\":\"Now that we have this power, people almost underestimate what they can do with it.\"},{\"Start\":22.78,\"End\":30.0,\"Speaker\":0,\"Content\":\"Today, probably 95% of the code that I produce, I didn't type it myself. I write so much of my code on my phone. It's wild.\"},{\"Start\":30.0,\"End\":43.12,\"Speaker\":0,\"Content\":\"I can get good work done walking the dog along the beach. My New Year's resolution, every previous year I've always told myself this year I'm going to focus more, I'm going to take on less things. This year, my ambition was take on more stuff and be more ambitious.\"},{\"Start\":43.12,\"End\":50.14,\"Speaker\":1,\"Content\":\"such an interesting contradiction. AI is supposed to make us more productive. It feels like the people that are most AI builder are working harder than they've ever worked.\"},{\"Start\":50.14,\"End\":62.57,\"Speaker\":0,\"Content\":\"Using coding agents well is taking every inch of my 25 years of experience as a software engineer. I can fire up four agents in parallel and have them work on four different problems. By 11 a.m., I am wiped out.\"},{\"Start\":62.57,\"End\":67.77,\"Speaker\":1,\"Content\":\"You have this prediction that we're going to have a massive disaster at some point. You call it the Challenger disaster of AI.\"},{\"Start\":67.77,\"End\":86.64,\"Speaker\":0,\"Content\":\"Lots of people knew that those little O-rings were unreliable, but every single time you get away with launching a space shuttle without the O-rings failing, you institutionally feel more confident in what you're doing. We've been using these systems in increasingly unsafe ways. This is going to catch up with us. My prediction is that we're going to see a Challenger disaster.\"},{\"Start\":87.12,\"End\":127.65,\"Speaker\":2,\"Content\":\"Today my guest is Simon Wilson. Simon, in my opinion, is one of the most important and useful voices right now on how AI is changing the way that we build software and how professional work is changing broadly. What I love about Simon is that he doesn't just pontificate in the clouds. He's been what you'd call a 10x engineer for over 20 years. He co-created Django, the web framework that powers Instagram, Pinterest, Spotify, and thousands of other platforms. He coined the term prompt injection, popularized the ideas of AI slop and agentic engineering, and amongst his hundred plus open source projects, he created Dataset, a data analysis tool that has become a staple of investigative journalism.\"},{\"Start\":127.65,\"End\":160.78,\"Speaker\":2,\"Content\":\"What makes Simon rare is that very few engineers have made the leap from the old way of building to the new way as fully and visibly as he has. And as he's leaned into this new way of building, he's been sharing everything he's learning in real time through his incredible blog, simonwilson.net. Simon does not do a lot of podcasts, and this conversation opened my mind up in a bunch of new ways. I am so excited for you to get to learn from Simon. Don't forget to check out lennysproductpass.com for an incredible set of deals available exclusively to Lenny's newsletter subscribers. With that, I bring you Simon Wilson.\"},{\"Start\":160.78,\"End\":163.44,\"Content\":\"[Music]\"},{\"Start\":163.44,\"End\":167.49,\"Speaker\":1,\"Content\":\"Simon, thank you so much for being here and welcome to the podcast.\"},{\"Start\":167.49,\"End\":169.24,\"Speaker\":0,\"Content\":\"Hey, Lenny, it's really great to be here.\"},{\"Start\":169.24,\"End\":201.9,\"Speaker\":1,\"Content\":\"I am so excited to have you here. I've been such a fan of yours from afar for so long. I've learned so much from your blog. And even though every guest I have on this podcast is my favorite guest, you're my favorite kind of guest because you're on the ground building with the latest tools, using it for real. You're very good at articulating what you experience. So we're going to get a lot of ROI out of this, out of your brain from from this time that we have together. What I want to start with is essentially a an AI state of the union. You've written about this November inflection.\"},{\"Start\":201.9,\"End\":202.9,\"Speaker\":0,\"Content\":\"Yes.\"},{\"Start\":202.9,\"End\":210.82,\"Speaker\":1,\"Content\":\"So what I'm thinking is we start just kind of give us like a brief history lesson of just like what happened in November and where are we today? What's possible now?\"},{\"Start\":210.82,\"End\":245.64,\"Speaker\":0,\"Content\":\"Well, let's let's talk about all of 2025 very briefly. Um, 2025 was the year that especially Anthropic and OpenAI realized that code is the application. Like being able to have things generate code. I think partly because um Anthropic came up with Claude Code back in in sort of February of 2025 and it took off like crazy. And a bunch of people started signing up for $200 a month accounts. And so suddenly, wow, it turns out people are willing to pay a lot of money for this stuff, for that specific field. Both Anthropic and OpenAI spent the whole of 2025 focusing all of their training efforts on coding.\"},{\"Start\":245.64,\"End\":283.09,\"Speaker\":0,\"Content\":\"If you look at what they were doing, it was all the reinforcement learning stuff, the reasoning trick, the thing where the models say they're thinking. That was new in late 2024. Like OpenAI's O1 was the first model to exhibit that. And now all of the models do it. So that was the other big trend of last year was these reasoning models. Turns out reasoning is great for code. It can reason through code and figure out the route of bugs and all of that. And so the end result of this, the end result of these two labs throwing everything they had at making their models better at code, is in November we had what I call the inflection point where GPT 5.1 and Claude Opus 4.5 came along. And they were both just\"},{\"Start\":283.09,\"End\":314.52,\"Speaker\":0,\"Content\":\"they were incrementally better than the previous models, but in a way that crossed a threshold where previously, if you had these coding agents, you could get them to write you some code, and most of the time it would mostly work. But you had to pay very close attention to it. And suddenly we went from that to almost all of the time it does what you told it to do, which makes all of the difference in the world. Now you can spin up a coding agent and say, \\\"Hey, build me a Mac application that does this thing,\\\" and you'll get something back which still needs some back and forth, but it won't just be a buggy pile of rubbish that doesn't do anything.\"},{\"Start\":314.52,\"End\":354.6,\"Speaker\":0,\"Content\":\"That was fascinating because all of the software engineers who took time off over the over the holidays and started tinkering with this stuff got this moment of realization where it's like, oh wow, this stuff actually works now. I can tell it to build code and if I describe that code well enough, it'll follow the instructions and it'll build the thing that I asked it to build. I think the reverberations of that are still shaking us to to the software engineering. A lot of people woke up in January and February and started realizing, oh wow, this technology which I'd been kind of paying attention to, suddenly it's got really, really good. And what does that mean? Like what does the fact like I can churn out 10,000 lines of code in a day and most of it works.\"},{\"Start\":354.6,\"End\":355.55,\"Speaker\":0,\"Content\":\"is that good?\"},{\"Start\":355.55,\"End\":390.94,\"Speaker\":0,\"Content\":\"like how do we get from most of it works to all of it works? There are so many new questions that we're facing, which I think makes us a bellwether for other information workers. Like code is easier than almost every other problem that you pose these agents because code is obviously right or wrong. Like it produces code, you run the code, either it works or it doesn't work. There might be a few subtle hidden hidden bugs, but generally you can tell if the thing actually works. If it writes you an essay or if it writes you a law like prepares a lawsuit for you, there are so it's so much harder to derive if it's actually done a good job, to figure out if it got things right or wrong.\"},{\"Start\":390.94,\"End\":391.96,\"Speaker\":0,\"Content\":\"But\"},{\"Start\":391.96,\"End\":411.08,\"Speaker\":0,\"Content\":\"It's kind of happening to us. So software engineers, it came for us first and we're figuring out, okay, what do our careers look like? How do we work as teams when part of what we did that used to take most of the time doesn't take most of the time anymore? What does that look like? And it's going to be very interesting seeing how this rolls out to to other information work in the future.\"},{\"Start\":411.08,\"End\":451.08,\"Speaker\":2,\"Content\":\"This episode is brought to you by our season's presenting sponsor, WorkOS. What do OpenAI, Anthropic, Cursor, Vercel, Replit, Sierra, Clay, and hundreds of other winning companies all have in common? They are all powered by WorkOS. If you're building a product for the enterprise, you've felt the pain of integrating single sign-on, SCIM, RBAQ, audit logs, and other features required by large companies. WorkOS turns those deal blockers into drop-in APIs with a modern developer platform built specifically for B2B SaaS. Literally every startup that I'm an investor in that starts to expand upmarket ends up working with WorkOS. And that's because they are the best.\"},{\"Start\":451.08,\"End\":480.82,\"Speaker\":2,\"Content\":\"Whether you're a seed-stage startup trying to land your first enterprise customer or a unicorn expanding globally, WorkOS is the fastest path to becoming enterprise-ready in an unblocking world. It's essentially Stripe for enterprise features. Visit workos.com to get started or just hit up their Slack where they have actual engineers waiting to answer your questions. WorkOS allows you to build faster with delightful APIs, comprehensive docs, and a smooth developer experience. Go to workos.com to make your app enterprise-ready today.\"},{\"Start\":480.82,\"End\":499.78,\"Speaker\":1,\"Content\":\"I want to come back to just like what is possible now. So just to give a little context, it's like insane how far we've come. I don't know, like a couple years ago, all code was human-written. Then it's like tab complete. Then it's like, okay, now the best engineers are 100% AI code. Now it's like, uh, uh, I'm like coding for my phone. Like I'm not even looking at my code anymore.\"},{\"Start\":499.78,\"End\":508.8,\"Speaker\":0,\"Content\":\"I write so much of my code on my phone. it's it's wild. Like I I can get good work done walking the dog along the beach, which is delightful, you know?\"},{\"Start\":508.8,\"End\":529.84,\"Speaker\":1,\"Content\":\"Yeah, I had Boris Journey on the podcast and he's doing the same thing. And I was just like, is that even coding anymore? He's like, yeah, it's just another level of abstraction. Just like engineering has always gone. Talk about maybe just like what else is there around just like what is possible now with AI in terms of building that people may not fully recognize? And where do you think what's like the next leap? Is there anything beyond this?\"},{\"Start\":529.84,\"End\":560.73,\"Speaker\":0,\"Content\":\"Let's talk about the two, the sort of, there's the vibe coding side of things, and then there's the, and and I like Andrej Karpathy's original definition of vibe coding, which is, um, when you don't even look at code and you basically just go on the vibes. You say, \\\"Build me something that does X,\\\" and it builds it, and you play with it, and if it looks good, then great, and if it doesn't quite do it, you you you keep on going back and forth with it. But it's very hands-off. You're not looking at code. It's so he he originally said this is great for having fun and prototyping, and it then expand exploded way out of that.\"},{\"Start\":560.73,\"End\":592.17,\"Speaker\":0,\"Content\":\"And I think today, vibe coding is effectively it's the the definition I use is it's when you're not looking at the code, you don't care about the code, and maybe you don't understand the code. Like non-programmers can now tell Claude what to build and it can build them a little app. And I love that. I absolutely love that we're sort of democratizing the art of getting a computer to do stuff for you, of automating tedious things in your life by knocking out these little tools. Of course, the problem is that there is a limit on how much you can do without responsibly.\"},{\"Start\":592.17,\"End\":632.02,\"Speaker\":0,\"Content\":\"Uh, like, I like to tell people, if you're vibe coding something for yourself where the only person who gets hurt if it has bugs is you, go wild. That's completely fine. The moment you're you're vibe coding code for other people to use where your bugs might actually harm somebody else, that's when you need to take a step back and say, hang on a second, this is not a responsible way of using the the these tools. The challenge is that understanding what's responsible and what isn't is in itself a sort of expert level skill. So, knowing that once you start dealing with like scraping other people's websites, maybe you'll damage their websites by hitting them too hard. There are so many ways that you can cause damage if you don't know what you're doing.\"},{\"Start\":632.02,\"End\":666.82,\"Speaker\":0,\"Content\":\"But I love that liberation and I love that people can come to meetings with a prototype that they knocked up of their idea that illustrates the idea. I think those things are wonderful. The big debate, the ongoing debate has been, what do we call it when a professional software engineer uses his tools to write real code that's production ready that they've reviewed and they've checked all of the details of? A lot of people call that vibe coding as well. I think that devalues vibe coding as a term because it's useful to say, I vibe coded this, as in I haven't even looked at how it works, it's not production ready, but it's kind of a cool prototype.\"},{\"Start\":666.82,\"End\":694.85,\"Speaker\":0,\"Content\":\"The moment vibe coding means everything involved that touches AI, it effectively ends up meaning programming because we're all moving in the direction where our code is mediated through AI at some point. So, what do we call it for professionals? I've gone with agentic engineering because I think the thing to emphasize is these coding agents, right? If you're asking ChatGPT to knock out some code, that's a different thing from if you're running Codex and having it write the code, debug the code, test the code, all of that.\"},{\"Start\":694.85,\"End\":732.63,\"Speaker\":0,\"Content\":\"And I think that agentic engineering is such a deep and fascinating discipline because the art of getting really good results out of this, like the art of having them help you build software you could deploy to a million people, that's not that's never going to be easy. That's never going to be trivial. That's always going to require a great deal of depth of experience in what software and how software works and how um how these agents work. And I love that. That's I'm I'm kind of writing a book about it now that I'm publishing a chapter at a time on my blog. The best form of writing because I don't have an editor or any pressure from a publisher is just when I feel like writing another chapter I can I can do that. But there's so much to discuss.\"},{\"Start\":732.63,\"End\":770.75,\"Speaker\":0,\"Content\":\"But yeah, so I think right now the frontier is how do we build professional software using coding agents? How do we build software that is... I don't just want to build software that's that's good. I want us to build software that is better than we were building before. Like, if the agents let us move a bit faster but we're still churning out the same quality of software, that's less interesting to me than if the software we're producing has less bugs, more features, it's higher quality, it's better software because we're harnessing these tools. The really interesting future is something which some people have been calling the dark factory pattern or software factories. This is the idea where...\"},{\"Start\":770.75,\"End\":809.85,\"Speaker\":0,\"Content\":\"Right now, if you're a professional using these tools, the way you do it is you tell them what to build and then you look at the code and you review that code really carefully and make sure it's doing the right thing. What does it look like if you're not reviewing the code? If you're not looking at that code, but you're also not vibe coding, you're not throwing everything to the wind and seeing what happens. You're applying professional practices and quality expectations to code that you're not directly reviewing. The reason it's called the dark factory is there's this idea idea in factory automation that if your factory is so automated that you don't need any people there, you can turn the lights off. Like the machines can operate in complete darkness if you don't need people on the factory floor.\"},{\"Start\":809.85,\"End\":837.02,\"Speaker\":0,\"Content\":\"What does that look like for software? And there's some very... this company called Strong DM has been pushing this and doing some really interesting experiments around this. That I think is the... that's that's futuristic. Like that's we're trying to figure out what that looks like and how we can responsibly build software in that way right now. And making some quite interesting like discoveries about things that work and things that don't work. But that to me is is the next the next sort of barrier.\"},{\"Start\":837.02,\"End\":849.42,\"Speaker\":1,\"Content\":\"Let's follow that thread. So what is what is this factory doing? So there's an element of no one's looking at the code really, but what how does that change how software is built? Are they are are people still coming up with the ideas and telling you this factory build this thing for me? Oh exactly.\"},{\"Start\":849.42,\"End\":857.62,\"Speaker\":0,\"Content\":\"So this is the fascinating thing is um so there's a policy of nobody writes any code and quite a few companies are beginning to introduce that now because\"},{\"Start\":857.62,\"End\":859.98,\"Speaker\":1,\"Content\":\"Just to be clear, the policy is you cannot write code.\"},{\"Start\":859.98,\"End\":862.64,\"Speaker\":0,\"Content\":\"You cannot type code into a computer.\"},{\"Start\":863.38,\"End\":896.96,\"Speaker\":0,\"Content\":\"Yeah. Um, and honestly, like I thought six months ago, I thought that was crazy. And today, probably 95% of the code that I produce, I didn't type it myself. So that world is is is is practical already because these the latest models are good enough that you can tell them, oh, no, rename that variable and refactor that and and add this line there and they'll just do it. And it's faster than you typing on the keyboard yourself. The next rule though is nobody reads the code. And this is the thing which StrongDM started doing back in, I think it was August last year. They said, okay, we're not going to read the code.\"},{\"Start\":896.96,\"End\":934.79,\"Speaker\":0,\"Content\":\"So what does that mean? How do you produce software that works and is good if you're not reading the code? And they've come up with a whole bunch of answers. Um, one of the most interesting was the way they did testing where in traditional software, some companies will have a QA department. Like the engineers write a bunch of software and then you throw it over the wall to the QA department and they sort of test it furiously to figure out if it's working or not. That, I think, went out of fashion a bit over the past sort of five to ten years from what I've seen in Silicon Valley because you kind of want your engineers to take responsibility for the code they're writing being good. But what if you can simulate that QA department?\"},{\"Start\":934.79,\"End\":968.06,\"Speaker\":0,\"Content\":\"So what StrongDM were doing is, um, they had a swarm of agent testers who were actually simulating cust- simulating end users. So the software that they were building, this is crazy, the software is security software for access management. So when you sign in, when you start as a company and somebody needs to assign you access to Jira and then give you access to Slack and all of that kind of thing, they were building software for that. That's very security like adjacent. That's not the kind of thing that you should be vibe coding at all, based on most people's understanding of how the world works.\"},{\"Start\":968.06,\"End\":969.73,\"Speaker\":0,\"Content\":\"But that's and there was so there were\"},{\"Start\":969.73,\"End\":985.4,\"Speaker\":0,\"Content\":\"legitimate security company who've been doing this stuff without AI for years, so it's not like they didn't understand the risks. So, the way they did their testing is they had this swarm of simulated employees all in a simulated Slack channel saying things like, \\\"Hey, could somebody give me access to Jira?\\\" The Slack channel itself is simulated.\"},{\"Start\":985.4,\"End\":999.4,\"Speaker\":0,\"Content\":\"we'll talk about that in a moment. And they 24 hours a day they're making requests and saying, \\\"Hey, I need access to Jira and all of those kinds of things\\\" at enormous cost. Like they were spending $10,000 a day on tokens, I think, simulating all of these end users. I believe so.\"},{\"Start\":999.4,\"End\":1036.4,\"Speaker\":0,\"Content\":\"But it meant that their software was being very robustly tested in all of these different ways. And yeah, it's kind of similar to having a similar to having a manual QA team, except one that never sleeps. And I thought that was fascinating as a sort of example of thinking outside of the box, taking this question, how do we tell our software is good if we're not reviewing the code, and trying to find creative answers to it. The other thing that was interesting is that the Slack channel itself wasn't actually Slack. Because it turns out if you test against real software like Slack and so forth, they all have rate limits and like they they they they they won't let you just run 10,000 simulated people at a time.\"},{\"Start\":1036.4,\"End\":1056.93,\"Speaker\":0,\"Content\":\"So what they did is they built their own simulation of Slack and Jira and Okta and all of this software they were integrating with. And the way they did that is they basically took the API documentation for the public APIs for Slack and the client libraries, the open source client libraries, and they told their coding agents, build this. Build build me a simulation of this API.\"},{\"Start\":1056.93,\"End\":1057.88,\"Speaker\":0,\"Content\":\"and they did\"},{\"Start\":1057.88,\"End\":1087.4,\"Speaker\":0,\"Content\":\"So this company is, and this is one of the things that they I went to a demo that they gave back in October. One of the things that really sat with me is that they had their own simulated version of Slack and Jira and all of these different package different systems that they could then build their software against, which cost them nothing because once they spun it up, it was a little Go binary that sat there. And they even had interfaces. They had like a fake version of the Slack interface that they'd code like vibe coded up that let them see what was going on. Absolutely fascinating.\"},{\"Start\":1087.4,\"End\":1103.54,\"Speaker\":1,\"Content\":\"That is such a cool story and I love these stories of just companies at the bleeding edge trying to see what's possible, uh and have an advantage essentially. So what I'm hearing here is the QA piece is like the new piece in this factory. So we, you know, we already have Codex, Cloud Code, they can go off and build stuff.\"},{\"Start\":1103.66,\"End\":1112.42,\"Speaker\":1,\"Content\":\"is the innovation here? Okay, now you've built all this stuff, is it actually any good? Is there a reason like Codex and CloudCo couldn't do this themselves? Why do you need kind of this factory concept?\"},{\"Start\":1112.42,\"End\":1124.55,\"Speaker\":0,\"Content\":\"I think they can, like you can tell Claude code, fire up a sub-agent that uses Playwright to simulate a browser and all of that kind of thing. You'd have trouble getting it to run 24 hours a day. I mean, maybe it would work.\"},{\"Start\":1124.55,\"End\":1163.75,\"Speaker\":0,\"Content\":\"Um, but certainly I think that what's interesting to me isn't so much the software you're using, it is these these big ideas, these these these techniques that you're using to try and answer these questions. Because even if your QA team, your virtual QA team says this is good, doesn't mean it's secure, right? It doesn't mean that you've got all of those other um characteristics you care about. At the same time, the agents are getting really good at security penetration testing now. And this is a new thing, I think in the past, again, in the past sort of three to six months, they've started being credible as security researchers, which is sending shockwaves through the security research industry. They're like, wow, we didn't think that they'd get to this point.\"},{\"Start\":1163.75,\"End\":1195.72,\"Speaker\":0,\"Content\":\"What's interesting there is both OpenAI and Anthropic have specialist security models that they will not release to the general public because they can be used to break into websites. So they have like invite-only, like registered security researchers can apply for access and they've been producing um vulnerability reports against popular open source software. I think Firefox just a few days ago, maybe last week, said that they'd they'd done a release which was assisted by Anthropic. Anthropic had\"},{\"Start\":1195.86,\"End\":1235.27,\"Speaker\":0,\"Content\":\"discovered a hundred like potential vulnerabilities in Firefox and responsibly reported them to Mozilla, who then fixed them. That's an interesting one as well because we're seeing a lot of this in the wild and it's it's just incredibly frustrating for maintainers because there are these people who don't know what they're doing who are asking ChatGPT to find a security hole and then reporting it to the maintainer. And the report looks good. Like ChatGPT can produce a very well-formatted report of a vulnerability. It's a total waste of time. Like it's not actually verified as being a real problem. The difference with Anthropic and Firefox is that Anthropic's security team actually did do the work.\"},{\"Start\":1235.27,\"End\":1241.38,\"Speaker\":0,\"Content\":\"They didn't report whatever the agent said, they actually verified that it was a good quality report before before they handed it over.\"},{\"Start\":1241.38,\"End\":1280.12,\"Speaker\":1,\"Content\":\"There's going to be a lot to talk about on the security side. You've done a lot of thinking and writing about the dangers there, but I want to follow this thread. So, in terms of what AI has been doing for teams, if you think about it's like it's kind of going on the middle and expanding. So it's like writing, you know, it's it's taking on more and more of the building components. It's doing code reviews now, QA as you've been describing, constantly building. And it feels like the front of that is the big now gap and opportunity, which is coming up with the idea, what the heck should we build? Because then once you tell the AI, build this thing, as you're describing, it's getting better and better at building something great. Have you had any luck yet with\"},{\"Start\":1280.12,\"End\":1287.0,\"Speaker\":1,\"Content\":\"using AI there and do you think it starts to eat that and just becomes the strategy, you know, PM basically.\"},{\"Start\":1287.0,\"End\":1327.16,\"Speaker\":0,\"Content\":\"So this is one of the most interesting problems we're having with all of this is we've taken the writing code bit and we've massively accelerated that. Now the bottlenecks are everywhere else, right? Like how do we redesign our processes now that the bit that used to take the longest, right? It used to be you'd come up with the spec and you hand it to your engineering team and three weeks later if you're lucky they'd come back with an implementation for you to then start. And now that that maybe that takes three hours, depending on how well established the coding agents are for that kind of thing. So now what, right? Now where else are the bottlenecks? I don't think it's, I mean, as coming with the initial ideas, um anyone who's done any product work knows that your initial ideas are always wrong. What matters is is proving them, right?\"},{\"Start\":1327.16,\"End\":1363.35,\"Speaker\":0,\"Content\":\"it's it's it's it's testing them. We can test things so much faster now because we can build workable prototypes so much quicker. So there's an interesting thing I've been doing in my own work where any sort of feature that I want to design, I'll often prototype three different ways it could work, because that takes very little time and then I can start experimenting them and trying them and seeing which ones I like. And that that feels to me like the really transformational step here is that when you get AI involved in your ideation phase, it's much more about the prototypes. It's about, okay, we can see like a a a UI prototype is free.\"},{\"Start\":1363.35,\"End\":1401.75,\"Speaker\":0,\"Content\":\"Now, ChatGPT and Claude will just build you a very convincing UI for anything that you describe, and that's how you should be working. I think anyone who's doing sort of product design and isn't vibe coding little prototypes is missing out on the the the latest, but like the the most powerful sort of boost that we get in that step. But then what do you do, right? How do you, given your three options now that you have instead of one option, how do you prove to yourself which one of those is the best? I don't have a confident answer to that. I expect this is where the good old-fashioned usability testing comes in. Like, get somebody on Zoom, screen shared, using your software, see what happens.\"},{\"Start\":1401.75,\"End\":1416.42,\"Speaker\":0,\"Content\":\"that's you can tell the AI to do it and you can simulate your users with the AI. I don't think that's credible. I don't think you're going to get as good results from ChatGPT pretending to click around on your prototype than you would from an actual human being.\"},{\"Start\":1416.42,\"End\":1454.45,\"Speaker\":1,\"Content\":\"This is so interesting. A question I've been tackling is just where are human brains going to continue to be valuable? And what I'm hearing here is there's like the initial idea. You made such a good point here. It's like the initial idea is often not the actual winning idea, it's just the beginning of an idea. So there's like the idea for the feature, then there's the try it out, prototype it, help you narrow on the direction, build it, make it awesome, get it out into the world. And it feels to me like AI is going to be really good at suggesting ideas and coming up with initial ideas. And I wonder if the human brain, like, it's not like maybe someday we don't need human brains at all and that's a whole other discussion. But maybe the next phase is...\"},{\"Start\":1454.45,\"End\":1456.74,\"Speaker\":1,\"Content\":\"AI will help us come up with great ideas.\"},{\"Start\":1456.74,\"End\":1483.84,\"Speaker\":0,\"Content\":\"I mean, that's been the case for probably a couple of years now. They've been strong enough to do really good brainstorming. And I like to compare it to the thing where when you've got a group brainstorming exercise, you book a meeting room for an hour, you've got a whiteboard, you get a dozen people in, and the first two-thirds of that brainstorming session, honestly, it's kind of just everyone going through the most obvious basic ideas, right? And you get them all out on the whiteboard, you get them all out, and then things get interesting when you start saying, okay, well, let's talk about these...\"},{\"Start\":1483.84,\"End\":1523.0,\"Speaker\":0,\"Content\":\"let's start combining them. But AI is so good at that first two-thirds of the ideas. Like, I brainstorm with them all the time where I just get them to spit out all of the obvious stuff and they'll come up with 20 things and they'll all be kind of done. Like they're they won't they won't be they just won't be very interesting. What gets interesting is when if you ask them for 20 more and now they by the sort of end of that list you're beginning to get things which are not good ideas but they point you in interesting directions. There are so many other tricks like this. Like, um, you can tell you can you can tell AI to combine weird fields. You can say, okay, I want ideas for marketing my new SaaS platform inspired by marine biology.\"},{\"Start\":1523.0,\"End\":1532.55,\"Speaker\":0,\"Content\":\"and you see what happens. And most of it will be complete junk, but there might be a spark that gets you to the good idea. So I love them as as brainstorming companions on that front.\"},{\"Start\":1532.55,\"End\":1554.24,\"Speaker\":1,\"Content\":\"That reminds me of a chat I had with David Plask. He's an expert naming person. He helps companies come up with names for products. And one of the things that he does at his company is he creates three teams to come to brainstorm names. One team, so for example, let's say uh Windsurf was a product they named. Um, so the first team is okay, this is an AI IDE thing. That's that's exactly what it is.\"},{\"Start\":1554.24,\"End\":1559.42,\"Speaker\":1,\"Content\":\"Second team is, okay, this is a, this is a boat. You're naming a boat. and here's the constraints\"},{\"Start\":1559.42,\"End\":1577.45,\"Speaker\":1,\"Content\":\"And then here, this is a a spaceship. So name it from that perspective. And he finds the best names come from those other directions where it's a different metaphor with the same sort of uh benefits. Um, okay. So what I'm hearing here is this is good. This is good for humans right now that there's still opportunity for us to contribute to the process.\"},{\"Start\":1577.45,\"End\":1617.0,\"Speaker\":0,\"Content\":\"And actually, I want to stand in defense of software engineers for a bit because on the one hand, these things can write code. That used to be our thing, right? I'm finding that using coding agents well is taking every inch of my 25 years of experience as a software engineer and it is mentally exhausting. Like this is something which people are talking a lot more about now. I can fire up like four agents in parallel and have them work on four different problems and by like 11:00 a.m. I am wiped out for the day. Like I have because there is a limit on human cognition in how much, even if you're not reviewing everything they're doing, just how much you can hold in your head at one time.\"},{\"Start\":1617.0,\"End\":1654.41,\"Speaker\":0,\"Content\":\"and it's very easy to pop that stack at the moment. Like, there's a sort of personal skill that we have to learn which is finding our new limits. Like, what is what is a responsible way for us to to to not burn out and for us to to use the time that we have. And I I've I've talked to a lot of people who are losing sleep because they're like, my coding agents could my agents could be doing work for me. I'm just going to stay up extra half hour and and set off a bunch of extra things and they're waking up at four in the morning. That's obviously unsustainable. I hope that that's a novelty thing. That agents only really got good in the past sort of four to five months. We're all learning what that looks like and what that lets us do. But it's it's it's concerning.\"},{\"Start\":1654.41,\"End\":1686.58,\"Speaker\":0,\"Content\":\"there's an element of sort of gambling and addiction to to how we're using some of these tools. But to stand in defense of software engineers, I get great results out of these things because they are amplifiers of existing skills and experience. And I have 25 years of existing like pre-AI experience, which I can now amplify because I can talk to the agents at a very high level. I can use very I can use um sophisticated engineering like language that I've mastered over the years, which they appear to know as well. And we can collaborate incredibly effectively.\"},{\"Start\":1686.58,\"End\":1723.86,\"Speaker\":0,\"Content\":\"And it means I can look at a problem and I can say, this problem is a one-sentence prompt and I know it'll find that bug and fix that bug, as opposed to this other problem which is, who knows how how big a problem. There is a flip side to this, which is that I've got 25 years of experience in how long it takes to build something, and that's all completely gone. Like that doesn't work anymore because I can look at a problem and say, okay, well, this is going to take two weeks, it's not worth it. And now it's like, yeah, but maybe it's going to take 20 minutes because the reason it was taking two weeks was all of the the sort of crafty coding things that the AI is now covering for us. And that I've been finding really interesting and challenging.\"},{\"Start\":1723.86,\"End\":1752.44,\"Speaker\":0,\"Content\":\"I I constantly throw tasks at AI that I don't think it'll be able to do because every now and then it does it. And when it doesn't do it, you learn, right? You learn, okay, Opus 4.6 still can't do this particular thing, but when it does do something, especially something the previous models couldn't do, that's actually cutting-edge AI research. You can be the first person in the world to spot that AI can now do X just because you were the person you you found it couldn't do it and you've you've been keeping that sort of backlog of of interesting tasks for it.\"},{\"Start\":1752.44,\"End\":1769.22,\"Speaker\":1,\"Content\":\"This is such an interesting line of discussion. This idea that, let's say 10x engineers, to use that phrase, are going to be more valuable is what you're describing here because you can work with these tools much more effectively. What do you think of junior engineers? Just like what's happening there, what's their future?\"},{\"Start\":1769.22,\"End\":1789.36,\"Speaker\":0,\"Content\":\"So, there's an interest. So Thoughtworks, um the big um like uh IT consultancy did a offsite a few uh about a month ago and they produced they got a whole bunch of engineering VPs in from different companies to talk about this stuff. And one of the interesting theories they came up with is they think this stuff is really good for experienced engineers. Like it amplifies their skills.\"},{\"Start\":1789.36,\"End\":1789.86,\"Speaker\":0,\"Content\":\"That's great.\"},{\"Start\":1789.86,\"End\":1829.09,\"Speaker\":0,\"Content\":\"It's really good for new engineers because it solves so many of those onboarding problems. Like, if you talk to um Cloudflare and Shopify, both said they were hiring a thousand interns over the course of 2025 because the intern onboarding costs, it used to be takes a month before you enter can do anything useful. Now they're doing something useful within like a week because the the AI assistant helps them get up and running faster. The problem is the people in the middle. Like if you're mid-career, if you haven't made it to sort of super senior engineer yet, but you're not sort of new either, that's the that's the group which Thoughtworks which Thoughtworks resolved were probably in the most trouble right now.\"},{\"Start\":1829.09,\"End\":1847.79,\"Speaker\":0,\"Content\":\"like that's the open question because they don't have that expertise to to to to amplify and and use with these tools. And it's not as benefit like they've got all of the the boosts that the beginners were getting they've got already. So that's an interesting open question right now for me is it's more the the the sort of mid mid-level as opposed to the beginners or the the advanced people.\"},{\"Start\":1847.79,\"End\":1880.16,\"Speaker\":1,\"Content\":\"It's so interesting how AI is coming at the middle of so many things. It's coming at the middle of the product development process, it's coming at the middle of seniority. It's probably other examples. And I'm guessing this is true for all functions, like PMs, designers too, just new PMs, designers, maybe because being AI native basically is what you're describing. And and ramping up much more quickly. I guess while we're on this topic, say you are a lot of listeners here are just like those people in the middle. What would your advice be to them to help them avoid becoming a part of the permanent underclass?\"},{\"Start\":1881.32,\"End\":1920.99,\"Speaker\":0,\"Content\":\"That's a big responsibility you're putting on me there. Um, I think, I think the way forward is to lean into this stuff and figure out how do, how do I help this make me better? Right? Like, a lot of people worry about skill atrophy, you know, if the AI's doing it for you, you're not learning anything. I think if you're worried about that, you push back at it. Like, you have to be mindful about how you're applying the technology and think, okay, I've been given this thing that can answer any question and often gets it right, doesn't always get it gets it right. How can I use this to amplify my own skills, to to learn new things, to take on much more ambitious projects?\"},{\"Start\":1920.99,\"End\":1959.3,\"Speaker\":0,\"Content\":\"Something I've been enjoying, I think the thing I've enjoyed most about this as a software engineer is that my level of ambition has shot right up. Because now, I used to like never, I never used AppleScript because AppleScript is a whole programming language you have to learn. And I've been using AppleScript for like two and a half years now because ChatGPT knows AppleScript and I don't have to. And so now I can automate things on my Mac. And that's great, you know. Um, and previously, the fact that it would take me like two or three months to learn basic AppleScript was enough for me never to use it. And now I've got all of these technologies that I'm using because that two to three month initial learning curve has been shaved right down. I think that applies to everything else.\"},{\"Start\":1959.3,\"End\":1991.51,\"Speaker\":0,\"Content\":\"like, I'm getting much better at cooking. I've been using Claude, it turns out, excellent chef, which doesn't make sense because it can't, it doesn't have taste buds. But it does, it can give you the global average of the world's guacamole recipes, which turns out is good guacamole. So, that's been really interesting, like trying to apply this stuff just to for sort of self-improvement. I think that's a really useful skill to have. Because honestly, everything is changing so fast right now, the only universal skill is being able to roll with the changes. Right? That's the thing that we all need.\"},{\"Start\":1991.51,\"End\":2023.38,\"Speaker\":0,\"Content\":\"Weirdly, um the term that comes up most in these conversations about how you can be great with AI is agency, right? People, human beings have agency and we use that agency to decide what problems to take on and where to go. I think agents have no agency at all. Like, I would argue that the one thing AI can never have is agency because it doesn't have human motivations. Like, sure you can tell it make more money or whatever, but it's never going to be able to decide on its, like what makes sense for it to act on next.\"},{\"Start\":2023.38,\"End\":2032.57,\"Speaker\":0,\"Content\":\"So I'd say that's the thing is to invest in your own agency and invest in how do I use this technology to get better at what I do and to do new things.\"},{\"Start\":2032.57,\"End\":2072.02,\"Speaker\":1,\"Content\":\"And also to your point, be ambitious, think big. Yeah, there's an interview with Jensen that just came out yesterday where people asked him about layoffs, there's all these layoffs happening, uh is AI actually taking jobs? And he's like, the reason a lot of these companies are not are letting people go is they don't have enough creativity or ambition for what they can do with all of these resources they're because they're not letting people go, they have so much they want to do. You know, obviously easier said than done and it's not always the case, but I think that's an interesting way of approaching it. Now that we have this power, people almost underestimate what they can do with it and don't fully lean into it. So I love this advice of just try to be a little more ambitious, try to stuff that you think is impossible and see you might be actually possible.\"},{\"Start\":2072.02,\"End\":2086.09,\"Speaker\":0,\"Content\":\"My New Year's resolution this year was the opposite. Every previous year I've always told myself this year I'm going to focus more, I'm going to take on less things. This year, my ambition was take on more stuff and be more ambitious. Like we've got these tools...\"},{\"Start\":2086.09,\"End\":2092.55,\"Speaker\":0,\"Content\":\"Bring it all in. Let's try and do everything. I don't know if that was a good New Year's resolution, but that's what I went with.\"},{\"Start\":2092.55,\"End\":2095.35,\"Speaker\":1,\"Content\":\"How's it going so far? How do you feel about this decision?\"},{\"Start\":2095.35,\"End\":2106.77,\"Speaker\":0,\"Content\":\"I'm enjoying myself. I think I'll probably get to the end of the year and I'll be like, wow, the thing the most important things that I should have been focusing on did not get done. But that's that's the case when it is my ambition to do them. So, you know.\"},{\"Start\":2106.77,\"End\":2111.48,\"Speaker\":1,\"Content\":\"It's a a converge diverge sort of situation, you know? next year could be refocused.\"},{\"Start\":2111.48,\"End\":2112.72,\"Speaker\":0,\"Content\":\"Absolutely, yeah.\"},{\"Start\":2112.72,\"End\":2140.91,\"Speaker\":1,\"Content\":\"Kind of along those lines though, I want to come back to this point you made about how you're you're working harder and you're like fried early in the day. This is such an interesting, uh, I don't know, contradiction almost. Uh, people, you know, AI is supposed to make us more productive. It's supposed to give us more time off, it's supposed to let us sit around and watch Netflix and do all the create wealth and productivity in the world. It feels like the people that are most AI filled are working harder than they've ever worked. There's this anxiety you described of my agents aren't running, I got to stay on top of them.\"},{\"Start\":2140.91,\"End\":2148.2,\"Speaker\":1,\"Content\":\"What do you think is going on there? Is this just like you said, maybe it's like a temporary novelty thing and then we'll be like, all right, I don't need to be this productive. Is there anything else there?\"},{\"Start\":2148.2,\"End\":2155.85,\"Speaker\":0,\"Content\":\"I think I I really hope it's a novelty thing. And I am actually getting much more, I'm getting more time, but I'm I'm exhausted.\"},{\"Start\":2155.85,\"End\":2157.41,\"Speaker\":1,\"Content\":\"Like your brain is exhausted.\"},{\"Start\":2157.41,\"End\":2194.44,\"Speaker\":0,\"Content\":\"like my brain is exhausted. I've got I've got more time to go and do things and I do things and it's great, but it's it is that the exhaustion from that sort of intensity of work has been a really big surprise for me. Like that that's been been something which I've I've I've I've been observing especially since November, like as as all of the stuff stuff started ramping up. And yeah, I think that's um the concern there comes down, it's always expectations from other people, you know, if you work for a company that's that's expecting you to get five times more done, that's going to be exhausting. And um and maybe we'll see, I think the good companies with good management are paying attention to this.\"},{\"Start\":2194.44,\"End\":2209.75,\"Speaker\":0,\"Content\":\"they don't want to burn out their best employees for the sort of the short-term gain but but lose people over it. But yeah, it's it's a big tension. I think we're we're those of us on the sort of leading edge of the AI boom are feeling it first. I imagine it's going to come for everyone else as well.\"},{\"Start\":2209.75,\"End\":2215.32,\"Speaker\":1,\"Content\":\"The other element of this though that we haven't mentioned is, and you've mentioned a couple times, it's actually really fun. The drive here is not\"},{\"Start\":2215.32,\"End\":2219.51,\"Speaker\":0,\"Content\":\"I have enjoyed myself so much. Absolutely. It's so fun. It's um\"},{\"Start\":2219.51,\"End\":2243.88,\"Speaker\":0,\"Content\":\"A lot of my friends have been talking about how they have this backlog of side projects, right? For the last 10, 15 years, they've got projects they never quite finished and ideas they thought would be cool. And some of them are like, \\\"Well, I've done them all now.\\\" Like last couple of months, I just went through and every evening I'm like, \\\"Let's take that project and finish it, and that one and that one and that one and that one.\\\" And they almost feel a sort of sense of loss at the end where they're like, \\\"Well, okay, my backlog's gone. Now what am I going to build?\\\"\"},{\"Start\":2243.88,\"End\":2262.84,\"Speaker\":1,\"Content\":\"Yeah, it comes back to that factory. I was talking to the founder of Linear the other day and this idea of the factory, and we were just like, like a factory doesn't sound like a place that'll create amazing products. It feels like, you know, like what are the chances that'll create something beautiful and innovative? So either that's the wrong word or it's just this will lead to bad stuff probably.\"},{\"Start\":2262.84,\"End\":2301.4,\"Speaker\":0,\"Content\":\"I feel like the word artisanal does like like artisanal to handcrafted software, I think is going to be valued more. Something I've noticed in my own work is sometimes I'll have an idea for a piece of software, a Python library or whatever, and I can knock it out in like an hour and get to a point where it's got documentation and tests and all of those things and it looks like the kind of software the previous I would have spent several weeks on. And I can stick it up on GitHub and everything. And yet, I don't believe in it. And the reason I don't believe in it is that I I got to rush through all of those things. I think the quality is probably good, but I haven't spent enough time with it to to feel confident in that quality.\"},{\"Start\":2301.4,\"End\":2303.48,\"Speaker\":0,\"Content\":\"Most importantly, I haven't used it yet.\"},{\"Start\":2303.48,\"End\":2332.85,\"Speaker\":0,\"Content\":\"Like it turns out, when I'm using somebody else's software, the thing I care most about is I want them to have used it for for months, right? I want other people to have put that software into practice. So I've got some very cool software that I built that I've never used. Like it was so it was quicker to build it than to actually try and use it. So the way I've been dealing with that is I always put alpha on it. Like if you see my software and it says it's an alpha, that probably means I haven't actually used it yet for most of my projects, which is a bit of a cheat code, you know, um alpha alpha this.\"},{\"Start\":2332.85,\"End\":2341.86,\"Speaker\":0,\"Content\":\"But isn't that interesting? like like like it used to be if you looked at software and it had high quality tests and documentation and everything it meant it was good and now that signal is gone.\"},{\"Start\":2341.86,\"End\":2346.42,\"Speaker\":1,\"Content\":\"It's almost like we need a proof of work for this versus the blockchain. A proof of usage.\"},{\"Start\":2346.42,\"End\":2347.82,\"Speaker\":0,\"Content\":\"Yes, exactly.\"},{\"Start\":2347.82,\"End\":2348.82,\"Speaker\":1,\"Content\":\"Oh man.\"},{\"Start\":2348.82,\"End\":2364.59,\"Speaker\":1,\"Content\":\"On this note of handcrafted code, I don't know if you know this, this is so interesting. Data labeling companies are buying old GitHub repos of handwritten code to train their models on, and they're paying a lot of money for like artisanal human-written code.\"},{\"Start\":2364.59,\"End\":2382.12,\"Speaker\":0,\"Content\":\"Oh, that's fascinating. That's the um, uh, the the pre-um World War Two ab- uh, the the the metal that you can dig up from old shipwrecks, which is before the nuclear, the first nuclear explosions and so it's it's not got like the the the the radiation baked into the metal. It's that whole thing.\"},{\"Start\":2382.12,\"End\":2382.68,\"Speaker\":1,\"Content\":\"Wow\"},{\"Start\":2382.68,\"End\":2383.72,\"Speaker\":0,\"Content\":\"That's fascinating.\"},{\"Start\":2383.72,\"End\":2388.78,\"Speaker\":1,\"Content\":\"Yeah, so they're looking for code pre-2022, I think, whenever ChatGPT kind of emerged.\"},{\"Start\":2388.78,\"End\":2389.9,\"Speaker\":0,\"Content\":\"Wow.\"},{\"Start\":2390.2,\"End\":2393.26,\"Speaker\":1,\"Content\":\"So if you've got some, you can make a you can make a fortune.\"},{\"Start\":2393.26,\"End\":2399.36,\"Speaker\":0,\"Content\":\"Uh, problem is I open source all my stuff, so it's already out there. It's it's in the training, it's it's been used to train the models already.\"},{\"Start\":2399.36,\"End\":2401.68,\"Speaker\":1,\"Content\":\"a lot of stuff already Yep. Oh man.\"},{\"Start\":2401.68,\"End\":2417.26,\"Speaker\":1,\"Content\":\"Okay, let me ask you this question. I'm just curious about this prediction. I know you're not like a prediction person, although you do make predictions and you seem to be right often. When do you think 50% of engineers in the world will be AI will be writing 100% of their code? How close to that do you think we are?\"},{\"Start\":2417.26,\"End\":2422.96,\"Speaker\":0,\"Content\":\"So, I'm going to refactor that to 95% of their code. I don't think yeah, we'll get to that, but yeah.\"},{\"Start\":2423.79,\"End\":2448.72,\"Speaker\":0,\"Content\":\"It's very difficult to say worldwide because partly because there are cultural differences. Um, I have spent way too much time on Hacker News and something I've noticed about Hacker News is a conversation that starts at midnight Pacific time and goes until 8:00 a.m., very different tone because it's the Europeans. Right? If you'll get the and the Europeans are a lot more AI skeptic than the Americans are generally.\"},{\"Start\":2448.72,\"End\":2449.76,\"Speaker\":0,\"Content\":\"So\"},{\"Start\":2449.76,\"End\":2473.7,\"Speaker\":0,\"Content\":\"I think different countries are going to have different sort of um different cultures around this. At the same time, I think it's become undeniable this year that this stuff produces good code. Like it used to be that you could say, I don't use this stuff because the code is bad. And that was a a justifiable position. That's not justifiable anymore. The code is now good. It's good code for for the my for my definition of good code at least.\"},{\"Start\":2473.7,\"End\":2474.72,\"Speaker\":0,\"Content\":\"So\"},{\"Start\":2474.72,\"End\":2496.63,\"Speaker\":0,\"Content\":\"So we're saying 50% of engineers, majority, let's say 50% of engineers' majority of their code, it could happen by the end of this year. It could. Because the the the the technology is good enough now and I feel like the the challenge now is getting people to learn how to use this stuff, which is difficult because using the stuff, everyone's like, oh, it must be easy, it's just a chatbot.\"},{\"Start\":2496.63,\"End\":2515.58,\"Speaker\":0,\"Content\":\"It's not easy. Like that's one of the great misconceptions in AI is that using these tools effectively is is is easy. It takes a lot of practice and it takes a lot of trying things that didn't work and trying things that did work. But yeah, I I I expect by the end of this year it will not be uncommon to have an engineer say that almost all of their code is written by AI.\"},{\"Start\":2515.58,\"End\":2534.1,\"Speaker\":1,\"Content\":\"That was the same rough idea I had and how crazy is that? How quickly this job has changed and what is possible. And I think people, this is a good example of people underestimate how quickly things can change. Like, we would not have, like, I think Dario was predicting this a year or two ago, just that 100% of code is going to be written by AI and we're just like...\"},{\"Start\":2534.1,\"End\":2535.42,\"Speaker\":0,\"Content\":\"We we all laughed at him. Yeah.\"},{\"Start\":2535.42,\"End\":2546.45,\"Speaker\":1,\"Content\":\"Right? Exactly. What are you talking about? so bad, so bad at writing code and and this might come for other jobs that people don't see coming which is scary and interesting and exciting.\"},{\"Start\":2546.45,\"End\":2566.14,\"Speaker\":0,\"Content\":\"It's honestly the the I'm I'm not an AI doomer in the slightest. The economics of it do make me nervous. Like, are we really going to wipe out like a tenth of white-collar knowledge work jobs in the next few years? I really hope not because I don't know how the economy adapts that, you know. So yeah, that's complicated.\"},{\"Start\":2566.14,\"End\":2580.57,\"Speaker\":1,\"Content\":\"Yeah, I'm actually I'm doing a report that's coming out, it'll come out ahead of this episode, uh looking at the job market in tech. And surprisingly, just at tech companies, we're at the highest number of open engineering roles, open PM roles.\"},{\"Start\":2580.57,\"End\":2581.79,\"Speaker\":0,\"Content\":\"Interesting.\"},{\"Start\":2582.11,\"End\":2594.66,\"Speaker\":1,\"Content\":\"except for during the crazy peak during COVID. So it's kind of like coming back to that. Basically, it's the highest number of open roles in three and a half ish years for engineers and PMs at tech companies globally.\"},{\"Start\":2595.34,\"End\":2596.97,\"Speaker\":1,\"Content\":\"That's very interesting.\"},{\"Start\":2596.97,\"End\":2608.61,\"Speaker\":0,\"Content\":\"It's funny, isn't it? because um you get all of these headline-grabbing like um layoffs. Uh yeah, um was it was it Block that laid off 4,000 people recently? Yeah, yeah. But the the the the the\"},{\"Start\":2608.61,\"End\":2647.8,\"Speaker\":0,\"Content\":\"Question there is always how much of that is AI and how much of it is um overhiring during COVID and re-corrections and all of that kind of thing. And it's always very difficult to tell. So that the the number of open jobs, on the one hand, maybe that's a better signal, but on the other hand, the recruitment market has been driven completely crazy by all of this stuff, right? Like all of the job ads are written by AI, the um the the the resumes are AI. People people in recruitment are saying that this is it's never been this hard to filter through and hire people. And people who are hiring jobs say they they applied to 200 things and got nobody hearing back. So it's hard, right? The the the the macroeconomic indicators for this stuff are lagging.\"},{\"Start\":2647.8,\"End\":2653.27,\"Speaker\":0,\"Content\":\"And at some point we should start getting more confident numbers about what the impact actually is.\"},{\"Start\":2653.27,\"End\":2658.87,\"Speaker\":1,\"Content\":\"Yeah, interestingly the number of recruiter open roles is also approaching like record numbers.\"},{\"Start\":2658.87,\"End\":2659.77,\"Speaker\":0,\"Content\":\"Hilarious\"},{\"Start\":2659.77,\"End\":2674.68,\"Speaker\":1,\"Content\":\"which is an interesting leading indicator of demand for hiring. So there's interesting trends in spite of the layoffs. So yeah, what a what a wild world. Um, so you've mentioned this uh book you're working on. This is the agentic engineering patterns stuff, right? Yes. Okay, cool.\"},{\"Start\":2674.68,\"End\":2681.25,\"Speaker\":1,\"Content\":\"So I want to talk about this. So you pointed out, people think it's easy to build with AI. It's like, oh, it's going to do all these things for us. What are we going to do all day?\"},{\"Start\":2681.25,\"End\":2700.98,\"Speaker\":1,\"Content\":\"To your point, it's actually not. There's a lot of very specific skills you need to do this well. And you're putting them together on your blog. We'll point to it. I want to talk through a few of them to help people do this better. So one is this idea of just writing code is cheap now. You talked touched on this a bit. Maybe just share why this is such an important thing to know and and keep in mind.\"},{\"Start\":2700.98,\"End\":2735.0,\"Speaker\":0,\"Content\":\"So I think this is the single biggest shock in all of this. The reason that we have to rethink how we build, how we work as software engineers, is that the thing that used to take the time takes way less time. Like it's it's never been the case that programmers spend 90% of their day typing code into a computer. There's always there's so much additional work around that. But it still used to be like people talk about how important it is not to interrupt your coders, right? Your coders need to have like solid two to four hour blocks of uninterrupted work so they can spin up their mental model and and churn out the code. It's so that that's changed completely.\"},{\"Start\":2735.0,\"End\":2770.65,\"Speaker\":0,\"Content\":\"like I my my programming work, I need two minutes every now and then to prompt my agent about what to do next and then I can do the other stuff and I can go back. I'm much more interruptible than I used to be. But yeah, so the thing that used to take the time is now the thing that takes way way less time. What does that mean for everything else that we do? And that doesn't just affect programmers, it affects entire like teams of teams around around software development. But as an individual programmer, you have to start thinking, okay, I can churn out 10,000 lines of code now in the time that would take me to write a hundred. How do I make that code good, right?\"},{\"Start\":2770.65,\"End\":2794.59,\"Speaker\":0,\"Content\":\"How do I make sure that I'm not just turning out total slop that that adds up to technical debt that slows me down? How do I take the fact that code is now cheap and use that to produce better code? Because I don't don't just want cheap code, I want really good code that does what I need it to do, that I can extend in the future, that's got all of those um those characteristics of of of of code that that's that's useful and and can be used in production.\"},{\"Start\":2794.59,\"End\":2805.42,\"Speaker\":1,\"Content\":\"The point you made earlier, I think, is a really important one along these lines, which is when you start a project, you fire off three different versions of it, and that helps you pick a direction, and that's only possible because code is so cheap now, right?\"},{\"Start\":2805.48,\"End\":2825.72,\"Speaker\":0,\"Content\":\"Right, prototyping is almost free. I think. And that really impacts me because throughout my entire career, my superpower has been prototyping. Like, I am very, I've been very quick at knocking out working prototypes of things. I'm the person who can show up at a meeting and say, look, here's how it could work. And that's that was kind of my my unique selling point. And that's gone.\"},{\"Start\":2825.72,\"End\":2838.34,\"Speaker\":0,\"Content\":\"now anyone can do what I could do, you know, it's like but but but it does but you still have to learn when it's appropriate to prototype, how to think about prototyping, how to get the tools to build useful prototypes that you can you can use to explore things.\"},{\"Start\":2838.34,\"End\":2872.06,\"Speaker\":2,\"Content\":\"I am so excited to tell you about this season's supporting sponsor, Vanta. Vanta helps over 15,000 companies like Cursor, Ramp, Duolingo, Snowflake, and Atlassian earn and prove trust with their customers. Teams are building and shipping products faster than ever thanks to AI. But as a result, the amount of risk being introduced into your product and your business is higher than it's ever been. Every security leader that I talk to is feeling the increasing weight of protecting their organization, their business, and not to mention their customer data.\"},{\"Start\":2872.06,\"End\":2906.76,\"Speaker\":2,\"Content\":\"Because things are moving so fast, they are constantly reacting, having to guess at priorities, and having to make do with outdated solutions. Vanta automates compliance and risk management with over 35 security and privacy frameworks, including SOC 2, ISO 27001, and HIPAA. This helps companies get compliant fast and stay compliant. More than ever before, trust has the power to make or break your business. Learn more at vanta.com/leni. And as a listener of this podcast, you get $1,000 off Vanta. That's vanta.com/leni.\"},{\"Start\":2906.76,\"End\":2913.28,\"Speaker\":1,\"Content\":\"I'm going to take a tangent. What's what's kind of in your stack, your AI stack? What models are you using most? What tools do you find useful?\"},{\"Start\":2913.28,\"End\":2952.59,\"Speaker\":0,\"Content\":\"So right now, I'm mostly Claude. Um, I do a huge amount of work using Claude Code. Well, I'm I'm mainly still a Claude Code person, but there are two sides of Claude Code that I use. There's the Claude Code that runs on your computer, and then there's Claude Code for web, which is their hosted version of Claude Code. And I use that one more than the one on my own computer. Partly because that's the one you can access through your phone. If you've got the Anthropic Claude app installed on an iPhone, there's a code tab and you can go in there and you can tell it to write you things. And that is running on their servers. Um, you need to give it a GitHub repository of yours that it can work within.\"},{\"Start\":2952.59,\"End\":2966.72,\"Speaker\":0,\"Content\":\"But it's also great from a security point of view because if you're running cloud code on your laptop, there's risks that bad things can happen. It might accidentally delete things. If I'm running on an Anthropic server, I couldn't care less. Like, it's their computer, it's not my computer.\"},{\"Start\":2966.72,\"End\":2977.19,\"Speaker\":0,\"Content\":\"Go wild. So this means that you can run these things in uh in YOLO mode. This is uh Claude calls it dangerously skip permissions. OpenAI actually do call it YOLO.\"},{\"Start\":2977.19,\"End\":3015.75,\"Speaker\":0,\"Content\":\"they've got an option for that. And that's the mode where the agent doesn't ask you if it should do something all the time. And that is a different product. I think a lot of people who haven't got on board with coding agents yet haven't tried them in the unsafe mode. They're using a coding agent where it's like, oh, can I run this piece of code? Can I edit this file? And that means you have to pay complete attention to it the whole time. And it's like working with a really frustrating toddler that's constantly nagging you about what it wants to do. The moment you take the safeties off, now I can run four of them and go and have like go and go and have a cup of tea and come back and they've they've achieved something useful for me. But it's inherently unsafe.\"},{\"Start\":3015.75,\"End\":3052.22,\"Speaker\":0,\"Content\":\"If it's running in Claude Code for Web, the only bad thing that can happen is maybe it accidentally leaks your private source code. And my code is all open source, so I don't care. But that's that's a useful trick there. But yeah, so I use that on my phone. I often have two or three of those running. A lot of my major projects are done mostly prompting on my phone. If it's security adjacent or super important, I might pull it down to my laptop to do a thorough review later on. But most of the review you can do through GitHub. Like these things will file pull requests and then you use the same tools you'd use to review code from other people to review the code from the agents.\"},{\"Start\":3052.22,\"End\":3067.9,\"Speaker\":0,\"Content\":\"That said, OpenAI came out with GPT 5.4 about three weeks ago. It's very, very, very good. I think it's on par with Claude Opus 4.6 and possibly even better. These companies are constantly leapfrogging each other. So I have been using leaning back...\"},{\"Start\":3067.9,\"End\":3081.74,\"Speaker\":0,\"Content\":\"It's also cheaper. So I've been leaning on GPT-5.4 a lot more this month. Um, and OpenAI Codex. And OpenAI Codex and Claude Code are almost almost indistinguishable from each other now. They're both very, very good pieces of software.\"},{\"Start\":3082.28,\"End\":3117.22,\"Speaker\":0,\"Content\":\"And I kind of expect this to happen, like the next Gemini model comes out might become the best coding model for a couple of months, in which case I might switch myself into that ecosystem. Partly because I write about this stuff as well, I like to stay familiar with as many of the the offerings as possible. But I keep on coming back to Claude Code, mainly because it fits my taste. Like there's this weird thing where I've got a very specific taste in how I like code to work, which coincidentally happens to map to how Claude Code likes to work. Which is kind of interesting. And GPT 5.4, it's almost matches my taste, but not quite.\"},{\"Start\":3117.22,\"End\":3126.44,\"Speaker\":0,\"Content\":\"And maybe that's because I've just spent more time with Claude, so my prompting style has evolved more to fit the Claude way of thinking. I don't know. This stuff's all so weird. It's vibes all the way down.\"},{\"Start\":3126.44,\"End\":3134.02,\"Speaker\":1,\"Content\":\"That is so interesting. So the taste is the code, the quality of the code it puts out is is what you're talking about, not like the conversation and the the\"},{\"Start\":3134.02,\"End\":3139.16,\"Speaker\":0,\"Content\":\"Absolutely don't care about how they talk to me. Like I'm I'm I'm I'm using them to to get stuff done. Yeah.\"},{\"Start\":3139.32,\"End\":3152.95,\"Speaker\":1,\"Content\":\"Yeah, because I was thinking as you're talking, what is the thing that will get someone to stick with a model? And it could be what you're describing, the qual- the way it writes code, it could be the UX, it could be the conversation, the vibes.\"},{\"Start\":3152.95,\"End\":3187.54,\"Speaker\":0,\"Content\":\"The stickiest thing is meant to be memory. Like the the all of the they they all have these features where they will remember things about you and and I hate those features and I turn them off wherever I can because mainly because as an AI researcher, I need to see what everyone else sees when I'm prompting. Like I don't want to say to the world, oh my goodness, look, this thing works now and it turns out it only works for me because it's based on previous like previous conversations that I've had. And maybe I'm missing out on something really important there. But the um the memory feature is is is that thing that all of the labs are trying to be more sticky with.\"},{\"Start\":3187.54,\"End\":3223.57,\"Speaker\":0,\"Content\":\"That said, um when the whole the the OpenAI military stuff happened a few weeks ago, Anthropic tried took advantage by saying, hey, why don't you move to Claude? And the way they did that is they had a Claude onboarding page that said, transfer your memories from ChatGPT by clicking this button and then pasting it into ChatGPT. And it was just a prompt. They had a prompt which was, hey, ChatGPT, tell me everything that you've remembered about me. And so you paste that prompt into ChatGPT and it gives you all of your the the the the memories and then you paste them into Claude. And I thought that was hilarious. Like a a whole...\"},{\"Start\":3223.57,\"End\":3228.39,\"Speaker\":0,\"Content\":\"export like move from one to the other just by prompting it to to give you the information you needed.\"},{\"Start\":3228.39,\"End\":3249.65,\"Speaker\":1,\"Content\":\"Yeah, that was like it always felt like that was hard to extract and they made it so easy. And that was such a moment for Anthropic. They went they were like the number one app in the App Store, such a interesting, not what you'd expect when they were being banned by the government, essentially. Right. Um, is there any any other AI tools that you find really useful just kind of along the side, like Jasper Flow, anything along those lines?\"},{\"Start\":3249.65,\"End\":3252.89,\"Speaker\":0,\"Content\":\"So I use Claude for code for the code stuff.\"},{\"Start\":3252.89,\"End\":3290.44,\"Speaker\":0,\"Content\":\"The other thing that I use a lot of is for research. Like, and this is this thing where a couple of years ago, if you told me that you were replacing use of Google with ChatGPT, I'd assume that you just didn't understand how this technology works and its limitations because that was a terrible idea. Now that all of the major models have really good search integration, they're just better at searching than I am. I can ask them a question and watch them fire off five searches in parallel for like aspects of answering that question, pull the data back. And I'll, if it's something I'm going to publish, I always double-check, make sure it didn't hallucinate a detail because that would be embarrassing. But honestly, most of like I hardly use Google search directly at all.\"},{\"Start\":3290.44,\"End\":3312.02,\"Speaker\":0,\"Content\":\"I'm always using it via, I'm doing searches via Claude or via ChatGPT or sometimes via the Gemini app. Like that that's that's a good option as well. And then I mean, for image generation, I'm using Gemini because of Nano Banana, but I only use that for fun. Like I I I don't publish images I generate, I use them for cranks. And that's great. Like that's deeply entertaining.\"},{\"Start\":3312.02,\"End\":3321.84,\"Speaker\":1,\"Content\":\"I wasn't planning to go here, but you're you famously created the uh pelican riding a bike benchmark for the quality of imagery. Yeah. Uh, anything there that might be worth sharing?\"},{\"Start\":3321.84,\"End\":3361.82,\"Speaker\":0,\"Content\":\"So this one's fascinating. Like, it was about a year and a half ago, I started benchmarks. So there were lots of benchmarks of these models and there were all these numeric things, like it scored 72% on Terminal Bench or whatever. And those always frustrated me because they don't really tell you anything interesting. Like, if this one got 74 and this one got 72, does that actually mean that one of them's better at something than the other? And so, basically to make fun of the benchmarks, I started my own benchmark which was generate an SVG of a pelican riding a bicycle. And it's an SVG. This isn't a test of the image models. This is a test of the text models because they can all output SVG code. And if you ask them to draw you an SVG of something,\"},{\"Start\":3361.82,\"End\":3398.85,\"Speaker\":0,\"Content\":\"They're almost universally terrible because they don't have good spatial reasoning and like drawing things by plotting out vectors is difficult anyway. So I started getting the models to render generate an SVG of a pelican on a bicycle because then you can look at them. You can say, here's one, here's one model, here's the other, which is best? And the weirdest thing happened where there appears to be a very strong correlation between how good their drawing of a pelican riding a bicycle is and how good they are at everything else. And nobody can explain to me why that is, but as I started looking at these things, I realized, wow, the better models really do draw better pelicans riding a bicycle. Because it's got to the point now, it's a meme.\"},{\"Start\":3398.85,\"End\":3427.52,\"Speaker\":0,\"Content\":\"the the the the AI labs are all very aware of this and they they they relish in how good their pelicans riding a bicycle are. The other day, OpenAI released GPT 5.4 Mini and Nano at five different thinking levels that you can have them do low thinking, medium thinking, high thinking. So I did a grid of 15 pelicans riding bicycles for the three GPT 5.4 models across the things. And sure enough, GPT 5.4 running at X high did draw the best pelican.\"},{\"Start\":3427.66,\"End\":3431.38,\"Speaker\":0,\"Content\":\"Why? I don't know. I don't know why that was, but it but it did.\"},{\"Start\":3431.38,\"End\":3439.0,\"Speaker\":1,\"Content\":\"First of all, I didn't realize this was a test of the LLM because you'd think an image would be a test of the imaging model, but uh but now it is.\"},{\"Start\":3439.0,\"End\":3440.02,\"Speaker\":0,\"Content\":\"It's all about the code generation.\"},{\"Start\":3440.02,\"End\":3440.8,\"Speaker\":1,\"Content\":\"That is so funny.\"},{\"Start\":3440.8,\"End\":3452.35,\"Speaker\":0,\"Content\":\"thing is um they're generating SVG and it has comments in. So you can see little code comments that say things like making sure the pelicans' legs are hitting the pedals and added added added a fish for whimsy. And that's really fun.\"},{\"Start\":3452.35,\"End\":3465.55,\"Speaker\":0,\"Content\":\"The Chinese AI models, I love playing with the Chinese like open weight models. Some of those have drawn quite good pelicans and they run on my laptop. So I have my laptop drawing these pictures of pelicans with these little comments about what it's trying to do.\"},{\"Start\":3465.55,\"End\":3470.35,\"Speaker\":1,\"Content\":\"I think with Gemini when they released one of their models, I think that was like their tweet was the the image of their code.\"},{\"Start\":3470.35,\"End\":3507.45,\"Speaker\":0,\"Content\":\"Gemini 3.1 just a few weeks ago, they had a video which featured a pelican riding a bicycle like animated. And I was like, oh my god, it's my pelican. But I thought it's okay because the way my benchmark works is I've actually got a bunch of secret, um, alternatives in my pocket because obviously what happens if the AI labs train them to draw really good pelicans riding bicycles? And I'm like, well then I'll get it to do an ocelot on a moped, and if the ocelot on the moped sucks, but the pelicans are really good, I can prove that they cheated on the benchmark. And that would be amazing, right? That would be a great thing to be able to say, hey, look, they cheated. Except that when Gemini 3.1 came out, they did all of the other combinations.\"},{\"Start\":3507.45,\"End\":3516.68,\"Speaker\":0,\"Content\":\"They were like, and here's a giraffe and a little tiny car and so on. And I'm like, wow, they they they they they they've beaten me. They've beat they're doing all of the animals and all of the modes of transport.\"},{\"Start\":3516.68,\"End\":3518.97,\"Speaker\":1,\"Content\":\"And they didn't know that you had this in your back pocket.\"},{\"Start\":3518.97,\"End\":3522.07,\"Speaker\":0,\"Content\":\"I don't know if they knew or not. I I I\"},{\"Start\":3522.77,\"End\":3540.0,\"Speaker\":0,\"Content\":\"People kept on asking me for like the past year, they've been saying, \\\"What if the labs cheat on the on the benchmark?\\\" And my answer has always been, \\\"Really, all I want from life is a really good picture of a pelican riding a bicycle. And if I can trick every AI lab in the world into into cheating on benchmarks...\\\"\"}]",
"segments": [
{
"text": "A lot of people woke up in January and February and started realizing, oh wow, I can churn out 10,000 lines of code in a day. It used to be you'd ask ChatGPT for some code and it would spit out some code and you had to run it and test it. The coding agents, they take that step for you.",
"start": 0,
"end": 13.85,
"duration": 13.85,
"speaker_id": 0
},
{
"text": "And an open question for me is how many other knowledge work fields are actually prone to these agent loops?",
"start": 13.85,
"end": 19.5,
"duration": 5.65,
"speaker_id": 0
},
{
"text": "Now that we have this power, people almost underestimate what they can do with it.",
"start": 19.5,
"end": 22.78,
"duration": 3.280000000000001,
"speaker_id": 1
},
{
"text": "Today, probably 95% of the code that I produce, I didn't type it myself. I write so much of my code on my phone. It's wild.",
"start": 22.78,
"end": 30.0,
"duration": 7.219999999999999,
"speaker_id": 0
},
{
"text": "I can get good work done walking the dog along the beach. My New Year's resolution, every previous year I've always told myself this year I'm going to focus more, I'm going to take on less things. This year, my ambition was take on more stuff and be more ambitious.",
"start": 30.0,
"end": 43.12,
"duration": 13.119999999999997,
"speaker_id": 0
},
{
"text": "such an interesting contradiction. AI is supposed to make us more productive. It feels like the people that are most AI builder are working harder than they've ever worked.",
"start": 43.12,
"end": 50.14,
"duration": 7.020000000000003,
"speaker_id": 1
},
{
"text": "Using coding agents well is taking every inch of my 25 years of experience as a software engineer. I can fire up four agents in parallel and have them work on four different problems. By 11 a.m., I am wiped out.",
"start": 50.14,
"end": 62.57,
"duration": 12.43,
"speaker_id": 0
},
{
"text": "You have this prediction that we're going to have a massive disaster at some point. You call it the Challenger disaster of AI.",
"start": 62.57,
"end": 67.77,
"duration": 5.199999999999996,
"speaker_id": 1
},
{
"text": "Lots of people knew that those little O-rings were unreliable, but every single time you get away with launching a space shuttle without the O-rings failing, you institutionally feel more confident in what you're doing. We've been using these systems in increasingly unsafe ways. This is going to catch up with us. My prediction is that we're going to see a Challenger disaster.",
"start": 67.77,
"end": 86.64,
"duration": 18.870000000000005,
"speaker_id": 0
},
{
"text": "Today my guest is Simon Wilson. Simon, in my opinion, is one of the most important and useful voices right now on how AI is changing the way that we build software and how professional work is changing broadly. What I love about Simon is that he doesn't just pontificate in the clouds. He's been what you'd call a 10x engineer for over 20 years. He co-created Django, the web framework that powers Instagram, Pinterest, Spotify, and thousands of other platforms. He coined the term prompt injection, popularized the ideas of AI slop and agentic engineering, and amongst his hundred plus open source projects, he created Dataset, a data analysis tool that has become a staple of investigative journalism.",
"start": 87.12,
"end": 127.65,
"duration": 40.53,
"speaker_id": 2
},
{
"text": "What makes Simon rare is that very few engineers have made the leap from the old way of building to the new way as fully and visibly as he has. And as he's leaned into this new way of building, he's been sharing everything he's learning in real time through his incredible blog, simonwilson.net. Simon does not do a lot of podcasts, and this conversation opened my mind up in a bunch of new ways. I am so excited for you to get to learn from Simon. Don't forget to check out lennysproductpass.com for an incredible set of deals available exclusively to Lenny's newsletter subscribers. With that, I bring you Simon Wilson.",
"start": 127.65,
"end": 160.78,
"duration": 33.129999999999995,
"speaker_id": 2
},
{
"text": "[Music]",
"start": 160.78,
"end": 163.44,
"duration": 2.6599999999999966
},
{
"text": "Simon, thank you so much for being here and welcome to the podcast.",
"start": 163.44,
"end": 167.49,
"duration": 4.050000000000011,
"speaker_id": 1
},
{
"text": "Hey, Lenny, it's really great to be here.",
"start": 167.49,
"end": 169.24,
"duration": 1.75,
"speaker_id": 0
},
{
"text": "I am so excited to have you here. I've been such a fan of yours from afar for so long. I've learned so much from your blog. And even though every guest I have on this podcast is my favorite guest, you're my favorite kind of guest because you're on the ground building with the latest tools, using it for real. You're very good at articulating what you experience. So we're going to get a lot of ROI out of this, out of your brain from from this time that we have together. What I want to start with is essentially a an AI state of the union. You've written about this November inflection.",
"start": 169.24,
"end": 201.9,
"duration": 32.66,
"speaker_id": 1
},
{
"text": "Yes.",
"start": 201.9,
"end": 202.9,
"duration": 1.0,
"speaker_id": 0
},
{
"text": "So what I'm thinking is we start just kind of give us like a brief history lesson of just like what happened in November and where are we today? What's possible now?",
"start": 202.9,
"end": 210.82,
"duration": 7.9199999999999875,
"speaker_id": 1
},
{
"text": "Well, let's let's talk about all of 2025 very briefly. Um, 2025 was the year that especially Anthropic and OpenAI realized that code is the application. Like being able to have things generate code. I think partly because um Anthropic came up with Claude Code back in in sort of February of 2025 and it took off like crazy. And a bunch of people started signing up for $200 a month accounts. And so suddenly, wow, it turns out people are willing to pay a lot of money for this stuff, for that specific field. Both Anthropic and OpenAI spent the whole of 2025 focusing all of their training efforts on coding.",
"start": 210.82,
"end": 245.64,
"duration": 34.81999999999999,
"speaker_id": 0
},
{
"text": "If you look at what they were doing, it was all the reinforcement learning stuff, the reasoning trick, the thing where the models say they're thinking. That was new in late 2024. Like OpenAI's O1 was the first model to exhibit that. And now all of the models do it. So that was the other big trend of last year was these reasoning models. Turns out reasoning is great for code. It can reason through code and figure out the route of bugs and all of that. And so the end result of this, the end result of these two labs throwing everything they had at making their models better at code, is in November we had what I call the inflection point where GPT 5.1 and Claude Opus 4.5 came along. And they were both just",
"start": 245.64,
"end": 283.09,
"duration": 37.44999999999999,
"speaker_id": 0
},
{
"text": "they were incrementally better than the previous models, but in a way that crossed a threshold where previously, if you had these coding agents, you could get them to write you some code, and most of the time it would mostly work. But you had to pay very close attention to it. And suddenly we went from that to almost all of the time it does what you told it to do, which makes all of the difference in the world. Now you can spin up a coding agent and say, \"Hey, build me a Mac application that does this thing,\" and you'll get something back which still needs some back and forth, but it won't just be a buggy pile of rubbish that doesn't do anything.",
"start": 283.09,
"end": 314.52,
"duration": 31.430000000000007,
"speaker_id": 0
},
{
"text": "That was fascinating because all of the software engineers who took time off over the over the holidays and started tinkering with this stuff got this moment of realization where it's like, oh wow, this stuff actually works now. I can tell it to build code and if I describe that code well enough, it'll follow the instructions and it'll build the thing that I asked it to build. I think the reverberations of that are still shaking us to to the software engineering. A lot of people woke up in January and February and started realizing, oh wow, this technology which I'd been kind of paying attention to, suddenly it's got really, really good. And what does that mean? Like what does the fact like I can churn out 10,000 lines of code in a day and most of it works.",
"start": 314.52,
"end": 354.6,
"duration": 40.08000000000004,
"speaker_id": 0
},
{
"text": "is that good?",
"start": 354.6,
"end": 355.55,
"duration": 0.9499999999999886,
"speaker_id": 0
},
{
"text": "like how do we get from most of it works to all of it works? There are so many new questions that we're facing, which I think makes us a bellwether for other information workers. Like code is easier than almost every other problem that you pose these agents because code is obviously right or wrong. Like it produces code, you run the code, either it works or it doesn't work. There might be a few subtle hidden hidden bugs, but generally you can tell if the thing actually works. If it writes you an essay or if it writes you a law like prepares a lawsuit for you, there are so it's so much harder to derive if it's actually done a good job, to figure out if it got things right or wrong.",
"start": 355.55,
"end": 390.94,
"duration": 35.389999999999986,
"speaker_id": 0
},
{
"text": "But",
"start": 390.94,
"end": 391.96,
"duration": 1.0199999999999818,
"speaker_id": 0
},
{
"text": "It's kind of happening to us. So software engineers, it came for us first and we're figuring out, okay, what do our careers look like? How do we work as teams when part of what we did that used to take most of the time doesn't take most of the time anymore? What does that look like? And it's going to be very interesting seeing how this rolls out to to other information work in the future.",
"start": 391.96,
"end": 411.08,
"duration": 19.120000000000005,
"speaker_id": 0
},
{
"text": "This episode is brought to you by our season's presenting sponsor, WorkOS. What do OpenAI, Anthropic, Cursor, Vercel, Replit, Sierra, Clay, and hundreds of other winning companies all have in common? They are all powered by WorkOS. If you're building a product for the enterprise, you've felt the pain of integrating single sign-on, SCIM, RBAQ, audit logs, and other features required by large companies. WorkOS turns those deal blockers into drop-in APIs with a modern developer platform built specifically for B2B SaaS. Literally every startup that I'm an investor in that starts to expand upmarket ends up working with WorkOS. And that's because they are the best.",
"start": 411.08,
"end": 451.08,
"duration": 40.0,
"speaker_id": 2
},
{
"text": "Whether you're a seed-stage startup trying to land your first enterprise customer or a unicorn expanding globally, WorkOS is the fastest path to becoming enterprise-ready in an unblocking world. It's essentially Stripe for enterprise features. Visit workos.com to get started or just hit up their Slack where they have actual engineers waiting to answer your questions. WorkOS allows you to build faster with delightful APIs, comprehensive docs, and a smooth developer experience. Go to workos.com to make your app enterprise-ready today.",
"start": 451.08,
"end": 480.82,
"duration": 29.74000000000001,
"speaker_id": 2
},
{
"text": "I want to come back to just like what is possible now. So just to give a little context, it's like insane how far we've come. I don't know, like a couple years ago, all code was human-written. Then it's like tab complete. Then it's like, okay, now the best engineers are 100% AI code. Now it's like, uh, uh, I'm like coding for my phone. Like I'm not even looking at my code anymore.",
"start": 480.82,
"end": 499.78,
"duration": 18.95999999999998,
"speaker_id": 1
},
{
"text": "I write so much of my code on my phone. it's it's wild. Like I I can get good work done walking the dog along the beach, which is delightful, you know?",
"start": 499.78,
"end": 508.8,
"duration": 9.020000000000039,
"speaker_id": 0
},
{
"text": "Yeah, I had Boris Journey on the podcast and he's doing the same thing. And I was just like, is that even coding anymore? He's like, yeah, it's just another level of abstraction. Just like engineering has always gone. Talk about maybe just like what else is there around just like what is possible now with AI in terms of building that people may not fully recognize? And where do you think what's like the next leap? Is there anything beyond this?",
"start": 508.8,
"end": 529.84,
"duration": 21.04000000000002,
"speaker_id": 1
},
{
"text": "Let's talk about the two, the sort of, there's the vibe coding side of things, and then there's the, and and I like Andrej Karpathy's original definition of vibe coding, which is, um, when you don't even look at code and you basically just go on the vibes. You say, \"Build me something that does X,\" and it builds it, and you play with it, and if it looks good, then great, and if it doesn't quite do it, you you you keep on going back and forth with it. But it's very hands-off. You're not looking at code. It's so he he originally said this is great for having fun and prototyping, and it then expand exploded way out of that.",
"start": 529.84,
"end": 560.73,
"duration": 30.889999999999986,
"speaker_id": 0
},
{
"text": "And I think today, vibe coding is effectively it's the the definition I use is it's when you're not looking at the code, you don't care about the code, and maybe you don't understand the code. Like non-programmers can now tell Claude what to build and it can build them a little app. And I love that. I absolutely love that we're sort of democratizing the art of getting a computer to do stuff for you, of automating tedious things in your life by knocking out these little tools. Of course, the problem is that there is a limit on how much you can do without responsibly.",
"start": 560.73,
"end": 592.17,
"duration": 31.43999999999994,
"speaker_id": 0
},
{
"text": "Uh, like, I like to tell people, if you're vibe coding something for yourself where the only person who gets hurt if it has bugs is you, go wild. That's completely fine. The moment you're you're vibe coding code for other people to use where your bugs might actually harm somebody else, that's when you need to take a step back and say, hang on a second, this is not a responsible way of using the the these tools. The challenge is that understanding what's responsible and what isn't is in itself a sort of expert level skill. So, knowing that once you start dealing with like scraping other people's websites, maybe you'll damage their websites by hitting them too hard. There are so many ways that you can cause damage if you don't know what you're doing.",
"start": 592.17,
"end": 632.02,
"duration": 39.85000000000002,
"speaker_id": 0
},
{
"text": "But I love that liberation and I love that people can come to meetings with a prototype that they knocked up of their idea that illustrates the idea. I think those things are wonderful. The big debate, the ongoing debate has been, what do we call it when a professional software engineer uses his tools to write real code that's production ready that they've reviewed and they've checked all of the details of? A lot of people call that vibe coding as well. I think that devalues vibe coding as a term because it's useful to say, I vibe coded this, as in I haven't even looked at how it works, it's not production ready, but it's kind of a cool prototype.",
"start": 632.02,
"end": 666.82,
"duration": 34.80000000000007,
"speaker_id": 0
},
{
"text": "The moment vibe coding means everything involved that touches AI, it effectively ends up meaning programming because we're all moving in the direction where our code is mediated through AI at some point. So, what do we call it for professionals? I've gone with agentic engineering because I think the thing to emphasize is these coding agents, right? If you're asking ChatGPT to knock out some code, that's a different thing from if you're running Codex and having it write the code, debug the code, test the code, all of that.",
"start": 666.82,
"end": 694.85,
"duration": 28.029999999999973,
"speaker_id": 0
},
{
"text": "And I think that agentic engineering is such a deep and fascinating discipline because the art of getting really good results out of this, like the art of having them help you build software you could deploy to a million people, that's not that's never going to be easy. That's never going to be trivial. That's always going to require a great deal of depth of experience in what software and how software works and how um how these agents work. And I love that. That's I'm I'm kind of writing a book about it now that I'm publishing a chapter at a time on my blog. The best form of writing because I don't have an editor or any pressure from a publisher is just when I feel like writing another chapter I can I can do that. But there's so much to discuss.",
"start": 694.85,
"end": 732.63,
"duration": 37.77999999999997,
"speaker_id": 0
},
{
"text": "But yeah, so I think right now the frontier is how do we build professional software using coding agents? How do we build software that is... I don't just want to build software that's that's good. I want us to build software that is better than we were building before. Like, if the agents let us move a bit faster but we're still churning out the same quality of software, that's less interesting to me than if the software we're producing has less bugs, more features, it's higher quality, it's better software because we're harnessing these tools. The really interesting future is something which some people have been calling the dark factory pattern or software factories. This is the idea where...",
"start": 732.63,
"end": 770.75,
"duration": 38.120000000000005,
"speaker_id": 0
},
{
"text": "Right now, if you're a professional using these tools, the way you do it is you tell them what to build and then you look at the code and you review that code really carefully and make sure it's doing the right thing. What does it look like if you're not reviewing the code? If you're not looking at that code, but you're also not vibe coding, you're not throwing everything to the wind and seeing what happens. You're applying professional practices and quality expectations to code that you're not directly reviewing. The reason it's called the dark factory is there's this idea idea in factory automation that if your factory is so automated that you don't need any people there, you can turn the lights off. Like the machines can operate in complete darkness if you don't need people on the factory floor.",
"start": 770.75,
"end": 809.85,
"duration": 39.10000000000002,
"speaker_id": 0
},
{
"text": "What does that look like for software? And there's some very... this company called Strong DM has been pushing this and doing some really interesting experiments around this. That I think is the... that's that's futuristic. Like that's we're trying to figure out what that looks like and how we can responsibly build software in that way right now. And making some quite interesting like discoveries about things that work and things that don't work. But that to me is is the next the next sort of barrier.",
"start": 809.85,
"end": 837.02,
"duration": 27.16999999999996,
"speaker_id": 0
},
{
"text": "Let's follow that thread. So what is what is this factory doing? So there's an element of no one's looking at the code really, but what how does that change how software is built? Are they are are people still coming up with the ideas and telling you this factory build this thing for me? Oh exactly.",
"start": 837.02,
"end": 849.42,
"duration": 12.399999999999977,
"speaker_id": 1
},
{
"text": "So this is the fascinating thing is um so there's a policy of nobody writes any code and quite a few companies are beginning to introduce that now because",
"start": 849.42,
"end": 857.62,
"duration": 8.200000000000045,
"speaker_id": 0
},
{
"text": "Just to be clear, the policy is you cannot write code.",
"start": 857.62,
"end": 859.98,
"duration": 2.3600000000000136,
"speaker_id": 1
},
{
"text": "You cannot type code into a computer.",
"start": 859.98,
"end": 862.64,
"duration": 2.659999999999968,
"speaker_id": 0
},
{
"text": "Yeah. Um, and honestly, like I thought six months ago, I thought that was crazy. And today, probably 95% of the code that I produce, I didn't type it myself. So that world is is is is practical already because these the latest models are good enough that you can tell them, oh, no, rename that variable and refactor that and and add this line there and they'll just do it. And it's faster than you typing on the keyboard yourself. The next rule though is nobody reads the code. And this is the thing which StrongDM started doing back in, I think it was August last year. They said, okay, we're not going to read the code.",
"start": 863.38,
"end": 896.96,
"duration": 33.58000000000004,
"speaker_id": 0
},
{
"text": "So what does that mean? How do you produce software that works and is good if you're not reading the code? And they've come up with a whole bunch of answers. Um, one of the most interesting was the way they did testing where in traditional software, some companies will have a QA department. Like the engineers write a bunch of software and then you throw it over the wall to the QA department and they sort of test it furiously to figure out if it's working or not. That, I think, went out of fashion a bit over the past sort of five to ten years from what I've seen in Silicon Valley because you kind of want your engineers to take responsibility for the code they're writing being good. But what if you can simulate that QA department?",
"start": 896.96,
"end": 934.79,
"duration": 37.82999999999993,
"speaker_id": 0
},
{
"text": "So what StrongDM were doing is, um, they had a swarm of agent testers who were actually simulating cust- simulating end users. So the software that they were building, this is crazy, the software is security software for access management. So when you sign in, when you start as a company and somebody needs to assign you access to Jira and then give you access to Slack and all of that kind of thing, they were building software for that. That's very security like adjacent. That's not the kind of thing that you should be vibe coding at all, based on most people's understanding of how the world works.",
"start": 934.79,
"end": 968.06,
"duration": 33.26999999999998,
"speaker_id": 0
},
{
"text": "But that's and there was so there were",
"start": 968.06,
"end": 969.73,
"duration": 1.6700000000000728,
"speaker_id": 0
},
{
"text": "legitimate security company who've been doing this stuff without AI for years, so it's not like they didn't understand the risks. So, the way they did their testing is they had this swarm of simulated employees all in a simulated Slack channel saying things like, \"Hey, could somebody give me access to Jira?\" The Slack channel itself is simulated.",
"start": 969.73,
"end": 985.4,
"duration": 15.669999999999959,
"speaker_id": 0
},
{
"text": "we'll talk about that in a moment. And they 24 hours a day they're making requests and saying, \"Hey, I need access to Jira and all of those kinds of things\" at enormous cost. Like they were spending $10,000 a day on tokens, I think, simulating all of these end users. I believe so.",
"start": 985.4,
"end": 999.4,
"duration": 14.0,
"speaker_id": 0
},
{
"text": "But it meant that their software was being very robustly tested in all of these different ways. And yeah, it's kind of similar to having a similar to having a manual QA team, except one that never sleeps. And I thought that was fascinating as a sort of example of thinking outside of the box, taking this question, how do we tell our software is good if we're not reviewing the code, and trying to find creative answers to it. The other thing that was interesting is that the Slack channel itself wasn't actually Slack. Because it turns out if you test against real software like Slack and so forth, they all have rate limits and like they they they they they won't let you just run 10,000 simulated people at a time.",
"start": 999.4,
"end": 1036.4,
"duration": 37.000000000000114,
"speaker_id": 0
},
{
"text": "So what they did is they built their own simulation of Slack and Jira and Okta and all of this software they were integrating with. And the way they did that is they basically took the API documentation for the public APIs for Slack and the client libraries, the open source client libraries, and they told their coding agents, build this. Build build me a simulation of this API.",
"start": 1036.4,
"end": 1056.93,
"duration": 20.529999999999973,
"speaker_id": 0
},
{
"text": "and they did",
"start": 1056.93,
"end": 1057.88,
"duration": 0.9500000000000455,
"speaker_id": 0
},
{
"text": "So this company is, and this is one of the things that they I went to a demo that they gave back in October. One of the things that really sat with me is that they had their own simulated version of Slack and Jira and all of these different package different systems that they could then build their software against, which cost them nothing because once they spun it up, it was a little Go binary that sat there. And they even had interfaces. They had like a fake version of the Slack interface that they'd code like vibe coded up that let them see what was going on. Absolutely fascinating.",
"start": 1057.88,
"end": 1087.4,
"duration": 29.519999999999982,
"speaker_id": 0
},
{
"text": "That is such a cool story and I love these stories of just companies at the bleeding edge trying to see what's possible, uh and have an advantage essentially. So what I'm hearing here is the QA piece is like the new piece in this factory. So we, you know, we already have Codex, Cloud Code, they can go off and build stuff.",
"start": 1087.4,
"end": 1103.54,
"duration": 16.139999999999873,
"speaker_id": 1
},
{
"text": "is the innovation here? Okay, now you've built all this stuff, is it actually any good? Is there a reason like Codex and CloudCo couldn't do this themselves? Why do you need kind of this factory concept?",
"start": 1103.66,
"end": 1112.42,
"duration": 8.759999999999991,
"speaker_id": 1
},
{
"text": "I think they can, like you can tell Claude code, fire up a sub-agent that uses Playwright to simulate a browser and all of that kind of thing. You'd have trouble getting it to run 24 hours a day. I mean, maybe it would work.",
"start": 1112.42,
"end": 1124.55,
"duration": 12.129999999999882,
"speaker_id": 0
},
{
"text": "Um, but certainly I think that what's interesting to me isn't so much the software you're using, it is these these big ideas, these these these techniques that you're using to try and answer these questions. Because even if your QA team, your virtual QA team says this is good, doesn't mean it's secure, right? It doesn't mean that you've got all of those other um characteristics you care about. At the same time, the agents are getting really good at security penetration testing now. And this is a new thing, I think in the past, again, in the past sort of three to six months, they've started being credible as security researchers, which is sending shockwaves through the security research industry. They're like, wow, we didn't think that they'd get to this point.",
"start": 1124.55,
"end": 1163.75,
"duration": 39.200000000000045,
"speaker_id": 0
},
{
"text": "What's interesting there is both OpenAI and Anthropic have specialist security models that they will not release to the general public because they can be used to break into websites. So they have like invite-only, like registered security researchers can apply for access and they've been producing um vulnerability reports against popular open source software. I think Firefox just a few days ago, maybe last week, said that they'd they'd done a release which was assisted by Anthropic. Anthropic had",
"start": 1163.75,
"end": 1195.72,
"duration": 31.970000000000027,
"speaker_id": 0
},
{
"text": "discovered a hundred like potential vulnerabilities in Firefox and responsibly reported them to Mozilla, who then fixed them. That's an interesting one as well because we're seeing a lot of this in the wild and it's it's just incredibly frustrating for maintainers because there are these people who don't know what they're doing who are asking ChatGPT to find a security hole and then reporting it to the maintainer. And the report looks good. Like ChatGPT can produce a very well-formatted report of a vulnerability. It's a total waste of time. Like it's not actually verified as being a real problem. The difference with Anthropic and Firefox is that Anthropic's security team actually did do the work.",
"start": 1195.86,
"end": 1235.27,
"duration": 39.41000000000008,
"speaker_id": 0
},
{
"text": "They didn't report whatever the agent said, they actually verified that it was a good quality report before before they handed it over.",
"start": 1235.27,
"end": 1241.38,
"duration": 6.110000000000127,
"speaker_id": 0
},
{
"text": "There's going to be a lot to talk about on the security side. You've done a lot of thinking and writing about the dangers there, but I want to follow this thread. So, in terms of what AI has been doing for teams, if you think about it's like it's kind of going on the middle and expanding. So it's like writing, you know, it's it's taking on more and more of the building components. It's doing code reviews now, QA as you've been describing, constantly building. And it feels like the front of that is the big now gap and opportunity, which is coming up with the idea, what the heck should we build? Because then once you tell the AI, build this thing, as you're describing, it's getting better and better at building something great. Have you had any luck yet with",
"start": 1241.38,
"end": 1280.12,
"duration": 38.73999999999978,
"speaker_id": 1
},
{
"text": "using AI there and do you think it starts to eat that and just becomes the strategy, you know, PM basically.",
"start": 1280.12,
"end": 1287.0,
"duration": 6.880000000000109,
"speaker_id": 1
},
{
"text": "So this is one of the most interesting problems we're having with all of this is we've taken the writing code bit and we've massively accelerated that. Now the bottlenecks are everywhere else, right? Like how do we redesign our processes now that the bit that used to take the longest, right? It used to be you'd come up with the spec and you hand it to your engineering team and three weeks later if you're lucky they'd come back with an implementation for you to then start. And now that that maybe that takes three hours, depending on how well established the coding agents are for that kind of thing. So now what, right? Now where else are the bottlenecks? I don't think it's, I mean, as coming with the initial ideas, um anyone who's done any product work knows that your initial ideas are always wrong. What matters is is proving them, right?",
"start": 1287.0,
"end": 1327.16,
"duration": 40.16000000000008,
"speaker_id": 0
},
{
"text": "it's it's it's it's testing them. We can test things so much faster now because we can build workable prototypes so much quicker. So there's an interesting thing I've been doing in my own work where any sort of feature that I want to design, I'll often prototype three different ways it could work, because that takes very little time and then I can start experimenting them and trying them and seeing which ones I like. And that that feels to me like the really transformational step here is that when you get AI involved in your ideation phase, it's much more about the prototypes. It's about, okay, we can see like a a a UI prototype is free.",
"start": 1327.16,
"end": 1363.35,
"duration": 36.18999999999983,
"speaker_id": 0
},
{
"text": "Now, ChatGPT and Claude will just build you a very convincing UI for anything that you describe, and that's how you should be working. I think anyone who's doing sort of product design and isn't vibe coding little prototypes is missing out on the the the latest, but like the the most powerful sort of boost that we get in that step. But then what do you do, right? How do you, given your three options now that you have instead of one option, how do you prove to yourself which one of those is the best? I don't have a confident answer to that. I expect this is where the good old-fashioned usability testing comes in. Like, get somebody on Zoom, screen shared, using your software, see what happens.",
"start": 1363.35,
"end": 1401.75,
"duration": 38.40000000000009,
"speaker_id": 0
},
{
"text": "that's you can tell the AI to do it and you can simulate your users with the AI. I don't think that's credible. I don't think you're going to get as good results from ChatGPT pretending to click around on your prototype than you would from an actual human being.",
"start": 1401.75,
"end": 1416.42,
"duration": 14.670000000000073,
"speaker_id": 0
},
{
"text": "This is so interesting. A question I've been tackling is just where are human brains going to continue to be valuable? And what I'm hearing here is there's like the initial idea. You made such a good point here. It's like the initial idea is often not the actual winning idea, it's just the beginning of an idea. So there's like the idea for the feature, then there's the try it out, prototype it, help you narrow on the direction, build it, make it awesome, get it out into the world. And it feels to me like AI is going to be really good at suggesting ideas and coming up with initial ideas. And I wonder if the human brain, like, it's not like maybe someday we don't need human brains at all and that's a whole other discussion. But maybe the next phase is...",
"start": 1416.42,
"end": 1454.45,
"duration": 38.02999999999997,
"speaker_id": 1
},
{
"text": "AI will help us come up with great ideas.",
"start": 1454.45,
"end": 1456.74,
"duration": 2.2899999999999636,
"speaker_id": 1
},
{
"text": "I mean, that's been the case for probably a couple of years now. They've been strong enough to do really good brainstorming. And I like to compare it to the thing where when you've got a group brainstorming exercise, you book a meeting room for an hour, you've got a whiteboard, you get a dozen people in, and the first two-thirds of that brainstorming session, honestly, it's kind of just everyone going through the most obvious basic ideas, right? And you get them all out on the whiteboard, you get them all out, and then things get interesting when you start saying, okay, well, let's talk about these...",
"start": 1456.74,
"end": 1483.84,
"duration": 27.09999999999991,
"speaker_id": 0
},
{
"text": "let's start combining them. But AI is so good at that first two-thirds of the ideas. Like, I brainstorm with them all the time where I just get them to spit out all of the obvious stuff and they'll come up with 20 things and they'll all be kind of done. Like they're they won't they won't be they just won't be very interesting. What gets interesting is when if you ask them for 20 more and now they by the sort of end of that list you're beginning to get things which are not good ideas but they point you in interesting directions. There are so many other tricks like this. Like, um, you can tell you can you can tell AI to combine weird fields. You can say, okay, I want ideas for marketing my new SaaS platform inspired by marine biology.",
"start": 1483.84,
"end": 1523.0,
"duration": 39.16000000000008,
"speaker_id": 0
},
{
"text": "and you see what happens. And most of it will be complete junk, but there might be a spark that gets you to the good idea. So I love them as as brainstorming companions on that front.",
"start": 1523.0,
"end": 1532.55,
"duration": 9.549999999999955,
"speaker_id": 0
},
{
"text": "That reminds me of a chat I had with David Plask. He's an expert naming person. He helps companies come up with names for products. And one of the things that he does at his company is he creates three teams to come to brainstorm names. One team, so for example, let's say uh Windsurf was a product they named. Um, so the first team is okay, this is an AI IDE thing. That's that's exactly what it is.",
"start": 1532.55,
"end": 1554.24,
"duration": 21.690000000000055,
"speaker_id": 1
},
{
"text": "Second team is, okay, this is a, this is a boat. You're naming a boat. and here's the constraints",
"start": 1554.24,
"end": 1559.42,
"duration": 5.180000000000064,
"speaker_id": 1
},
{
"text": "And then here, this is a a spaceship. So name it from that perspective. And he finds the best names come from those other directions where it's a different metaphor with the same sort of uh benefits. Um, okay. So what I'm hearing here is this is good. This is good for humans right now that there's still opportunity for us to contribute to the process.",
"start": 1559.42,
"end": 1577.45,
"duration": 18.029999999999973,
"speaker_id": 1
},
{
"text": "And actually, I want to stand in defense of software engineers for a bit because on the one hand, these things can write code. That used to be our thing, right? I'm finding that using coding agents well is taking every inch of my 25 years of experience as a software engineer and it is mentally exhausting. Like this is something which people are talking a lot more about now. I can fire up like four agents in parallel and have them work on four different problems and by like 11:00 a.m. I am wiped out for the day. Like I have because there is a limit on human cognition in how much, even if you're not reviewing everything they're doing, just how much you can hold in your head at one time.",
"start": 1577.45,
"end": 1617.0,
"duration": 39.549999999999955,
"speaker_id": 0
},
{
"text": "and it's very easy to pop that stack at the moment. Like, there's a sort of personal skill that we have to learn which is finding our new limits. Like, what is what is a responsible way for us to to to not burn out and for us to to use the time that we have. And I I've I've talked to a lot of people who are losing sleep because they're like, my coding agents could my agents could be doing work for me. I'm just going to stay up extra half hour and and set off a bunch of extra things and they're waking up at four in the morning. That's obviously unsustainable. I hope that that's a novelty thing. That agents only really got good in the past sort of four to five months. We're all learning what that looks like and what that lets us do. But it's it's it's concerning.",
"start": 1617.0,
"end": 1654.41,
"duration": 37.41000000000008,
"speaker_id": 0
},
{
"text": "there's an element of sort of gambling and addiction to to how we're using some of these tools. But to stand in defense of software engineers, I get great results out of these things because they are amplifiers of existing skills and experience. And I have 25 years of existing like pre-AI experience, which I can now amplify because I can talk to the agents at a very high level. I can use very I can use um sophisticated engineering like language that I've mastered over the years, which they appear to know as well. And we can collaborate incredibly effectively.",
"start": 1654.41,
"end": 1686.58,
"duration": 32.169999999999845,
"speaker_id": 0
},
{
"text": "And it means I can look at a problem and I can say, this problem is a one-sentence prompt and I know it'll find that bug and fix that bug, as opposed to this other problem which is, who knows how how big a problem. There is a flip side to this, which is that I've got 25 years of experience in how long it takes to build something, and that's all completely gone. Like that doesn't work anymore because I can look at a problem and say, okay, well, this is going to take two weeks, it's not worth it. And now it's like, yeah, but maybe it's going to take 20 minutes because the reason it was taking two weeks was all of the the sort of crafty coding things that the AI is now covering for us. And that I've been finding really interesting and challenging.",
"start": 1686.58,
"end": 1723.86,
"duration": 37.27999999999997,
"speaker_id": 0
},
{
"text": "I I constantly throw tasks at AI that I don't think it'll be able to do because every now and then it does it. And when it doesn't do it, you learn, right? You learn, okay, Opus 4.6 still can't do this particular thing, but when it does do something, especially something the previous models couldn't do, that's actually cutting-edge AI research. You can be the first person in the world to spot that AI can now do X just because you were the person you you found it couldn't do it and you've you've been keeping that sort of backlog of of interesting tasks for it.",
"start": 1723.86,
"end": 1752.44,
"duration": 28.580000000000155,
"speaker_id": 0
},
{
"text": "This is such an interesting line of discussion. This idea that, let's say 10x engineers, to use that phrase, are going to be more valuable is what you're describing here because you can work with these tools much more effectively. What do you think of junior engineers? Just like what's happening there, what's their future?",
"start": 1752.44,
"end": 1769.22,
"duration": 16.779999999999973,
"speaker_id": 1
},
{
"text": "So, there's an interest. So Thoughtworks, um the big um like uh IT consultancy did a offsite a few uh about a month ago and they produced they got a whole bunch of engineering VPs in from different companies to talk about this stuff. And one of the interesting theories they came up with is they think this stuff is really good for experienced engineers. Like it amplifies their skills.",
"start": 1769.22,
"end": 1789.36,
"duration": 20.139999999999873,
"speaker_id": 0
},
{
"text": "That's great.",
"start": 1789.36,
"end": 1789.86,
"duration": 0.5,
"speaker_id": 0
},
{
"text": "It's really good for new engineers because it solves so many of those onboarding problems. Like, if you talk to um Cloudflare and Shopify, both said they were hiring a thousand interns over the course of 2025 because the intern onboarding costs, it used to be takes a month before you enter can do anything useful. Now they're doing something useful within like a week because the the AI assistant helps them get up and running faster. The problem is the people in the middle. Like if you're mid-career, if you haven't made it to sort of super senior engineer yet, but you're not sort of new either, that's the that's the group which Thoughtworks which Thoughtworks resolved were probably in the most trouble right now.",
"start": 1789.86,
"end": 1829.09,
"duration": 39.23000000000002,
"speaker_id": 0
},
{
"text": "like that's the open question because they don't have that expertise to to to to amplify and and use with these tools. And it's not as benefit like they've got all of the the boosts that the beginners were getting they've got already. So that's an interesting open question right now for me is it's more the the the sort of mid mid-level as opposed to the beginners or the the advanced people.",
"start": 1829.09,
"end": 1847.79,
"duration": 18.700000000000045,
"speaker_id": 0
},
{
"text": "It's so interesting how AI is coming at the middle of so many things. It's coming at the middle of the product development process, it's coming at the middle of seniority. It's probably other examples. And I'm guessing this is true for all functions, like PMs, designers too, just new PMs, designers, maybe because being AI native basically is what you're describing. And and ramping up much more quickly. I guess while we're on this topic, say you are a lot of listeners here are just like those people in the middle. What would your advice be to them to help them avoid becoming a part of the permanent underclass?",
"start": 1847.79,
"end": 1880.16,
"duration": 32.37000000000012,
"speaker_id": 1
},
{
"text": "That's a big responsibility you're putting on me there. Um, I think, I think the way forward is to lean into this stuff and figure out how do, how do I help this make me better? Right? Like, a lot of people worry about skill atrophy, you know, if the AI's doing it for you, you're not learning anything. I think if you're worried about that, you push back at it. Like, you have to be mindful about how you're applying the technology and think, okay, I've been given this thing that can answer any question and often gets it right, doesn't always get it gets it right. How can I use this to amplify my own skills, to to learn new things, to take on much more ambitious projects?",
"start": 1881.32,
"end": 1920.99,
"duration": 39.67000000000007,
"speaker_id": 0
},
{
"text": "Something I've been enjoying, I think the thing I've enjoyed most about this as a software engineer is that my level of ambition has shot right up. Because now, I used to like never, I never used AppleScript because AppleScript is a whole programming language you have to learn. And I've been using AppleScript for like two and a half years now because ChatGPT knows AppleScript and I don't have to. And so now I can automate things on my Mac. And that's great, you know. Um, and previously, the fact that it would take me like two or three months to learn basic AppleScript was enough for me never to use it. And now I've got all of these technologies that I'm using because that two to three month initial learning curve has been shaved right down. I think that applies to everything else.",
"start": 1920.99,
"end": 1959.3,
"duration": 38.309999999999945,
"speaker_id": 0
},
{
"text": "like, I'm getting much better at cooking. I've been using Claude, it turns out, excellent chef, which doesn't make sense because it can't, it doesn't have taste buds. But it does, it can give you the global average of the world's guacamole recipes, which turns out is good guacamole. So, that's been really interesting, like trying to apply this stuff just to for sort of self-improvement. I think that's a really useful skill to have. Because honestly, everything is changing so fast right now, the only universal skill is being able to roll with the changes. Right? That's the thing that we all need.",
"start": 1959.3,
"end": 1991.51,
"duration": 32.210000000000036,
"speaker_id": 0
},
{
"text": "Weirdly, um the term that comes up most in these conversations about how you can be great with AI is agency, right? People, human beings have agency and we use that agency to decide what problems to take on and where to go. I think agents have no agency at all. Like, I would argue that the one thing AI can never have is agency because it doesn't have human motivations. Like, sure you can tell it make more money or whatever, but it's never going to be able to decide on its, like what makes sense for it to act on next.",
"start": 1991.51,
"end": 2023.38,
"duration": 31.87000000000012,
"speaker_id": 0
},
{
"text": "So I'd say that's the thing is to invest in your own agency and invest in how do I use this technology to get better at what I do and to do new things.",
"start": 2023.38,
"end": 2032.57,
"duration": 9.189999999999827,
"speaker_id": 0
},
{
"text": "And also to your point, be ambitious, think big. Yeah, there's an interview with Jensen that just came out yesterday where people asked him about layoffs, there's all these layoffs happening, uh is AI actually taking jobs? And he's like, the reason a lot of these companies are not are letting people go is they don't have enough creativity or ambition for what they can do with all of these resources they're because they're not letting people go, they have so much they want to do. You know, obviously easier said than done and it's not always the case, but I think that's an interesting way of approaching it. Now that we have this power, people almost underestimate what they can do with it and don't fully lean into it. So I love this advice of just try to be a little more ambitious, try to stuff that you think is impossible and see you might be actually possible.",
"start": 2032.57,
"end": 2072.02,
"duration": 39.450000000000045,
"speaker_id": 1
},
{
"text": "My New Year's resolution this year was the opposite. Every previous year I've always told myself this year I'm going to focus more, I'm going to take on less things. This year, my ambition was take on more stuff and be more ambitious. Like we've got these tools...",
"start": 2072.02,
"end": 2086.09,
"duration": 14.070000000000164,
"speaker_id": 0
},
{
"text": "Bring it all in. Let's try and do everything. I don't know if that was a good New Year's resolution, but that's what I went with.",
"start": 2086.09,
"end": 2092.55,
"duration": 6.460000000000036,
"speaker_id": 0
},
{
"text": "How's it going so far? How do you feel about this decision?",
"start": 2092.55,
"end": 2095.35,
"duration": 2.799999999999727,
"speaker_id": 1
},
{
"text": "I'm enjoying myself. I think I'll probably get to the end of the year and I'll be like, wow, the thing the most important things that I should have been focusing on did not get done. But that's that's the case when it is my ambition to do them. So, you know.",
"start": 2095.35,
"end": 2106.77,
"duration": 11.420000000000073,
"speaker_id": 0
},
{
"text": "It's a a converge diverge sort of situation, you know? next year could be refocused.",
"start": 2106.77,
"end": 2111.48,
"duration": 4.710000000000036,
"speaker_id": 1
},
{
"text": "Absolutely, yeah.",
"start": 2111.48,
"end": 2112.72,
"duration": 1.2399999999997817,
"speaker_id": 0
},
{
"text": "Kind of along those lines though, I want to come back to this point you made about how you're you're working harder and you're like fried early in the day. This is such an interesting, uh, I don't know, contradiction almost. Uh, people, you know, AI is supposed to make us more productive. It's supposed to give us more time off, it's supposed to let us sit around and watch Netflix and do all the create wealth and productivity in the world. It feels like the people that are most AI filled are working harder than they've ever worked. There's this anxiety you described of my agents aren't running, I got to stay on top of them.",
"start": 2112.72,
"end": 2140.91,
"duration": 28.190000000000055,
"speaker_id": 1
},
{
"text": "What do you think is going on there? Is this just like you said, maybe it's like a temporary novelty thing and then we'll be like, all right, I don't need to be this productive. Is there anything else there?",
"start": 2140.91,
"end": 2148.2,
"duration": 7.289999999999964,
"speaker_id": 1
},
{
"text": "I think I I really hope it's a novelty thing. And I am actually getting much more, I'm getting more time, but I'm I'm exhausted.",
"start": 2148.2,
"end": 2155.85,
"duration": 7.650000000000091,
"speaker_id": 0
},
{
"text": "Like your brain is exhausted.",
"start": 2155.85,
"end": 2157.41,
"duration": 1.5599999999999454,
"speaker_id": 1
},
{
"text": "like my brain is exhausted. I've got I've got more time to go and do things and I do things and it's great, but it's it is that the exhaustion from that sort of intensity of work has been a really big surprise for me. Like that that's been been something which I've I've I've I've been observing especially since November, like as as all of the stuff stuff started ramping up. And yeah, I think that's um the concern there comes down, it's always expectations from other people, you know, if you work for a company that's that's expecting you to get five times more done, that's going to be exhausting. And um and maybe we'll see, I think the good companies with good management are paying attention to this.",
"start": 2157.41,
"end": 2194.44,
"duration": 37.0300000000002,
"speaker_id": 0
},
{
"text": "they don't want to burn out their best employees for the sort of the short-term gain but but lose people over it. But yeah, it's it's a big tension. I think we're we're those of us on the sort of leading edge of the AI boom are feeling it first. I imagine it's going to come for everyone else as well.",
"start": 2194.44,
"end": 2209.75,
"duration": 15.309999999999945,
"speaker_id": 0
},
{
"text": "The other element of this though that we haven't mentioned is, and you've mentioned a couple times, it's actually really fun. The drive here is not",
"start": 2209.75,
"end": 2215.32,
"duration": 5.570000000000164,
"speaker_id": 1
},
{
"text": "I have enjoyed myself so much. Absolutely. It's so fun. It's um",
"start": 2215.32,
"end": 2219.51,
"duration": 4.190000000000055,
"speaker_id": 0
},
{
"text": "A lot of my friends have been talking about how they have this backlog of side projects, right? For the last 10, 15 years, they've got projects they never quite finished and ideas they thought would be cool. And some of them are like, \"Well, I've done them all now.\" Like last couple of months, I just went through and every evening I'm like, \"Let's take that project and finish it, and that one and that one and that one and that one.\" And they almost feel a sort of sense of loss at the end where they're like, \"Well, okay, my backlog's gone. Now what am I going to build?\"",
"start": 2219.51,
"end": 2243.88,
"duration": 24.36999999999989,
"speaker_id": 0
},
{
"text": "Yeah, it comes back to that factory. I was talking to the founder of Linear the other day and this idea of the factory, and we were just like, like a factory doesn't sound like a place that'll create amazing products. It feels like, you know, like what are the chances that'll create something beautiful and innovative? So either that's the wrong word or it's just this will lead to bad stuff probably.",
"start": 2243.88,
"end": 2262.84,
"duration": 18.960000000000036,
"speaker_id": 1
},
{
"text": "I feel like the word artisanal does like like artisanal to handcrafted software, I think is going to be valued more. Something I've noticed in my own work is sometimes I'll have an idea for a piece of software, a Python library or whatever, and I can knock it out in like an hour and get to a point where it's got documentation and tests and all of those things and it looks like the kind of software the previous I would have spent several weeks on. And I can stick it up on GitHub and everything. And yet, I don't believe in it. And the reason I don't believe in it is that I I got to rush through all of those things. I think the quality is probably good, but I haven't spent enough time with it to to feel confident in that quality.",
"start": 2262.84,
"end": 2301.4,
"duration": 38.559999999999945,
"speaker_id": 0
},
{
"text": "Most importantly, I haven't used it yet.",
"start": 2301.4,
"end": 2303.48,
"duration": 2.0799999999999272,
"speaker_id": 0
},
{
"text": "Like it turns out, when I'm using somebody else's software, the thing I care most about is I want them to have used it for for months, right? I want other people to have put that software into practice. So I've got some very cool software that I built that I've never used. Like it was so it was quicker to build it than to actually try and use it. So the way I've been dealing with that is I always put alpha on it. Like if you see my software and it says it's an alpha, that probably means I haven't actually used it yet for most of my projects, which is a bit of a cheat code, you know, um alpha alpha this.",
"start": 2303.48,
"end": 2332.85,
"duration": 29.36999999999989,
"speaker_id": 0
},
{
"text": "But isn't that interesting? like like like it used to be if you looked at software and it had high quality tests and documentation and everything it meant it was good and now that signal is gone.",
"start": 2332.85,
"end": 2341.86,
"duration": 9.010000000000218,
"speaker_id": 0
},
{
"text": "It's almost like we need a proof of work for this versus the blockchain. A proof of usage.",
"start": 2341.86,
"end": 2346.42,
"duration": 4.559999999999945,
"speaker_id": 1
},
{
"text": "Yes, exactly.",
"start": 2346.42,
"end": 2347.82,
"duration": 1.400000000000091,
"speaker_id": 0
},
{
"text": "Oh man.",
"start": 2347.82,
"end": 2348.82,
"duration": 1.0,
"speaker_id": 1
},
{
"text": "On this note of handcrafted code, I don't know if you know this, this is so interesting. Data labeling companies are buying old GitHub repos of handwritten code to train their models on, and they're paying a lot of money for like artisanal human-written code.",
"start": 2348.82,
"end": 2364.59,
"duration": 15.769999999999982,
"speaker_id": 1
},
{
"text": "Oh, that's fascinating. That's the um, uh, the the pre-um World War Two ab- uh, the the the metal that you can dig up from old shipwrecks, which is before the nuclear, the first nuclear explosions and so it's it's not got like the the the the radiation baked into the metal. It's that whole thing.",
"start": 2364.59,
"end": 2382.12,
"duration": 17.529999999999745,
"speaker_id": 0
},
{
"text": "Wow",
"start": 2382.12,
"end": 2382.68,
"duration": 0.5599999999999454,
"speaker_id": 1
},
{
"text": "That's fascinating.",
"start": 2382.68,
"end": 2383.72,
"duration": 1.0399999999999636,
"speaker_id": 0
},
{
"text": "Yeah, so they're looking for code pre-2022, I think, whenever ChatGPT kind of emerged.",
"start": 2383.72,
"end": 2388.78,
"duration": 5.0600000000004,
"speaker_id": 1
},
{
"text": "Wow.",
"start": 2388.78,
"end": 2389.9,
"duration": 1.1199999999998909,
"speaker_id": 0
},
{
"text": "So if you've got some, you can make a you can make a fortune.",
"start": 2390.2,
"end": 2393.26,
"duration": 3.0600000000004,
"speaker_id": 1
},
{
"text": "Uh, problem is I open source all my stuff, so it's already out there. It's it's in the training, it's it's been used to train the models already.",
"start": 2393.26,
"end": 2399.36,
"duration": 6.099999999999909,
"speaker_id": 0
},
{
"text": "a lot of stuff already Yep. Oh man.",
"start": 2399.36,
"end": 2401.68,
"duration": 2.319999999999709,
"speaker_id": 1
},
{
"text": "Okay, let me ask you this question. I'm just curious about this prediction. I know you're not like a prediction person, although you do make predictions and you seem to be right often. When do you think 50% of engineers in the world will be AI will be writing 100% of their code? How close to that do you think we are?",
"start": 2401.68,
"end": 2417.26,
"duration": 15.580000000000382,
"speaker_id": 1
},
{
"text": "So, I'm going to refactor that to 95% of their code. I don't think yeah, we'll get to that, but yeah.",
"start": 2417.26,
"end": 2422.96,
"duration": 5.699999999999818,
"speaker_id": 0
},
{
"text": "It's very difficult to say worldwide because partly because there are cultural differences. Um, I have spent way too much time on Hacker News and something I've noticed about Hacker News is a conversation that starts at midnight Pacific time and goes until 8:00 a.m., very different tone because it's the Europeans. Right? If you'll get the and the Europeans are a lot more AI skeptic than the Americans are generally.",
"start": 2423.79,
"end": 2448.72,
"duration": 24.929999999999836,
"speaker_id": 0
},
{
"text": "So",
"start": 2448.72,
"end": 2449.76,
"duration": 1.0400000000004184,
"speaker_id": 0
},
{
"text": "I think different countries are going to have different sort of um different cultures around this. At the same time, I think it's become undeniable this year that this stuff produces good code. Like it used to be that you could say, I don't use this stuff because the code is bad. And that was a a justifiable position. That's not justifiable anymore. The code is now good. It's good code for for the my for my definition of good code at least.",
"start": 2449.76,
"end": 2473.7,
"duration": 23.9399999999996,
"speaker_id": 0
},
{
"text": "So",
"start": 2473.7,
"end": 2474.72,
"duration": 1.0199999999999818,
"speaker_id": 0
},
{
"text": "So we're saying 50% of engineers, majority, let's say 50% of engineers' majority of their code, it could happen by the end of this year. It could. Because the the the the technology is good enough now and I feel like the the challenge now is getting people to learn how to use this stuff, which is difficult because using the stuff, everyone's like, oh, it must be easy, it's just a chatbot.",
"start": 2474.72,
"end": 2496.63,
"duration": 21.91000000000031,
"speaker_id": 0
},
{
"text": "It's not easy. Like that's one of the great misconceptions in AI is that using these tools effectively is is is easy. It takes a lot of practice and it takes a lot of trying things that didn't work and trying things that did work. But yeah, I I I expect by the end of this year it will not be uncommon to have an engineer say that almost all of their code is written by AI.",
"start": 2496.63,
"end": 2515.58,
"duration": 18.949999999999818,
"speaker_id": 0
},
{
"text": "That was the same rough idea I had and how crazy is that? How quickly this job has changed and what is possible. And I think people, this is a good example of people underestimate how quickly things can change. Like, we would not have, like, I think Dario was predicting this a year or two ago, just that 100% of code is going to be written by AI and we're just like...",
"start": 2515.58,
"end": 2534.1,
"duration": 18.519999999999982,
"speaker_id": 1
},
{
"text": "We we all laughed at him. Yeah.",
"start": 2534.1,
"end": 2535.42,
"duration": 1.3200000000001637,
"speaker_id": 0
},
{
"text": "Right? Exactly. What are you talking about? so bad, so bad at writing code and and this might come for other jobs that people don't see coming which is scary and interesting and exciting.",
"start": 2535.42,
"end": 2546.45,
"duration": 11.029999999999745,
"speaker_id": 1
},
{
"text": "It's honestly the the I'm I'm not an AI doomer in the slightest. The economics of it do make me nervous. Like, are we really going to wipe out like a tenth of white-collar knowledge work jobs in the next few years? I really hope not because I don't know how the economy adapts that, you know. So yeah, that's complicated.",
"start": 2546.45,
"end": 2566.14,
"duration": 19.690000000000055,
"speaker_id": 0
},
{
"text": "Yeah, I'm actually I'm doing a report that's coming out, it'll come out ahead of this episode, uh looking at the job market in tech. And surprisingly, just at tech companies, we're at the highest number of open engineering roles, open PM roles.",
"start": 2566.14,
"end": 2580.57,
"duration": 14.430000000000291,
"speaker_id": 1
},
{
"text": "Interesting.",
"start": 2580.57,
"end": 2581.79,
"duration": 1.2199999999998,
"speaker_id": 0
},
{
"text": "except for during the crazy peak during COVID. So it's kind of like coming back to that. Basically, it's the highest number of open roles in three and a half ish years for engineers and PMs at tech companies globally.",
"start": 2582.11,
"end": 2594.66,
"duration": 12.549999999999727,
"speaker_id": 1
},
{
"text": "That's very interesting.",
"start": 2595.34,
"end": 2596.97,
"duration": 1.6299999999996544,
"speaker_id": 1
},
{
"text": "It's funny, isn't it? because um you get all of these headline-grabbing like um layoffs. Uh yeah, um was it was it Block that laid off 4,000 people recently? Yeah, yeah. But the the the the the",
"start": 2596.97,
"end": 2608.61,
"duration": 11.640000000000327,
"speaker_id": 0
},
{
"text": "Question there is always how much of that is AI and how much of it is um overhiring during COVID and re-corrections and all of that kind of thing. And it's always very difficult to tell. So that the the number of open jobs, on the one hand, maybe that's a better signal, but on the other hand, the recruitment market has been driven completely crazy by all of this stuff, right? Like all of the job ads are written by AI, the um the the the resumes are AI. People people in recruitment are saying that this is it's never been this hard to filter through and hire people. And people who are hiring jobs say they they applied to 200 things and got nobody hearing back. So it's hard, right? The the the the macroeconomic indicators for this stuff are lagging.",
"start": 2608.61,
"end": 2647.8,
"duration": 39.190000000000055,
"speaker_id": 0
},
{
"text": "And at some point we should start getting more confident numbers about what the impact actually is.",
"start": 2647.8,
"end": 2653.27,
"duration": 5.4699999999998,
"speaker_id": 0
},
{
"text": "Yeah, interestingly the number of recruiter open roles is also approaching like record numbers.",
"start": 2653.27,
"end": 2658.87,
"duration": 5.599999999999909,
"speaker_id": 1
},
{
"text": "Hilarious",
"start": 2658.87,
"end": 2659.77,
"duration": 0.900000000000091,
"speaker_id": 0
},
{
"text": "which is an interesting leading indicator of demand for hiring. So there's interesting trends in spite of the layoffs. So yeah, what a what a wild world. Um, so you've mentioned this uh book you're working on. This is the agentic engineering patterns stuff, right? Yes. Okay, cool.",
"start": 2659.77,
"end": 2674.68,
"duration": 14.909999999999854,
"speaker_id": 1
},
{
"text": "So I want to talk about this. So you pointed out, people think it's easy to build with AI. It's like, oh, it's going to do all these things for us. What are we going to do all day?",
"start": 2674.68,
"end": 2681.25,
"duration": 6.570000000000164,
"speaker_id": 1
},
{
"text": "To your point, it's actually not. There's a lot of very specific skills you need to do this well. And you're putting them together on your blog. We'll point to it. I want to talk through a few of them to help people do this better. So one is this idea of just writing code is cheap now. You talked touched on this a bit. Maybe just share why this is such an important thing to know and and keep in mind.",
"start": 2681.25,
"end": 2700.98,
"duration": 19.730000000000018,
"speaker_id": 1
},
{
"text": "So I think this is the single biggest shock in all of this. The reason that we have to rethink how we build, how we work as software engineers, is that the thing that used to take the time takes way less time. Like it's it's never been the case that programmers spend 90% of their day typing code into a computer. There's always there's so much additional work around that. But it still used to be like people talk about how important it is not to interrupt your coders, right? Your coders need to have like solid two to four hour blocks of uninterrupted work so they can spin up their mental model and and churn out the code. It's so that that's changed completely.",
"start": 2700.98,
"end": 2735.0,
"duration": 34.01999999999998,
"speaker_id": 0
},
{
"text": "like I my my programming work, I need two minutes every now and then to prompt my agent about what to do next and then I can do the other stuff and I can go back. I'm much more interruptible than I used to be. But yeah, so the thing that used to take the time is now the thing that takes way way less time. What does that mean for everything else that we do? And that doesn't just affect programmers, it affects entire like teams of teams around around software development. But as an individual programmer, you have to start thinking, okay, I can churn out 10,000 lines of code now in the time that would take me to write a hundred. How do I make that code good, right?",
"start": 2735.0,
"end": 2770.65,
"duration": 35.65000000000009,
"speaker_id": 0
},
{
"text": "How do I make sure that I'm not just turning out total slop that that adds up to technical debt that slows me down? How do I take the fact that code is now cheap and use that to produce better code? Because I don't don't just want cheap code, I want really good code that does what I need it to do, that I can extend in the future, that's got all of those um those characteristics of of of of code that that's that's useful and and can be used in production.",
"start": 2770.65,
"end": 2794.59,
"duration": 23.940000000000055,
"speaker_id": 0
},
{
"text": "The point you made earlier, I think, is a really important one along these lines, which is when you start a project, you fire off three different versions of it, and that helps you pick a direction, and that's only possible because code is so cheap now, right?",
"start": 2794.59,
"end": 2805.42,
"duration": 10.829999999999927,
"speaker_id": 1
},
{
"text": "Right, prototyping is almost free. I think. And that really impacts me because throughout my entire career, my superpower has been prototyping. Like, I am very, I've been very quick at knocking out working prototypes of things. I'm the person who can show up at a meeting and say, look, here's how it could work. And that's that was kind of my my unique selling point. And that's gone.",
"start": 2805.48,
"end": 2825.72,
"duration": 20.23999999999978,
"speaker_id": 0
},
{
"text": "now anyone can do what I could do, you know, it's like but but but it does but you still have to learn when it's appropriate to prototype, how to think about prototyping, how to get the tools to build useful prototypes that you can you can use to explore things.",
"start": 2825.72,
"end": 2838.34,
"duration": 12.620000000000346,
"speaker_id": 0
},
{
"text": "I am so excited to tell you about this season's supporting sponsor, Vanta. Vanta helps over 15,000 companies like Cursor, Ramp, Duolingo, Snowflake, and Atlassian earn and prove trust with their customers. Teams are building and shipping products faster than ever thanks to AI. But as a result, the amount of risk being introduced into your product and your business is higher than it's ever been. Every security leader that I talk to is feeling the increasing weight of protecting their organization, their business, and not to mention their customer data.",
"start": 2838.34,
"end": 2872.06,
"duration": 33.7199999999998,
"speaker_id": 2
},
{
"text": "Because things are moving so fast, they are constantly reacting, having to guess at priorities, and having to make do with outdated solutions. Vanta automates compliance and risk management with over 35 security and privacy frameworks, including SOC 2, ISO 27001, and HIPAA. This helps companies get compliant fast and stay compliant. More than ever before, trust has the power to make or break your business. Learn more at vanta.com/leni. And as a listener of this podcast, you get $1,000 off Vanta. That's vanta.com/leni.",
"start": 2872.06,
"end": 2906.76,
"duration": 34.70000000000027,
"speaker_id": 2
},
{
"text": "I'm going to take a tangent. What's what's kind of in your stack, your AI stack? What models are you using most? What tools do you find useful?",
"start": 2906.76,
"end": 2913.28,
"duration": 6.519999999999982,
"speaker_id": 1
},
{
"text": "So right now, I'm mostly Claude. Um, I do a huge amount of work using Claude Code. Well, I'm I'm mainly still a Claude Code person, but there are two sides of Claude Code that I use. There's the Claude Code that runs on your computer, and then there's Claude Code for web, which is their hosted version of Claude Code. And I use that one more than the one on my own computer. Partly because that's the one you can access through your phone. If you've got the Anthropic Claude app installed on an iPhone, there's a code tab and you can go in there and you can tell it to write you things. And that is running on their servers. Um, you need to give it a GitHub repository of yours that it can work within.",
"start": 2913.28,
"end": 2952.59,
"duration": 39.309999999999945,
"speaker_id": 0
},
{
"text": "But it's also great from a security point of view because if you're running cloud code on your laptop, there's risks that bad things can happen. It might accidentally delete things. If I'm running on an Anthropic server, I couldn't care less. Like, it's their computer, it's not my computer.",
"start": 2952.59,
"end": 2966.72,
"duration": 14.129999999999654,
"speaker_id": 0
},
{
"text": "Go wild. So this means that you can run these things in uh in YOLO mode. This is uh Claude calls it dangerously skip permissions. OpenAI actually do call it YOLO.",
"start": 2966.72,
"end": 2977.19,
"duration": 10.470000000000255,
"speaker_id": 0
},
{
"text": "they've got an option for that. And that's the mode where the agent doesn't ask you if it should do something all the time. And that is a different product. I think a lot of people who haven't got on board with coding agents yet haven't tried them in the unsafe mode. They're using a coding agent where it's like, oh, can I run this piece of code? Can I edit this file? And that means you have to pay complete attention to it the whole time. And it's like working with a really frustrating toddler that's constantly nagging you about what it wants to do. The moment you take the safeties off, now I can run four of them and go and have like go and go and have a cup of tea and come back and they've they've achieved something useful for me. But it's inherently unsafe.",
"start": 2977.19,
"end": 3015.75,
"duration": 38.559999999999945,
"speaker_id": 0
},
{
"text": "If it's running in Claude Code for Web, the only bad thing that can happen is maybe it accidentally leaks your private source code. And my code is all open source, so I don't care. But that's that's a useful trick there. But yeah, so I use that on my phone. I often have two or three of those running. A lot of my major projects are done mostly prompting on my phone. If it's security adjacent or super important, I might pull it down to my laptop to do a thorough review later on. But most of the review you can do through GitHub. Like these things will file pull requests and then you use the same tools you'd use to review code from other people to review the code from the agents.",
"start": 3015.75,
"end": 3052.22,
"duration": 36.4699999999998,
"speaker_id": 0
},
{
"text": "That said, OpenAI came out with GPT 5.4 about three weeks ago. It's very, very, very good. I think it's on par with Claude Opus 4.6 and possibly even better. These companies are constantly leapfrogging each other. So I have been using leaning back...",
"start": 3052.22,
"end": 3067.9,
"duration": 15.680000000000291,
"speaker_id": 0
},
{
"text": "It's also cheaper. So I've been leaning on GPT-5.4 a lot more this month. Um, and OpenAI Codex. And OpenAI Codex and Claude Code are almost almost indistinguishable from each other now. They're both very, very good pieces of software.",
"start": 3067.9,
"end": 3081.74,
"duration": 13.83999999999969,
"speaker_id": 0
},
{
"text": "And I kind of expect this to happen, like the next Gemini model comes out might become the best coding model for a couple of months, in which case I might switch myself into that ecosystem. Partly because I write about this stuff as well, I like to stay familiar with as many of the the offerings as possible. But I keep on coming back to Claude Code, mainly because it fits my taste. Like there's this weird thing where I've got a very specific taste in how I like code to work, which coincidentally happens to map to how Claude Code likes to work. Which is kind of interesting. And GPT 5.4, it's almost matches my taste, but not quite.",
"start": 3082.28,
"end": 3117.22,
"duration": 34.9399999999996,
"speaker_id": 0
},
{
"text": "And maybe that's because I've just spent more time with Claude, so my prompting style has evolved more to fit the Claude way of thinking. I don't know. This stuff's all so weird. It's vibes all the way down.",
"start": 3117.22,
"end": 3126.44,
"duration": 9.220000000000255,
"speaker_id": 0
},
{
"text": "That is so interesting. So the taste is the code, the quality of the code it puts out is is what you're talking about, not like the conversation and the the",
"start": 3126.44,
"end": 3134.02,
"duration": 7.579999999999927,
"speaker_id": 1
},
{
"text": "Absolutely don't care about how they talk to me. Like I'm I'm I'm I'm using them to to get stuff done. Yeah.",
"start": 3134.02,
"end": 3139.16,
"duration": 5.139999999999873,
"speaker_id": 0
},
{
"text": "Yeah, because I was thinking as you're talking, what is the thing that will get someone to stick with a model? And it could be what you're describing, the qual- the way it writes code, it could be the UX, it could be the conversation, the vibes.",
"start": 3139.32,
"end": 3152.95,
"duration": 13.629999999999654,
"speaker_id": 1
},
{
"text": "The stickiest thing is meant to be memory. Like the the all of the they they all have these features where they will remember things about you and and I hate those features and I turn them off wherever I can because mainly because as an AI researcher, I need to see what everyone else sees when I'm prompting. Like I don't want to say to the world, oh my goodness, look, this thing works now and it turns out it only works for me because it's based on previous like previous conversations that I've had. And maybe I'm missing out on something really important there. But the um the memory feature is is is that thing that all of the labs are trying to be more sticky with.",
"start": 3152.95,
"end": 3187.54,
"duration": 34.590000000000146,
"speaker_id": 0
},
{
"text": "That said, um when the whole the the OpenAI military stuff happened a few weeks ago, Anthropic tried took advantage by saying, hey, why don't you move to Claude? And the way they did that is they had a Claude onboarding page that said, transfer your memories from ChatGPT by clicking this button and then pasting it into ChatGPT. And it was just a prompt. They had a prompt which was, hey, ChatGPT, tell me everything that you've remembered about me. And so you paste that prompt into ChatGPT and it gives you all of your the the the the memories and then you paste them into Claude. And I thought that was hilarious. Like a a whole...",
"start": 3187.54,
"end": 3223.57,
"duration": 36.0300000000002,
"speaker_id": 0
},
{
"text": "export like move from one to the other just by prompting it to to give you the information you needed.",
"start": 3223.57,
"end": 3228.39,
"duration": 4.819999999999709,
"speaker_id": 0
},
{
"text": "Yeah, that was like it always felt like that was hard to extract and they made it so easy. And that was such a moment for Anthropic. They went they were like the number one app in the App Store, such a interesting, not what you'd expect when they were being banned by the government, essentially. Right. Um, is there any any other AI tools that you find really useful just kind of along the side, like Jasper Flow, anything along those lines?",
"start": 3228.39,
"end": 3249.65,
"duration": 21.26000000000022,
"speaker_id": 1
},
{
"text": "So I use Claude for code for the code stuff.",
"start": 3249.65,
"end": 3252.89,
"duration": 3.2399999999997817,
"speaker_id": 0
},
{
"text": "The other thing that I use a lot of is for research. Like, and this is this thing where a couple of years ago, if you told me that you were replacing use of Google with ChatGPT, I'd assume that you just didn't understand how this technology works and its limitations because that was a terrible idea. Now that all of the major models have really good search integration, they're just better at searching than I am. I can ask them a question and watch them fire off five searches in parallel for like aspects of answering that question, pull the data back. And I'll, if it's something I'm going to publish, I always double-check, make sure it didn't hallucinate a detail because that would be embarrassing. But honestly, most of like I hardly use Google search directly at all.",
"start": 3252.89,
"end": 3290.44,
"duration": 37.55000000000018,
"speaker_id": 0
},
{
"text": "I'm always using it via, I'm doing searches via Claude or via ChatGPT or sometimes via the Gemini app. Like that that's that's a good option as well. And then I mean, for image generation, I'm using Gemini because of Nano Banana, but I only use that for fun. Like I I I don't publish images I generate, I use them for cranks. And that's great. Like that's deeply entertaining.",
"start": 3290.44,
"end": 3312.02,
"duration": 21.579999999999927,
"speaker_id": 0
},
{
"text": "I wasn't planning to go here, but you're you famously created the uh pelican riding a bike benchmark for the quality of imagery. Yeah. Uh, anything there that might be worth sharing?",
"start": 3312.02,
"end": 3321.84,
"duration": 9.820000000000164,
"speaker_id": 1
},
{
"text": "So this one's fascinating. Like, it was about a year and a half ago, I started benchmarks. So there were lots of benchmarks of these models and there were all these numeric things, like it scored 72% on Terminal Bench or whatever. And those always frustrated me because they don't really tell you anything interesting. Like, if this one got 74 and this one got 72, does that actually mean that one of them's better at something than the other? And so, basically to make fun of the benchmarks, I started my own benchmark which was generate an SVG of a pelican riding a bicycle. And it's an SVG. This isn't a test of the image models. This is a test of the text models because they can all output SVG code. And if you ask them to draw you an SVG of something,",
"start": 3321.84,
"end": 3361.82,
"duration": 39.98000000000002,
"speaker_id": 0
},
{
"text": "They're almost universally terrible because they don't have good spatial reasoning and like drawing things by plotting out vectors is difficult anyway. So I started getting the models to render generate an SVG of a pelican on a bicycle because then you can look at them. You can say, here's one, here's one model, here's the other, which is best? And the weirdest thing happened where there appears to be a very strong correlation between how good their drawing of a pelican riding a bicycle is and how good they are at everything else. And nobody can explain to me why that is, but as I started looking at these things, I realized, wow, the better models really do draw better pelicans riding a bicycle. Because it's got to the point now, it's a meme.",
"start": 3361.82,
"end": 3398.85,
"duration": 37.029999999999745,
"speaker_id": 0
},
{
"text": "the the the the AI labs are all very aware of this and they they they relish in how good their pelicans riding a bicycle are. The other day, OpenAI released GPT 5.4 Mini and Nano at five different thinking levels that you can have them do low thinking, medium thinking, high thinking. So I did a grid of 15 pelicans riding bicycles for the three GPT 5.4 models across the things. And sure enough, GPT 5.4 running at X high did draw the best pelican.",
"start": 3398.85,
"end": 3427.52,
"duration": 28.670000000000073,
"speaker_id": 0
},
{
"text": "Why? I don't know. I don't know why that was, but it but it did.",
"start": 3427.66,
"end": 3431.38,
"duration": 3.7200000000002547,
"speaker_id": 0
},
{
"text": "First of all, I didn't realize this was a test of the LLM because you'd think an image would be a test of the imaging model, but uh but now it is.",
"start": 3431.38,
"end": 3439.0,
"duration": 7.619999999999891,
"speaker_id": 1
},
{
"text": "It's all about the code generation.",
"start": 3439.0,
"end": 3440.02,
"duration": 1.0199999999999818,
"speaker_id": 0
},
{
"text": "That is so funny.",
"start": 3440.02,
"end": 3440.8,
"duration": 0.7800000000002001,
"speaker_id": 1
},
{
"text": "thing is um they're generating SVG and it has comments in. So you can see little code comments that say things like making sure the pelicans' legs are hitting the pedals and added added added a fish for whimsy. And that's really fun.",
"start": 3440.8,
"end": 3452.35,
"duration": 11.549999999999727,
"speaker_id": 0
},
{
"text": "The Chinese AI models, I love playing with the Chinese like open weight models. Some of those have drawn quite good pelicans and they run on my laptop. So I have my laptop drawing these pictures of pelicans with these little comments about what it's trying to do.",
"start": 3452.35,
"end": 3465.55,
"duration": 13.200000000000273,
"speaker_id": 0
},
{
"text": "I think with Gemini when they released one of their models, I think that was like their tweet was the the image of their code.",
"start": 3465.55,
"end": 3470.35,
"duration": 4.799999999999727,
"speaker_id": 1
},
{
"text": "Gemini 3.1 just a few weeks ago, they had a video which featured a pelican riding a bicycle like animated. And I was like, oh my god, it's my pelican. But I thought it's okay because the way my benchmark works is I've actually got a bunch of secret, um, alternatives in my pocket because obviously what happens if the AI labs train them to draw really good pelicans riding bicycles? And I'm like, well then I'll get it to do an ocelot on a moped, and if the ocelot on the moped sucks, but the pelicans are really good, I can prove that they cheated on the benchmark. And that would be amazing, right? That would be a great thing to be able to say, hey, look, they cheated. Except that when Gemini 3.1 came out, they did all of the other combinations.",
"start": 3470.35,
"end": 3507.45,
"duration": 37.09999999999991,
"speaker_id": 0
},
{
"text": "They were like, and here's a giraffe and a little tiny car and so on. And I'm like, wow, they they they they they they've beaten me. They've beat they're doing all of the animals and all of the modes of transport.",
"start": 3507.45,
"end": 3516.68,
"duration": 9.230000000000018,
"speaker_id": 0
},
{
"text": "And they didn't know that you had this in your back pocket.",
"start": 3516.68,
"end": 3518.97,
"duration": 2.2899999999999636,
"speaker_id": 1
},
{
"text": "I don't know if they knew or not. I I I",
"start": 3518.97,
"end": 3522.07,
"duration": 3.100000000000364,
"speaker_id": 0
},
{
"text": "People kept on asking me for like the past year, they've been saying, \"What if the labs cheat on the on the benchmark?\" And my answer has always been, \"Really, all I want from life is a really good picture of a pelican riding a bicycle. And if I can trick every AI lab in the world into into cheating on benchmarks...\"",
"start": 3522.77,
"end": 3540.0,
"duration": 17.230000000000018,
"speaker_id": 0
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment