How Anthropic Uses Claude Fable 5 With Mike Krieger

Overview

This episode is about what changes after the launch hype of a new AI model fades and people start using it every day. Mike describes Fable 5 less as a better chatbot and more as a system you can hand real work to, especially in software and internal tool building, while Dan pushes on the harder question of what that means for skills, engineering, and trust.

Key Takeaways

Mike says the biggest shift is not raw model quality but the need for new working habits. His old approach - breaking work into small prompts and tightly steering each step - stopped making sense once the model could hold more intent, reason across a larger context window, and keep going through setbacks. He now gives broader goals, richer context, and longer time horizons.

The strongest claim in the conversation is about delegation. Mike says he can set up a complex task at night, go to sleep, and wake up to finished work. What impressed him was not just output quality, but the model's ability to recover when something breaks: if a service goes down, it can stub a backend, document what happened, and return to finish the job later. That changes the model from assistant to teammate.

He also argues that these systems narrow the gap between "the thing in my head" and "the thing that exists in the world." For non-engineers, that may matter more than any coding benchmark. He gives an example of someone in recruiting who could finally build an internal tool herself instead of waiting on overloaded internal engineering teams.

On software engineering, Mike's view is that the job is not gone, but the shape of it has changed hard. Writing code by hand and solving low-level implementation details matters less than product judgment, system ownership, coordination, and incident response. He says the PM-engineering boundary is already getting blurrier inside Anthropic. Humans still hold intent, make tradeoffs, and stay accountable for what reaches production.

A big bottleneck now is verification. If a model can do hours of work on its own, a person still has to know whether the result is right. Mike's answer is to tighten the verification loop: screenshots on every pull request, video captures of UI behavior, regression tests for known paths, and direct questioning of the model about why it made certain decisions. The standard is not "the model wrote it." The standard is whether a human is willing to stand behind it.

Practical Steps

Give the model intent, not just tasks. Describe the goal, constraints, and likely evolution of the project instead of asking for one isolated step.
Use longer task horizons. For work that can run unattended, set context clearly, define success conditions, and let the model keep going without constant intervention.
Build a verification layer around autonomous work:
- require screenshots or screen recordings for UI changes
- run tests against both standard flows and the exact feature being changed
- ask the model to explain tradeoffs after it finishes
Keep ownership clear. Even if multiple agents are doing work, assign a human DRI for each area who reviews, approves, and answers for the result.
Turn repeatable processes into workflows. Mike says he started by asking Claude to design a workflow for a complex task, then added extra verification steps and reused smaller follow-up workflows after that.

Notable Quotes

"I feel like a total newbie again because I feel like the way that I am prompting or even thinking about decomposing a task is really out of date now with this model." - Mike
"It is the first time in my life where I feel like the thing that's in my head and the thing that exists in the world is now right next to each other." - Mike, quoting a recruiter on his team
"You ultimately as a person still need to stand behind the work that you are doing, especially if you're putting it into a production system." - Mike

It is the first time in my life where I feel like the thing that’s in my head and the thing that exists in the world is now right next to each other. — From the episode

Full Transcript

Source: openai 52m runtime

Mike, welcome to the show. Great to be here, Dan. Good to see you. So for people who don't know you, you're the head of Anthropic Labs and you're the co-founder of Instagram. And today what I want to talk to you about is Fable 5. So Fable 5 is dropping tomorrow. We're recording this a day before. This will come out after it drops. But what I really wanted to do is bring you on the show to tell me about what it's like to use this model beyond the first day. I think when a model this powerful drops, it's so useful to have someone who's using it day in and day out to tell you, this is where it's powerful, this is what it actually changes, this is what it doesn't change. So that you're, you kind of like don't, you kind of don't get the same AI psychosis type thing. You can actually think about, okay, like this is how it fits into my life. Yeah, absolutely. And, and it's also just been interesting, you know, we've had some, you know, models in this, you know, mito's class leading up to the Fable release, you know, for a couple of months now. And it's, I think it's very exciting to see how people will build with this externally. But I think you're also right that day one impressions, I think it really comes from getting to use this over a couple of weeks. I think we've seen that even with previous models, like the December into January usage, I think it was Opus 4.5 or Opus 4.6, was really important because people spent extended time with the model and then figured out, oh, actually, I wasn't pushing it hard enough. I got to go further and I got to rethink what's even possible with this generation. Totally. I mean, I don't know. I feel like there are people internally at every who have been using it who have been like, oh my god, I think I kind of need a new set of skills to use this model. And I think you can especially see this with people who are maybe more non-technical internally and who are more on the knowledge work side of things where they're like, I don't even know what I would use this for. And the people who are orchestrating agents are like, holy shit, I feel like there's so many new things I need to learn. So I'm curious for you, tell us about the difference between your impression when you first tried it and now. Yeah, I think that your point on adopting workflows is a really good one. Quite literally workflows, I'll talk about that in a second, but also just in terms of like, how do I think about usage of the model? Because at first, the timing was interesting because it kind of coincided with me transitioning from CPO into labs and going really back into builder mode. And I think it was about a month and a half or two months into that that we first had one of these models available internally. And I sat there and I was like, I feel like a total newbie again because I feel like the way that I am prompting or even thinking about decomposing a task is really out of date now with this model. Like, it's no longer, and it's even thinking about the time horizon or the sort of like interactivity model, I think has to evolve as well. Like going from, I think early on would be like, I have an idea for this feature. Can we start by doing like, absolutely not, right? To, great, like, let me express more of the intent. And then just being, you know, I remember like, you know, March, April, be like, wow, on the one shot, it's already incredibly impressive. But then it also understands the intent around how we're going to evolve this and understands like the global context as well. So I think that's been a really interesting evolution till now where, you know, it was funny, I was talking to somebody this morning where, you know, I think about doing work, I had a flight and I was like, okay, I can do most of this work remotely and I don't even worry that like the Wi-Fi is going to drop out because I know that if I set up the right, you know, context instructions, like flush loop, you know, I'll see it, it'll see it through. And, and I think my last few months have been full of a lot of times where I will, you know, wish Claude a good night, set it off on like a pretty complex task of something of this like model class and wake up to, you know, actually it's usually done by like 2 in the morning and I guess it just fiddles its thumbs for the next four hours. But like the really impressive ability to like complete the swing, get itself out of the situation where it's like, okay, all right, well, Mike asked me to do this complex task overnight. I got stuck because this remote service went down. I'm going to write a like scaffolded like backend for it for now. So I'll, you know, I'll document that. I'll, you know, go all the way through. I have like a good mental model of like how far that's going to get me. And then when it comes back online, I'll fix it. I'll keep track of that fact. It's just like, it is, I think the most impressive thing for me is like, you're just being able to like delegate that kind of level of task and just trust that the right thing will happen by the end. And of course, like you'll review the result and there's still like a whole verification thing that we can and should talk about because I think that's an important part of still completing the swing there. But it's really forced me to rethink, like, what does being productive with one of these models look like? And it is much more, like we've talked for a while about, you know, like what is it like when these models are more of like a companion or a coworker? And it really feels like now it's like a teammate that I can delegate like a lot of work to. Totally. I mean, I don't know. I feel like there are people internally at every who have been using it who have been like, oh my God, I think I kind of need a new set of skills to use this model. And I think you can especially see this with people who are maybe more non-technical internally and who are more on the knowledge work side of things where they're like, I don't even know what I would use this for. And the people who are orchestrating agents are like, holy shit, I feel like there are so many new things I need to learn. So I'm curious for you, tell us about the difference between your impression when you first tried it and now. Yeah, I think that your point on adopting workflows is a really good one. Quite literally workflows. I'll talk about that in a second. But also just in terms of like, how do I think about usage of the model? Because at first, and the timing was interesting because it kind of coincided with me transitioning from CPO into labs and going really back into builder mode. And I think it was about a month and a half or two months into that that we first had, you know, one of these models available internally. And I sat there and I was like, I feel like a total newbie again because I feel like the way that I am prompting or even thinking about decomposing a task is really out of date now with this model. Like, it's no longer, and it's even thinking about the time horizon or the sort of like interactivity model I think has to evolve as well. Like going from, I think early on would be like, I have an idea for this feature. Can we start by doing, like, absolutely not, right? To, great, like, let me express more of the intent. And then just being, you know, I remember like, you know, March, April, be like, wow, on the one-shot, it's already incredibly impressive. But then it also understands the intent around how we're going to evolve this and understands like the global context as well. So I think that's been a really interesting evolution. Till now where, you know, it was funny, I was talking to somebody this morning where, you know, I think about doing work. I had a flight and I was like, okay, I can do most of this work remotely and I don't even worry that like the Wi-Fi is going to drop out because I know that if I set up the right, you know, context instructions, like flush loop, you know, I'll see it, it'll see it through. And, and I think my last few months have been full of a lot of times where I will, you know, wish Claude a good night, set it off on like a pretty complex task of something of this like model class and wake up to, you know, actually it's usually done by like 2:00 in the morning and I guess it just fiddles its thumbs for the next four hours. But like really impressive ability to like complete the swing, get itself out of the situation where it's like, okay, all right, well, Mike asked me to do this complex task overnight. I got stuck because this remote service went down. I'm going to write a like scaffolded like backend for it for now. So I'll, you know, I'll document that. I'll, you know, go all the way through. I have like a good mental model of like how far that's going to get me. And then when it comes back online, I'll fix it. I'll keep track of that fact. It's just like, it is, I think the most impressive thing for me is like, you're just being able to like delegate that kind of level of task and just trust that the right thing will happen by the end. And of course, Keep the site up, or we were just trying to like add the one incremental feature. And, you know, hashtags take a week to build, but then there's like all the things that you want to continue doing on it as well. And so I think it's both that shortening of time, like, there's still the time required for the idea and the concept and the iteration. And then the other piece, which is the way you can then iterate on what you have in, I think, a really, I think really fun, but also like very, you know, sort of in the flow kind of way. And then, you know, if now this is me as a sort of professional software engineer, sort of startup founder, beyond that, if you had that idea, you know, and I saw multiple people go through this, like, it was like, well, I'll try to find maybe a consultancy that will take this on, but like now there's like, it's a really lossy process of like what I wanted. You know, the government's money for it. And I think that the thing that I think is like the most exciting part about these models getting not just more autonomous, but again, closing that gap between intent and execution is what I've seen it do to people's ability to build who are not like builders. And the trajectory of these models has been, you know, if something able, you know, of this general mythos class is like in that class of models and eventually, you know, models of, you know, that are cheaper and more accessible to other folks become available too. And like as that process happens, like, I just think it is just opening up so many, like, I got a thing the other day, I get very excited about this stuff, if you can't tell, from somebody internally. And we had built them an internal tool that kind of combined Fable and like access to some internal MCPs. And she said like, it is the first time in my life, and she works in recruiting, she's like, the first time in my life where like, I feel like the thing that's in my head and the thing that exists in the world is now like, right next to each other, like I can just do it. And it was like very like a meaningful moment to her because prior to that, like, I remember these days, these days were five years ago or four years ago where that person, if they wanted a tool, would have to either make do or try to get an internal tools engineer that probably was overloaded with 50 other, you know, requirements. But instead now they like are just having the time of their lives building. And I think that is, I think that's cause for a lot of like hope. Because I don't think that human capacity for creativity and what's possible is enormous. And I think like at our best, we are basically expanding the number of people who can then see that through to something that feels real. I totally agree, but I do think that there's a question in the back of my mind and I think it's probably going to be in the back of the minds of some of the people listening. So I want to ask you, given everything you just said, is software engineering over? Yeah, I think software engineering is different. It is like dramatically changed. And as I, as I probably would have defined it if you had asked me around the Instagram time, like what is software engineering? I'd probably say like, all right, like thinking through the hard problems and like thinking about an architecture and then like spending a lot of time in, you know, like TextMate. I don't know where that came from, but like, you know, like the text editor you're going to edit with. Or Xcode, you know. Watching Rails casts, you know. Yeah, exactly. Right. Exactly. And understanding the intricacies of Django's like ORM layer and then like 15 bugs after you deploy it. Like so much of that is radically different and collapsing into other parts of like product management. I think that sort of like PM eng split, I think if I see it even in our teams has become much more diffused. That's radically changed. But I think the overall, like, like maybe zoom out from software engineering and think about like software production or, you know, software development, but not in like just the pure developer case. I think that is like alive and well and essential still. So I think that it, that is the moment that I feel like we are in. I think Fable is another step on the direction of, and I'm not going to call it the final step. Of course, a lot will still happen. But like, I think a pretty significant step in terms of like. The trust, at least I end up placing the model in terms of its capacity to see things through and even, you know, architect things reasonably is quite high. So that part feels like it is, it is not ever going to be done, but it is pretty, pretty done, right? Like it's gone really far. But I think that the overall sort of craft of the, what needs do you have? Like, what are you putting out? Like, is it actually good? I think still a very human endeavor. But I also sort of can see that that is not a transition that is sort of pain-free in a way. Like, I think there are plenty of people who love the craft of like actually putting, I used to love the stuff like I solved that problem so elegantly. You would dream about code. And if you've had the experience of like you would dream about the thing that you're working on, they like wake up in the morning and be like, I figured out how to solve this thing really elegantly. And that for sure has, has, has passed. And I think that there's, you know, there, there is a feeling of loss, I think, in some of the like better engineers that I talk to, as well as the feeling of, oh my God, but I can do insane amounts of work now at the same time. So we're holding both ideas in our heads at once, I guess. Which I think is the most important part of this. Like it's normal to feel sadness for that kind of thing and excitement. But I'm curious, let's just take the thesis of software engineering is alive and well. What does that actually look like inside of Anthropoc? Yeah, I think there's, there's a few pieces. I think there's still the, the crafting of, well, I got to take it off from like the full software development cycle or like maybe what I see on a day to day. Maybe I'll do a little bit of both, but I think there's still a lot of, you know, we all got together. We, we talked about the next way we want to, you know, evolve co-work. And now we've kind of broken it down into areas of ownership. I think that ends up still being quite important because there's still context that you hold as a person that is sort of beyond Claude, right? Like what is the actual intent of this product? How's it going? What do we need to know about the sort of other products that are coming down the pipeline that are going to be integrated in some interesting way? So I think that aspect is really important still. And so, you know, though we have many Claude's to each human, each human, at least the way we've been working on Anthropics, still kind of has, you know, we call them DRIs, like directly responsible individuals, still has like a DRI ship over some part of the product or some area. I think that'll be the case for a while because I think there is value in not just this distributed, like we should all make co-work better, but instead like, all right, I'm thinking through how co-work does it this particular task. And there's still a lot of, you know, the, you try to keep meetings minimal, but they, they still emerge and you still have these kind of alignment conversations. Then like a lot of that sort of asynchronous delegation. I think what many engineers here have now found is they've, they've all built. And I think we should solve this at some point at like a broader product level, but they've all built some version of, all right, I'm going to now like create a dashboard of where all my Claude's are doing and what's waiting for me and what pull requests like need my attention because, you know, either a human or a Claude code reviewer got back to me. So there's a lot of that sort of meta maintenance of the, of the work that I think, again, I think we'll standardize some, but I think some of it will always be a little bit bespoke to the way each individual likes to work just in the way that people organize their windows, how they organize their work. And then there is, I think also the understanding how things work in production. And I think that is another, like, there's a few like next frontiers, I think for the models. And I think one of them that Fable does, you know, make significant strides in, but I think there's, there's more work needed here is understanding what happens to code after it gets deployed, you know, because there's incidents, there's, you know, this was all working well, but like this network link got cut, which is not in your usual failure mode. And like it manifested like so much of Instagram, like 2012 to 2016 was like dealing with that and scaling things up. And so that role of the engineer still remains really key. And I think getting the reps in around incident response and understanding how to stay calm, gather data, like remediate what's immediate, but then like go off and, and work on, on longer term fixes, like still a necessary part of it. And I'm trying to think if there's any like other pieces that are, that are notable I don't think the fundamental, you are sending messages and it is giving your message back is totally wrong. I think that there are ways we need to evolve with like, one is maybe like three that come to mind. Like one is, is your laptop the right place for it? So I think that's number one where I mentioned with the side project I was working on how useful it was to have the mobile side for us to create a cloud code. He's always like, you know, ahead of the curve on how these models get used. Almost a year ago, maybe nine months, I was talking to him, he's like, "Yeah, I've moved a lot of my cloud code work to mobile." I was like, "No way." And like, it took me a while to get there, but especially with the Fable class, like there's oftentimes where, you know, because it can keep the session going and we use like kind of remote dev boxes at Anthropik, like it is a thought I'd be like, OK, I need, can you keep, keep up and doing that. So, I mean, number one is like decoupling the where the work is happening from where I'm talking to about the work. The second one touches a little bit on what I was mentioning earlier around like, what are, how do you take everything that Fable has sort of discussed or decided or proposed about something and make it comprehensible? And that's an area that we're thinking a lot about. Like there are some skills that are out there that we've used around, like, all right, can you diagram this? Can you do that? So that's a place where the current chat UI, I think is insufficient, where like it will, you experience this with Fable, it will give you like a lot of text. You're like, this, I need to like take a walk before I fully understand this. And I think that that is a piece of property I sometimes will do with Fable. It's like, okay, like, you have a lot more context on this than I do. Can you like back it up? Like, like, let's do like more progressive disclosure of the complexity here. So I think that that piece is interesting. The last one that I think is we're still early in pulling on is thinking through multiplayer, where, you know, at some level, like these, the abstraction levels and like, because we have this sort of DRI and like ownership area, usually like a chunk of significant work, a human and a couple of clouds, like that is still floating together. But in other cases, that is less the case, right, where it's, you know, maybe it's an incident response where multiple people are thinking about it. Maybe it's, you know, a project where there's multiple competing or not competing, but like conjoining areas that are coming together. And thinking through like, what would it mean for, you know, and we have like chat sharing, which gets you a little bit of the way there, but I think there is going to be a need for more like, all right, you've got an independent cloud that's doing a lot of work that was, you know, kicked off by somebody, but can it be keeping up with all the other work happening on the team? I think that is an interesting and underexplored sort of next frontier about how this work ends up happening. But I think it's really exciting because I think, again, it's, it's the, it's the level of teammate collaborator that, that the models are now capable of and we're almost holding them back by not having the right abstractions around them for that to happen. Yeah, it makes me think I've, I've mostly been using this for my own vibe coded stuff. So, so I haven't really had to, I haven't really had to think about this, but there's, there's a problem when you're using this inside of an organization, which is, do I really understand every part of this? And, and therefore, how do I transfer the context of what the model just did into my brain? Like that's, that's one of the big bottlenecks. How do you, how do you think about drawing the line, especially with a model like this, around how much you actually need to understand and how to make sure that you have enough context on what it's done to feel comfortable? I think there's like two big pieces here. The first is verification, where I, I became like fully verification-philled earlier this year and now, like, almost in the same way, it actually, it connects to how I think I used to do when I was sort of typing code more full time, which is try to find the sort of tightest dev loop that you can around the idea that you're trying to develop. And like sometimes with Instagram, that meant, like, you know, actually making a new build target in Xcode that was just that screen with some sort of synthetic data and just doing that dev loop. And I'm not, and I would mentor newer engineers of like, if there's one thing that I can impart on you, like it is, try to get that for any project you're working on and things will go much more quickly. I think that is no longer exactly the case here, but I think what is the case now is, anytime I set it up, like, how do I get, like, for every pull request that Claude is putting out, that there is an attached, you know, photo or video, whether that's an iOS PR or whether that's, you know, something in the UI. And that's, I think that that, that helps you gain a lot of confidence because even now, you know, you might have like, you know, Fable go off and do work for a couple of hours and be like, it's, I'm done. And it's really useful to say like, and here's the like full screenshot gallery of the full UI. Cause you might say like, oh, you know what, on screenshot eight, that error state, I've never actually seen it, but I could see how, you know, a person might hit it. Let's actually make that different. And so getting that comprehensive verification, I think, is something we've been working on a lot internally and like sort of publishing more and more skills and knowledge about. But I think is, is a really key piece there. Um, and then the second one is, I think you ultimately as a person still need to stand behind the work that you are doing, especially if you're putting it into a production system. Like a lot of people use Claude every day. There's still the accountability of like, oh, that's still Claude that wrote it, but like, you need to understand, you know, the, the, the, at least the, the general decisions that were made on these pieces as well. And so I have seen, uh, a fair amount of engineers actually adopt this practice where like, Claude will have done the work, but then there is like the fall of conversation around, well, can you like, can I make sure I deeply understand, like, all the tradeoffs that you made and, and that, and whatever lowercase a artifacts need to be produced in order to make that comprehensible, um, is important. It is really interesting though, to be in meetings where somebody will say like, oh yeah, and I have this, this PR ready and somebody else asks, like, oh, that's interesting. Like, did you do X or Y? And have that moment of pause. And they're like, you know what? I'm not entirely sure I will find the before we merge this PR. And that's, you know, I think that adapting to that norm and figuring out how to work with that is something we'll have to do. Tell me more about the verification loop. Such a, it's such a hot topic right now. Sounds like one way that you do that is with screenshots and screen shares, but what are the other ways that you think about that? I think part of it, it starts in, can you get to a place where you are exercising real, like, uh, sort of real flows that aren't just like a static injected piece. And the system gets more complex. That gets more and more complicated. Um, so we've invested a bunch. It's like even just getting it so that the, you know, the iOS app can log in to staging on a real account and like have real data. But then you don't want it to then go through like an eight-stage onboarding process every time, but you're just trying to test like the second part of the screen. Um, so there's a lot of work around, like, how do you, you know, is there a special affordance? Is there like some shared secret, whatever that is around getting the, the, the, like app, you know, to really feel as human, you know, using the product as possible. So that's one, one aspect of it. Um, the second is like this mix of like well-known paths versus the things you're exercising in the exact moment, like the former being really useful for regression testing. And so we have places where we've expressed like, uh, sort of ideal workflows in X basically, and the cloud can repeatedly check that. And then there's also, and Claude does a really good job of this, sort of expressing the intent of the current change at hand. So that gets really, really deeply exercised. So I think that the combination of those two things is important. The visual verification that I mentioned as well. Um, video has been really cool to see. Actually, video is a very underexplored tool to give Claude as well. Like I think I've been prototyping is just giving Claude, uh, video captures of the thing that it has built and then giving it just basically an FM peg. And you'll watch it scrub through and it's like, oh, this animation has some jank in it. I'm going to go fix that. And With Victor, the specific implementation, it wasn't worth porting, and I do not think you could have done that, A, with previous models at that level of success, and B, without the kind of scaffolding that workflows provide. So I think that is extremely exciting kind of combination of model capabilities and then our own ability to orchestrate them over longer and longer time horizons with that feeling of like, you had a goal, you broke it down effectively, and then you were able to make it work. The other piece is, I think over time we'll be able to also make some of those subtasks sort of tuned to the, have the model be tuned to the level of complexity of it. So you can imagine that some parts of the dynamic workflow don't need extra high thinking. They could use, you know, a medium thinking to get it done, or even a smaller model. And I think that's really the future of where these things are going. So yeah, I'm a huge workflows DAU. For people who haven't used it before, tell me about how do you got that workflow made? How did you design it? How did you make sure it was good? Yeah, it was pretty iterative, but sort of just started with Cloud Code, like, hey, I have this complex, you know, kind of task, like, let's design a workflow to go and do it. It kind of showed me the plan. I was like, oh, this is like close to what I want. I wanna make sure that you do these three or four levels of like additional verification for missed features. And here's what you have. You're ready to go. And it expressed the workflows in code, which I think is really valuable too. You kind of see what it was about to do. And then what was interesting is it did the full port. And then I had like a couple of like follow-up kind of questions that I had or like little tweaks. And I did those as sort of like mini workflows that built off the previous one as well. But I think that's like, you know, we talked a little bit about whether chat was the right interface. So we've had that conversation over the last year. And I think workflows are a good middle ground of you can compose them using chat, but they're expressed using code. And then they're executed with like, I think a nice clean UI around what's happening at every stage. And like, I think we'll start bridging longer horizon work with chat in ways like that over time. Mike, this is such a great conversation. Thank you so much for joining and telling us all about this new model. I'm really excited to get to spend time with you and really, really look forward to what people think outside too.