Why AI Makes Things Worse for Enterprise Teams

The Story

Paul Ford and Rich Ziatti start with a sales pitch for their own company, but it turns into a setup for the larger point: using AI in software is not the same as using it well. They frame Abort as a team that helps companies adopt AI without wrecking their existing systems, and that leads neatly into a report from CircleCI and ThoughtWorks about what is actually happening inside engineering teams.

The report looked at a huge pool of deployment workflows, and the headline is messy. Teams are producing a lot more code. Paul says throughput is up 59 percent year over year. But that extra output is not landing evenly. A small slice of teams, roughly the top 5 percent by their measures, are moving at a completely different speed, pushing huge numbers of changes every day. Everyone else is dealing with more bugs, more failed checks, and a lot of cleanup.

That gap becomes the center of the conversation. Rich argues that AI code generation is a harsh filter. If a team already has strong engineering habits, clear review practices, and people who can tell good output from bad, the tools can make them much faster. If they do not, AI turns into a machine for generating work they then have to untangle. The real problem is not that the code appears from nowhere. It is that it enters testing and integration pipelines without enough human understanding behind it.

They circle around an open source example to sharpen the point. The maintainers of the Zig programming language have a no-LLM rule for contributions. Paul finds that stance credible, even if he would not want to run a business that way. Their argument is that they are trying to build human contributors, not just accumulate more code. A messy human submission is still an investment in a person who might learn the system. AI output adds text to the repo, but not ownership.

From there the episode shifts from engineering to management. The hosts describe a familiar pattern: leaders play with AI, get dazzled by polished output, and then push teams to "use more AI" without changing the process around quality control. That creates a false metric. People are rewarded for producing more, while the real bottleneck moves to testing, review, and repair. The best teams, they argue, are not succeeding because the model is smarter. They are succeeding because they have built systems around it, including automated checks and strict ways of working.

By the end, they land on a sober view. AI has already changed individual work at the desk. It helps people get unstuck, sketch ideas, and write first drafts. But at the organizational level, the change is barely underway. Big companies will not reshape themselves around software overnight. The software, even AI software, will have to fit the organization.

Main Themes

The episode keeps returning to one idea: AI amplifies the quality of the process around it. That is why a few teams are flying and many are stumbling. The tool is the same. The surrounding discipline is not.

Another thread is the difference between individual productivity and organizational performance. On your own, AI can feel amazing because it is fast, helpful, and flattering. It gives answers, code, images, and plans with very little friction. Inside a company, though, that personal feeling runs into shared systems, testing pipelines, and other people who have to maintain what got produced. The hosts are blunt that this is where the fantasy breaks down.

They also connect AI adoption to talent and experience, though they are careful not to reduce everything to genius engineers. Rich talks about top performers being able to direct these tools with far more control, while Paul pushes the focus back toward team process. Put together, their point is that success comes from judgment plus structure. Without both, AI makes the mess bigger, faster.

The advantages of this technology are not equally distributed: somebody picking it up and going with it might have a lot of failure states, and only a relatively small number of people is actually really successful with AI coding. — From the episode

Full Transcript

Source: openai 27m runtime

Hi, I'm Paul Ford. And I'm Rich Ziatti. And this is the Abort podcast, the podcast about how AI is changing the world of software. And Rich, how are you today? I'm doing well. I want to share some thought leadership with you from our industry. I want to bring it on in and we can discuss it. Good news or bad news? Interesting news. Turns out that AI isn't good for a lot of engineering teams. They're struggling. Whoa, okay. Let's do it. But it's really good for some. So we'll talk about that. We'll play a theme song and let's go. Okay, we're with a company called Abort. You and I, we're the co-founders. Right? We sure are. Abort is a partner. We use AI. But basically we are old software pros and we have a set of really good custom tools for delivering software with AI in a very low risk way. You come to us and you say, I want to get in on this revolution. I heard you can get things a lot faster and cheaper, but I still want it to be really good. And we take you seriously. We bring you in. Yep. We can turn around your old legacy tools. We can build something new in Greenfield. It's not just a matter of like strapping an LLM in front of something and crossing your fingers. It's, there's ways to use this stuff that are really good and really productive. So that's what we do all day. And we build and we ship things for large firms, small firms, not-for-profits, all the people who need software. Yep. The thing I'd add is that we don't lead with a bunch of tech. We come in, we listen, we get to know your business, see where we can be helpful. And then we go from there. Sounds great. Reach out. We were working in insurance, helping with policy management. We're working health, helping people make better dashboards, like really grisly stuff, but we like to do it and we like to do it fast. That's all you need to know about us right now. But you know what's funny is that actually ties into today's conversation. Oh, talk to me. I have in my hands a piece of thought leadership. Oh, it looks chunky. It is chunky. It's from CircleCI. Do you know what the CI stands for? No. Continuous integration. I know what CI says. It's a hell of a thing to put it in the name. It really is. It really is. So tell the people what CI is or I can't. Continuous integration is a style of building software where it's less ceremony, less chapters in a book, and you just kind of keep going as progress gets made. Just keep pushing that code out, right? Keep pushing code out. So this is a company, CircleCI, and it's important to know why. So they work, and it's done with another company called ThoughtWorks, it's kind of like a big consultancy. ThoughtWorks is a big consultancy. And so what CircleCI has access to, what they do is they provide services to all sorts of programming teams to ship code in a more reliable- Streamlined, rapid way to get code out. Lots of testing, lots of good stuff. And it's been around for a while. So what they have is insight into how people are actually deploying code these days. They have the data. And they had 28 million workflow deploys that they were able to look at and see sort of what's going on. And so I'll give you some interesting stats. So I think clearly people are writing more code. So I'll give you the number, 59% year over year throughput has increased, but just throughput is throughput. Like it's code. More lines of code. That's right. And so- That's a huge increase. Very short. Yeah. I mean, because it's not a lot more engineers. Right. Right. So every word I'm saying to you now, I would be saying 59% more words. That sounds terrifying. Nobody can, I mean, we wouldn't be able to get this done in 20 minutes. So more code is being written, but what they're finding is that the number of bugs is going up and the number of issues is going up. And what they're finding is that there's this huge split. The 95th percentile and above of really productive teams, they're off to the races. They are pushing thousands of changes. They are just all in and they're moving so fast. Okay. When you say 95th percentile, what of what? Of sort of high velocity teams. No, but are they higher quality? Well, I mean- Is it the top 5% of quality or top 5% of velocity? We're kind of working back from velocity here. I mean, they're not really reading every line of code. Okay. But what they're seeing is like, you know, things fail. Things have bugs. Things have issues. So what they're seeing is that if you are a team that's like all in on this and you're kind of in that cream of the crop, it's like an order of magnitude how much more you're getting done according to their metrics of getting things done. Okay. Everybody else, the tail gets really, really long. There's more bugs. Things slow down. Projects stall. And what's happening, and they make a really good point in this, which is just like, because, okay. So you use the AI code. It's magic. It writes some code for you. Right? And now you bring it into your continuous integration pipeline, which means that you're going to be tested. Yeah. You're going to have all of these things going on where we automate the quality. Yeah. Right? But that produces a whole lot of issues. And now nobody's seen the code because it was AI generated. Yeah. There's no like magical way to like get through this. And so all these hours are being spent cleaning up AI mess because you've been told you got to use these tools. And it is fast. It gets you done quicker, but it's leaving you a big mess. And so I think like the really good teams are the ones that can automate the mess cleanup, do lots of, you know, they have like a real specific policy, like smaller changes, whatever. And so it's an interesting time because what we're learning is that, and I think we keep learning this in a million different ways, the advantages of this technology are not equally distributed. It is somebody picking it up and going with it might have a lot of failure states and some people might have a relatively small number of people is actually really successful with AI coding. And it's not even relatively, you're saying 5% are really cashing in all this productivity and all these capabilities. I think it's really confusing to us because we're cashing in the capabilities. Yeah. And I'm going to say something that can sound maybe a little arrogant. I don't mean it to be. Well, okay. High quality talent can really assert their knowledge and their ability to assess what's being produced on these tools in a more deliberate way than mid-level talent. And I don't mean that mid-level talent is, I don't mean that to sound elitist. The challenge you have with this stuff is that the tools don't just produce stuff because you push a button at the beginning of the day. You do a lot of things to guide how these things work. And if you don't have a really thorough and intimate understanding of A, good practices in general, right? And the truth is you can be productive without tons of good practice, right? You can be productive. Python, bless its heart, is incredibly forgiving as a programming language. It lets you do stuff. It's not going to bat you over the head with all kinds of rules. It's kind of its strength. It's so purposefully simple that it actually drives a lot of very serious engineers batty. That's right. Because they're like, no, no, this is nowhere near as complicated as it needs to be. Exactly. Now you have these just incredible weapons grade tools that can be massively productive. And if you're not really asserting expertise and high level concepts of how code should be struck, like all the things that you, all the boring things about good practice, right? It's a runaway train. And here's the other reality is you're producing stuff that you've not reviewed and you're handing it into the CI process. And unless you have absolutely airtight confidence in how you got there because of your skills, you're really rolling the dice. And I think that's what we're seeing here, right? What we're seeing is productivity is not code by the pound here. Like it doesn't work. It just doesn't work. It's hitting that wall for tools like CI. By the way, it's worth saying out loud, CI isn't just a cool way to like all the colors blend together. It's a process that actually applies rigor to the quality of what's being integrated into the code base. Like it's not, it's the whole point of it is that yes, you're supposed to go faster, but there are a lot of toll booths that are going to stop the process. So there's a few things that come to mind. First of all, we are veterans of process in this industry. We have CI, we have CD, we have Agile, we have, you know, Agile with Scrum and so on. They all end up failing because no process can capture everything. No abstraction is perfect, right? And so I think there's a little bit of that, which is just, there isn't a really good established process for working with all these new tools. There's another point I want to make too, obviously we've, we've talked a lot about Simon Willison. He had a link to, he's a, you should go check out Simon Willison's website. Just type it into Google, but he's really capturing the industry and he had a link to, there's a programming language that's called Zig. It's relatively a low level programming language, it's open source. It came in the news recently because Anthropic bought a company that uses Zig heavily. And then the company, they make a product called Bunn. It's a faster JavaScript. Now you're making up words. I know. It's just horrible. This part's horrible, but just stay with me. Okay. So the Bunn folks were like, Hey, we actually improved Zig and we made this one part like four times faster, but just so everybody knows, like, we'll go get the code. It's all good. We're still open source, but we can't put it back into Zig because they have an absolutely no LLM rule. Hmm. Okay. And at first that sounds like maybe it's open source people just being their open source selves. Yeah. But the Zig maintainers made a very interesting point and I think it's a point that like the whole industry should internalize. They're like, look, if you're going to, you're going to do what you want to do, we're going to say no. And here's why. Our job is to develop and create contributors. We need to create an ecosystem around our code where people take ownership and do things. We will invest in contributors even if their early contributions are really messy. But investing in an LLMs output doesn't create contributors. It just adds more code. Yeah. We got plenty of code. Yeah. I got, I got code all over the place. Yeah. It's more important for us to draw this very clear line and only let human work in and build people up so that they can be part of this community and less important for us to just have the best product as quickly as possible. So go to, there's no rules. You go live your life over there, but don't expect it to be upstream here in the main branch if you're going to just have a robot do all the work for you. Now literally the other companies inside of Anthropic, so like it's a funny split, but I did hear that and I was like, look, I may not agree with that top to bottom, but it's very credible. Yeah. Right. And what they're saying is, look, I don't want this in my process because my process is to build up humans to be contributors who understand the code base completely. Yes. And I get that. Like I really do. I don't, I don't want to build that myself. I don't ever want to live in that world again. But I actually do understand those boundaries. Let me read you just a tiny section from this. Okay. So, cause let's focus on what's working. Cause you know, what's funny is I read through this report and everybody should go read it. It's fine. But when you read the report, everything fails in the same way. Like it's just sort of like too many bugs. The top 5% of teams nearly doubled their throughput year over year from 6.8 to 13.4 daily workflow runs. The top 10 and 25% of the teams saw smaller, but still significant increases. So the median team increased throughput by just 4%. So what we're seeing is like, boy, the advantages of this are just going to a very small number of people. And then the next one is the year's most productive team delivered roughly 10 times the throughput of 2024. Organizations running just, if it's an AI focused company, they're running thousands of workflows. Just kind of, just absolutely shooting code out. Do you have any company names that are in the top 5%? Nah, they're being fuzzy. Yeah. I get that. Yeah. Look, I'm going to punctuate the point I made earlier. AI generated code is an absolutely devastating vetting process is what's happening here. There's something, you know, known as- Explain what you mean there. And actually I'll give you a stat, which I think is relevant. Each of the top 10 teams on CircleCI, which are probably mostly AI companies themselves, validate more than 10,000 changes a day. Okay. So this is like huge numbers of new code releases per programmer per day. Yeah. So back to your point. Yeah. What I mean by vetting is that there's always been something known as the 10X engineer in engineering. But if you distribute out the most junior to the best engineers, right, the best engineers are not 20% or 30% better than the junior or the beginner engineer or the weak engineer. They are 10X as productive as others in their cohort, right? People get very upset about this concept. It is very real. But I'm going to tell you, you know, I'll tell you what's real about it. Yeah. I don't know if I've ever met a truly 10X completely everything engineer. But I've definitely met 10X in terms of like, oh yeah, I do low level, you know, streaming databases that take into account the rotation of the hard drive. The next person standing to their left cannot do that. But I also think I've seen it in terms of raw output. The person next to them is one tenth as productive and they are a very good, credible engineer. Like I have seen it. We've been fortunate enough to hire a few of them. And what you have in some companies, especially AI companies who have just like thrown so much money at the top talent, they've kind of gathered all those people, is that their productivity there because they know how to bring a tool this powerful to heal that the others don't. They simply do not. And that is, that is what I mean by a vetting process. These tools are not making the 1X or 2X engineer two to three times more productive. I would not. I wouldn't even focus on the individual engineer because I think these tools could really help the individual. Yes. I would focus on the, let's call it the 1X process. Yeah. I think that's more relevant than the individual because, okay, here's how the AI companies are going. They're saying, yeah, of course we're going to use these tools. Of course, all day long. Let's go. What do you need? Do you need more tokens? Have more tokens. Anthropic says this all the time that they're using Claude to come up with the next version of Claude code or Claude work. What is it? Claude co-work. Yeah. Or whatever. And it's like, look how awesome it is. And it's like, you have literally some of the best engineers walking the earth in your walls. Right. And so it is the kind of tool that if you know how to get gain control over its output, you're going to be incredibly productive. And if you don't. So now we have a problem. The problem is that results and the value, if this thesis is correct and it's only that top percentage and they tend to cluster around sort of, then are you out of luck if you don't hire a bunch of million dollar a year programmers who are really good at this one specific thing? Because it's an order of magnitude. We're back to the, it's, it's 10 X productivity by these metrics. Yes. If they're using these tools wisely, followed by this incredible long tail of not that productive. Yes. Or even less productive in some ways. Yeah. I have two thoughts about that. One is I think the tooling around AI will get better and people don't talk about this a lot. But the usability around these tools, how you can be productive with them, everyone like no one has put, it's so early in terms of the maturity of these tool sets such that the mid-level engineer isn't being empowered in a way where their outputs can be more productive. Also the LMS are getting better. It's just early. Like I think you can get, will you get 10 X out of the, the mid-level engineer? Maybe not, but you'll get two, three, four as these tools get better and smarter about how they can help people. There's more code and there's more velocity overall. There's just a lot more. Yeah. I think if you're pulling the lever and just letting, you know, the fire hose of code come out and you don't know what's coming out and then you're sort of saying a prayer and putting it into the integration workflow, best of luck, right? You just got to know what's going on there. Let's be, let's be management consultants for a minute because it's, here's what this feels like. For a minute. For a minute. Here's what this feels like. Top tier. Hey guys, figure out the process that works. You're having a blast. Go to it. Ladies and gentlemen, you are free, but no bugs, no defects. Use this thing as much as you want and just get 10 extra results and if you need money, you let me know. Okay. So that's top. Okay. Second tier, lower tiers are this. We got to use more AI. Who's saying that? The boss. Well, there we go. The boss. And then everybody's like, and people, and so the metric gets wrong. The metric is immediately wrong, which is, I think everybody's saying, you got to put the AI in here to get us the results so we can be like those AI companies. The boss is playing with the tools. That's part of the problem. That is part of it. And because he gets it to make them a plan for like a new piece of software. Not a plan. It's just a bunch of pretty colors. They're like, oh, look at this. I'm almost done. Just finish it. So I had this experience. I'm working on a little side project to do a make sort of climate analysis. Climate analysis is very tricky and I had to make a little document that would explain how to climate proof your house and I showed it to my wife who's in construction and I really almost didn't survive the next five minutes because what I thought was fine regarding backwater valves outside in your sewer was not accurately presented. Sounds like you had a really fun weekend. Everybody was having a great time. And so like the subtlety gets lost, but it looks so real. It's so confident that you assume that among all its many other things, Claude is obviously a master planner. And we're seeing that, right? So these mandates are coming down from oftentimes non-engineers who are saying, look, it drew me a picture of the interface. It looks pretty good to me. I mean, those managers should look at this paper and realize that the cliff of diminishing returns here, and the truth is this, and I'm going to say another thing out loud. It's a podcast, so you should. I should. Yeah. Is look, certain industries are just not going to attract the best people. Like this is the reality of it. Like a top shelf engineer wants to work on the coolest stuff. And there's a lot of industries that need straight up just people to handle the workflow of reset your password at the bank in the Midwest. This is why client services exists, my friend. There's that too. So this is the other piece of advice I would give people is there is expertise that's going to cluster around this. I think the professional services industry around this stuff can take off if you have the right people. If we don't get that 5%, what happens to us? We should show them the door. Yeah, that's it. So it's like you need to get people who are just like, okay, here's how we use it. Here's the process. I'm going to be frank. Process should not be secret with this stuff. Don't trust anyone who's like, I have magic AI tools that will guarantee success. We will sit with you and anyone good will sit with you and be like, here's the 12 things we do. Here's the internal tools that we use, and here's how it's very, very likely that I will be able to get you a bad version of your software followed by a good version of your software in the next six weeks. Yeah. Yeah. Yeah. Right. Look, I think if you're going to give some advice to someone that's in the middle of the pack here, like an engineer that's in the middle of the pack is, this is going to be my catchphrase, like AI punishes laziness. Like it just does. And you could see that in an image that got spit out in 10 minutes rather than, I've seen art. I've seen videos. I've seen music videos where clearly someone spent hundreds of hours using AI, but they applied all their creative thinking to it and use it as more of a tool. Do not let it come up with what looks like really neatly, really tidy code and just pass it along. Like you're going to have to do the work of understanding what it's outputting. And that's work. You know what everybody needs to do is pick something where they're truly an expert. Could be a hobby. Sure. And you want it to write in very, very clear, discreet, like footnoted terms about that hobby. Yeah. And you will find. You learn where the edges are. Yeah. And then realize that that hobby is everything. It's really good for a lot of stuff, but without the refinement and verification steps, which are what those companies have, that's, I'm going to tell you, that is what. They've built software to read through their software. Like the best shops do that. Right. And that, that is why they're getting those results. They've automated a lot of that. That's right. Formally verifying. They're following these very strict processes. And so they always get good stuff on the other side. Whereas your guys, your people are in there just kind of like making do the best they can with cursor and hoping it works. Yeah. Frankly, when you hit the wall at the test phase, these are people that are using sophisticated tools. There's a lot of businesses out there that are pushing code and just hoping for the best because they look pretty, pretty darn good and we're just going to run with it. And that's scary too. I gotta say the worst part of all of this is I think what's going to happen is everybody's going to go looking for that one tool that'll solve it. Yeah. And it's actually process and learning and accepting. The boring is creeping up on all of this stuff. Yeah. Like there's hard, it's writing well, takes work, creating art with these tools, takes real work. Boring just feels like home to me though. Oh, you're an exciting guy. I am. I am. In other ways. Yeah. But I like boring software. Any other juicy stats out of this paper? I mean, no, no. This is not a, I wouldn't say it's a very juicy stat. It's not a, yeah. No. What it, what it is telling us though, and I think it's really good to be, to remember is like for all the narrative about how the revolution is here and God knows I'm, I'm part of that narrative. Most people are not experiencing it the same way. I think I, it's funny, I want to close with this thought. I think the revolution is complete at the desk. What does that mean? Everyone's got it at their desk. Like everyone is like, if there's a function that just can't seem to run right cause you're a coder or if you want, or you're stuck on a, on a tagline for a slogan, you're, it's at your desk. It's, it's stack overflow for everything. Kind of answers all your questions. It's kind of there to sort of maybe if you're stuck and you want to get unstuck or you're just need a spark to keep going or you need to review a legal document. I'm not, I think that, that has been, that is complete, right? That, that part of it is complete. I think at the organizational level where the vetting process and the testing and here it happens to be a very clear process, which is like, we're going to have to test this code before it goes out or whether it be we're going to change the way we work because we have AI now it, that is at the, we're at the like 3% mark of that change. I want to, too early, I want to close this with something that kind of to think on, right? Which I think that's important and I, we're trying to work on that too. Everybody is. Yeah. Everyone has experienced this technology as an individual. Exactly. And it's one of the ways that it really has blown up in our face because that's actually how people sit there and they're like, it told me my scientific theories are my God. Yeah. It's incredibly affirming. It's sycophantic. And the organization, remember when you start working and you kind of get that boss who sits you down and is like, this isn't it buddy. Yeah. Right. And it's like, AI is never going to do that. It's never going to let, so everybody's having this experience where they're getting their narcissistic supply fed, they're writing code, they're drawing pictures and nobody is swooping in and going, Hey, that's kind of garbage. Yeah. Right. That's what the organization needs. Yeah. Like the organization doesn't need all these little tiny people like in cubes going like, I came up with everything. It's also when you put forward the work to an org and it falls on its face, it's very embarrassing. Like it's not good. Right. Like you can try it at your desk and fail a bunch of times so you feel good about it. But a tool like this, it's just going to light up everything red. And we're going to see now that we have the giant AI companies, which are all worth like nearly a trillion dollars, they are real, they really want to make the enterprise work. Yeah. But their, their view of this as a, is as a one to one thing that kind of scales up. Yeah. I don't know how that's going to go. It's going to go. We, we should, we'll have conversations about that. It's where we are, are swimming right now, which is how does an org internal metabolize all this? Right. So it's useful. I'm going to tell you, orgs don't bend to software. No they don't. Big orgs. Small orgs have to. Yeah. Big orgs. The software must bend to them. Yes. Even if it's AI and it's worth a trillion dollars. That's right. That'll be really wild to see and I'm going to enjoy that. There's another way to bypass all of this and get incredible transformation into your organization. That's right. That's right. And I, let me, let me tell you, it's to call, uh, is it 1-800 ThoughtWorks? Is it CircleCI? No. No. Jesus. Ah, no. That was going to be a smooth exit. No, there's no such thing. You made a joke. That's right. We are a New York city shop. We're a growing group of people. We are solution engineers and you say, Hey, I need to do this thing. So I'll give you, I'll give you a couple of examples. Right now, somebody called, they're like, Hey, we built this really cool app and it's a scientific app, but we're having trouble getting people to use it and we're like, Oh, this is great. Academic professors are using it. Yeah. And we're like, we are going to make this really pretty for you. We're going to make it pretty and we're going to make it easy for people to walk in. And when I say pretty, I mean good UX and so on. We're going to do that as just like a prototype to see what happens next. Great. Okay. And then we got other ones which are much bigger, which are sort of like, Hey, can you replatform? I need everything. I need like, I need a new CRM and I need a new this and I need a new that. And we're like, yeah, we're going to put it all together for you in one nice, one nice package. So that's, that's who we are. That's what we're about. If you want to talk to us, you just send an email to hello at a board.com we're on that. We're on the list. We'll always take the call. Always, always. You know, the other thing, Richard, so we would love to hear from you, not just about business. Obviously we love business, but we want to also hear about what worries you about this technology, what you're learning, what you think we should be paying attention to. We'd love to hear about guests that we should have on. Maybe not. You don't have to send us like yourself as a guest. Sometimes that happens a lot. It's pretty awkward. But beyond that, unless you're awesome. Yeah, that too. Maybe we do want to hear from you anyway. Please get in touch. Please ask us any question and we would love to do more advice. How did they do that? Paul? Oh my God. Thank you. Well, they can just send an email to hello at a board.com. They can also check us out on YouTube. They can give us that beautiful subscribe and thumbs up. They can subscribe to the podcast and they can give us five stars. So there's lots of ways to interact in really positive ways with us. And that's what we're all about. Have a great day. Bye.