How to design AI agent loops: schedules, goals, and subagents in Claude Code and Codex

The Big Idea

This episode explains "loops" for AI agents in plain English. A prompt is the instruction you give an AI. A loop is what happens when that instruction gets triggered automatically instead of waiting for you to type it every time.

The host’s main point is simple: stop thinking of AI as something that only responds in a chat box. You can set it up more like a dishwasher timer or a Roomba. At a certain time, or when something happens, it wakes up, does a job, checks whether the job is finished, and keeps going until it is or until it gets stuck.

The episode focuses on coding agents like Codex and Claude Code, but the idea applies more broadly. An agent can review pull requests, check Jira tickets, validate tools, send Slack updates, and even start smaller helper agents to do parts of the work.

Why It Matters

If you only use AI by typing one prompt at a time, you are basically hiring a smart assistant and then making it sit still until you tap it on the shoulder. Loops let the assistant watch the inbox, check the calendar, and handle repeat jobs on its own.

That matters because a lot of useful work is repetitive. Checking for new bug tickets, reviewing old pull requests, testing whether a new internal tool still works - these are chores that pile up. A loop can take care of them in the background.

The host also argues that this is where AI becomes more than a toy. Once the agent has access to your code, tickets, docs, and chat tools, it can do real work. But that also means you need to set it up carefully, because a bad loop can waste money, create messy output, or run longer than you expected.

Key Concepts

A good starting point is the difference between manual prompts and automated prompts.

A manual prompt is the normal chat setup: you type, the AI replies, you type again.

A loop is an automated prompt. The host describes a few common kinds:

Heartbeat: every few minutes, check something.
Cron: run at a set time, like every Friday at 10 a.m.
Hook: run when an event happens, like a new email or a tool call.
Goal: keep working until a clear result is reached or the agent gets blocked.

The "goal" idea is the newer piece. It is less like setting an alarm clock and more like giving someone a job with a finish line: "Keep going until the tests pass."

The episode also covers what helps loops work well:

Worktrees: separate sandboxes so agents do not step on each other’s code
Skills: reusable instructions for common jobs
Connectors and plugins: the tools the agent can access, like GitHub or Slack
Sub-agents: helper workers for smaller tasks
State tracking: a running to-do list so the agent remembers what it is doing

One example in the episode is a daily loop that checks for pull requests older than 12 hours, watches their merge checks, and sends a Slack message when they are ready. Another is a weekly loop that reviews recent code changes, suggests missing team "skills," then starts sub-agents to test whether those skills actually work.

The Bottom Line

A loop is just AI on autopilot: a prompt that runs on a schedule, on an event, or until a goal is met.

Used well, loops turn AI from a chatbot into a worker that handles repeat tasks without being asked every time. Used badly, they burn tokens, run too long, and make a mess. The host’s advice is to start small, give the agent clear jobs, clear stop rules, and the right tools. That is what makes self-prompting useful instead of chaotic.

Full Transcript

Source: openai 29m runtime

Prompts are out and loops are in. If your agent isn't able to prompt itself through an automation, what are you even doing? In today's episode, I'm going to teach you what a prompt is in normal person speak, how to write one, when it's useful, and some pitfalls to watch out for. We will be doing this in Codex and in Claude Code. And at the end of this episode, you'll be one of the cool kids whose agents prompt itself. Let's get to it. This episode is brought to you by WorkOS. AI has already changed how we work. Tools are helping teams write better code, analyze customer data, and even handle support tickets automatically. But there's a catch. These tools only work well when they have deep access to company systems. Your co-pilot needs to see your entire code base. Your chatbot needs to search across internal docs. And for enterprise buyers, that raises serious security concerns. That's why these apps face intense IT scrutiny from day one. To pass, they need secure authentication, access controls, audit logs, the whole suite of enterprise features. Building all that from scratch, it's a massive lift. That's where WorkOS comes in. WorkOS gives you drop-in APIs for enterprise features, so your app can become enterprise-ready and scale up market faster. Think of it like Stripe for enterprise features. OpenAI, Perplexity, and Cursor are already using WorkOS to move faster and meet enterprise demands. Join them and hundreds of other industry leaders at workos.com. Start building today. Okay, so why are we all prompt maxing? Of course, it's Pete at OpenClause who told us we are old news if we are prompting and we really need to be designing loops where our agents can prompt themselves. Now, this one tweet spun off tons of content about what is a loop, how to use a loop. And to be honest, I don't think any of them explained it very well. So I am here to answer your safe space questions about what is a loop, how do I get one set up, is it really that useful, and should I really be letting my agents prompt itself? I think the answer is yes, and yes, there are tons of great use cases for loops, and we're going to talk about how you can use those and how they can be beneficial, especially with software engineering. But there are some reasons why you wouldn't want to use loops, and honestly, I still do a little prompting. So don't worry if you are not loop maxing, you're in good company, and you can still get a lot done with AI. So to answer what a loop is, I'm just going to make this super simple for you all. And this goes back to one of the earliest articles I wrote on OpenClause, which was this article about why OpenClause feels alive even though it's not. And the core of this article was explaining that there are many ways you can prompt an AI agent. And often we only think about one way to prompt an agent, but actually there are many ways an agent like ClaudeCode, like Codex, like ChatGPT, like name your favorite agent here, can be prompted. And I want to go over what those ways are. First, there are messages. This is a human triggered input. This is probably how most of us are prompting our agents. We are going to a chatbot and we are typing in some sort of prompt, waiting for a response, and then typing another response. That is a message turn-based prompting strategy. I still think there's use for this kind of prompting. I use it all the time, but that is not what we're talking about when we're talking about loops. Instead, when we're talking about loops, we're talking about automated prompting of an agent. And there are a couple form factors that can take, and I just want to remind you of what those are. And I'm using OpenClause because I think it demonstrates these types of prompt loops, but it is not the only system that does them. So the first one is a heartbeat. You can set a schedule, like every 30 minutes, every hour, every five minutes, and on that schedule, it's going to kick off a task. And so you're going to say, every five minutes, check if I have a new Jira ticket. And if so, start a coding agent to triage and fix that Jira ticket. That's sort of like on a heartbeat. Every five minutes, I want it to do that. Then there is a cron. A cron is at this time or on this schedule, do this. So it can be at 9am. It can be at a specific time. It can be every Sunday night. These crons are a little bit more scheduled. A heartbeat is kind of on a regular basis. Crons are more on a set defined schedule. And then the last thing that I've talked about are hooks. So you can prompt an agent based on an internal lifecycle, like a tool was called, a session was started, a session was reset, or an external hook, like a webhook from an external session. Every time I receive an email, I want to get a webhook and kick off some sort of agent. And I only remind you of these things because these are common ways to do automations outside of AI. So we were doing automations on heartbeats, on crons, on hooks way before AI even happened. But now you can do that in order to prompt your AI. And so I think this whole concept of a loop is really just reminding people you do not have to use your human fingers to type in a prompt in order for your agent to do work on your behalf. Now, what's different between when I wrote this article and now is a new type of loop has been shipped as a first class citizen of both ClaudeCode and Codex, which is a goal. A goal is a type of loop that sets an outcome and runs an agent against that outcome until the outcome can be measured and validated or the agent is blocked. And so I'd say there's one more loop type that's becoming pretty common in AI coding in particular, although I think there's lots of use cases for it. But again, pretty simply, a loop is a scheduled or kind of semi-autonomous automation that allows an agent to instruct itself what to do, prompt itself and get that work done. Now, what do you need to write an effective loop? I like this article by Addy Osmani about loop engineering. I think it's really good. It does break this down pretty well. You can see it's fairly recent. It's from this month. But my favorite part about this article is what you need to write a good loop. To write an effective loop, you need these five things. I like how this is written out in this block in that it tells you what the thing is. It's an automation, kind of what its job is. So it's like triage of a task to be done or a prompt to be set on a schedule. And then it shows you how Codex and ClaudeCode do this. And so for Codex, your automations can come out of the automations tab, and you can actually define your automation in the schedule there. And then in ClaudeCode, you have scheduled tasks. Both of them have slash goal, and then they all have different hooks and integrations. ClaudeCode has the benefit of GitHub actions, which I think is nice for engineers. But both of them are basically at parity in terms of the types of automations that you can run. And then a couple other foundational things that I think are helpful when you're running loops. And why are these things helpful before we get into what they are? They just keep the work clean. If you are going to be Yolo-ing loops all over the place, you're going to want some consistency and execution. You're going to want clean workspaces. You're going to want conflicts resolved and avoided. And so all these things are really to make those loops effective. And so what are those things? They are work trees. I feel like this entire podcast could be Git 101, but work trees are just basically a way to isolate the work, especially the coding work of an agent, away from other agents' work in a sandbox. There are skills, repeated ways to do common tasks. We have a full episode on what skills are from earlier last year when they came out. Plugins and connectors. These are just the tools that your agent has access to. And so those can be like GitHub connectors, connectors to Google Docs and Google Calendar, and plus plugins, which are some instructions on how to use those tools. Sub-agents. Both Codex and ClaudeCode allow you to kick off sub-agents. This is just a way to federate out work from the main thread so that sub-agents can do specific tasks, especially validation. And then there's some way to track state. And essentially, just think of this as like a to-do list. So you can save it in a markdown to-do list. You could use linear as a task tracker. Both ClaudeCode and Codex use this. And so if you put all this together, basically what you have is a way to kick off an automatic prompt, a way to keep that prompt going until the job is done. And the way you keep it going is you can keep it scheduled or you can give it a goal and it can't exit the loop until it's hit that goal. And then you empower this agent that has been kicked off autonomously with the isolation it needs not to get in each other's way and the tools it needs to get the job done, including its little army of sub-agents. That's it. So again, I promised you I'd explain it to you very basically what a loop is. A loop is a way to autonomously kick off an agent with a prompt or set of prompts on a schedule or on kind of a recurring basis until it's done. It could be done because the time's up or it could be done because the job's done. So I am putting in those instructions, and I'm gonna say I want this to run, let's see, daily at 9 a.m. That's fine. Actually, let's have it daily at 10:15 a.m. because it's about to be 10:15. Okay, and then I am going to have it work in my ChatPRD branch on base is fine. And I'm gonna create that automation. Now, there's a loop. And a couple things I want to call out about this loop. It happens every day at 10:15, so it happens on a schedule. I don't have to come in and say what PRs do I need to review. And it's gonna tell me the next time it's gonna run. And then one thing that I want to call out is, if you remember I said your agent can have agents, I called out here that if there are any PRs that need to be babysat, you can spin up a thread to babysit that PR until all merge checks are green. So not all the work has to happen in the one master thread. It can actually kick off sub-agents or other threads to watch the work. And so I'm gonna go ahead and not wait the four minutes and run this now. Okay, so once this kicked off, yes, it's gonna prompt with that original prompt that I put in the routine or automation, but then it's gonna be pretty autonomous and work itself. And you know, I've given it basically two outcomes it needs to go after. It needs to identify anything over 12 hours that it can watch and actually monitor and make sure all the merge checks are green itself. That's success criteria one. And then success criteria two is it would use our Slack connector to send us a message. Again, I'm not gonna make you watch this, but you can see here it's gonna work all by itself. I am not gonna have to monitor it and all I'm gonna get at the end of the day is a good set of PRs that are ready to merge and some mean messages about how we're ignoring good PRs and not putting awesome product in the hands of our customers. So again, I wanted to demystify what a loop is. It is just something that happens on a schedule. Now this is a very simple loop and it has access to a bunch of connectors. It has access to GitHub. It's gonna have access to Slack. That's already set up. So I feel like this agent or this like pseudo employee with a job is well set up to be successful, but this is a perfect use case for a loop and a very, very simple time-based one. Okay, and it says no Slack MCP surfaced. I am going to make sure that Slack is turned on. There we go. Now it should be fine. Okay, now let's talk about a more advanced loop. So I wrote that one in Cloud Code. It is a scheduled routine. It is pretty simple, but I'm gonna also pull up Codex and show you another loop that I think is really interesting that's a little bit more complicated and a little bit more technical. Before I go into writing a more complicated loop, I wanted to call out some of the things that I like in Codex when you're thinking about or learning how to write loops. So in Cloud Code, they're called routines. In Codex, they're called automations. And what I like about what Codex has done is they have these templates. And so they actually have given you a couple good ideas of quote unquote loops, automations, routines that you can run. So if you're looking for inspiration, I would really look at these automation templates. And I'm actually going to use one. I think it's this from recent PRs and reviews suggest next skills to deepen. And so this is sort of a meta tool that I'm gonna use, which is look at all the code we shipped, look at all the code commits and comments, and then come up with skills that our coding team, including agents, could use to deepen the work. And so I'm gonna select that one. It's gonna happen Fridays at 10 a.m. It's gonna happen weekly. I think weekly is right. Again, you want enough data in these loops for it to do a good, long job. But I'm gonna give it a little bit more information. So the prompt is out of the box. From recent PRs and reviews, suggest next skills to deepen, grounding rules, anchor each suggestion to concrete evidence, avoid generic advice, make each recommendation actionable and specific. I'm actually gonna be more specific. If we have developed any tools for agents or developers to automatically validate their work, ensure that we have a skill for those tools, specifically command line tools or MCPs where an agent or a software engineer can run a test suite or a smoke test against a specific use case are very important to build skills around. I'm gonna add one more thing to the skill just to show the power of sub-agents and automations. If you identify a skill, spin up its own thread and use that skill validated against the base branch of the repo. We want to confirm that the skill actually works and outputs high quality. Okay, so this is like a loop with sub-agents that is probably gonna generate its own loop. I'm actually gonna force it to generate its own loop by saying, you should use a goal when validating this skill. So when you prompt the sub-agent, make sure you prompt it with a very specific goal it can use to validate against. You know, basically when you write a loop or a goal or an agent, you just say validate loop, goal, validate loop, goal, and you're good to go. But this basic prompt is saying, okay, every Friday, I want you to look at all the code I merged. I want you to identify skills that are missing. There are specific types of skills that I think are very important, which is skills to use some of the internal tools we've developed. If you see a new skill, I want you to spin up a sub-thread, another chat. I want you to validate that skill with a goal loop. So not only are we setting a loop at the schedule basis, we are setting up sub-agents to work on specific things. And then we're using a goal in those sub-agents, which is a different type of loop, to validate the work. So this is like a very meta task, but I think one that illustrates the power of loop-based prompting. It doesn't just have to be on a schedule. It can be on a schedule, set up a team that does work on a schedule or on a schedule, set up a team that does work on a loop until it's done. And so I'm gonna go ahead and create that. And then again, I'm gonna just run this now. And we're gonna see here that this agent is going to spin up, the automation is going to start. One of the things that Codex does is kind of interesting, is it sets up its own memory. So you can see here a little bit of the scaffolding of what an automation looks like. And then it just gives its own prompt. Now, again, it's gonna go ahead and search the code, run its own commands. It's gonna look at GitHub and it's hopefully going to create those new skills. And then what ideally we're gonna see in the left-hand side in these all chats is new threads being kicked off to test the skills that it's identified it needs to run. And so it found one strong automation candidate. Let's see if it actually kicks off a thread to validate it. Okay, so it did. It identified a chat smoke CLI skill. Basically, this is a command line tool I built to sort of test chats without having to use the UI in ChatPRD. And it basically spawned a dedicated sub-agent to test the skill with a goal to test it against the base branch and tell us whether its instructions actually hold up in practice. So look, it spun up this agent. You can see agent, it's got a little key name. And it's given it a goal. So you can see here, it's pursuing this goal, which is validate the local repo chat smoke CLI skill on the base branch. And it's basically gonna loop until that validation is done. So what we're gonna see is more and more sub-agents being kicked off. You can click them here by clicking this little dropdown. So I see Gauss, which is working on my smoke CLI skill. And then let's see, Galileo is working on a different skill. It's working on the GitHub address comment skill. So it's basically like a babysitter PR skill. And so this automation that I've set up happens on a Friday. It's gonna look at our repo. It's gonna create skills. And then it's gonna create sub-agents that are on, again, a goal, which is a type of loop to validate that those skill works. And it's just gonna do it over and over again until it has done as many skills as it thinks is appropriate for the last week. And so that is like my mega loop that actually I did not think to do until live on this episode. And it's gonna be really useful for me on a regular basis. So I'm gonna let that run. But before I get you out of here, I just wanna talk about a couple warning signals around loops. This is amazing. We all want our agents to work for us on a schedule whenever we want, doing work that we don't wanna do. It's great. What are some of the problems? One, loops can get expensive. So I just kicked off an automation that happens on a regular basis. It does wide-ranging work. It decides when to spin off sub-agents. And it does loop-based validation, which means it's burning tokens until it hits a threshold that it decides is successful. If you do not write that loop well or your validation criteria is too thin, guess what? Your agent is going to burn tokens. I think we've seen this with OpenCLaw in particular, or some of