Building AI Employees for Hospitality: How AITropos Takes Orders Where Customers Already Are

Overview

This episode is a conversation with Santi Marchiori, CEO of iTrepos, and Juan Aedo, CTO, about building "AI employees" for hospitality. Their company focuses on restaurants and hotels, with a tighter focus on one hard, high-value job: taking orders accurately and fast enough that customers treat the experience like texting a real person.

A big theme runs through the whole discussion: the flashy demo is easy, production quality is not. They argue that the real work is turning messy human conversation into structured, reliable actions inside POS systems, payment flows, delivery checks, and support processes.

Key Takeaways

iTrepos did not start by trying to automate everything in hospitality. They narrowed in on order taking because it was both difficult and valuable. That choice came after years of domain experience and about two years of focused idea exploration together, including many dead ends and a pivot from tools for waiters to direct customer ordering.

The founders make a sharp distinction between "AI" as the product and service quality as the product. They say guests do not care about the model stack. They care whether the right order shows up, whether the response is quick, and whether the interaction feels natural. That framing keeps them from chasing novelty for its own sake.

One of the more useful points in the episode is where the real difficulty sits. Integrations matter, but Juan and Santi say the hardest part is translating non-deterministic conversations into structured order data that can be trusted by deterministic back-end systems. Menus, modifiers, combos, sizing, delivery rules, and POS differences make this much harder than a chatbot connected to a FAQ.

They also show how fast this market shifts. At one point, they were stuck on a problem and then a new model release made the system suddenly workable. Their view is that you sometimes have to build ahead of the current model and bet that the next generation will close the gap.

Their main quality metric is simple: did the system identify the correct items? That cuts through a lot of vague AI evaluation talk. If the order is wrong, nothing else matters.

Practical Steps

If you are building AI products, a few concrete moves stand out:

Pick one job, not a department. iTrepos chose order taking instead of trying to cover every hospitality workflow.
Start with the operational bottleneck. Look for tasks where errors are expensive and speed matters.
Prototype fast, but assume the real build starts after the prototype works once.
Measure one core outcome. In their case: how many items were captured correctly.
Use synthetic testing at scale. They run thousands of agent-to-agent test conversations overnight, then use another agent to review failures.
Keep humans in the loop during onboarding. For new customers, they audit live conversations and step in when needed.
Shrink latency aggressively. They improved response time through parallel processing, internal order handling, caching, and preloaded context.
Build for repeated patterns. They can onboard a new pizza shop faster because they already understand that menu structure and ordering behavior.

Notable Quotes

Santi Marchiori: "Our product is delivering a high-quality service for guests and restaurant customers."
Santi Marchiori: "Anybody that uses AI can do a prototype in a day. But making sure that this is consistent and that it works every time takes a lot of time."
Santi Marchiori: "We like to stick to one KPI, which is how many items did we identify correctly."

If it’s not allowed by physics, then it can’t be done, but if physics allow it, then you can do it. — From the episode

Full Transcript

Source: openai 1h 07m runtime

Welcome to Just Now Possible with Teresa Torres. Hi, my name is Santi Marchiori. I'm the CEO at iTrepos, and I'm an engineer, MBA. I've been doing software product for the past 10 years, and I'm an AI fanatic, absolute fanatic. Love it. I'm Juan Aedo. I'm CTO at iTrepos. I'm a software developer. I've been doing software development since the age of around seven or eight. I'm 42 now. I got my first computer when I was a little kid and never stopped getting into it. And with that in mind, I've always been working with cutting edge technology, always looking for the latest things. Of course, AI excited me as soon as it showed up. I have a degree in data science, and I have a lot of experience, more than 15 years of experience in hospitality software, for hotels and restaurants, been working on that. Yeah, that's how we got where we are now. Amazing. One of my questions I want to dig into is why hospitality and how you found this space. But before we get there, tell me a little bit about what does iTrepos do? Yeah, so we're actually building AI employees for the hospitality industry, and we're trying to generate real operational impact, right? So it's not just a bot, not just a chatbot, but it has a lot of tools, and we have many integrations so that we can get done real operational work. That's what we're doing. We're addressing restaurants and hotels mainly, but we also have bakeries as customers, we're looking into bars, and every hospitality business can use our AI employees basically. Yeah, I love that. Okay, so we've had actually quite a few companies on the podcast where they're basically creating AI employees in one way or another, whether that's like customer service agents, or one of our recent episodes was with a company that's creating agents to help with managing clinical trials. I think this is a very hot space right now because AI is capable of so much, but it's also a little bit of a tricky space, right? There's a lot of fear around, are jobs going away? And I think especially in the hospitality industry, I could see there being, these are the types of businesses where it's probably hard to find good employees, especially if we're talking about bakeries and restaurants, the owners are probably tapped out and need help. How do you think about this balance of what's good for AI, what's to do, what's good for humans to do? I think especially in hospitality, I think of hospitality and I immediately think service. Like when I show up to a hotel, I like having a human welcome me. Do you want to just tackle some of this? How do you think of these hard challenges? Yeah, and that's an amazing point. And of course, AI employees are not for every type of hospitality business, but they are super useful for a lot of services. For example, one of our niches is a QSR, quick service restaurants, right? Because in those cases, you don't go for the human attention, you just go for food. And the same thing applies for hotels. There's some hotels that, of course, you have to have the human attention, the human touch, and there's other hotels that you just want to get some room service or that you just want to schedule the taxi to the airport. So that's where we believe those are our targets. They are super distinguished and the customer, the restaurants, and the business owners really understand when these type of services are useful for customers and when they might need the human touch. Yeah, if I'm sitting by the pool and I need a margarita, I'm okay with ordering from an AI. To your point of where it impacts on places where this technology can be used, the places we're looking to do it is not replacing where a human should be, but where the technology is in the middle, in between humans. So basically, instead of having a human-to-human conversation through a platform, we know that we can replace that interaction with human-to-agent or human-to-AI through that same platform, right? So even more so, some other places where we think we can be very competitive is where technology even doesn't have a person on the other side. So for example, when you're using an app, you are basically focusing on the conversational interface rather than the app interface. Yeah, and maybe just to finish wrapping up the idea, our product, of course, we market our product as AI employees. We're even thinking about changing that because AI is not our product. Our product is delivering a high-quality service for guests and restaurant customers. And we actually do that, of course, with a lot of AI, but sometimes we need humans to jump in, right? So our main focus is to deliver an amazing experience. And one of our goals is to pass the Turing test. The idea is to make customers feel like they are speaking or interacting with a human. And we are achieving that in many cases. In many cases, people, the final consumers, thank us and even send us pictures, food pictures to thank us. And they think that they explicitly think about how the attention was super close. I like this framing. It's interesting. Let me back up. I can see clearly, like in hotels, there's a lot of rules where I could see your service being really helpful. You already mentioned room service. I joked about the margarita by the pool. But anybody who's been to a resort that's busy has had this experience of you just can't even find a human. Your example of a walk-up restaurant makes a ton of sense to me. I'm curious about the restaurant category in particular. You mentioned a bakery. These are more experiences where I feel like we're used to talking to a human. And I know here in the U.S. during COVID, a lot of our restaurants moved to QR code menus to reduce human contact. But we're seeing, at least where I live, we're seeing all of that go away. People want to connect with humans. But I also know restaurant owners that can't find good employees. They're working two full-time jobs and still struggling to run the business. Tell me a little bit about where do you see this playing a role in restaurants? Just so I can get a clearer picture of what types of rules your product is filling. Let's jump right into a specific example. Let's say McDonald's. Today, you have to either order through the kiosk or wait in line to be served by a human. Imagine if you were able to just get into the McDonald's, take a seat and order through a voice message on your phone. And of course, the employee will tell you when your food is done so that you can pick it up at the counter. Or if the restaurant can have runners, they will take the food to your place, to the table where you're sitting. That's one of the main use cases that we're aiming for. Yeah, I can see this being really powerful. What immediately came to mind is my town has a big concert venue and outdoor amphitheater. And I would love to be able to just order a beer and have it come to me rather than walk into the tent and standing in line. Yeah, amazing. In line, there is a huge potential for AI employees to take care of their orders. Yeah, that's a great way to think about it. Why do we stand in line? Correct. Juan, I want to go back to something you said. In your intro, you said you've been in the hospitality industry for 15 years. Is this how the two of you landed in this space? Tell me a little bit about how you found this as the area you wanted to work in. Yeah, actually, we met with Santi on a previous company we were working on that was mostly related to marketing, but did marketing for hospitality, basically, for hotels. But yeah, I got into that company because of my experience also working with hospitality systems. So basically, I worked with the company here in Argentina for around 15 years. Actually, it's more. I always say 15 because I'm used to telling that, but five years ago was 15. So we could say that it was... And I'm still sometimes working with them. I do consulting for them, so that's kind of still there. But this company basically builds one of the PMS softwares, property management software system, which is what hotels use for their operations. It's one of the systems that has the biggest market share in Argentina. And they also have, of course, a POS system, which is for restaurants. So I've been working with those systems for a lot of time for all of these 15 years that I mentioned. And that got me into kind of understanding how the business or that market works. So after meeting with Santi and working on all of this, of course, it was mostly... It went on its own, decanting on that specific market because of my experience. Santi, I know he can tell, but he's also been working a lot on the industry as well. Previously, he's been working on other companies. So we both had the knowledge, maybe from different areas, but totally related. Both super nerdy about the cutting-edge technology, super nerdy about AI. And so it was a no-brainer to get started with this. Of course, there were... Just because of the contacts we have or what we've been working on for a lot of years, we had a lot of easier entrance into the market than if we had thought about, I don't know, going into, I don't know, automotive, automobile market, which I know nothing about cars. Yeah. So it sounds like you both had a lot of domain expertise in this area. You were geeking out on the technology and just excited to play in this space. One thing that's interesting to me is you have a very broad problem space. So hotels have a lot of employees. They have a lot of use cases. Restaurants have a lot of employees, a lot of use cases. Tell me a little bit about how did you decide what to do first? That's definitely a great question. We did decide to work in the hospitality industry. But actually before that, with Juan, we spent two years meeting with industry experts and analyzing hundreds of different ideas. And we finally met someone with a lot of domain expertise in the restaurants area, in the restaurants industry. And that's when we realized that there is one specific use case, one specific feature that can unlock an immense potential, which is order taking. There's lots of companies, lots of people helping with chatbots, providing information, etc., etc. But there is this specific use case, which is taking orders, which is super hard and super valuable. So actually we do have different employees, but our main feature is being able to take orders. And that's how we actually decided about it. We actually started doing an assistant for waiters. That was the first thing that we started building. We literally spent six months working on a device, on a physical device that waiters would use. And while working on that, we realized that taking orders was the hardest part. So we said, okay, let's focus on this. And then when we started offering this solution to different restaurants, to different potential customers, they started asking for this service to be deployed in for customers directly. And that's how we found out that there was huge potential for a solution like this for customers. So we pivoted and we started focusing specifically on that. And were you using AI already at that time? I can't remember a time when we didn't use AI anymore. I don't even want to think about that. Gives me the chills. I think if you're familiar with the product market fit question of how disappointed would you be if this product went away? I feel like AI, for people that have embraced it, disappointed is the wrong word. How devastating would it be if this technology went away? I think I wouldn't be able to breathe. Yeah, I know. Sometimes I wake up and Anthropic has downtime and I'm like, how do I do my job today? That happens to us. When we run out of credits, for example, we use AI a lot to code. As you might imagine, that helps us move faster. And as soon as we run out of credits, it's, oh, what do I do now? Do I have to code manually? No, no. Yeah, that's usually when I eat a meal. I'm like, just step away from the computer, go have a meal, go outside. And it's fun because we literally are the first ones to find out when some of the LLMs is not working. We jump to X and there's nothing there, like silence. Five minutes later, a thousand tweets. Okay, so you know what I really like about your story is you clearly had domain expertise. You still took a lot of time to figure out the right problems to solve. You found an area that looked promising. You started to build in that space. Your customers, it sounds like, pulled you even further. Forget waiters, do this for our customers. Santi, you said you spent two years looking for problems to solve. Was that full time? Were you both working somewhere else? Tell me a little bit about that exploration space. Yeah, and two years is an understatement, to be quite honest. I spent the last 20 years thinking about startup ideas, right? But the two years was specifically related to Juanu and myself. Both of us together, as soon as we met, we started enjoying very much our conversations and started thinking about so many ideas that we could build. Actually, just a very quick, both of us love astronomy. And at some point, we fantasized with building a company that is called TAS, which was Telescopes as a Service. Nice. We were trying, our idea was to use SpaceX to put in orbit a telescope and lease the time of the telescope. So that's how we literally spent a lot of time together thinking about ideas. We were working full time. So yeah, we did it on our spare time. Yeah, I hope you someday also make that telescope company. I think that would be fun. Yeah, absolutely. Okay, so let's get into this a little bit. You got pulled into, you were first making software for the waiter to make it easier to take orders. This got pushed into, can we just give it to the customer? The first thing I love is that this is a very specific use case. You're not looking at a hotel and saying, let's do all their jobs for them. You're saying, let's take orders. The other thing I love about this is, Santi, you mentioned this was a hard problem, and I can imagine you're not just spinning up a knowledge base and being an answer bot. You've got to integrate with point-of-service system, point-of-sale systems. I'm imagining you're interfacing somehow with a kitchen that maybe is making food or something real in the physical world where that order is turning into something real. So give me a sense of what does it take to solve a problem like this? What's the big picture? Yeah, so the connections, the integrations, all of that, although they are quite complicated, those are not the hardest part. The hardest part is being able to translate the not deterministic world of human conversations and LLMs into a structured information so that you can feed that to systems. That is one of the hardest parts. So that's what took us a long time. You can do a prototype in a day. Anybody that uses AI can do a prototype in a day. But making sure that this is consistent and that it works every time takes a lot of time. That's one of the things that we realized when we started working. And how did we solve that? Putting a lot of hours. Putting a lot of hours, understanding, and making an architecture that is super advanced. I'm going to let Juanu speak more about all the architecture that these agents are using. They are not a simple prototype that you can build in a day. Let me make sure I understand where you said the challenge was. First, you mentioned the non-deterministic human, which I love this because we all talk about the non-deterministic LLM. But it turns out humans are also non-deterministic. Yeah, the platform is actually channel agnostic, so basically, we started with WhatsApp here in Latin America. It's like the OS, the operating system of Latin America. That's what we use, but we could easily connect to iMessage, SMS, any other channels, and it's part of our roadmap as well, and it's the easiest thing to connect for us. We're just using WhatsApp right now because of that. But one thing I wanted to mention regarding what Santi said about the challenges was all of the things that Chatty said, not only they had to be correctly done, but as you might imagine, since you're doing real-time order taking, the agent has to be fast and responsive and respond correctly while in a time fashion where it doesn't get the customer waiting. As you might imagine, agents today are mostly used for long-running tasks, right? And that takes a lot of time to process information. So one of the biggest challenges, and I had Santi here on my ear constantly, we need to work. I remember, I don't know if you saw the playlist series based on how Spotify was built. Okay, so the playlist is a series where it shows how Spotify was built from the different perspectives of the builders. And one of the things you could see is how the person had the idea. I can't remember his name. Constantly had the technical guy saying, faster, I need songs to load faster. No, the time between a song and another song has to be faster, faster. That's how I felt with Santi on my side. But that paid off because we're actually right now at a place where, like Santi said, people are not noticing they're chatting with an agent, not only because of the way the agent responds, but also because it responds not too fast, but not too slow either. And that's kind of part of the challenges that we were trying to solve. I love that you mentioned not too fast as part of the challenge, because I know, like, when I send a support email and I get a really detailed response one second later, I'm like, yeah, that was an AI. Okay, so it seems like there's this first layer of the human gets to enter whatever they want, so you got to deal with the messiness of this. You have an agent that is trying to understand that message. I imagine your agent is doing the heavy lifting of interacting, integrating, like structuring that input in a way that works with your now deterministic systems, point of sale, whatever. So give me a sense of, I want to go back to, like, day one. What was your first prototype? How did this start? How did you even evaluate if AI could do any of this? Yeah. I thought you can stop at one thing I want to say is that the core piece was always the same one, which was basically have an integration with this external system. That was something that remained along these different iterations. But I let Santi talk about the first product we built, that we iterated. Actually, we had two. I think we're on the third iteration right now, right? Santi, we have first the hardware, then we have the custom app for waiters, and now we're actually at the end. Yeah. Yeah. I can't remember how many iterations we've done on this product, to be honest. But yes, so we always, we were super optimistic about this, Teresa. We were super optimistic. We saw the potential of AI, and we actually never thought that this couldn't be done. I think it was our determination to make this work, which made us push very hard. And again, the first time, one of the best things about AI is that it gives you some very quick dopamine hits, because making a prototype, it's awesome. It's awesome. But that is good and bad at the same time, because you have no freaking idea what you are starting to do, what you are getting into. You have no idea before you start. But it gives you this dopamine hit, and it's like you feel a superhuman, and you're convinced that you can do it. That was where we were. But I honestly had some doubts along the process. We spent a few months working on it, and we still had an unacceptable error rate because we wanted to make this perfect. And that's when we started testing different ideas, playing around with so many tools, so many different agentic architectures. We have five different types of RAG rack with different... It got complicated. At some point, we started thinking about the physics and how it should evolve, et cetera, et cetera. But yeah, it was super, super hard to be able to really understand every time what the customer is ordering, especially in different restaurants that probably have products that are quite similar. So if you feed that to a prototype, you're done. That's when you say, okay, this might be harder than it looks. But yeah, but it was a hard process, but where right now it's working so good that we are very proud of what we built. One thing to mention is, and this is anecdotal, but like Santi said, sometimes you could be overwhelmed at things not working and you start having questions. Like Santi said, we never thought that this couldn't be done. We just were thinking whether we were at the right time. And I have this memory, I have this snapshot of a chat I was having with Santi, chatting with him, where we were doing this second iteration. So the first iteration was a hardware that was, the idea was that kind of like having a headset for waiters where they would just talk to an agent, the agent would help them, et cetera. That was super hard, not only because of the model, but also mostly because of the hardware. Second one was something similar, but on an app where it was mostly like a chat app. And then the waiter would just make the order, take the order from the customer, make the order on the app, and then generate the order and send it to the POS. Like I said, all iterations had the core idea of integrating with the system. Now this third one is with the chat. So when we were at the second iteration, we were trying to get our agent or agents to build an order, right? With orders of super complex objects in data terms, because you have the product, the product can have a variation in the recipe. The product can have something called a modifier, which is like large, small. The product can have extra products linked to it, so you could have like a promotion. If you buy two products separately, they have a one price, but if you buy them together, they have another price. And depending on which POS you have, all POS have a different data structure. And that's actually a regular problem on POS system. Like, I tend to think that POS systems are still an unsolved problem because they all have different ways. Each restaurant or each venue has a lot of each of their own different ways of doing things, so they all have in the end to ask for specific custom implementations to the software that the company that developed the software. So there's a lot of, as you might imagine, there's a lot of variants and working on implementation on integration with all of them for us, it's super hard. But that wasn't the hardest part. So when we were doing this second iteration and we couldn't get the agent to correctly build the system, I was like, Santi, I remember just a message from Santi. Juanjo, I'm not sure. Can we do this? It's not fully working as expected. And then I was like, trust Santi, we're gonna make it. And in the end, it turns out that it's about not giving up because the only limitation is the technology. I remember having, like I was saying, most of the time it's basically trusting your product and understanding whether the problem is that, can it be done or not? And if you think it can be done, what's keeping you from it? And it's most of the time, if it's the technology that's limiting you, it's whether do you think the technology will be there at some point or is it like a physical limitation? I think even Elon Musk works with this idea whether if it's not allowed by physics, then it can't be done. But if physics allow it, then you can do it. We're super far away from that. But I remember having Santi sending me this message saying, Juanjo, I don't know if we can do this. It's not working as expected. We're taking a lot of time. And I was like, let me see, Santi. Trust, trust we can do it. And just that exact same date, one of these companies that we use for model released a new model, smarter. And it was just a matter of, let me try with this. And I just switched the model and it started working, like without even changing anything from our side. Of course, there were a lot of nuances that we then fixed and improved and we're constantly doing that. But it was just a matter of waiting for the improvement on the technology, the base technology that we were using to have our product fully up and running. And that's when we said, yes, this can be done. And the fact that we actually got to the point where the model that we needed showed up for us to get moving faster or actually get moving, made it, okay, man, we're on the right track and we are early. Yeah, this is, I love this because I think, I think it was Andrew Karpathy who said, it was either Andrew Karpathy or it might've been Boris Cherny from Anthropic, said to build your product for the model that's coming out six months from now. And what's fun about this space is that like you build something, and then time passes. And even if you do nothing, your product gets better. Like the brain behind your product gets If you're a recurring customer, you already have the connection. So you just have to go to whatever messaging platform you're using and search the name of that restaurant. I can imagine, too, for like delivery, people have their go-to spots. Like, I know exactly what I want to order from a specific restaurant, so I don't need to look at a menu. And I can see that being a really powerful use case as well. And I love that it's through WhatsApp, because I definitely don't want to call a restaurant ever. I'm not a millennial, but I feel like I have that millennial trait. I just don't want to call somebody. And I also, I want to see, I want feedback that my order is correct. I hate placing an order on the phone. I don't really trust that they got my order correct. And I really think the world should just operate over text. So I think this is amazing. It's basically text, right? We just have a technology that basically converts the audio into text, but the idea is giving the full conversational experience of DMing through WhatsApp. You have with your friends. So whether you want to chat or just send an audio message, which is right. You would clearly be one, a good customer, a customer that uses this tool. I would be, I wish all things could be ordered via text. Okay, let's get under the hood a little bit. It sounds like there's a lot you're orchestrating behind the scenes, whether it's checking delivery zones, checking stock, making sure you have all the right data for the point of sale system. How does this, what does this look like under the hood? Good question. I don't know if I want to tell you our secrets now. Just kidding. No. Okay. These are different. So basically what's going on under the hood is just to give a quick sample is basically we have our system just receives a webhook or a call from whatever API we use to message for the messaging channels. In this case, WhatsApp receives the message. And from there starts a full pipeline that does a lot of things. The main challenge was, so we actually, the first thing we had was that the pipeline was straightforward, like one thing after the other, right. Just to make sure that things worked. But upon iterations on that, and we started saying, okay, we need to shrink down the times. I think the biggest challenge was actually figuring out which parts, for example, could be parallelized, right. So you got, do we talk directly with the POS as we build the order or do we build an internal system that takes the order, which is much faster. Of course, in the first time, we just went through the integration part. Next iteration was a no brainer. Let's just do everything inside of our app and then send to POS or the integration. The next one was how many things can we do in parallel that can be done in parallel that doesn't need a sequential sequential processing, right. So for example, if you're ordering for, if you're searching or ordering for multiple products, the agent can basically search for all those products at the same time and then build a response based on all the results, right. So instead of searching one by one, you do a lot of multiple searches at the same time. Then the other thing is like database, the database, basically the database infrastructure, how powerful is the database or the database choice that you use so that it actually has quick results in order. But all of that is basically architecturing different ways of treating the data and parallelization, caching, database infrastructure, database engine, of course, you know, for different things. So all of those things have to be considered at the same time, which is, I think the hardest part. But in the end, what happens is that the agent in our scenario, we're using tools and we decided to use agent tools other than MCP or RAG because it's the fastest and most efficient way for the agent to interact with MCP. It would have to basically go through the MCP, understand what's going on, make a request to an endpoint to, which could potentially be fast, but still it's one extra step with tools. It's basically, okay, call this function and the tools are already on the prompt of the agent. Now we do some RAG, some initial RAG retrieval augmented generation for knowledge bases or if we want to preload information sometimes. So we have these hacks that we've been implementing where, for example, our system prompt is built based on the last message and the previous messages. So we got two system prompts. One is like the main system prompt and then we have something called like updated system prompt, which is like an immediate system prompt that the agent gets to know more information about what's going on, but from the system role. And that basically builds the prompt based on, okay, so you're looking for this product, so let's quickly build into that small system prompt information about that product before the agent can respond so that it doesn't have to figure out a tool to use and search for it. We just fit that information right away because we have to figure that out, right? So all of this, like I was saying, is mostly architecuring engineering, but just coming up with good ideas on how can you resolve these problems. Okay, it's taking too much time on a database call. Okay, can we do this faster, figure it out on our own programmatically instead of having the agent figure it out and figure out which tool to call. Okay, so it sounds like you have, you started with the pipeline, which I can imagine, first of all, a lot of people on this podcast talk about, they start with the pipeline. You have confidence it's going to work. You control more of the process. It's a little more deterministic. I can imagine in this use case though, for the customer, it feels like they're on rails. It's not a like open conversation where anything goes. It's like the agent is guiding them through their order taking. Whereas one benefit I could see you getting from shifting to an agent plus tools architecture is the customer can drive the conversation a little bit more. Is that what you found? Yes, so we started using tools in general, like at the first moment. So we consider MCP and workflows such as, I don't know, other workflows with the pipeline, but tools was the no-brainer for me to get started because we did the previous analysis and then we saw that it was the fastest. Now I think that's more what you're mentioning is more of an emergence, an emergent property of the fact that we decided tools, this was already happening, right? The agent has access to all these tools. At some point, I did, we did think about using state, just giving state to the agent so that it knows, okay, now, right now it's just receiving the or greeting the user. Now it's building the order. Okay, now the order is built. So the problem with that is that it would happen what you just said. You would be having the customer on rails and not giving them that freedom of speaking, hey, wait, so actually there's actually something, a workflow that I can mention that gives a good example. So when you can chat with the agent, start building the order, you're done with the order and the agent sends you the payment link. Before you even make that payment, you can ask the agent to make a modification to the order so they can go back, update the order and they will send you a new payment link. You're not on rails. You're free to talk as you would with a person on a call center, for example, that takes your order. So I think that's, we were lucky enough to make the right call at the beginning. Early on. Yeah. Okay. So it sounds like, so your agent, I'm imagining some of the tools that you, I'm not going to guess. Tell me some of the tools that your agent has access to. It's just built in tools that we built. So basically the tools are, for example, add products. You have a tool that is basically create an order, add products to the order. This tool with the add products to the order basically has all the logic for adding modifiers, comments, variations, etc. You got check product availability, right? So you got search products. You got search knowledge space. Of course, we have a knowledge base that helps a lot about extra information. We got geolocation tools that helps, hey, give in this address, figure out whether you can use it. We got generate payment link tool, which basically takes care. And internally on all of this, we have like multiple providers. So depending on the venue's configuration, if they have one payment provider or the other, the agent is agnostic to that. It just goes through it and then the tool takes care of that. That's a way good thing about tools, right? The same happens for the geolocation. If you, most of the times we use Google Places API, but you could use other one if you wanted. It's kind of fit. We don't, we try. This is a constant back and forth we have with Santi. Most of the times there's a lot of things that go into the prompt, but there's also a lot of things that should be systematically happening, right? So when the agent tries to take an action, the tool should tell it whether that action is successful or not and why so that the agent can understand what's going on. And this is like I was saying in the back and forth with Santi because Santi takes a lot of time and working on super amazing prompts that make the agent talk like a real person. But then we have the, okay, does the agent do this? That's when the systematic implementation has to come in. So that's part of where we're mixing with Santi where, okay, it's prompting, but the prompting should be related to what the venue's configuration is but just recently, the second version of Mercury came out by Inception. Have you heard of Inception? Yeah. This is the fusion large language model. And I've been on that since the first version. I was like, man, this is really something interesting. And I think this is really the path because it has this property of being able to correct itself. Like previous tokens can be corrected because of just how it works instead of having to, you write a token and that's it, the token is there. So learning all this stuff gives you like the advantage of already knowing what's possible and whatnot. So when you have a problem, you already have the knowledge of the different technologies on whether of what, like your tool set. It's basically the tools that you have at disposal, right? What you're describing is selfishly why I started this podcast. Amazing. Right? My thinking was if I collect all these stories, I saw a need. I would see lots of teams trying to learn this, but selfishly, I was like, if I interview a bunch of teams about how they build AI products, by the time I'm building my AI products, I'll have heard lots of ways for how people have already solved the same problems. That's a great way to do it. Which is really fun. That's a way to see what you build. Yeah. Yeah. I mean, yeah, I have several AI products in the market now, which is really fun all around discovery coaching. Okay, let's get into something you said earlier, which is, you told me about this moment of doubt. This is really hard. Can we do this? It wasn't good enough, which really raises the question of how are you evaluating if your agent is working? And I think with orders in particular, I can imagine some failure modes that are really catastrophic. So what are you doing to make sure this works well? We like to stick to one KPI, which is how many items did we identify correctly? That is our most important metric to really understand how well our agent is working. Then it can say strange things or it can use not the perfect vocabulary. All of those things can be improved. But the main thing, main KPI, how many of the items did it get correctly? That's it. Yeah, I love the simplicity of that. And I think from a customer standpoint, that's probably what they primarily care about. That's the only thing they care about, yeah. Do you see, like in your conversations, do you get data to measure that? If you mentioned at the end of an order, they get a payment link, they pay for it, the food gets delivered, they get a text saying it's delivered. If something's wrong, are they adding that to the WhatsApp chat? Are they saying, I got the wrong thing? I don't think if it ever happened, maybe once or twice. Yeah. I think that the whole picture here is that the agent can actually take, how do you say, like llamos, Santi? Claims. Claims, yeah. So the agent can take claims from the customer when there's a problem, and it even sends an email to the venue so that they know about the problem. Like if there was a problem on how it was delivered, even if the user sends a picture, they can send it. The email will come with all the attachments needed. If there's an error in an order, it's more of a post-sale service or customer service that, okay, let's see how we can solve this problem. But to the point that Santi was mentioning, what we tried to do is previously validate it, and Santi built an amazing tool. And this is amazing. He just built this with Lovable. Just a quick mention, Santi is the number one user of Lovable in Argentina, in South America, in Latin America, something like that. No, in Argentina. Within the 1% of users. So they reached out to him to ask him, but he built this amazing tool, which is mostly front-end. Of course, there's a lot of things happening, but basically this tool, what it does, it just, he came up with a way, I mean, Santi, you want to talk about it? While testing it, we realized that if we wanna move fast, we wanna go fast, there are some things that can go wrong, of course, especially with AI. So we said, okay, there's actually two things that we need to do because our main goal, again, is to take the orders correctly, all of the items correctly. So we can do that in two ways. The first one is safe that works every time, which is human takeover. If we do audits of the live audits of the conversations, so whenever we find things that are off, we take over and we correct them by hand and then we automate that. We fix that problem and we automate it. But before we even get any of the agents into the customer's hands, we do thousands of tests. How do we do that? We actually trained an agent that acts as a customer to test the agent. We run literally thousands of those during a night. When they are done, when one conversation is done, there's another agent that analyzes the conversation and checks if the order was correct, if all of the items were correct. And then after X number of runs, we have another agent that analyzes wherever there was an error, right? And we start fixing that. Of course, the first time we did it, we had a huge error rate, huge error rates. But then we started improving each of the things that were happening. And now, honestly, our production products might make a few mistakes, but then we fix it by hand if it ever happens. Once it happened that we didn't catch the error and we added an address that was not the right one, one for delivery. But since the customer gets a confirmation ticket, he saw that the address was not right and he called the restaurant. So it actually was quite an easy fix. But there's a whole bunch of agents testing the customer agents so that the items are understood perfectly. I also realized there, like in my area, that example that I gave, I focused on the wrong food getting delivered. That was because of my bias with DoorDash. I feel like my experience with DoorDash is I often get food from a totally different restaurant that I ordered from. It's just a weird experience. But I realized in your case, you send them a payment link and I'm assuming they get to see their order on that page. So there's this like human in the loop before the order is even finalized. Is that true? Yes, but that human is a customer and he assumes that everything is okay. So they are not testing it. They have a different intent. Okay, so you're not relying on that as a human in the loop step. No, no, no, no. The human in the loop is someone on our team that jumps in case there are any errors. And how, do you literally have somebody monitoring all the conversations? Like how are you detecting errors real time? For when we start with a new customer, we try to audit them all. We have team members, we have freelancers that helps us with that. We do it ourselves, as you might imagine, at 12 a.m. or 11 a.m. in the night, we are reviewing conversations many times. But after finding out that was super painful, we also have an agent that does a revision automatically and send us an email, an alert email in case there is any, anything that needs to, that requires our attention. So this is part of onboarding a customer. You go through this testing phase to make sure the agent is interpreting the menu correctly. It knows how to construct an order. It's not something that you have to do indefinitely. It's just part of the like fine tuning to make it work well. Just a few weeks. Correct. Just a few weeks of onboarding and then it just starts on its own. Yeah. Yeah, it used to be three months. And as we improve this, the time for onboarding is reducing, which is basically our biggest challenge. And what we're working on is like improving the onboarding test to make it super, super fast. Like we think we've already solved the messaging part. We, I mean, even today when we are reviewing calls and messages with Santi, just so you know the idea, it's like every noon or evening where like lunch or dinner time, Santi and I are sitting on our computers just watching these conversations going and orders being placed and just seeing them go. And it's like, we message each other going, did you just see what it, how it solved the problem? We are even not amazed about how it's working. So it's impressive. So we think that part is solved. And the part that we're trying to solve right now is basically decreasing the onboarding times. Times when the type of restaurant is new, you might have a longer onboarding because of all of the different products. But for example, you can get us any pizzeria and we will get it at pizza stores and we will get it set up pretty quickly because we know how that business works. That's, it's cool to see like how you can build iterative domain knowledge and start to reduce that onboarding time for different types of businesses. This is great. It's really clear. You're passionate about your problem space and that you've really dug in and that you have an equal passion for the technology, which is fun to see too. Is there anything you wish I had asked you that I didn't? I think your questions were amazing. We got to talk about a lot of stuff. All right, then let me ask you one last question before we wrap up. What's next? What's the big challenge that you're tackling next? Yeah, it's a great question. And I'm honestly super proud of our product and we've seen it working in lots of venues. So now our goal is to scale it. We wanna be