Scaling Uber with Thuan Pham (Uber’s first CTO)

Overview

This episode traces Tuan Pham’s journey from a childhood as a Vietnamese refugee to becoming Uber’s first CTO at one of the company’s most precarious moments. The conversation focuses on how he helped Uber scale from a fragile, crash-prone system with 40 engineers into a global engineering organization capable of supporting enormous operational complexity.

Beyond Uber’s technical history, the episode is also about leadership under pressure: how to make architectural decisions when growth is outpacing your systems, how to organize teams for speed, and how humility, relationships, and personal purpose shape long-term success.

Key Takeaways

One of the most striking insights is Tuan’s framing of scaling as a series of survival problems rather than a quest for perfect architecture. At Uber, the question was not “What system will last forever?” but “How much runway do we have before this breaks beyond recovery?” That mindset explains why Uber repeatedly rewrote critical systems like dispatch: the goal was to buy enough time to survive the next wave of growth, not to prematurely engineer for every possible future.

A second major takeaway is that many of Uber’s famous technical choices—thousands of microservices and hundreds of internal tools—were driven less by ideology than by necessity. Existing open-source solutions often could not handle Uber’s scale in 2013–2016, and the company could not afford to wait for the ecosystem to catch up. Microservices allowed teams to move independently when the monolith became a bottleneck, even if that created later complexity that had to be cleaned up.

Tuan also emphasizes that organizational design is inseparable from technical execution. Uber shifted from functional teams to cross-functional “program and platform” teams so each group had the skills to solve problems end-to-end without waiting on other teams. That structure was essential in environments like the China launch or the Helix app rewrite, where speed mattered more than elegance.

The leadership lessons are equally memorable. Tuan argues that talent compounds when people continuously invest in their skills, especially during calm periods. In downturns or crises, the people who have kept learning remain resilient. He also rejects transactional networking in favor of genuine relationships: the strongest teams he built came from people who trusted him enough to join hard missions because they had worked well together before.

Finally, on AI, Tuan’s view is refreshingly grounded. AI changes the tools and raises the abstraction level of software work, but it does not eliminate the gap between average and exceptional engineers. The differentiators remain curiosity, adaptability, ambition, and the willingness to explore new ways of working.

Practical Steps

When scaling fast, identify your next “brick wall” explicitly. Ask: what component will fail first, when will it fail, and how much runway do we have?
Rewrite only what is necessary to extend survival. Prefer focused constraints—like “a city must run on multiple boxes”—over sprawling redesign requirements.
Organize teams so they can execute independently. Cross-functional ownership reduces waiting time and increases speed in high-growth environments.
Treat career development as continuous preparation. In stable times, keep sharpening fundamentals so you are valuable when conditions worsen.
Build relationships by being useful and trustworthy, not strategic. The best future opportunities often come from people who have seen how you work under pressure.
Use AI as a force multiplier, not a substitute for thinking. Experiment aggressively with new workflows, especially where AI can automate large-scale code changes or parallelize development work.

Notable Quotes

“Tuan Pham: ‘The world will move faster and faster. And the moment we stand still, we are falling behind.’”

“Tuan Pham: ‘No one can block anybody else.’”

“Tuan Pham: ‘The thing I’m most proud of is how many people remember how I was good to them or helpful to them.’”

Full Transcript

Source: openai 1h 38m runtime

When Tuan Pham joined Uber as the company's first CTO in 2013, the company had 40 engineers, did 30,000 rides per day, and the system crashed multiple times per week. He had five months before Uber's dispatch system would hit a brick wall with no way out. Seven years later, he left the CTO of one of the most complex engineering organizations ever built. In today's conversation, we discuss Tuan's interview with Travis Kalanick for the CTO role, which lasted 30 hours, spread over two weeks. Scaling through chaos, rewriting dispatch before it collapsed, launching China in five months, and the full-app rewrite known internally as Project Helix. Why Uber ended up with thousands of microservices and hundreds of internal tools, because existing solutions could not handle Uber's scale at the time, and many more. If you've ever wondered what it's like inside the room when a company is growing faster than its systems can handle, and what are ways to get things under control, this episode is for you. As a side note, I've been lucky enough to work at Uber while Tuan was a CTO, and Tuan is the real deal. This episode was presented by StatSig, the unified platform for flags, analytics, experiments, and more. Check out the show list to learn more about them, and our other season's sponsors, Sonar and WorkOS. Tuan, it is so good to have you here in person. It's my pleasure. It's so good to connect with you again after all these years. And it's so good to reconnect. We worked together for almost four years at Uber, probably my first month I already met you in like some really like fun slash stressful circumstances during Helix, the Uber app rewrite, which was a crazy project. Well, before we get into any of that, how did you get started, not just in tech, but in life? You had a pretty rough start. Yeah, I grew up in, I was born in Vietnam, and I was a child, I would say, of the Vietnam War. So in 1975, when the South of it, I was from the South of Vietnam. My father was tied to the military of the South. And when the country was unified, the South is lost and the North is won. And there were a fair amount of repercussions. People who are associated with the Southern regime would not have much of an opportunity growing up, the educational opportunity, all these other things. That was, again, the way it was at the time. That's not necessarily true right now, but that was. And my mother then made a very bold decision that she wouldn't want her two sons growing up with no opportunity. And so we had to flee the country. And at the time, there was a massive wave of exodus called the boat people, where people just get onto a rinkety boat, fishing boat, or whatever thing they can get their place in and escape the country in the middle of the night. People did not know at the time, and nobody thought about it, but the chance of survival was about less than 50 percent. About two million people left, about a million people survived the crossing, because these boats are not seaworthy, and we crossed the ocean. And yeah, but we were the lucky, we were the lucky half, really. But no one thought about it. If people think too much about it, they probably wouldn't do it. But everyone was just like, well, we need to escape. We need to give ourselves a shot of a better life. And so we did. So we left Vietnam. It took many tries, and it depleted the entire saving of my parents, because we were scammed. People would say, pay up half now, half later, and then the boat never shows up. And finally, on the fourth try, we actually made it. And then we were lucky that we have a pretty good captain who actually navigates through storms and all that. And we survived even pirates from Thai. I was around, I think, 11, 12, somewhere there. And so we crossed that, and we survived three days, four nights of the crossing of the South China Sea to Malaysia. Then we went into Malaysia. We thought we were done. A week later, we got towed back out and dumped into Indonesia a few days later. And that's where the government there accepted us in and put it on a deserted island at the time. And we formed a refugee camp there. And then we were waiting to be processed. We got interviewed by all the different countries. And the U.S. gave us a refugee settlement because we were tied to the old regime that were supported by the Americans. So we were very, very thankful to get here, the land of opportunity. And we didn't know any English. We didn't have a penny to our name. We were sponsored by a church. The first set of clothing we got was from the donation closet at the church. But we had to build from the ground up. And so that was how I grew up, and that's how I got here. And I'm from this absolutely, not just unconventional, but just extremely hard start. How did you eventually get your interest into computers, into tech? Just like most things in life is by happenstance or luck. I was pretty good in math and science as most kids in Asia. We were growing up, we learned that. And when we got here, I had a friend in high school who had received a gift from his dad, an IBM PC. That was one of the very first ones. Was this in the 80s or 70s? This was in the 80s. This was in 1982. Yeah. So freshman year. So after school, I would hang out at his place and he's got a new toy. And so we were, you know, writing little basic program and playing game and all that stuff. And we learned how to use word processors and Lotus and WordStars and all that. And I started coding in basics. And then I just realized that, oh, it comes very natural to me. I can think very algorithmically. And then there's another weird thing I sometimes tell people. I am generally a procrastinator. I don't like to do the same thing twice. So computer programming is perfect for me. You solve the problem one, that's the creative part. After that, I get bored. I got to just do the next problem. And so writing program was like the perfect fit for me. You do not duplicate your code. Yeah. I don't like duplicate the code. I don't like to do the same thing twice. And so, yeah, when you write it and then it execute way faster than you can do it by hand. So that was really wonderful. I just taught myself that. And then I volunteer at a government agency to write code for them after school. And so I did that and I went in there and I basically stitched together Lotus, Dbase3 with all the scripting languages and automate the entire financial accounting and reduce the workload at a time that two accountant had to spend about three weeks or so every quarter reconciling everything. I did all that stuff with a push of a button and took about three hours for the whole batch to run. And so they were so happy. When I graduated high school, I think they wrote me a really good recommendation letter. And with other things that were going on and good grades and everything else, I got accepted to MIT. And then I got there and I really learned computer science, like the fundamental computer science. Back then I was just like a kid who just write programs. And then during or after MIT, what was your first professional job where you got paid and you worked full time on technology? One thing led to another. When I was at MIT, there was a multi-year co-op program with some of the best tech companies in the world at the time. AT&T, Bell Labs, Xerox PARC, HP Labs, and all these companies, Bellcore, all over the country, actually. And so we applied for it. And then the best, the kids with the best grade got prioritized. Then the company had to go through a selection process. They rank all the kids and then the kids all rank the company that they got ranked. And then there was a matching process. And I ended up coming to Hewlett Packard Laboratories. And HP was an awesome company at the time. Back then they were massive and very prestigious, right? Right, very innovative, laser printers, workstation, computer systems, all of that stuff. I was in the HP lab, which is the research lab, where a lot of the really innovative stuff happened. And so it was a dream job. As a student, I get to work on cutting edge research with all the other PhDs around. I get to write the joint thesis for my bachelor's and my master's with the work there. That was part of the arrangement. And when I graduated, HP just hired me straight into that research lab. So I became one of the researchers, although I didn't have a PhD. And after that, there were a few years of that, then I went into the industry and write code that people would actually use. I really enjoy my time at HP lab because you get to do cutting edge stuff. We were working on medical informatics at the time, where right now you go to every doctor, all your records are following you. Back then, we actually have a network distributed system architecture where every physician workstation that you go to, right, had your x-ray and everything followed. And then you have like knowledge base that actually look at for drug interaction. Oh, we actually did that research back in the mid, late 80s, actually. And so these are kind of A durable company. And so in that dot-com wave, there were massive companies that emerged, right? There was Yahoo, there's Google, there's Amazon, all of those companies. There's also a bunch of other companies, Webvan and others. Whatever that would go under because they didn't have like a strong value proposition that lasts the test of time, right? So, yeah, it's all about what value you deliver and whether or not it's beneficial and valuable to the customer that they're willing to pay, right? And I think that's one thing that we learned. Which one is like a real fundamental strong business, even though it might not be a profit initially. But which one that's just a M2, right? Just put a dot-com on something and it's hot. There may be a lot of dot AI things that's going on right now, right? Eventually, some of these things will consolidate. Some will go under. Some will become really awesome solutions and all that stuff. And so, but the market will sort it out that in the end, the customer will vote on what they want to spend the money on. Speaking of building things that last versus things that don't, one thing that always separates the two is code quality. And that's what our season sponsor Sonar is all about. Sonar, the makers of SonarCube is deeply rooted in the core belief that code quality and code security are inherently linked. High quality code is naturally more resilient. And as agents start to write code at a massive scale, that verification layer becomes your most important security perimeter. This is where solutions like SonarCube Advanced Security are valuable. With this new malicious package detection, Advanced Security provides a real-time circuit breaker, automatically stopping agents from pulling in unverified or risky third-party libraries before they ever hit your pipeline. The impact is measurable too. Developers who verify their code with Sonar are 44% less likely to report experiencing outages due to AI, as per Sonar State of Code Developer Survey 2026 report. It's really about closing the gap between the speed of AI and the reality of production security. What else is Sonar doing to help reduce outages, improve security, and lower risks associated with AI and agenda coding? Head to sonarsource.com slash pragmatic to find out. With this, let's get back on what Tuan did after the dot-com boom and bust. And there was a lot of layoffs, companies going bankrupt. Did that worry people around you? Did that worry you that, you know, your job could be in danger or you might have a harder time switching jobs? Or did you not, was it a very short-lived? It lasted a couple of years. I remember. And during that time, it was definitely hard to get a job, especially for new college grads. That's always the first layer that gets hit, right? When everything retrends, people want more experienced people. People want to stretch existing folks rather than keep on, you know, hiring entry-level folks that you have to, you know, continue to invest in, right? So it's just the economy of time. It comes and go in waves. Yeah, so that was certainly a very scary time. But of course, you know, in the longer range of history, things generally tend to recover, but it caused a rearrangement. And yeah, so during that time, it was certainly tough. However, the way I look at this thing is like, talent is always talent, right? So people with really strong talents and who's really hungry is always try to punch above their weight, will always be marketable, right? Even in a downturn, right? So I think the key thing is how people should, even in peacetime, invest in that skill, never be complacent, constantly try to be better. And then in wartime or in rough time, those things will save you, right? If people just be very complacent, atrophy with the time, and then when rough time hit, it's very, very hard to recover from that. And then you went to VMware at this time. Yeah, so let's see. After I went from DoubleClick to that, and then I jumped into a four-person company. Again, leaky roof and everything, classic startup. That business did not succeed. It took about three years or so, got to about 40, 50 people in size, and then kind of ran out of money and then got acquired by another entity that was built with a security appliance product with a try to solve the problem of, you know, intermediation of web services, traffic that are going through. And it was a very interesting security niche, but it's not a mass market thing. And so it's hard for a company to kind of break through like that, right? And but eventually it went away. But even then, you know, those three years taught me a lot because that you can survive even when you do it from the ground up, then you still have skill that you can pick up despite the fact that that journey might not end in like a commercial success. But your skills still get better. So you were getting better as a professional, even though the company failed. That's something we have to trust. We invest in ourselves, but of course, we invest in the company or vehicle that we are a part of. And ideal case, both sides succeed. But if the other one does not succeed, at least if you work really hard, you will gain some skill. And then based on that, then you can then leverage all the things that you learned so far and all the mistakes that you've made, all you got you smarter and better and wiser to look for the next opportunity. So right after that, I look at a bunch of other things. When that company was acquired, then I went into VMware. Again, when VMware was a pretty small, not very well known yet. So it was a 40-person organization. And so that built software to stitch together the... So VMware was still early. VMware was still early, yeah. There was three divisions. One division that did the workstation desktop app. And then there was a division that does the hypervisor, which is the OS underneath the OS. And then there was my division that was building enterprise software that stitched together all of the hypervisor into like a cloud platform, a management platform, right? So I was the one for that. It was a 40 people. And we kind of built the very first product suite for VMware. We called Virtual Center that tied to ESX. So that was a really, really fun ride. Very smart people. And then VMware really took off. So virtualization as a whole took off in the early 2000s. VMware was core part of it. It was one of the main things. So it was just... was it a kind of hockey stick-ish experience? It was not to the extreme of Uber, but it certainly was because it was an industry-changing technology. It was a game-changer, right? Before that, there wasn't anything like that. At first, people thought, oh, this is a kind of interesting tool on the desktop for you to run a couple of, you know, Mac and PC OS on a PC. But the true power was the ESX, right? Yeah. And then that's what you power data center. And then, of course, that's just the hypervisor. But I think the key feature that made VMware so useful was the whole V motion thing. When you take a virtual machine and you can migrate it from hardware to hardware without any perceivable downtime of the application running on top, with that capability, it unlocked the whole cloud thing, right? Because you have a thousand machines, it can look like one. It can look like a single machine. And so application inside of a machine will just scale and it will just move itself and it can do whatever you need to do, right? You can do DR, you can do, you know, yeah, all kinds of things with it, right? So that actually make it very much like a cloud operating system. And then at VMware, we also grew with the company, right? So again, it seems you have this history of you were a VP of engineering at the startup. You stepped down to a small startup. You then joined VMware. And eventually you became VP of engineering at VMware as well, right? Yeah. I have this weird thing where when I get, when the thing gets large and I start to feel too comfortable, I get nervous. Really? Yeah. And so that's where at DoubleClick, when I got to VP and I managed hundreds of people, I was like, is this a fluke or is it real? So I had to go back to a four-person company and try to see if it's real or not. That didn't succeed really well, but the engineering was healthy. It was good. And then when VMware, again, it's a smaller company and it go big and we get really big again when you get to a point where you're just running things rather than breaking ground and doing new things or you all are learning, then you got to do something different, right? So I keep on going back small and when it get big, I might go back small again. Yeah. So I'm seeing the pattern. So you got big at VMware and VMware was doing amazing. What made you look around and how did you find this very small company at the time called Uber or it might've been UberCab. I'm not even sure how it was called. It was Uber at the time already. UberCab was way before that. Yeah. Yeah, it was when after eight years of VMware and sometime people change, sometime company change, sometime both sides change. And so, yeah, for me, what changed personally for me was I have reached to the point where I didn't feel I could do much more there, right? I'm running 800% engineering team. We're building this software and it's been like the third generation What was it like from the inside, especially from an engineering point of view, from a systems point of view, from like, what was going on? So it was still pretty small. It was about 30,000 rides a day when I pulled the data on the weekend, which is always the busiest time when people move around. Yeah, that weekend, the day, the Saturday or Sunday before I joined, I joined on Monday, was about 30,000 rides a day. And Uber was, I don't know, 20-something cities around the world at that point, 20, 30. And so it was very modest. That's certain things that were going for it already. The engineering team was very young, but pretty scrappy and pretty committed and talented, where whatever we need to get done by hook or by crook, they'll cobble together, right? And so, and as a result, though, the service was beautiful. Anybody who ride, we only have black car service at the time, but the experience was beautiful. And for all the people who ride it, that's why word of mouth is, you know, racing around. And yeah, and so that was the really good part. Now, the thing that maybe Travis had foreseen, whatever it is, was the next phase, which as the company grow faster and faster, what happened, right? And by the way, the 40 engineers were very, very young, I think in the 20s, all of them. And the system was built not to scale, right? It was built for functionality. It actually did work. So it all came together and it's work. Yeah, and it worked and it worked beautifully, right? But it wouldn't scale and it would crash and burn all the time, multiple times a week. And that was our lives in the trenches. As the hockey stick actually happened, yeah, everything breaks. And we have to basically race against time to actually figure out what with the next most critical thing that would break and how to get ahead of it. And one of the things that Travis always tell me, even from the interview day, is you got to see around corners. So I try my very best to see around corners. And one of the first thing I did in the first couple weeks was beyond getting the, getting to know the engineer and build relationship and build trust, was to start examine what we currently have and what we need. And dispatch was the first thing. Without dispatch, there is nothing, right? That's what the mass riders and drivers. Yeah, it's our matching service, right? When it has the drivers, the riders, and does the match. Yeah, that's right. And without that, there's no business, right? There's no reason, there's nothing. And so, yeah, and I started, that was the first system I looked at. And I asked some, I reviewed the architectures, I reviewed the implementation plan, and it was very obvious that it wasn't going to get to scale. It was a JavaScript, it's a Node.js, and it was a single-threaded thing. And the engineer at the time where when the city get larger and larger, they need more out of that piece of code to power that city. They will move that piece of code into a larger machine with a faster processor. For vertical scaling only gets you so far. Exactly, right? So my role also is to do things, but also to teach people along the way. And so I was just asking leading questions to the team and the team only have three people, three or four people at a time. And so I asked the engineer, okay, what would happen if the city get larger and you have to support that? Because every city is getting larger and the ride volume is getting larger and larger. Then the engineer basically, oh yeah, we just move it to a more powerful processor. And so what happened if you get to the fastest processor you can? Oh, there are multi-processors and then you can get a four-way box and then you can put multiple of these processes on them. And then you say, well, you've got three or four of these things servicing the same city. Do they talk to each other? Do they share the same state? Not really, right? So it becomes very partitioned. So pretty soon by asking those leading questions, the engineer now discovered the flaw that this thing would not scale, right? And then, and then I have to establish the limit of where is the brick wall. And I basically started, what's the biggest city that we currently have in terms of ride volume? And they say, New York city. And I said, okay, when is New York city's gonna, we're gonna run out of capacity to even handle New York city, even on the biggest box that we can get our hands on? It's about October and this was around May. Okay. And so I was like, well, we have to rewrite it, don't we? And we have to write it in a really scalable way. And, and I only have two requirements that all I need. One is a city has to be powered by multiple boxes. Yes. And a box has to power multiple cities. That's it. So you can have N by M. You gave them these two constraints. No new feature necessary. Just make sure that we can do that. And then that allow the business, the company to just pour a whole bunch of hardware behind that. And it will scale. Technically it will scale infinitely, right? And so the engineer did that. And because the requirement is very simple, uh, we have to do it really, really quickly before we run out of time, uh, run out of runway to survive. And so they did that. And we actually deployed that right around August, September, right before it actually, and then on to the next problem, database going to be the next choke point, right? And out. And then the API monolith is going to be the next choke point. And we kept on identifying all these things. Uh, so there's all these threat coming at us and we have to establish like how much runway we have until we like really get in serious trouble. There's no way out and then get ahead of it. And so this was then the reason that we had so many rewrites. I joined later, but rewrites were still continuously happening. And I think when you come in, you ask like, why could have they not written it properly the first time or like, but, but I guess, do I understand correctly that it was because a, sometimes you just build a system to solve your problem and, and, and B, you don't always know how big this will scale. A good example is the, the, the New York problem. And then you take those constraints and then you build a system. And then if those things change later, you might need to build a different system. Yeah. Um, it also depends on how fast you're growing that dictate how you make. And because the faster you grow, the shorter runway you have to survive, right? Given whatever architecture and system that you currently have. And yeah, the, the, the question about how big it can possibly grow. Nobody knows really, but it's actually not fruitful to pontificate on that. It was all about, um, how much time we have to live. Right? When we hit the brick wall and there's no way out, right? So yeah, and, and if that time is really short, then don't overthink it. Just survive that and give yourself enough runway to then live to fight another day, is what I'd like to say. Growing fast like Uber did is a good problem to have for startups until it's not. And the pain points that come with fast growth is a good time to mention our season sponsor, WorkOS. If you're building any SaaS, especially an AI product, surprisingly, you reach the need to build enterprise features like SAMLH cases, directory sync, audit logs, and all the things enterprise customers expect quickly. Building that infrastructure yourself takes months. WorkOS gives you APIs to ship it in days. Authentication, SSO, SCIM, RBAC, audit logs, and more. All designed to integrate directly into your product. That's why companies like Entropic, OpenAI, and Cursor already run on WorkOS. Skip the rebuild, keep shipping. Visit WorkOS.com. With this, let's get back to Tuon's team rewriting the dispatch system and the short breathing space that this first rewrite gave them. So that's what, with dispatch, we knew we had to do it very quickly. Maybe buy ourselves another 12 months. And after we get through that point, then we have another 12 months to think about the next phase of survival for that team. That's why the system needs to be rewritten several times. Let's say if my requirement for the engineer was build a system that will scale infinitely to this at the time, it might take a year. We never get there. We'll die before then. Speaking of dying before, you were given in 2014 a seemingly impossible task. Travis told you, you have two months to launch in China. And apparently, launching in China was not as simple as just like opening your API and allowing the firewall. What was that project like? I heard it was an absolutely manic and crazy project. Can you take us back what it was like? Yeah, that was one of the craziest things we've ever done, but it's also one of the most amazing things we've ever done. And I'll explain why that is so. So I remember very, very clearly right around Christmas time 2014, we were all hanging out in the big room in 1455. And Travis made a declaration that, okay, come the new year, I'm gonna light it up and we're gonna go into China. Okay. And then he turned over to me. It's like, one, I Yeah, cross-functional team rather than a functional team. Yeah, which means that there's like a back end, a mobile, and whatever else you need, like a design if we need to, et cetera. Exactly. The concept is that team has to have all the skill set necessary to just take care of this. Whatever they have to do, they just go off and they do that, right? So that was the principle behind that decision. And then we call some of those program and some of them platform. So programs are the team that build things that the end user actually use. And the platform are the thing that build tools and layer that other program team use. And that was it sort of horizontal versus vertical kind of thing. So and that's that. And then after we define that, then we start putting the right sticky note onto that box. And then that's how the first version of program and platform came about. And then how did microservices start and how did they blossom as much as they did? Yeah, again, none of us wanted to go through that extreme, but lots of time when you are under a lot of pressure and no time to react other than just to survive that scale, that keep on coming at you. You have to make a decision that increase speed and velocity because speed and velocity allow us to build things quick enough to survive. And so we knew right away that the back-end API, which is a monolith, right, is the thing that will prevent speed from happening. So we made a declaration, anything that is new need to be built outside of that as a microservice. Okay. And then there's a team that's dedicated to decompose that monolith, that API monolith into a bunch of services. Now, we used to call it API, right? I think that was the name. Exactly, call it API, right? And I think that project name is called Darwin. Yes, Darwin. Oh, I remember. Yeah. And interestingly, had we freeze time, that piece of code could be decomposed in a matter of three to six months. But it took us two years to do that because as we peel out a piece of code, the business keep on going forward, right? These hockey sticks are laying on top of each other as we launch new city and it happens faster and faster. New city and new product UberX. That's right. Features have to be added on, right? And so the philosophy we operated at the time was no one should be blocking anybody else. No one can block anybody else. And so when a team that needs to build a feature and that thing hasn't been pulled out of monolith, they add to the monolith, right? And then the team that pulls it out, do the best that it can. And then we kind of keep chasing our own tail until eventually, you know, something gets completely pulled out. And as it happened, it bulged up like this, the monolith, right? If you pull out one thing, the remaining stuff grew even faster than the stuff that you pull out. So the code base gets larger and larger, eventually reached a certain point where they start to come down. And that's why it took two years. And meanwhile, everything that is new must be on because we don't keep on adding stuff, right, to the monolith. And so that's how it came up to like, you know, thousands of microservices. But that was out of necessity so that we can just fan out and solve every problem all at once. And then over time, after things stabilize and so the business more mature and growth is not as violent like that anymore, then the team, we actually have a project called arc. We're going to look at this stuff and how do we clean it all up. So we put like domain interfaces on top of a whole bunch of microservices that are within the same domain. It's funny because I remember that around like 2016 or so, there was a published Uber blog post that Uber had about 5,000 microservices. And I just saw about a few months ago, Uber published another one and they have over 4,500. So in that 10 years, the number has gone slowly down, right? It's gone down. But even then, even the light now Uber has so much more complexity, right? That's right. Yeah. There are processes that took a little while, but when yeah, the team had to look at everything and how do we simplify that, right? And then to make sense out of that, new tool has to be invented by us. Gager, the tracing tool, all of that stuff. And so that would be a really great tool that we open source to the world. Let's talk about how and why Uber built so much internal tools and also open source a bunch of them. Jaeger was one of them, but internally we had schema lists, a trip data store, T-channel, an RPC protocol, Ring pop, the GEL spatial placing, clay service framework, UMonitor, observability, and there's like hundreds of others. Some of them open, some of them not. How were you thinking about that? Like, does it not seem like a lot of waste for us to build this or was it again, necessity? It was mostly necessity. I can't claim that every single one of those things were absolutely necessary, but all the important ones were absolutely necessary. The thing is when I started, Uber used pretty much all the open source stuff. We use Redis, we use everything, right? Because those, the engineers there just focused on putting together a service that actually moved cars. But then as we scale, we keep on using, pushing the boundaries of the capability of those open source stuff and to the breaking point. And at a certain point, if we don't invent something to power our own need, by the way, this is 2013, 14, 15, 16. It's not as mature as it is right now. We did not have the kind of the big tech investment in open source back then. That's right. There was very little and most of the big teams like Google and Facebook, they were keeping their things inside. Dedicated their own things. Right? Yeah. I remember like, for example, a very painful example of that we had to face early on was we use PostgreSQL, right? And we get to a certain scale that PostgreSQL would randomly fail and that take our services down randomly. We don't understand. It's inside the kernel. I remember the time where I had to go on LinkedIn begging people who anybody on LinkedIn that has any knowledge of PostgreSQL to be our consultant to help us diagnose this problem. And we spent several weeks. And during that time, it was terrifying because I don't mind if we think we can do something about our own problem. It's terrifying when we have a major problem and we depend on somebody else and we, we don't know because open source, there's no single person, no single company I'll be willing to pay anything if someone can give me an answer, but there was no one, right? And so that was one of the motivator to kind of build our own, you know, data layers and all of that stuff as well so that we would use this generic database and we end up using MySQL just as table data store, all the logic on top. We have to build for our own use, right? Because then we control our own destiny. And we only built the feature that we really need, right? And so that was one of many examples. And eventually we run into other brick wall of scaling. I remember in 2015, right around the holiday, I was taking a holiday trip and I go to the airport and I, I took an Uber ride as usual. The receipt didn't come for two days after that, right? Why is that? Things were queuing up. We weren't processing things enough, right? And so, yeah, but that's not a deal breaker for many people because they just ride and then the receipt comes later. That's fine. As long as the billing and all the stuff, even when you bill people late, they don't really mind that either. Right. But as long as the ride happened, the rest of the stuff can be processed later. But it's still not great. Okay. And when I dug into it, like our data processing capability is at capacity, right? So we have to rewrite a bunch of stuff. And then our capability to monitor things is reaching a breaking point with the open source tool that we use. So the M3 has to be invented, right? And all of that stuff. So we, we, a lot of, we have to do things because we have the scale where we broke all the open source stuff that we use. At Uber, we did unusual things. One of the most unusual projects, which is where you and I met when I joined Duper was called internally called Helix. It was completely rewriting Uber's app. And as I understand what happened is Uber's user experience was starting to degrade because it was really cluttered. Travis got a bit fed up with it. The designer team came up with a solution, which was a very nice and clean UI, which kind of the engineering team looked at it and it would have been a full rewrite. And then we just did a full rewrite back then. I remember we had a million or 2 million lines of code. We had two or 300 mobile engineers working on this. This was a massive business and there was an extremely tight deadline set. Can you take us back on, why did we even do this? Because from, it didn't feel, it felt existential threat from the inside, but it was not like a, like a Google plus versus versus Facebook existential thing. And how did we decide on that short deadline? Yeah. It seemed like a recurring theme that keep on coming up was that tight deadline, right? Everything we I tell you guys all the time, right? It's not a jail. We can't lock anybody down. Everybody have free will. If they want to work, you know, somewhere, they should have the ability to do that. And we should create more opportunity. And then we also, to support that, we publish internal job board. Right? Anything on the outside see, we see on the inside. So, and you should be able to shop within all the opportunities they have inside the company and stay with the company. And why make it so hard and then they're leaving the company? That's just a silly thing. I remember at Uber in in some of the meetings either all hands or team meetings, you gave talks that were memorable. And one of the most memorable, I asked around former Uber folks and Charles specifically, he was on the podcast. He told me that his most vivid memory of you is this talk or this topic about behaving work in the perspective of death. Yeah, I don't remember that exact speech, but I, I do have that line of thought in my head all the time. Right? And sometimes I would share with different audience. I put different context, but it is, it's all about finding one's purpose and not take oneself too seriously. Right? If you look at people, the most accomplished people don't take themselves that seriously, right? The more, you know, the more, you know, you don't know kind of thing. And people who are arrogant tend to like not know enough yet, or they have all that stuff. Right? So, yeah. So I always take opportunity to remind people to sort of be humble. And the example I use always is use myself, right? As I look, you know, when you're in an important position, people treat you really well, but don't let that get to your head. It's not you. It's the position you hold. And I remember saying this, we were like, the moment I stopped being CTO of Uber, nobody's gonna care or know about me. They're gonna talk about the next CTO, right? And that's always happened, right? The, the world forget about us, right? So the only thing we can really do is in any job that we do, do the best that we can to help each other, to leave a lasting, positive impression in each other. And one day everything ends. A job ends and then I'll get to the morbid stuff, like life even ends itself. And so then I measure my myself, like, what is my achievement that I would be most proud of? And I said, well, when I'm gone, the thing I'm most proud of is how many people remember how I was good to them or helpful to them. And for some number of years, right? And that is that because I can't take anything with me. And so live in the moment, be as best as you can to everyone and be very constructive as you can and live a good legacy behind you. So that, that, that was the whole gist of that. It feels to me sometimes there's talk about how you can network better and grow your network, but it sounds like this is almost like it's not a hack. It's just do the work, right? Yeah, do the work and then the right thing happens, right? But you can't do the work in service of that goal because that's very artificial. Right. Just be genuine. Just be yourself, be helpful, be constructive, uplift everybody, help people along the way, coach you, doing it altruistically. And let me show you another angle too, which I per experience over and over again. It's not only that other people around the industry pull you into good stuff. When you pull in and you don't have people to support you succeed, you would not, you would not succeed also. And here is an example at Uber, right? When I came in again, the engineering that we talked about very, very young, inexperienced, did not know how to build systems at scale, reliable, all that stuff. And the network that I have, who really knew how to do that was from VMware. We were doing systems software. We were doing operating system, right? Rigorous principle level engineer. Experienced. No, like in their sleep, they can do it, right? Right. So when I came in and when I have to like work with a team on dispatch, I pull in the first engineer from Uber to lean land on that team. His name is George. And so he's there and he worked for everybody else to uplift everybody there. Right? So this was the engineer from VMware. Yeah, yeah, yeah, yeah. And then when I built payment system, I have to pull in another, a few more ones. And then when we get to build schemaless, it was a Denmark team, right? I pulled the top four engineers from my VMware team. And then I moved them down from one floor to the next in Denmark, so that's why we had a Denmark office. That's correct. Which was one of the best infrastructure offices at Uber. And they built schemaless. They built schemaless. The four of them, right? They built a lot of other. Right. And so now if I weren't a good person doing a good job for them, with them, why would they come? They wouldn't answer the phone. Yeah, they wouldn't answer the phone, right? But every single one that I call because I really needed help, they all came. Initially, they all asked the same question. Why a taxi company? But when they understand that, they came, right? But they came because they still enjoy working with you. Right? There are people who work with me for five different companies over 28 years. And that always surprised me. And I think this is something that people might overlook a little bit as they're building out offices. I'm talking with founders. Is one thing is where you can hire. The other thing is where the good people stay for a long time. And there's a lot of value in that. And Denmark kept being very core critical infrastructure. Yeah, core infrastructure software team. And that's one of the things we had to build at Uber because back then when I came in, we didn't build infrastructure software, right? We just used existing open source stuff, right? And we built that. And another thing that I, you know, discover along the way is great talents are everywhere, but you know, you have to bring opportunity to them. They don't necessarily relocate from Denmark to San Francisco, right? And so that's why we end up having nine engineering offices around the world because we have a lot of work that needs to be done. We didn't go to other places because of cost savings or anything like that. We go there because we have need and we have world-class talent and we just cherry pick the world-class talent. Doesn't matter what size it is. And Denmark team was small compared to team in India, et cetera. But, you know, there was really great talent in infrastructure and we'd invest on that. Lithuania had amazing DevOps team. And so we just go to where the talent is and then we bring the great work to the great talent. And then we establish a structure to manage and give people first class ownership of the problem. And then, you know, everybody is kind of equal. At Uber, you talked about several times of your three chores of duty. Which one were these? Yeah. Again, it come back down to purpose. So when I do something, I try to be intentional about why am I doing something? What's my purpose of doing that? And so, of course, my purpose to come into Uber was, hey, let's build this business. Just build a tech that support the business. And so the first couple of years, 18 months, 24 months, were fixing a lot of the broken stuff. Things weren't reliable, become more reliable, et cetera, et cetera. Rebuild things, basically just get things to work and work well. And then along the way, you know, these things don't end and beginning on a particular day. It just phase in and out, right? So the phase two, that was my second tour of duty was scale. Worldwide scale. That was China. That was massive scale everywhere in every dimension. And so, yeah. So at each of those phase, when you're done with that phase, you ask yourself, am I still useful? Do I want to re-up, right? My commitment and energies and everything else. And so the first two phases were no question, right? We're there to do that. And then as phase two were about to wrap up, right? About 2017, we actually kind of stabilized. We're really big now. I was actually asking myself that question. Am I needed here anymore? And I was actually about to wrap it up that summer because, you know, at that point, we had also another SVP that was higher. And I think he's really, really great technically. And I can like feel very, very at peace. Kind of, you know, there's someone who really take it on even better because the person has done even bigger thing at Google, right? Yeah. And then, but that didn't work out. And then we had a really rough year. So then I have to like sign myself up to the third tour of duty, which is, and what is the purpose of that? You know, help the company get through the turbulent years. And I had no idea at the time when that phase would end. I just kind of know the condition for that to end, which is whenever the next CEO arrives, right? And then after that, whether that person liked me, I liked that person or whatever it is, that's to be decided. But that third phase, I have to stick it through because, you know, we owe it to ourselves and we owe it to everyone along the way who have Work together, help with college application, all of that stuff. And so the bond we had was really cool. And as I was thinking about her going to college, I was thinking, wow, I'm gonna have a lot of time on my hand. So what should I do? Here we go again. Exactly, right? And so should I join another board, which I was about to. And then at the last minute, some partners at Sequoia asked me to meet Max, the CEO of Fair. And really liked him, very smart. Again, all the same characteristics, very smart, very hard charging, want to do all the right thing. The business is to empower, you know, local businesses. Can we talk a little bit about that? Because from the outside, you know, when you Google Fair and you and I look at it, it doesn't tell you exactly too much. It feels a little abstract from the outside. It is a B2B marketplace, right, between a big brand wholesalers and retailers. So people buy that and then stock their storefront. And so, yeah, and so all the traditional two-sided marketplace dynamic apply. And the mission is very similar to our mission, even though we will be the C, right? This is B2B, but it's all about what can we do to empower local businesses to flourish, right? So to buy the right thing, to sell through, make a profit, grow that business. So basically, this can help smaller and also large businesses to actually just, like, grow their business. Yeah, yeah. That's right. More supply, more successful demand, more demand, more supply, all of that stuff, right? So, yeah, it's like a really marketplace. This is a really fun and very complex. And so I really like that. And I really, when I dig in through the interview process and everything else, and again, this company moved really fast. Within a week, everything was finished, including my homework assignment, right? You have to go and present and everything else. And so I really, the company moved really fast. It's energizing. And the culture is super nice and super kind, you know, like no politics. Everybody's just focused on doing the right thing and working with each other, taking care of one another. So it's a trifecta. It doesn't matter if the company is really big or really small, right? But it's got all the ingredients. So I was like, well, maybe that's a good place to jump in and help out. And can you give us a little context on Fair in terms of the size of the company, the size of the engineering team, where the hubs are, what the work is like? Is it in person? Is it hybrid? And so on. Yeah. The company is about a thousand person. The engineering team, including the data science team combined is about 300 people. The work, we are in the office three days a week. Yeah. Three days on the week. The other two are working remotely online. Yeah. And some people show up more if they live close to the office. Yeah. The engineering team, there's a portion here in SF, just down the street from here. And a large part is in Canada. They would have a big office in Waterloo. And we have a big office in Toronto. So I, I make the trip there quite often every five, six weeks or so. I'm over there. And what are some interesting engineering challenges that you're excited about right now that you're solving? Oh, right now, clearly the most exciting thing is AI and how is AI changing everything so quickly. Tell me how, what are you seeing? What's working? What's not at, you know, like on your teams. In my team, as well as in the company, you know, we're using AI to boost everyone, you know, effectiveness and productivities and output. Right. And so, so that's one. Within the engine specifically, we use AI to make, you know, search and recommendation better, right? Because the whole job is to help people discover things that would sell really well for the business and et cetera. And imagine AI as a shopping consultant, right? And all that stuff. And then coding wise, you know, AI is doing a lot more of the coding now, but we also use a different technique to actually boost the engineering productivity. Have you heard of like swarm coding? So swarm coding as in the agents? Yeah, a whole bunch of agents, swarm of agents, right? It's pretty new. So you're already using it. So we are already using it and we, we're building orchestrator to orchestrate the action of all those agents. And we measure the, the first, the early adopters and then the, the bulk of the engineer follow through after we build the, the more robust tooling. And we see dramatic lift in engineering output among the early adopters, the ones that are really efficient at thinking this way, right? Because it's very different from a linear kind of thinking. When I write this piece of code right now, it's almost like multi-threaded programming with single threaded, right? You have to think about all these other things. You have to prompt all the action and then you have all this code can come back at you and you have to review it. You have to stitch it together. Yeah. And it required a different way of thinking and the cognitive load may be a little higher. but the output is dramatic. We have seen our best engineer double their output. I know we're talking about that, but just to make clear, we're talking about not the code output, the actual business output, the impact of their work, right? Yeah. The impact now depends on the, the evolution of AI, right? So right now the state of the art right now is it's very easy to make large scale changes, right? Cleanup and everything else, right? So massive productivity increase. Now we're trying to crack the next frontier, which is how we get that level of productivity increase in output, building new features on top of a code base that are older, right? It's not like you and I can just go build something brand new, not entangled with anything. It's really fast. The whole thing will generate for you, right? Yeah. But we got millions of lines of code and how do you like deal with that and build feature on top with all those dependencies and all that stuff, right? Can AI good enough now to help us untangle some of those things along the way of building new things. And so we actually, you know, continue to work on that and figure out how we can actually continue to boost more and more productivity out, even building new features with AI. How do you think AI will change software engineering and what a software engineer does or what skills we value? Yeah, it's already changing. I mean, very rapidly, fast. These things are faster than anything I've ever seen, including the internet, right? Back then, I remember when we first learned how to do programming, we have to know a lot about the machine architecture. We have to know about virtual memory. You have to do, and then we have to learn how to write syntax and coding. All of that stuff been abstracted away now, right? So that's why AI is used like I want X, Y, and Z, blah, and it should be this way. And the whole thing get constructed, right? So it elevated the level of the playing field where people who don't even know how to program can now create good, you know, good decent code and app or whatever it is that look on the surface is really good. So it is game changing, right? It elevated the playing field. Now, then in that level of abstraction, how do you tell the great engineer from the good engineer? Great question. How do you? Well, from what we see so far, the great engineer are still finding ways to leverage this and accelerate the output even more. Then we see the difference between the great engineer and an average engineer is still 2 to 3 X in terms of their capability. They're more inquisitive. They're at the bleeding edge more. They're more innovative, right? And then there are people who like, okay, well, here's the tool that you give me. I'm going to be two times more productive, right? Because I'm using this tool. It's great. But the great engineer continues to break new boundaries. And so I think that is still a very, you can still, you can look at people and you can see who are the high performer versus who are average. So do I hear correctly that the, the traits that you're seeing in great engineers is we didn't mention, but it's kind of a given the foundations plus curiosity, plus innovation, Fearlessness, willing to innovate, willing to stretch, willing to try new things and break new ground. All of those traits are still exist. If I think back to like just the Uber days or your startup days, that those traits were kind of the traits that stand out. That's right. Those are the thing that makes someone outstanding versus someone average. So I guess maybe an advice is like, well, I mean, try not, if you were a great engineer before, just don't be complacent and keep using, keep approaching the same way, right? Correct. Yeah. Complacency is death. I mean, like every, the world will move faster and faster. And the moment we stand still, we are falling behind. Sounds like if you, if you worked at a fast, fast paced startup before, which is, this is how it works. AI should be familiar. Welcome to how it was before. To me, it is a incredibly powerful tool, but in the end, it's still a tool and if you can wheel the tool properly, you can do extraordinary things versus you just merely use a tool in a mundane way. You're not going to A company from a decade earlier that didn't even win this market. The engineers Twan pulled from VMware into Uber came because they genuinely enjoyed working with him. There was no networking strategy. Just years of being good to people compounding quietly in the background. Finally, Twan's point about AI was an interesting one. Complacency is death. The traits that made someone a great engineer before these AI tools, curiosity, fearlessness, willingness to try new things, are exactly the same traits that make someone great with AI tools. The tools changed. What makes people exceptional has not. Do check out the show notes below for more deep dives on Uber and Uber's engineering culture, as covered in the Pragmatic Engineer newsletter and podcast. If you've enjoyed this podcast, please do subscribe on your favorite podcast platform and on YouTube. A special thank you if you also leave a rating on the show. Thanks and see you in the next one.