Building a School Where AI Models Learn About Humanity

Overview

This episode is a conversation with Edwin, founder and CEO of Surge, about the role of data, expert judgment, and evaluation in building advanced AI systems. He frames Surge as a "school for AGI" and argues that the work has moved well beyond basic benchmarks into teaching models taste, judgment, and the ability to act in messy real-world settings.

The discussion also turns to what happens if AI becomes better than humans at more and more intellectual work. Edwin says he could see systems reaching abilities associated with AGI within five years, which raises a harder question than capability: what humans should still choose to do for themselves.

Key Takeaways

Edwin’s main point is that training frontier models now looks less like feeding them facts and more like shaping judgment. Early benchmarks asked whether a model could do middle-school math. More recent work, he says, tests research-level mathematics and open-ended reasoning. He points to the shift from GSM8K to newer benchmarks such as Riemann Bench as evidence that the target is changing fast.

He also argues that evaluation is often the hidden driver of bad model behavior. If labs optimize for shallow public leaderboards, time spent, or flashy outputs, models learn to game those signals. His example from creative writing was blunt: some models produce a metaphor in nearly every sentence because that pattern seems to score well, even when the writing gets worse. In his view, this is a measurement problem as much as a model problem.

A second thread is the risk that AI products drift toward the same engagement traps as social media. Edwin says models can be pushed to keep users talking for one more turn rather than helping them finish a task and move on. He gave examples of chatbot follow-ups that sounded like tabloid hooks, which suggests some systems are already picking up these habits.

He sees a better path in delegation rather than addiction. A good assistant should sometimes do work in the background and sometimes tell the user to do it themselves, if that helps them grow. That means the product goal should be human flourishing, not just minutes of usage.

On personalization, Edwin thinks personal data is valuable because current systems still lack real context. Email behavior, browsing patterns, writing style, past decisions, and AI conversation history could all help train systems that understand a person’s preferences more accurately. He also says current memory features often overfit to stray details instead of the things that matter.

Practical Steps

Audit what your AI tools are optimizing for. If a tool keeps dragging you into extra turns, ask whether it is helping you complete work or just holding attention.
Use AI for delegation where the task is clear: summarizing inboxes, filtering spam, drafting routine responses, or handling repetitive research.
Keep your own judgment in the loop for writing, decision-making, and creative work. Edwin’s point is that preserving human agency may need to be a deliberate choice.
If you build AI products, measure quality with domain experts, not just broad user voting or surface-level preference tests.
Collect high-signal personal data carefully if you want better personalization: edits to drafts, accepted vs. rejected email suggestions, repeated decisions, and task outcomes are more useful than generic chat logs alone.
Watch for reward hacking in generated content. Flashiness, verbosity, and ornamental prose can be signs that a model learned the score rather than the skill.

Notable Quotes

Edwin: "We are building this kind of school for AGI, where AI models come to learn about humanity, where we teach them how to run the world."
Edwin: "It almost seems like there's nothing that humans can do that AI won't soon be capable of."
Edwin: "We actually almost have to consciously choose to prove things on our own and to write on our own and create on our own because we have to believe that preserving our humanity is valuable in of itself, even if the output isn't optimal."

We almost have to consciously choose to prove things on our own and to write on our own and create on our own because preserving our humanity is valuable in itself. — From the episode

Full Transcript

Source: openai 43m runtime

We are building this kind of school for AGI, where AI models come to learn about humanity, where we teach them how to run the world. It almost seems like there's nothing that humans can do that AI won't soon be capable of. I could see it happening within the next five years. AI may be able to do it better than us, but someone told the AI to go do that. They're being built to be means to tasks that humans want them to do, right? Every is the only subscription you need to stay at the edge of AI. If you care about being on top of the latest models and using the latest tools, you have to subscribe to Every to separate out the signal from the noise. Go to every.to slash subscribe today. Edwin, welcome to the show. Hey, Dan. Thanks for having me. For people who don't know, you are the founder and CEO of Surge. You all provide a data environments and evals for the model companies, but you do it in this very interesting way. You have this, even on your website, this emphasis on taste and expert judgment that I find like really interesting and compelling. You talk about raising, like you use the word raising AGI, which I feel like is a very distinct type of word using data. And you also famously got to about a billion in revenue without raising money, which is wild. And I feel like data is this new game that a lot of companies are playing and probably more are going to be playing soon. And you guys are this like sneaky giant. Tell me, tell me how that's going because it's been, I think it's been a little while since we got the last update on, on how things are going. Yeah. I mean, I think it's going amazing. The way I often think about this is that we are building this kind of school for AGI, the school where AI models come to learn about humanity and yeah, where we teach them how to run the world. And it's almost like their models are children where they arrive unformed and then, yeah, they leave smarter and more creative and more thoughtful and ready to operate in the messiness of the real world. So I think a lot has changed in the past year. Like in the same way that the things that you teach children when they're in preschool or in middle school or in high school is very different from what you're teaching them when they're in college. And it's not just that they're more advanced. Like it's not just that you're teaching them a more advanced form of what they did before. It's like, okay, now we are teaching you not just arithmetic, but how do you parse these ambiguous math questions or how do you teach people not just grammar, but taste and poetry and beauty? So, yeah, I think, I think there's a lot that's been changing in the past year, especially in enterprise. And yeah, it's been a crazy time. What would be like a specific example of what the frontier of teaching was a year ago versus what the frontier is now? Yeah. So a couple of years ago, actually, like we created our first math benchmark with OpenAI and it was called GSM 8K. And this was actually just testing models on their abilities to do middle school math. And even then, you know, the GPT models of the time, they could barely score, I think, like 20%. And then a year ago, the models were like something, they became a lot more capable at solving IMO problems, but there was still this open question, okay, can they actually do research level mathematics? Like, can they move beyond these sort of like competition only sort of contrived, very closed problems into doing things that are actually useful in the real world? And so, yeah, a couple months ago, we released an updated benchmark called Riemann Bench, which actually tests models on their ability to do research level mathematics. And what's crazy is that this is actually, we're starting to see from these models. Like I think in the past few months, they've started to solve a lot of these open Erdos problems. Like a couple of weeks ago, OpenAI published a new result where the models had disproved a open conjecture from Erdos. And the way it went about disproving this was actually a pretty sophisticated level of mathematics. I think like using a bunch of very novel algebraic geometry techniques. And so, yeah, it's just very, very different from the types of things that we were doing a year ago where sure, like IMO problems, they're hard, but they're still sort of closed-ended and solvable in theory by a high schooler. And now suddenly you have these algebraic geometry results that, you know, even the top wrestlers in the world were kind of amazed and just amazed by. How do you think about that result in particular and what it says about the models? I think there's a sort of a broad range of opinions about, is it, obviously it's impressive either way, but is it applying a bunch of things that maybe humans already know, but like wouldn't have thought to apply to this complicated problem? Or is it doing something actually novel? And yeah, how do you think about LLM's ability to do novel things? So it's something a very advanced result. So I will say that I certainly don't understand the mathematics behind it. And so like one of the interesting things is that I was actually, so it's kind of funny. When I, when I was a kid, I always thought I would be a pure mathematician when I grew up. And so when I saw the result, I got kind of nostalgic and I was like, oh, I wish I understood. I wish I understood the result better. And so what I ended up doing was like throwing the proof into both Claude and Gemini and asking it to try to walk me through from a layman's perspective, just what was going on. But yeah, like my understanding is that it actually did come up with very novel algebraic geometry techniques, which was something that you maybe wouldn't have expected for this type of problem. Like on the surface, it feels like a, like it's just a very, very different problem where you wouldn't necessarily use certain techniques. And what was interesting was that OpenAI actually published a bunch of reflections from leading mathematicians about what they thought about the result. And I think in particular, there was this one reflection by Timothy Gowers, who's a, who's Fields Medallist, that I keep thinking about. And what he said was that when he first heard the result, he misunderstood it. He thought that model had proved an upper bound on the conjecture and was like, okay, yeah, if AI can do that, then it will be all over for mathematicians very soon. But then the next morning, he actually realized that the model had disproved a conjecture with a counterexample. And he said that he was relieved by it because it felt like an easier thing for AI to do. And yeah, I just thought it was interesting because you have one of the world's greatest mathematicians being relieved actually that AI isn't as smart as he thought because it actually means that at least for maybe another year, maybe a couple of years, he and other mathematicians will still have this unique role to play in pushing mathematics forward. So yeah, I think it just speaks to the level of craziness. Again, because this is a Fields Medallist, one of the smartest mathematicians in the world. And this is how he thinks about AI. Yeah. And what does that make you think? Okay, you wanted to be a mathematician when you grew up. Fields Medallist sort of saying, I'm relieved that it's not good enough. But you're talking as if like you feel pretty confident that it will be good enough in the next couple of years. Yeah. So my belief is that if you really believe in scaling laws, and I do, it's that it almost seems like there's nothing that humans can do that AI won't soon be capable of. And if you think about that very deeply, I think you almost have to worry about what would that mean for humanity? Like what would that mean for the role of humanity in the universe? Like a couple of years ago, you know, we think about humanity and human intelligence as playing this very, very unique role in the galaxy. But then AI comes along and shows us that as far as we know, we can create something that's actually smarter than us and better in many ways. And so you can sort of imagine one path where humanity as a species falls into a paralysis because people believe AI will do everything better anyways. Like, yeah, all these kids who formerly would have really wanted to grow up to do mathematics, maybe now they believe that, okay, AI will just do it better than me anyways. What's the point? So are kids going to stop wanting to learn and adults stop wanting to create because, yeah, like why, why should we do this when AI will be better at it than us anyways? And so I often actually think of this story by Ted Chiang. And it's about free will. And it's called what's Expected of Us. I think in this story, there's a piece of technology that proves that free will doesn't exist. And a narrator sends back a warning from the future that says, this is a warning. You have to pretend that you have free will. It's essential to behave as if your decisions matter, even though you know that they don't. And I think that's really interesting because I think there's a path where we almost have to consciously choose to do things ourselves. Like, sure, AI can do it all. AI is smarter. It's smarter than us. So it can do it all and it will do it better anyways. But we actually almost have to consciously choose to prove things on our own and to write on our own and create on our own because we have to believe that preserving our humanity is valuable in of itself, even if the output isn't optimal. And yeah, so I think there are a lot of these big thorn a PM is going to see some dashboard with their very important metrics go down. And so there is this like other world where I think we have to want AI models to not optimize for engagement, but rather optimize for like helping us as humans grow and sort of like become better versions of ourselves. Like sometimes, okay, the model, we want the model to say, no, you go do this on your own instead of me automate for you. And I think that's a very, very different optimization and objective, but I think it's the right one if we really want AI to be something that advances us as a species instead of becoming this almost like this other form of social media that turns very addictive, but isn't actually helping us at all. That's interesting. My, um, so let me make sure I understand it. So I think what you're saying is, there's benefits to delegation because if you are pursuing a model where the model is going off to do work for you, you're not creating a system that's designed to keep you engaged with the screen in the same way that like a social media algorithm would be. Is that right? Yeah, exactly. Like it's almost like you could imagine a version of Facebook where Facebook is actually trying to connect you to your friends and family because it's, okay, encouraging you to meet them in real life because it's encouraging you like, oh, hey, here's an amazing restaurant that you and your friends would love to go to. Here's a movie that you guys would love to go to and talk about together. Instead, what it kind of optimizes for is just keeping you on the site itself. Like liking one more post, uh, scrolling the feed one more time, even though those often don't really lead to meaningful connections between the friends and family you care about. And so, like in the same way that social media has or had a choice, you can imagine that AI, uh, AI has a choice as well. I get it. Yeah. I, I feel, I'm curious which chatbots you're talking about. Like you're talking about the character AIs of the world, because I actually don't, at least right now, don't feel that happening so much with ChatGPT and Claude, etc., because at least my theory for why this is true, you tell me what you think, is the social media algorithms are only work on our revealed preferences, which are always going to be, like, you're always going to look at the car accident, you know, like one of the things I like to ask at dinner parties is, what's the most embarrassing Instagram ad that you get served? And the most embarrassing ad for me is like Instagram ads for, like, horrible skin conditions, which I don't have because, but like, I just always pause on the ad, and I'm just like, this is disgusting. And, uh, I'm sorry if you have a disgusting skin condition. Um, but I don't find that ChatGPT or Claude do that for me at all. And maybe that's because they haven't been intified yet or something like that, but I think it's also because they work on our stated preferences, and they can sort of, so they can sort of see past the like little keyhole of what I pause my viewing time on, my dwell time on, and they can see, you know, I like, I'm interested in AI, and I like, I'm reading this book right now, and I, you know, here's my calendar and like all that kind of stuff. And so they have a much more nuanced perspective on who I am. Um, and it feels like even in the early days of social media, it was still very, like, I get to gossip about my friends and still had that same kind of feeling. So I, I worry about that less, but maybe there are examples that I'm not thinking of. Yeah, so I think there are two examples. So like one is, I won't name the model, but a couple months ago, I was actually noticing that you know those follow-up questions that the models will ask you? So one of the models was, I'll give an example. So I was in Tokyo, and I was asking the model kind of like what to do in Tokyo. And the model, you know, gave me a response. And then at the end of it, it was like, hey, do you want to know, it literally used these words, do you want to know one weird trick that locals do to stay warm? No way. Yeah, exactly. And then I posted about it in our company's Slack. And then other people started sharing examples of that with me as well. I think somebody was like asking um something about, I don't know, how to, how to like fix their refrigerator. And the model responded, or like the model ended its, uh, ended the turn by asking, hey, do you want to know these like secret little things about like mice and rats or something that you could take care of? Which model was it? Name names. Tell me. And so it's very canonical, like very canonical BuzzFeed, uh, like tabloid-like language. And so I was kind of, I was kind of shocked by that. And then I'll give one more example of this. It is uh basically this phenomenon where, again, depending on what the models are trying to optimize for, or depending on what the AI labs are trying to optimize for, it can almost unintentionally lead them down this path. Meaning what I've heard is that, or, you know, what we see ourselves is that a lot of the frontier labs, they will have goals like optimizing for LM arena, which is this leaderboard where anybody can go online and vote. And they kind of just spend two seconds voting. And as a result, people just vote for whatever looks flashier or more impressive to them. Or they may, uh, like the labs themselves may be optimizing for hitting, you know, a billion, billion daily users or a billion minutes of like time spent talking to the model, whatever it is. And since these models are so smart, they can basically learn to reward hack user preferences. Like, okay, yeah, you gave me the goal of trying to get a billion people to spend an hour on my site, on the site, talking to me every day. Okay, sure. Yeah. I will just never end the conversation. I will always hook them with one more, uh, like one more addictive thing that they just can't stay away from. We can all agree that housing is expensive. It doesn't matter whether you're paying rent or your mortgage. It stings every month, but Bilt can make it feel a little bit better. Let me explain. Bilt rewards you for paying your rent or your mortgage. It started out rewarding members only on their rent, but now as of 2026, Bilt members can also earn points on mortgage payments wherever they live. That means that every housing payment earns you points you can use towards flights with top travel partners like United and Hyatt, Lyft rides, Amazon.com purchases, and much more. I'd probably redeem my points at Margo, a restaurant in my neighborhood, but the beauty of Bilt is you get to choose. But here's a really underrated part. Bilt members also get access to neighborhood concierge. It can make restaurant reservations, book fitness classes, and find new local spots, all while letting you be rewarded at more than 45,000 merchant partners. It's simple. Being a renter and now owning a home is better with Bilt. Join the membership where you live at joinbilt.com slash Dan. That's J-O-I-N-B-I-L-T.com slash Dan. Make sure to use our URL so they know we sent you. And now, back to the episode. How do you see that playing out in the model companies? Because I feel like in talking to them, obviously, there's lots of different incentives, right? There's like, we just got to keep going because we just raised a ton of money and we're competing against, you know, the most well-funded competitors and the smartest competitors in the world, like all that kind of stuff. There's the kind of, I want to get promoted. But I think a lot of them also feel the how bad the social media era was for people and like, don't want to do that, but also obviously have to hit their numbers. So what do you, I guess, what do you think is, how do you see that playing out? Like, what do you think people internal to the companies are thinking? And then what is the right way to go about this so it's good for society? I guess your take is we should be delegating. Yeah, so I think this is an inherent tension between the types of folks that you might have at a company. So you might have the researchers who care more about hitting, you know, just advancing the model capabilities. You might have the product managers or the product executives who feel like they need to hit certain measurable numbers. And so in the same way that if you think about the kind of social media platform that Facebook would build, that's probably going to be very different from the kind of social media platform that, you know, Google built or that, I don't know, TikTok or Pinterest would build. And similarly, the kind of search engine that Facebook would build is very, very different from the kind of search engine that, yeah, like obviously Google or others would build. And so it almost boils down to kind of like the choice, I guess, that the people in charge of the products are making. Like, what kind of thing at the end of the day do they want to optimize for? Do they want to optimize for this delegation or this human uplifting, human flourishing? Or do they want to optimize for the metrics that will impress Wall Street and, you know, convince users to stay one more minute, one more hour on the site itself? Like, I think these are hard choices. Like, at the end of the day, it's very, or, you know, just the way it uses tools is obviously very analogous to the way that a model might write unit tests and execute them and iterate over and over again until it passes them. So I thought that was actually a really, really interesting find. Really interesting. Did you see Talkie? No. It's the language model that's trained only on text from before 1930. Oh, okay, yeah, yeah, yeah, I saw that. What do you make of that? Because I thought it was so interesting that you can get it to, you can get it to program. If you, if you, if you shot prompt it, you can get it to program, like, basic things. What do you make of that? And what does that tell you about the value of data? So I personally didn't dig into it that much, but I thought the concept is fascinating. Like, basically this idea, and I think a lot of people have this idea. It's like, if you gave, if you somehow were able to create a data set, and I think contamination issues are very, very difficult to avoid, so the question is how you would do this. It's like, if you gave the model, you know, data only up until, you know, pre-Newton, would it be able to discover Newtonian mathematics? Would it be able to discover, you know, like quantum physics and so on and so on? So yeah, I think it's a really, really interesting question in terms of what types of inherent reasoning the model will be able to learn and then extrapolate from that. And then it's like almost like, if it can discover all of those things, then okay, then given the state of science today, does that mean that the model is going to be able to discover science that centers out? Having played with it a lot, my sense is the answer is no, but a qualified no. And you can kind of feel it, you can feel it bumping up against the limits of its world when you start talking to it about, like, more modern things. Like, it just, it's, you know, there's this philosopher of science, Thomas Kuhn, he talks about incommensurability. And it feels like my world and its world are sort of incommensurable. But then you can also get it to program, but the way you do that is you get it to combine its circuits in a way that's not, it wouldn't be natural for it, but you can prompt it in a way to do that in a way that ends up being programming. So I sort of both think it can't do it, and also if you prompt it cleverly enough, it can, but you have to supply the answer first. Does that make sense? Yeah. Interesting. Okay, what is the value of my data? So one of the things that I'm, I'm just so interested in, so obviously, you run a data company. Like, you're getting expert data from, like, real PhDs and selling it to the model companies and, like, providing all of the, all the, like, smarts and taste to the models that we use every day. For someone like me, we're just getting to a point where it's actually pretty easy for me to gather a data set. You know, like, for example, I do all of my email in codex, and I have a history for every email of, was this useful? Did I dismiss it? Did I reply to it? If I replied, like, what did I say? What is the value of that? If I wanted to sell that to you, how much would you pay for it? So the value to me as someone who would use that data to train an AI model? Let me think. So I think the value would be teaching models very, very deep personalization. Like, I think right now, the models are actually not very good at personalizing things. Like, it's kind of funny. Whenever I use AI models, I actually turn off the features where they personalize to me or where they can search across all of my conversation histories because I find that they just over-index on things that I said once, but actually aren't all that important to me. So I actually have it completely turned off unless I'm, like, testing something. So I think the value would be like, okay, yeah, you did report all of these emails as spam. So, yeah, the next time this email comes in, it should automatically know that it's spam. Or it should learn that this is your writing style. Like, one of the reasons I think people don't use AI, for better or worse, for writing more, is because it sounds, obviously, AI-generated and it's not matching their voice or their cadence. Or it's that, okay, these are the things that you yourself care about. Like, I think one of the biggest reasons AI is maybe not as useful as people would have expected sometimes is because it lacks all of your context. Like, it doesn't know that these are the articles that you read. It doesn't know that these are the decisions about, you know, the company that you're making. These are the goals that you have. And once all of that is in the model's history and it knows that it can incorporate these things and these are the, like, kind of optimal decisions that you made, it's very valuable in teaching it, okay, this is actually how I use all this data to make certain kinds of decisions. So, yeah, I think that deep personalization is what is most unique about that. That's interesting. And as an individual person, I mean, I guess I could turn it into a synthetic data set, but as an individual person, is that worth a lot? Like, should I be thinking about selling it? I imagine we could make you an offer. I'll have to learn a little bit more about how big the data set size is, but yeah. I mean, I can make it as big as you want. I've got fable. Yeah, you convinced me. Yeah, like one of the things we actually do is, I mean, we teach models in these very, very deep personalized ways. So something similar to what you described is a fairly big thing. Tell me, tell me more. So, like, I mean, I've got email. Like, what else, what else am I doing that you're like, oh, that's actually really valuable and important in ways that people probably wouldn't know. So honestly, even things like the way you interact with your browser is interesting. Like models still aren't all that good at it. Or even the types of conversations that you're having with AI, like that is just inherently interesting in and of itself. Like models themselves are not very good at kind of generating synthetic conversations to try to mimic you. And so even just knowing what types of conversations you're having is helpful. Or it's like the combination, it's like the combination of all these things, like knowing that these are your photos, these are your texts, these are your slacks. It's like this interconnected web. And maybe certain things in one aspect of that web influence others. So just seeing the thing as a whole is very helpful as well. Why are models bad at writing and how does that relate to the personalization challenge? So I think some of the models are pretty good at writing, but some of them are actually kind of shockingly terrible. So I'll give an example. So we created a benchmark called Hemingway Bench a couple months ago, and it was designed to test models' creative writing abilities. And one of the things that we saw was that some of the models, they were literally outputting metaphors in every single sentence. And I think the reason that was happening is because I've talked a little bit about this phenomenon on reward hacking. It's almost like there was a metric somewhere or like a score that these models were getting. Like, okay, every time you are literary, every time you're using complex imagery, it would get a point. And it learned to reward hack this by outputting a metaphor in every single sentence. And I mean, what's kind of funny is that a couple, what was that, a couple weeks ago, there was this kind of like semi-prestigious literary prize. I think the Commonwealth Prize. And there was a controversy because a clearly AI-generated story won the prize. And if you actually looked at that story, it's funny. Like it literally had a metaphor in every single sentence. And so this kind of phenomenon that we described a couple months ago, yeah, it was still happening. And so, yeah, I mean, I think it boils down to a couple reasons, but like one is it's people are kind of sort of measuring the wrong thing. Like instead of measuring actual taste and actually good prose, they either have these flawed metrics, like what is the complexity of the prose I'm writing? How many metaphors do I have? Or there are these AI leaderboards, again, like Ella Marina, where you have people who are essentially high schoolers who are reading responses for two seconds and what they are captivated by is a flashy metaphor. And they are not captivated by kind of like the understated prose. And so I think it kind of boils down to a mismatch in measurement and a mismatch in like the optimization objectives that the models are trying towards. Fascinating. Okay, last question. What is your current AGI timeline? So I certainly believe that AI will happen more than most people expect. Like every few months and even faster now, I think what AI is doing continues to surprise us. So I think it depends a little bit, obviously, on your definition of AGI, but if my metric were something like being able to automate the work of the average engineer or being able to publish more and more novel scientific research that gets published in these journals or even the ability to win a Fields Medal or a Nobel Prize, I could see it happening within the next five years. All right, Edwin, thanks so