Where Is All the A.I.-Driven Scientific Progress?

Overview

This episode of Hard Fork examines what AI is actually accomplishing in scientific research versus what’s being oversold by major AI labs and policymakers. Hosts Kevin Russo and Casey Newton interview Sam Rodriguez (CEO/co-founder of Futurehouse and Edison Scientific), who argues that AI is already generating real scientific value—especially in data analysis and hypothesis generation—while cautioning that “cure all disease in 10 years” narratives collide with experimental and clinical realities.

Key Takeaways

Two distinct “AI for science” tracks matter—and they’re often conflated. Rodriguez separates (1) modeling the natural world (e.g., protein structure prediction, generative antibody design) from (2) modeling the process of doing science (agents that read papers, write code, analyze datasets, and propose hypotheses). This distinction clarifies why some claims feel magical and others are more like “turbocharged grad student work.”
AI agents can compress analysis timelines, but not eliminate validation. Rodriguez describes Cosmos, an “AI scientist” agent that runs for ~12 hours, reads ~1,500 papers, and writes ~42,000 lines of code per run—costing ~$200 per prompt. In tests with academic collaborators, Cosmos reportedly reproduced months-long analysis work overnight and also produced several novel conclusions, but users must still interpret, check, and experimentally validate outputs.
The biggest bottlenecks in medicine aren’t only discovery—they’re trials and execution. Even if AI improves target selection and trial design, clinical trials remain slow because of manufacturing, recruitment, time-to-observe outcomes, and basic experimental uncertainty. Rodriguez argues AI can help ensure expensive trials are better designed by extracting insights from existing, underused data.
Generative biology is a standout frontier capability. Rodriguez is especially excited about generative models that create proteins/antibodies “from scratch” with desired properties—an ability he views as qualitatively new and potentially transformative, even if downstream proof comes years later.
Hype check: timelines are the core disagreement. He rejects a 10-year “cure all disease” horizon as unrealistic, while viewing 30-year transformative progress as plausible—assuming sustained advances and better experimentation.

Practical Steps

If you’re a working scientist, start with the lowest-friction wins: use AI for coding assistance and literature synthesis before trusting it with central scientific claims. These areas already reduce bottlenecks without forcing high-stakes reliance.
Treat agent outputs as hypothesis generators, not conclusions. Build a workflow where AI produces candidate mechanisms/interpretations, then you create a validation plan: cross-check citations, rerun analyses, and design experiments explicitly meant to falsify the AI’s top claims.
Use AI to improve trial and experiment planning, not just discovery. Apply it to systematically review prior datasets and literature to justify endpoints, biomarkers, inclusion criteria, and alternative hypotheses—so the eventual expensive experiments are better targeted.
Demand measurable benchmarks. Favor tools and teams that can articulate evaluation methods (replication studies, blinded comparisons, known-answer tests) rather than relying on anecdotes about “breakthroughs.”

Notable Quotes

Sam Rodriguez: “A decade is crazy… you have to run clinical trials.”
Sam Rodriguez: “Checking the work is like always going to be faster than producing it in the first place.”
Sam Rodriguez: “You should not expect that you're one day going to… ask it how to cure Alzheimer's and it will just tell you.”

Full Transcript

Source: openai 39m runtime

Cynthia Erivo is the best singer in the world. She's incredible. I don't know what it is about her voice, but it brings me to tears every single fucking time I hear her. She has the most incredibly emotional voice. I was trying to figure out what it was, but it's just like, she's just the best. Obviously, she has the power, but there's all these textures in there. Did you see the Design to Go viral clip of her visiting her old school? Yes, I obviously lost my shit. The absolute best is that the students start singing and they just sound like shit. No, it was so sweet. Just imagine you're one of these kids. It's just like after-school class. It's a little club. You're just doing it for a little bit of enrichment, and you're just kind of plotting along, trying to get through the day. And then fucking Cynthia Erivo shows up, and they're like, all right, kid, you're up next. What do you got? No, it was so sweet. I would throw up. It was so sweet. I'm Kevin Russo, tech columnist at the New York Times. I'm Casey Noon from Platformer. And this is Hard Fork. This week, Future House CEO Sam Rodriguez joins us in the studio to separate the hype from the reality of AI science. Well, Casey, it's time for some science. Yeah, give me a second, Kevin. I'm just going to put on my lab coat here, get out my Budson burner, and see what you've got cooking for us today. So I have been obsessed with this question of what AI is and isn't doing for science and scientific discovery. Obviously, this is something we hear a lot about from the leaders of the big AI companies, people like Dario Amadei, Sam Altman, Demis Hassabis. They have all been saying things in recent months about how close they believe we are to solving new scientific problems and curing diseases and fixing the climate with all of these new AI tools that they're building. And some of that is obviously hype, or at least has the sort of markings of hype. But there's actually a lot of real stuff going on in AI and science that I just do not feel personally qualified to evaluate. Yeah, and I would also say that science has become one of the main ways that the leaders of these tech companies want us to evaluate them. Because whenever one of their models does something horrible, the message we basically get back in response is, don't worry, we're about to cure cancer. Just hang on tight. I know that this chatbot might be driving you to madness, but if you could just give us a few more releases, we're gonna do some really good stuff. Yes, and this is something that we're also hearing now from the US government. The Genesis mission was announced by the White House just before Thanksgiving. That is what they're calling a dedicated, coordinated national effort to unleash a new age of AI-accelerated innovation and discovery that can solve the most challenging problems of this century. I thought the Genesis mission was just them trying to get Phil Collins to play the White House Christmas party. I guess not. And so today we have brought in a bona fide scientist to help us understand which of the sort of scientific discoveries and possibilities out there are real and which are not. We need an expert with a broad focus, someone tracking the impact of AI, not just on biotech or drug discovery, but across the different sciences. And Casey, we have found the perfect person. Let's hear about him. Sam Rodriguez is the co-founder and CEO of Futurehouse and Edison Scientific, which is a San Francisco-based, I guess it's both a non-profit and a for-profit. Where have I heard that before? Yes, come back when he has his board coup. Futurehouse is the non-profit, Edison Scientific is the for-profit that spun out of it. I've been to their office in Dogpatch. It's really fun. It sort of feels like a kind of wacky mad scientist lab. They've got all these like, you know, sort of lab machines that I don't understand. You know, people running around in lab coats, and they're all talking about AI, and it just feels like kind of a cool place to be. And they are building what Sam calls an AI scientist, which is an AI agent that can do sort of parts of the process of scientific research. And Sam is also himself a scientist. He has a PhD in physics from MIT. And before he launched Futurehouse, he spent several years running an applied biotech lab. So he has sort of seen this stuff happening from a couple different angles. Yeah, and today we want to talk to him about what he is up to, but also kind of get his vision of the entire landscape. Tell us what is working, what isn't, where's the hype, where's the real stuff? Sam has a lot to say about it. Yes, and I think it's fair to say that Sam is on the more optimistic end of the spectrum of beliefs about what AI will do for science. But as you'll hear in our conversation, he's more skeptical than some of the most optimistic people who are claiming that we'll cure all disease in five or 10 years. Yeah, if you've been craving a little bit of cold water for the wildest projections, he has some of that to offer you. So let's bring him in. When we come back, we'll be joined by Sam Rodriguez. Sam Rodriguez, welcome to Hard Fork. Hello, thank you. So we have brought you here today to be our science expert, our guide to the biggest recent AI-powered breakthroughs that are happening in science. This is an area that I sort of understand in an ambient way is important, and there are big things happening, but neither of us are scientists, although I did make a killer baking soda volcano in a couple of years ago. So Sam, welcome to Hard Fork. Thank you for having me. baking soda volcano in elementary school. So we have so much to talk about today, but before we get into some of the particulars, I want to ask you about your project that you've been working on. Last month, the commercial arm of your nonprofit, which is called Edison Scientific, launched a new AI scientist called Cosmos that you say can accomplish work equivalent to six months of a PhD or postdoctoral scientist in a single run of this model. Tell us about how Cosmos works and where that six-month number comes from. Yeah, yeah, exactly. And actually, I will just start out by saying that when I got that six-month number, my reaction originally was like, there is no way that this is true, right? And we've now measured it in a bunch of different ways. I can walk you guys through that. But basically, just to take a step back, so we've been working for two years on figuring out how to build an AI scientist. And the concept here is there's so much more science that we can do than we have scientists, right? And so how do we scale up science? And the thing that happened with Cosmos that is pretty cool is Cosmos is like the first thing that I think that we've made that actually really feels like an AI scientist when you're working with it, right? Which is to say that you go in, you give it a research objective, it goes away and it comes back with insights that are actually really deep and interesting and sometimes wrong, but about 80% of the time right, which is like kind of similar to like if you ask a human to go away and do something, comes back like similar percentage of the time is right. And it's like, it's a kind of new experience working with it. So that's very exciting. The six month number specifically, the way that we measured this was we had a bunch of academic collaborators, scientists who had done a bunch of science previously that they had not published yet. And we basically gave the same research objective and the same dataset to the AI, to Cosmos. And we ask it to go away and just make new discoveries. And it would come back and it had found the same things that the researchers had found overnight. And then you go and you ask the researchers, how long did it take you to find this in the first place? And they would say like three months, five months, like six months, whatever. And so that's where it comes from. And it's like, that's the amount of time that it took them to come up with the finding. Right. So let me just ask you a couple of questions so I can ground myself here. Is this tool kind of a box you type into like the other chatbots? And if so, what is powering it? Did you guys sort of build your own model from scratch? Did you sort of make fine tunings to another company's model? Yeah, exactly. Yeah. So it is indeed a box that you basically type into. You ask a research objective. It's not a chatbot, right? It runs for 12 hours or so before eventually coming back to you with its findings. In terms of how it's built, we build on top of a bunch of different language models from OpenAI, from Google, from Anthropic, like in any given run, we use models from all the different providers. We also have like our own models for specific tasks that we've trained internally, where those models are like much better for the specific tasks that we train them on than the models that the frontier providers make. Got it. And then the key insight in Cosmos is basically this use of what we call like a structured world model. So one of the main limitations with AI systems today is that they're just limited in the length of the task and the sophistication of the task they can carry out before they kind of go off the rails. You know, forget what they're doing, they no longer are on task. And what we figured out was a way to have them contributing to this world model that gets built up over time that basically describes like the full state of knowledge about the task that they're working on, which then means that we can orchestrate hundreds of like different agents running in parallel, running in series, and have them all working towards a coherent goal. And that was like the real unlock. Right. Another thing that I found interesting about Cosmos is the cost. This model costs $200 per prompt. Yeah. So every time you give it a task, you're paying $200. Why is it so expensive? I mean, it uses a lot of compute. I mean, that's like the fundamental answer is it uses a lot of compute, right? But give us a sense of how much. Well, so an individual run from Cosmos will write 42,000 lines of code and read 1,500 research papers on average. Like if you run Cloud, it might write like a few hundred lines of code, so that gives you some sense. It's like there's a lot of compute that is going into this. Have you ever had like a scientist whose cat walks across the keyboard and accidentally hits enter and all of a sudden spends like $600? This is a problem. This is a problem. And we were like, right, so the thing that you have to understand, is that if you are a scientist and you go and do an experiment, you get some data back. You're going to spend five or $10,000 gathering that data. And so what scientists want is they want the absolute best performance that they can get. And like scientists who have used Cosmos generally come back to me and are like, they can't believe we're only charging $200 for it. Right? And, you know, I will say like, you know, $200 right now is a promotional price. We actually have to eventually charge more. Oh, it's going up. So get those prompts in before Christmas day. Get those prompts in, exactly. But like, but really, you know, it's like if you have to spend thousands of dollars gathering the data, like the cost at the end of the day is not the limitation. We do have to be very generous with refunds because people have, you know, make mistakes over time. I made a typo. Yeah, exactly. So what you just mentioned about the sort of the tests that you all ran to figure out how long this thing could run for, how much time it was saving scientists, that's about like sort of replicating existing research that's out there. But a lot of what we hear from the people who are running these big AI labs is the possibility that pretty soon AI will start making novel scientific discoveries. We'll start doing things that existing scientific methods and processes can't do. How close are we to that? That's already happening, actually. So if you go and you read the paper that we put out about Cosmos, we put out seven conclusions that it had come to, three of which were replications of existing findings, four of which are net new contributions to the scientific literature, like new discoveries. And of those, what's the most impressive? So like one of the ones that we really like, the human genome contains millions of genetic variants, right? These are differences between different people's DNA. That are associated with disease. And for the most part, we know that a variant is associated with a disease, but we have no idea why, right? And so we asked Cosmos, we gave it a bunch of raw data about a huge number of different genetic factors. So like what the variants are, what proteins bind near the variants, right? Like all these kinds of things. And just asked it for type 2 diabetes to go and identify a mechanism associated with one of these variants. And it came back and it identified this was a variant that was not in a gene. And Cosmos identified that this is actually somewhere where a different protein binds. It was able to identify what protein binds and what gene is being expressed and connected that to the actual mechanism of that gene, SSR1, which is involved in the pancreas in secreting insulin, right? Okay, so in this case is what I'm hearing that your model was able to do some very fancy reasoning over some existing data and identify something that sort of no other human scientist had gotten around to and might not have for a really long time. Yeah, that's right. Okay. And I think science generally consists of deciding what data to gather, gathering that data, and then drawing conclusions. And so at this point, basically, it's like step number three that Cosmos is aimed at, and there's more work done. You left out step zero, which was getting the Trump administration to unfreeze your funding, but everything else was right. So what happens when you get a discovery like this from Cosmos? Do you have to then go validate it? Do you hand it to like a team of researchers who then have to like make sure it works? Or like what happens next? Yeah, absolutely. You have to go and validate it. And so that's actually one of the things also, you know, in the paper, where actually we describe how we went and validated that particular variant. In general, when people are using it, yeah, you go in. I mean, actually, literally, when you run a Cosmos run, the first thing you have to do is you have to understand what it's telling you, because it has just done something that scientists think is like six months worth of work, and you're going to sit there for a long time just like reading and understanding it. Once you've read it and understood it, then yes, indeed, you're going to go and you're going to run, you know, various experiments, do your own analysis, cross-reference to try to like convince yourself that this is true. And then based on what your research objective is, you'll decide next steps, right? You know, in this case, I think it's probably low likelihood there's a new drug target like from this particular finding, right? But you could go and you could run this on other findings. And then eventually, maybe you find your drug target, you start a drug program. That's, you know. So one concern that I've heard people express about models like Cosmos is that this is just like sort of not where the roadblocks are, that the sort of reason that we don't have more AI discover drugs and design drugs out there curing diseases is not actually because like we don't have the research methods to discover those. It's because there's like, you got to go to trials and you got to recruit human subjects and you got to get FDA approval. Like all that stuff just takes a lot longer than the actual discovery of the drug. So what problems are models like these helping to solve in our scientific process right now? So absolutely, I actually like, you know, I really agree that like the bottleneck at the end of the day in solving medicine is basically, you know, clinical trials. I mean, and the easiest way to see this is if you look at the number of diseases that we like know how to cure in mites, right? It's like astronomical because obviously you can just like run experiments. And in humans, things are just slow. That said, if you think that every experiment that is being run right now by pharma companies, like every clinical trial that's being run is like optimally planned and optimally, you know, conceived given the full state of knowledge, you are off your rocker, right? There's like no way. And those experiments cost hundreds of millions of dollars. And so the question is like, we do at the end of the day have to run clinical trials. How do we make sure that those experiments are the best experiments we could possibly be running given all the knowledge that we have, given all the data we have? There's so much data that we have that has insights in it that are waiting to be found where we just like do not have people to go and find them. And that's ultimately going to feed into better experiments, better trials, right? Well, so then I'm curious how you see like your tool fitting into the workflow of today's scientists. Is it the sort of thing where like I have completed my experiments and now I want some help doing some analysis? Is it I have all these old experiments that I only did a little bit of analysis on and I'm curious if I can like sort of squeeze any more juice out of them? Or like what other ways are you seeing the AI being like really good right now for a working scientist? Yeah, yeah, great question. So going back to me in 2019, which is when I was wrapping up my PhD, right? I had this gigantic data set and I wanted to graduate because I was a PhD student, which meant that I was making like $40,000 a year or something on. And like there were a ton of great opportunities to go out and like don't be a PhD student anymore. OK, so I spent six months literally just like sitting at my desk, like trying to analyze the data and drawing conclusions, reading papers, right? For right now, that's what that's where Cosmos fits in. It's like, you know, you would just take that data that you give it to Cosmos. It comes up with a lot of findings. Right now, you need to go and do a bunch of manual work to validate those findings and so on. Pretty soon it's going to come with findings and you're going to be like, great. Sam, I'm curious if you could help sort of give us and our listeners a state of the world of AI science right now. Recently, the White House announced what it's calling the Genesis Mission, which is a federal effort to kind of corral and harness all of these data sets that the federal government is sitting on and use them to do new scientific exploring. We also have lots of efforts, including yours, but lots of things going on in and around the tech industry, the biotech industry, people doing AI for materials, science. Give us a sense of like the lay of the land of like what's hot right now in AI science. Where is the effort and money going? Right. In order to understand the landscape of AI and science, the first thing like fundamentally that you have to understand is that AI is about building models. Right. So, for example, right, like a language model, like what is a language model? A language model is fundamentally a model of human language. It just so happens that when you build a model of human language, it like learns how to think like a human in some sense, because humans like encode their thoughts in language. This is like one of the greatest discoveries, right? Certainly of the 21st century, maybe of all time. So similarly, when we talk about AI and science, what you have to think about is that you are modeling things. That is what AI does. And there are kind of two fundamental categories. There's modeling the natural world. Right. And there's modeling the process of doing science. These things are fundamentally different. And the reason to make this distinction is because, you know, what we are doing, right, we are modeling the process of doing science. The other side of the AI for science world is building models that can, for example, predict the structure of proteins that can generate a new antibody that can create a new organism from scratch, which are all things that have kind of like happened in 2025, where there's just a huge amount of momentum. Yeah, that makes sense. I mean, of the things that are happening in the part of the sort of process of modeling the natural world, you mentioned protein folding, novel organisms. Like, what has most excited you as a scientist that you've seen? So it's absolutely what's most exciting right now, I think, without a doubt, is this trend towards what we call generative models. So these are things where these are models that can produce examples of, you know, proteins or antibodies or whatever that have desired characteristics, basically from scratch. This is a new capability that we have never had before, and it's huge. I'm curious about the reliability piece as you're running all of these experiments. You know, I saw this going around on social media this week. I reproduced it myself. If you asked, Google, is 2026 next year? It said, no, 2026 is not next year. It is the year after next. So in such a world, Sam, some people might get concerned at the idea that we're now entrusting the AI with all of our data analysis. So how much time are scientists having to spend, go back and essentially rechecking the work of the AIs? And what kind of tax does that place on their work? Yeah, this is very funny. I mean, look, you have to spend a lot of time going back and checking. Yeah. But like, to be clear, this is true regardless of whether or not an AI does it or whether you ask a friend to do it. If you're going to publish a paper, you damn well better go back and check it and like, be sure that you are confident. And it's never going to be 100%, right? The best you're going to do is you're going to get to a place where it is similarly good to if you were doing it yourself, which is not 100% because you're not infallible, right? And checking the work is like always going to be faster than producing it in the first place. Got it. Right? By a lot. A lot of our biggest scientific breakthroughs in history have come from these kind of strange accidents, these moments of serendipity, you know, penicillin starts growing in a petri dish, so we just go, oh my God, you know, this is great. Does AI preserve that kind of serendipity, those kinds of accidents, or do they sort of optimize it away? Yeah, this is a great question. And the fact of the matter is we just really don't know yet. This is going to be a really important core question that a lot of people are asking. What's your intuition on it? I mean, I think that they probably will, because... They probably will because they probably will, they probably will preserve it, preserve it because penicillin, my understanding is that basically like the window was left open on some agar with like no antibiotic and it obviously didn't have antibiotics. This was the discovery of the first one, right? So the window was left open with some agar and like, you know, some spores flew onto it and began growing and they observed that the bacteria was inhibited, right? That's a mistake. Someone screwed up, right? And that mistake led to something fantastic and you will have mistakes, I think, that will be preserved. But in the meantime, scientists should always leave their windows open. You never know what's going to happen. You have no, you know, seriously though, like there's so much, when you get a graduate student in academia, right? When you get graduate students, first year graduates, they have no idea what to do. They have no idea what to do and that is a huge source of scientific progress because they just do the most random kooky stuff that no one who knew anything, who knows anything would ever think to do. And it's actually, it's actually really important. You almost want your like AI scientist model to hallucinate a little bit. Totally. So that it doesn't lose that quality. Or just add noise, right? We talk about this, it's just like adding noise in order to, this is actually important for like biological evolution also, right? Like the genome has a lot of noise and that's how the evolution randomly comes up with like new stuff. There's a protein that like is just totally random, doesn't do anything. Then one day all of a sudden, oops, it does something and that's great, right? So. What do you make of the leaders of the big AI labs, people like Demis and Dario and Sam Altman, who are saying, you know, AI is going to allow us to cure all diseases or most diseases within the next decade or two? A decade is crazy. Oh, and I'm happy to take a very strong stance on this because if I'm wrong, it's a great thing, right? But if I'm wrong, everyone wins. But like a decade is crazy. Why is it crazy? Because for the reason that we were talking about before, you have to run clinical trials, right? If we had a drug right now that prevented aging, completely halted aging in humans between the ages of like 25 and 65 or something, you would not know for 10 years because you can't detect in humans in that age range whether or not they're aging for like at least like five or 10 years. You don't detect from one year to the next that you're aging. So you won't know if the thing is working. I don't know. Some people at my 10-year high school reunion were already looking pretty good. I hate to say it. I did say 25. Fair enough. But right. I mean, you know, so we have to conduct experiments. Those experiments will take time. Now, will we in like 30 years, I think is very plausible. We don't know what is going to be possible. We don't know if it's possible to halt aging. We don't know if it's possible to like cure all diseases or whatever. But they're like between now and 30 years from now, I think you should expect to see a humongous leap forward. Let me drill in on that a bit there because I think some people might hear that and saying that like this is essentially a regulatory issue that like we just don't have, you know, the FDA set up to measure this. I'm curious about the experimental side of it though, right? Because my understanding is like we don't really have enough biologists to run all the experiments that we might not have like the funding to fund the experiments. And you did raise the point that some of these experiments just actually take a long time to run, right? So like what are all of the factors that in your mind are just going to make it so hard to cure this disease? My gosh, you have to go and you have to like, you know, even supposing you have a molecule that you want to test in a human and you know which humans you want to test it in, you have to go and make it, right? Humans are big. They require like a lot of it. You have to make sure it's like high enough grade that you can actually put it into a human. You have to find the patients, which means forming relationships with the doctors, right? Actually, you know, waiting until you have enough patients who are willing to do it. For many diseases, like there just aren't that many patients. And so finding the patients is hard, right? And then you have to actually dose them. You have to wait and see what happens, right? Even with no regulation, it would be slow. There's no AI shortcut for almost any of that, at least not right now. No, like what AI will allow us to do is it will allow us to discover a lot of things where we already have the information to discover it. We just haven't figured that out yet. You should not expect that you're one day going to like get GPT-7 and just like ask it how to cure Alzheimer's and it will just tell you. My expectation is that there is not enough knowledge. We do not have enough knowledge to solve it in principle, even with infinite intelligence, right? Like with infinite intelligence, there would still be some things that are just not known about the world where we have to conduct the experiments to see. You'll be able to plan the best possible experiment given everything that's known, but you will not just be able to like, you know, de novo kind of figure it out, right? Casey, I took Latin. That means from new. Oh, thank you. Thank you. That's saved me a step of Googling. When we come back, we'll play a game of overhyped or underhyped with our guest, Sam Rodriguez. This isn't quite science per se, but I'm curious what you make of this, Sam. All of the big AI labs are obsessed with math, with winning the International Math Olympiad, with putting up a gold medal score, with solving these unproven math theorems. And I have a take about this, which is that I believe that this is because these labs are filled with people who were themselves competitive mathletes in high school and took part in the IMO and did pretty well. And a lot of those people think that like AGI will just sort of be like a slightly smarter version of them. But I'm curious, why are these places so obsessed with math as being one of the sort of first places that they want to make a lot of progress? There are two reasons. I think that one of the reasons is exactly what you just said. It's just familiar, right? But the other reason is that you can measure progress, right? So ultimately, what drives progress in machine learning, a big part of what drives progress is benchmarks. With math, you can tell whether or not your proof is right. And there's kind of like an infinite number of things to go and prove. So it's just really easy to tell whether or not you're getting better. And things like the IMO just present great opportunities. By contrast, if you look at some of the biggest breakthroughs recently, biggest breakthroughs this year in AI for biology, right? Things like, you know, Chai Discovery, Nobla coming up with these like extremely good models for producing antibodies de novo, right? Huge breakthrough. But like, ultimately, the win for them is going to be like when it's approved in a human, and that might be another five years or something. Ark Institute putting out like the first time anyone has designed an organism from scratch, they designed a bacteriophage. It's a kind of virus that infects bacteria. Incredible, right? But like, just harder to evaluate. Like, how good is it? Like, you're not going to release it into the wild. And so, it's sort of like it's harder to evaluate, whereas like the IMO is just like super clean. And so I think that's one thing that we think about a lot is just like, you know, how do we get really clear benchmarks that we can pursue to measure whether or not we're doing a good job at science? I have an answer here. International Cancer Curing Olympiad. I like that. Should we start this? I think that would be great. We can give people a medal if they win. Let's get on it, labs. So, when the CEOs or the leaders of these companies make these statements about how we're going to cure all disease using AI in the next 10 years or 15 years or whatever timeline they give, are they doing that because they don't understand the bottlenecks? I mean, these are very smart people. So, what are they not seeing or are they just doing this as sort of a marketing exercise? Is this an attempt to get people excited about AI who might otherwise be freaked out about it? Why are they giving these projections? No, look, I mean, I think that they are reasonable people could disagree. There are lots of reasons why you could argue that like actually the models will get super smart and they will figure out ways to measure whether or not we're making progress before you run a clinical trial. And that will increase the iteration cycle, right? Like, there are reasonable arguments to be made about that, right? Like, you know, that we are just going to not do full clinical trials anymore. We'll just like use biomarkers. Like, that's not crazy. And that's one way that I could be wrong. And maybe in 10 years, we do have cures for all diseases. So, that's part of it. Like, obviously, there's part of it, which is that they want to hype the thing. Part of it is that, you know, does Sam Altman, like, really intimately understand, like, what it takes to go and manufacture, like, scale up manufacturing for a small molecule to put into the clinic? Like, probably not, right? So, there's a mixture. I don't think any of it's in bad faith. It's just people are very excited. There will be a little bit of a collision with reality at some point. We're going to see exactly where that is. But regardless, the future is going to be awesome. At this moment in 2025, how much do you think AI tools have changed the life of a working scientist? And how different do you expect that will be a year from now? I think that you'd be shocked to the extent that they have not yet. Scientists, in general, are extremely conservative people, because if you're running an experiment, you, like, never actually fully know, in biology, at least, you usually do not, like, fully understand, like, why the experiment works and why not. There are some things that you've inherited from protocols that you've run in the past, and where it's like, we do it this way, you could go and test it, but there are way too many things to test. So, you're just kind of, like, locked in in your methods, and it's what works, and you just want to do what works. And so, for that reason, like, biologists just adopt new methods slowly. I think most labs around the world are still probably doing science the way they've done it before, and probably will continue to do so for a while. And that's okay. You know, one place, I think, with coding, a lot of people are already adopting it, because in biology, historically, coding has been a big bottleneck. It's a huge unlock now that biologists who didn't know how to code can, like, do a lot of coding using cloud code, using opening eyes models, Gemini, etc. So, that's a huge unlock. I think that's going to see a lot of adoption quickly. Literature search, right? Like, being able to parse the immensity of the scientific literature, that's a huge unlock. That's going to get adopted very quickly, right? The tools, like, what we're building are, like, a little bit more frontier. Ultimately, people will adopt them when they see other people using them and getting great results. Sam, can we play a little lightning round game here with you? Yeah. We're calling this one overhyped, underhyped. So, we'll tell you something, and you tell us whether, in your scientific opinion, it is overhyped or underhyped. Right. You ready? Yeah. Vibe-proving. This is when AI systems go out and, like, write math proofs. Probably, just, if I have a forced choice, probably overhyped. It's great for, I mean, it's great as, like, a progress driver in AI. It's like, and we'll probably have not, you know, being good at it will probably have implications elsewhere. But is it itself that useful? I'm not sure. Robotics for AI lab automation. Robotics for automating AI labs? Yes, or for automating scientific labs. Robotics for automating scientific labs. I think appropriately hyped. It is going to be totally transformative. The technology is not at all there yet. There's a lot that we need to do, but, like, yeah, probably appropriately hyped. AlphaFold 3? That's an interesting one. I mean, I think that I would say probably, like, underhyped in that I think, like, all of the protein structure models, there's a lot of hype around them, but they're still probably, like, they're going to be extremely transformative. So, maybe I would say probably underhyped. It's a hard, there's a lot of hype around it, though, so it's a hard decision to make. Virtual cells. We heard from Patrick Collison this summer about what the ARC Institute has done with making a virtual cell. This is overhyped, but for a specific reason, right? Like, the models that they're building at ARC are awesome, the models, and they're doing similar things at, like, New Limit, Chan Zuckerberg, right? Like, many of these places, many of these great companies and great organizations are doing it. I think that, like, calling it a virtual cell, like, is a little bit, that's, like, a little bit over, that's overhyped, right? Like, ultimately, that kind of model models something, like, very specific. Like, actually building, like, a true virtual cell, like, being able to simulate a cell in a computer is an amazing goal. We are very far away from that. Quantum computing. I'm overhyped. Brain-computer interfaces. Um, I'm also, oh, man, this one's really hard. Um, I will, I'm going to say overhyped. I'm a huge believer in, in BCIs. I think, like, effective BCIs are the way that we imagine them in sci-fi are further out than people imagine. Even, like, Neuralink is making amazing progress. Yeah, Casey's got one in his head right now. Yeah. It's, it's on the fritz. Yeah. There are a lot of great people who are making progress there, but it's further out, I think, than people think. So, we're, we're nearing the end of the year. Uh, if we can put you in a bit of a reflective mode, what do you think were the top three AI-driven scientific advancements this year? Um, yeah, I think that, honestly, like, this year is the year of, has been the year of agents. This was the year when people discovered agents. And so, I, I do, like, you know, in good faith, have to put myself, I have to put us on that list. Also, with Google co-scientists, I mean, we're not the only people who are working on this. Um, uh, you know, Google has been doing a great job. There are a bunch of other people. So, AI agents for science, definitely. And then, like, generative design is just having a huge moment, right? So, the other ones would probably be, um, the work that Chai has been doing, the work that Nabla has been doing, um, and many others on de novo antibody design. I'm really glad you defined de novo earlier in the broadcast, by the way. It's come up a lot. Yes. Um, sorry, when I say de novo, I just mean, like, literally, you just, like, it generates it from scratch. You don't give it anything, right? You just, like, or you give it a target that you want to bind to, and it generates it from scratch. This is huge, because, like, basically, the promise that companies like Chai, Nabla, and so on are going after, um, is a world in which you can say, like, we know to cure this disease, we have to target that protein. You click a button, and you have an antibody that you can go and put in humans tomorrow. It's huge. It cuts out an enormous amount of what people had to do previously. So, that's a huge one. And the third one, I just think, like, what Brian He, Patrick Xu, and so on at the Ark Institute have done with, like, generating organisms, sorry, generating organisms from scratch. We can say it. We know what it means now. That's the important thing. This is our, like, Pee-wee's Playhouse Word of the Week. The de novo design of organisms, um, is it useful? I don't know. Is it awesome? Like, absolutely. It's so, it's such a big breakthrough. And, Sam, what should we be watching for next year? What are you excited about that may be coming down the pipe for 2026? Honestly, it is, again, going to be the agents that see an explosion. We are right now at, like, the beginning of that S-curve, and that is going to continue. Maybe a year ago, I would tell people that I thought in 2026 or maybe 2027 that, like, the majority of the high-quality hypotheses that are generated by the scientific community would be generated, like, by us or by, like, agents that are like the ones that we're building. And when I said it in 2024, I thought I was overhyping, right? I mean, I was just like, I need some hype. At this point, it may be real. I mean, I think 2026 would be ambitious for that. I mean, that's a huge, right, for the majority of the good hypotheses that come out to be made by agents, that's a huge leap. But, like, 2027, yeah, man, I mean, 2026 is going to be the year when we just see these agents start to, like, infiltrate everything, right? Infiltrate labs, infiltrate people's normal life. I mean, it's already happening. Cool. Yeah. Well, I look forward to it. Sam, thank you so much for giving us the science education that we clearly didn't get in school. Yeah, you've really given us some de novo things to think about. I appreciate that. Good. Thank you, guys. Thank you. We're edited by Jen Poyant. Today's show was fact-checked by Will Peichel and engineered by Chris Wood. Original music by Diane Wong, Rowan Nemesto, Alyssa Moxley, and Dan Powell. Video production by Soya Roque, Pat Gunther, Jake Nichol, and Chris Schott. You can watch this whole episode on YouTube at youtube.com slash hardfork. Special thanks to Paula Schumann, Pui Wing Tam, and Dalia Haddad. You can email us at hardfork at nytimes.com.