Skip to main content

Why Vlad Tenev and Tudor Achim of Harmonic Think AI Is About to Change Math—and Why It Matters

A few years ago, Vlad Tenev (also founder of Robinhood) and Tudor Achim noticed the rise of the community around an esoteric programming language called Lean that was gaining traction among mathematicians. That, combined with the rising power of LLMs, led them to found Harmonic to create mathematical superintelligence and along the way verify all safety-critical software.

Summary

Robinhood founder and CEO Vlad Tenev has teamed up with Tudor Achim to create Harmonic, an AI research lab focused on developing mathematical superintelligence. Their goal is to push the frontiers of human knowledge by creating AI systems capable of advanced mathematical reasoning and problem-solving. 

  • Mathematics as the foundation for reasoning and problem-solving across domains. Tenev and Achim believe that by focusing on mathematical capabilities, they can create AI systems with strong general reasoning skills that transfer to other fields. Just as humans who excel at math often perform well in other quantitative areas, they expect AI systems with advanced mathematical abilities to demonstrate superior reasoning in science, engineering and beyond.
  • The power of recursive self-improvement in mathematical AI. Harmonic’s approach uses Lean as a formal verification tool to objectively evaluate mathematical proofs. This allows for rapid cycles of reinforcement learning and self-play, enabling the AI to improve its capabilities much faster than in domains with less clear reward functions. The lack of an upper bound on mathematical knowledge means this recursive improvement could potentially continue indefinitely.
  • Synthetic data generation as a key advantage. While many AI companies face data limitations, Harmonic can generate vast amounts of synthetic mathematical data to train its models. This allows them to create training examples of increasing complexity, mirroring how humans learn math by building on simpler concepts to tackle more advanced problems.
  • Accelerating breakthroughs in mathematics and adjacent fields. Tenev and Achim are optimistic about AI solving major open problems in mathematics—potentially including Millennium Prize problems—within the next 5-10 years. They also see applications in theoretical physics, software verification and any field requiring advanced reasoning skills.
  • Redefining the role of mathematicians and researchers. As AI systems become more capable, the focus of human mathematicians may shift towards guiding and interpreting AI results, posing new problems, and deciding where to direct computational resources. This represents a fundamental change in how mathematical research is conducted.

Transcript

Contents

Vlad Tenev: Last I checked, there’s a 43 percent chance that the next Millennium Prize will be solved by AI or, like, human, with significant AI assist. I think that’s an underestimate. I mean, we could be lucky, and Larry Guth might be on the path to the Riemann hypothesis, which would be amazing. But I think that if the next one is solved by a human, it would probably have to be in the very near future. And for sure, the next next one will—will probably be significantly solved by AI.

Sonya Huang: We’re excited to welcome Vlad and Tudor to the show. We’ve had the pleasure of knowing Vlad for many years at Sequoia, but what many of you may not know about Vlad is that in addition to being founder and CEO of Robinhood, he’s also an enormously talented mathematician. Vlad and Tudor have teamed up to create Harmonic, an AI research lab with the goal of pushing the frontiers of human knowledge. Specifically, they hope to create mathematical superintelligence with the thesis that understanding math allows you to better understand and reason about much of the broader world. We’re excited to talk to Tudor and Vlad about Harmonic, about the ingredients that go into creating mathematical superintelligence, including synthetic data, reinforcement learning and self play, and when AI will win the IMO or a Millennium Prize, or even solve the Riemann hypothesis.

Pat Grady: All right, Vlad, Tudor, welcome to the show.

Vlad Tenev: Oh, thanks for having us.

Math is reasoning

Pat Grady: All right, so, you guys, you have this core belief that math is reasoning. And you have what might be a contrarian belief that the best way to train a model to perform well in math is to directly teach it math, versus allowing math to emerge as a property of scale, which is what a lot of the other foundation model companies are doing. Can you talk a bit about that core belief? Why do you need to teach the model directly math? And also, maybe, what does it mean that math is reasoning?

Tudor Achim: When we started the company, we had a really big focus on math, and maybe we can get to that later. But if you look around at all fields of science and engineering—well, almost all fields—math is really at their foundation. And math has essentially become the way that people understand the universe. It’s the way you model phenomena from black holes to atoms, it’s the way you design things in engineering.

And there’s a couple of reasons for that. First of all, it just happens to be the case that the universe is explainable by math, so you can write down a fairly compact set of symbols that explain things. But the other really important thing is that it’s a way to build these shared cognitive theories that are very objective and clear and transparent. And if you and I are discussing something that’s rigorous, we can write down a set of deductive rules, we can agree on what the ground rules are of whatever we’re modeling, and then from there we can derive conclusions.

And so with humans, what you see is that when people become very good at math, they tend to be good at other quantitative areas in science and engineering. And so our bet is that if you make a system that’s really good at math, you’re probably going to see the same phenomenon where it’s true it might not immediately write the world’s best history essays, but when you ask it to do something scientific or something in engineering, it’s just going to be really, really good at that. That’s why we started with math.

Pat Grady: And where is that boundary between, you know, “Help me with my math homework” and “Write a history essay?” There is some boundary that it’s hard for math to cross. What do you think the outer edges of what’s possible with a model with sort of math at its core? Where are those outer edges?

Vlad Tenev: I’ll give you sort of the non-AI perspective. So I studied math, and I was really good at math from when I was a little kid. And I remember there were always the seventh graders in math class that would raise their hands whenever the teacher would come up with something. It was always an abstract thing like, you know, the sideangleside of triangles. There’d be the annoying kid that was like, “When are we ever gonna use this?” And, you know, the teacher would kind of mumble a little bit and they’d be like, “Well, you know math, just like—you’re probably not going to use it soon, but it’ll make you really good at other things.” And, you know, the other kids were always skeptical of that, and I bought into it. And so I just kept taking more and more advanced math.

I went to Stanford and I majored in it. Then I went to grad school to do a math PhD. And my belief was that okay, if I just focus on math, then I’m going to get really good at problem solving, and business problems and other problems will be easy compared to solving these really tough, like, abstract algebra problem sets that I was banging my head against the wall for 10 hours every week trying to do. And I think it basically ended up being correct, right? It’s like I didn’t really pay attention to anything else. I took maybe, like, one computer science class, Intro to Computer Programming at Stanford, and five years later, when I became an entrepreneur, I found it really easy to pick up code. I found it really easy to pick up contracts. And of course, I’m no lawyer, but you could understand that stuff. I mean, the logical underpinnings are relatively simple compared to abstract algebra and analysis. So I think for humans at least, I consider myself an example of, like, math transferring to other very monetizable things. And I think for AI, my intuition seems to suggest that it should be the same.

Pat Grady: Yeah.

Tudor Achim: And you already see a little bit of evidence of this. You know, at this point, it’s an open secret in the industry that code, training on a lot of code data, leads to much better performance on reasoning benchmarks. So you can imagine what that’ll look like when you have incredible math datasets that encompass a lot more general types of reasoning.

Studying with the world’s greatest living mathematician

Pat Grady: Yeah. Yeah, that resonates with the idea that math teaches a human how to think critically, how to think logically, and that skill can be ported to a bunch of other domains. It stands to reason that that would work in AI, also. Vlad, you casually mentioned that you studied a little bit of math, and just for anybody who’s not quite familiar with your background in math, I believe you studied briefly under Terry Tao, who is perhaps the world’s greatest living mathematician.

Vlad Tenev: Yeah.

Pat Grady: And then one of the things you mentioned to us was that you still catch up with him every now and then, you have lunch when you’re in LA, that sort of thing. So I’m curious, when you have lunch with Terry Tao, what do you guys talk about? Do you get any—do you get me stock tips?

Vlad Tenev: [laughs] No. No, I’m not allowed to do that. Yeah, one of the—yeah, one of the unfortunate things of being a public company CEO in the financial space is I have lots of stock tips, but I can’t share any of them. I have to, like, keep them in turn—and I can’t even use them. I basically can’t trade anymore, which is unfortunate because I love trading.

But yeah, I—so to backtrack how I got to UCLA. So Terry Tao is a professor at UCLA, and I think what’s really amazing about him is the breadth of the work. So most mathematicians get very deep into a pretty narrow domain, and Terry can get very deep across dozens of domains. You know, he’s made contributions to number theory, combinatorics, harmonic analysis, applied math. He’s one of the leading Lean contributors at this point, I’m sure. He’s, like, formalizing his papers in Lean, and actually hopping on the community Zulip and engaging with students. And then he has a very popular blog.

And I think the way that he’s been able to do this is he’s just smarter than 99.999 percent of people, probably even more than that. And you have this picture of him at the IMO, the International Math Olympiad. He’s Australian, so he was on the Australian team, and it’s like five, six-foot-tall Australian high schoolers, and then Terry Tao, who I think was like five foot six. I mean, he’s literally like a third their height. I think he was the team captain. So from a very early age, it was very clear that he was, like, on another plane.

And I studied—I did my math honors thesis in undergrad under this professor, Larry Guth, who’s also really amazing. I mean, he actually had a recent result that came out that was a groundbreaking result in, I want to say, number theory or something about the Riemann hypothesis. But yeah, this result in non-AI math really was quite something. And he kind of suggested I look at UCLA. I was really interested in his field, and I ended up going there and being fortunate to study under Professor Tao.

But I should be clear. I am a dropout, and it’s amazing that I can claim that after grad school, but I will claim dropout status. So I only did one year of UCLA, so it was an intro to graduate-level analysis. Terry Tao taught my first year, which was pretty amazing. And one thing I remember was I was doing some reading with Professor Tao, and he gave me this book, and he signed it. And I think he signed it because he wanted to make sure I would give it back to him when I was done reading it, and little did he know that by signing it, he guaranteed he would never get that book back. And I bring it up every time I see him. I’m like, “Hey, you’re not getting that book back. It’s on my shelf next to all my other autographed first editions.”

What does the math community think of AI math?

Sonya Huang: What does the math community think of AI math? Are people split, or do people think it’s, you know, the path to the promised land and the way that we’re going to solve Riemann and everything else?

Vlad Tenev: I think it’s split. I think it seems to be split.

Sonya Huang: Okay.

Vlad Tenev: And there’s sort of like, the younger mathematicians, I think, are very proAI and proverification and tools like Lean, and I think the older folks are a little bit more skeptical. So not surprising. I think you see that in pretty much every field. I think that my guess would be that this will evolve. My mental model is something like chess, where at first there will be perhaps a lengthy period of humans plus AI assist, and that will lead to a lot of really good results, but over time, I think the AI will get better and better.

And you look at chess right now, and it’s sort of like if there’s a human assisting the AI, the AI would be annoyed by it. They would just want to just delete all of the input because it would only make the results worse. So I’m not sure if we’re going to get to that point. I think humans will at some point—I mean, they’ll need to guide the algorithms, but I think the kind of definition of what a mathematician will do will fundamentally change.

I was talking to one of my friends who’s a mathematician at MIT, and I asked him when we were first starting this, “What do you think?” And this is a young professor, like, very excited about the field. I was like, “Are you worried that you’re kind of in a field that is going to fundamentally change?” He’s like, “The field of math has always changed.” Back in the 1800s, mathematicians used to be kind of like in the royal court, and they would be glorified calculators. They would solve quadratic equations by hand. And of course, they were worried that when computers and calculators came out, the job would no longer exist. But mathematicians get to define what math is.

So I’m sure at some point it’ll be prompting and kind of guiding these AIs to solve problems. And I think—yeah, I think that’s going to be very huge. Even if an AI solves the Riemann hypothesis, a human will always be in the loop because the humans kind of pose the Riemann hypothesis to begin with.

Tudor Achim: Yeah, just to hop on that. I think there’s—like, in the future, you’re going to have a lot of compute resources dedicated to math, and the question will be a very human one, which is: by which procedure do humans decide where to direct all that reasoning firepower? And I think that’s going to be the job of mathematicians. They have to choose what do we work on? How do we interpret the results? How do we interpret failures to find answers? That kind of thing.

Sonya Huang: Do you think an AI math system can solve Riemann? Or where is the ceiling, do you think?

Vlad Tenev: I think that it should be able to solve it or, you know, prove that it’s undecidable. That would also be an interesting result. Yeah, I think the—if we think about what a great mathematician like a Terry Tao, for instance, is capable of doing, they’re able to synthesize lots of papers, lots of frontier results and learn from them in a way that the other top human mathematicians can’t. And kind of find connections between these things and sort of use them to create new and more complex theories. I mean, that’s exactly kind of how the system we’re engineering works, and that’s what computers are great at. And AI models are great at synthesizing large amounts of information, finding patterns, recursively self-improving. I think now on Metaculus, last I checked, there’s a 43 percent chance that the next Millennium Prize will be solved by AI or, like, human, with significant AI assist. And I think that’s an underestimate. I mean, we could be lucky, and Larry Guth might be on the path to the Riemann hypothesis, which would be amazing, but I think that if the next one is solved by a human, it would probably have to be in the very near future, and for sure, the next next one will—will probably be significantly solved by AI.

Recursive self-improvement

Pat Grady: One of the things that you said I want to hit on, which is this idea of recursive selfimprovement, because it seems like in the world of AI, there are—if you were to draw a spectrum of human only to AI only, and then human in the loop is sort of the spectrum in the middle, from lots of human, a little bit of AI, to lots of AI, a little bit of human, one of the things that is interesting about Harmonic, at least the way I understand it, is because of Lean, you can encapsulate math and code. Because of formal verification, you can objectively determine whether things are right or wrong, which means that you have an objective reward function that you can use with self play to have very fast cycle times with reinforcement learning, which means that the progress of your model has a chance to be extremely fast, because there are no humans in the loop with that recursive selfimprovement. Like, objective function is clearly defined. You can do self play to just make the model better and better and better and better and better. Which is not something that we see in a lot of domains of AI. Most domains of AI, it’s a lot messier to kind of get the cycle time on improvement. Can you just talk through the system a bit, a little bit of what I described? Like, what feeds into your model, what governs the rate at which it can get better? Because it seems like something that will be able to get better at a pretty quick rate.

Tudor Achim: Yeah, I’m happy to cover that one point before going to that, is just that I think the most interesting part about this is that, you know, there are other areas where recursive selfimprovement can work. For example, again in those board games like AlphaGo. But I think what people don’t realize—well, what a lot of people don’t realize is that in these, let’s say, perfectly observed zero-sum games, you improve recursively just by playing against yourself, but you hit an optimal strategy.

Pat Grady: Yes.

Tudor Achim: So at that point, it doesn’t matter what system you have, it won’t do better. The most exciting part about math is that there is no upper bound. So you’re just going to keep putting compute in and it’s going to keep getting better, and there’s no end to it. And so when we talk about do we think AI can solve a Riemann hypothesis or get a Millennium Prize, like, those are very human milestones. And I think the real question is, like: will it ever stop? I mean, yeah, because it clearly will get there. And I think we’re going to end up solving problems that are much, much harder than the Riemann hypothesis, which we haven’t even conceived of yet, because it’s almost like beyond us to write down such a hard problem. But coming up …

Vlad Tenev: Have you guys ever seen that Minecraft video of the AI beating Minecraft in 20 seconds?

Pat Grady: No.

Vlad Tenev: I think that’s a good analogy. It’s like, you know what Minecraft is, how a human would play it, and then the AI beating it in 20 seconds is just, like, incomprehensible. Like, you can’t even kind of grok what’s going on in the video feed.

Tudor Achim: Yeah, but I think if we just talk about how Harmonic works, you can just think of it as there’s a collection of agents that are essentially trying to solve problems. And it’s true, because we use Lean, we’re able to check whether our answers are correct, and thereby derive a variety of training signals that we use to improve the system. But just to be clear, the use of Lean just lets you verify things. Lean doesn’t itself tell you whether you’re getting closer to the answer or whether you’re getting smarter or not. It’s just telling you whether it’s correct or not. So there’s a lot of open scientific challenges to making it get better quickly.

What is Lean?

Pat Grady: And can you just say a word about what Lean is, just in case people aren’t familiar?

Tudor Achim: Yeah, totally. Lean is just another programming language, a really great one, created by Leo de Moura.

Vlad Tenev: The best programming language.

Pat Grady: [laughs]

Tudor Achim: We might all be writing—or the AIs might just be writing Lean in the future, but the idea is that the mathematical statements are encoded in the type system of the language. So just very simply, you have functions in Lean, and the input types correspond to the assumptions of the mathematical theorem, and the output type is the conclusion. And the point of Lean is that if you write a program that implements that function and it compiles, that means you can derive the output type from the input type, which in turn implies that you can conclude the conclusion from the assumptions. So that’s really the fundamental—that’s how you use Lean for math.

Vlad Tenev: And I think one thing that’s super interesting about Lean is if you look at Leo De Moura, the creator who’s at Amazon AWS now, he’s not a mathematician, and he wrote this as a software verification tool. So he has the belief that in the future, software will be verified, and the existing tools, things like Coq and Isabelle, which are kind of multi-decade-old software verification tools, are just not good, you know, and they’re frankly unusable. The experience for a developer is poor, and so he wanted to create a better software verification tool in the hopes of if he builds something better, more people will use it and we’ll have better software, which is a super noble goal in its own right.

But then what he didn’t realize was software verification, all it is is just proving that software has certain properties. And this thing became very popular in the math community, and you had, like, thousands of mathematicians and math students building up an organic library called mathlib, which you can think of as kind of like the largest math textbook. Open source, it’s on GitHub, and it’s just growing at a pretty fast clip. And the usage of Lean for math, I think, to some degree has surpassed anyone’s expectations. It might be more than the usage for verified software at this point. And that might change as time goes forward. And with AI.

Why now?

Sonya Huang: One of the questions we always ask is why now? Because reinforcement learning has existed for a long time, math has existed for even longer, and it seems like math is really hitting an inflection point. You guys have chosen to start Harmonic at this point in time. Why now?

Tudor Achim: Oh, I mean, there’s two really good reasons why now that we’re excited about. So first, you know, one is just that the AI systems have gotten better in an interesting way. So I was actually talking with a friend, a close friend, about RL for theorem proving back in 2015 and 2016. And one issue back then was that there wasn’t even a great notion of an AI system that could predict something in an infinite action space.

So in Go you can place a piece somewhere, and it’s either a black or white piece, but in math you can really do anything. Like, you can just generate any next step. And so we didn’t have great systems to do that. So autoregressive language models have gotten pretty good. So that’s one thing that makes it possible. I’m talking on the timescale of like a decade here, but that’s really important.

And the other thing that’s kind of crazy is that Lean has gotten really good. So if you had told a mathematician 20 years ago that a large fraction of the field would be excited about formal methods in math, they might have thought you were crazy, because back then, formal methods were really isolated to formal logic or certain types of graph theory. Like, if you guys have heard the four-color theorem, that was one big success for formal math. But what’s changed is that Lean is so flexible and so exciting for people that they’ve contributed this thing called mathlib. So now there’s a lot of body of knowledge that you can build on to prove things. And so the combination of AI starting to even be a possible fit for this problem, plus Lean working really well—and Lean 4 was only released officially in September, 2023. So those two things happening together really made it the right time to attack this.

Synthetic data is the fuel for the model

Pat Grady: Can you say a word about data, and specifically synthetic data and what it is that fuels the model that you guys are building?

Tudor Achim: Yeah, so synthetic data is the fuel for the model. There’s a—there’s an amazing resource called mathlib. It’s that open source repository. So that’s a lot of human written Lean, and it’s written in a way that’s very general and compact. So they’re really proving advanced theorems, right? It’s not necessarily the best fit for problem solving. And so as a result, almost the only data you can use for this is synthetic data you generate yourself, because that original data is not very applicable. So it’s kind of a data-poor regime compared to most areas of AI.

And so that process that I described, where the agents themselves are trying to solve problems and thereby generate training signals, that’s the primary way in which you can get data. And the other issue is that, you know, you have to progress through levels of intelligence. So you’re not necessarily proving the Riemann hypothesis upfront, you’re proving really simple things, but then you kind of amplify yourself recursively throughout the process.

Sonya Huang: Turns out there’s not as much math data on the internet as cat videos. [laughs]

Tudor Achim: Unfortunately not. Unfortunately not.

Pat Grady: Well yeah, it’s interesting, though, because there’s the data wall that the foundation model, you know, the general purpose foundation model companies are running into. And at this point, they’ve exhausted what’s available on the internet. And if you can generate most of the data that’s required to train, that’s another advantage of having math at the core, versus hoping for math as an emergent property of scale.

Vlad Tenev: Yeah, and I think the data wall kind of manifests itself in two ways. One is, just like you said, we’re out of internet data.

Pat Grady: Yeah.

Vlad Tenev: The other is the actual internet data quality that’s out there, you can think of that as providing a ceiling to how smart these models can get, because if you train on the cat videos and all the nice Wikipedia content and the internet content, it’s an open problem how to get something that’s significantly smarter than that. And so you do need to get into some kind of self-reinforcing self-play regime, in our opinion, to get to a point where you can surpass the ability of human mathematicians and researchers at multiple tasks.

And so I think, in many ways, the path, it’s inevitable that it takes kind of the AlphaGo to AlphaZero approach, and we learn how to make these models create the vast majority of their data, and have the data actually increase in complexity as these models continue to iterate.

I think the great thing about math is there’s a simple path to doing this. You can basically measure the complexity of a math problem and how difficult it is by how many lines of Lean it takes to solve. So you can actually look at the complexity of a system, and a lot of problems are solved by breaking it down into smaller chunks and actually solving those chunks. And if you kind of think about how that works, the smaller chunks are then more manageable because there are sort of like, fewer lines to solve than the big one. So if you get really good at that and then you get good at solving the chunks, then you can kind of train your model to do better. And as you kind of keep turning the gears on that, the model gets better at solving incrementally harder and harder and more complex problems.

I think that works very well in math, because it mirrors how we solve math, like, on pen and paper.

Pat Grady: Yeah.

Vlad Tenev: And we’ve been able to start with, like, simple axioms and build up just giant complex structures. Maybe the Riemann hypothesis would be hundreds of pages, if not more, to solve. Fermat’s last theorem was, I think, 200 pages. Very, very complex. So yeah, I do think eventually you’ll get to a level where you’ll be able to solve these things, and the math is to some extent like the original synthetic data.

How fast will your model get better?

Pat Grady: Yeah. What determines the rate at which your model can get better?

Tudor Achim: The rate at which it can get better? Well, I think the highest level one is energy. So the more energy you can put into it, the more attempts can happen in parallel, which means you generate data faster. So I mean, there’s no rate-limiting step—sorry, there’s a bunch of rate-limiting steps, but there’s no, like, fundamental constraint on how far—how fast it gets better. So it’s really just about how much compute you put in.

Vlad Tenev: I think it’s also—I mean, there’s still a lot of unsolved problems in this field, right? Like, we benefit a lot from core theorems that have been proved in the past. And, you know, there’s—if you think about, like, competition math contexts, there’s theorems that every student would just learn and use, like, you know, AM-GM inequality, things of that nature. And so to some degree, mathlib is incomplete. There’s very little content about geometry, for instance. And it’s very theoretical and abstract. And so a limiting step is what’s in mathlib.

And, of course, at some point, the models have to solve the problem of creating new theories and new structures and kind of like, expanding to other domains, and getting really, really good at formalizing things that haven’t been formalized by humans. I think that’ll be a big unlock, and that’ll certainly happen within the next several—several years. You’ll be able to say, “Hey, here’s just like this situation.” It could be as simple as, like, a baseball team, and they’re, like, throwing balls back and forth to each other, and, you know, systems would be able to kind of, like, auto-formalize that and turn that into Lean code on the fly. And I don’t think we’re quite there yet to the point where that’s reliable, but when it does get reliable, I think that’ll be a really big unlock.

Exploring the frontiers of human knowledge

Sonya Huang: If everything goes right, what do you think Harmonic becomes?

Tudor Achim: Well, our mission statement is to explore the frontiers of human knowledge, so it’s very important to us that the things we produce are correct and useful to humans. So I think in the best case, you know, we’re able to build a tool that a lot of mathematicians use to close all the Millennium Prize problems and to go far beyond that. I think that would be a great service to humanity. There’s also other areas of commercial application for the software. So the dream for software engineering is to be able to just check that code is correct. To do that, you need to have a very good model of how code works. You need to be able to understand how the libraries work, what they promise, that kind of thing. So you can imagine a future where safety-critical software is proven correct, general software is proven correct, and the way software engineering is done can change as well. So there’s just a lot of applications if you can make a system that’s very good at math reasoning and very good at checking its reasoning. Yeah, really, we think there’s a lot of applications.

Vlad Tenev: Yeah. And I think the—I mean, math and software are two fairly obvious ones. I think software engineering as a discipline is changing really quickly. I’m sure you guys are seeing all the reports of people doing crazy things with Cursor and Claude 3.5. I think in the future, software engineering will be less about, like, reviewing and collaborating on code as an artifact, and will be more about collaborating on specs. What do we want the code to do? Can we be more rigorous about that?

And I think that’s where verification will become a bigger thing, because as the cost of software goes to zero, the cost of verified software will also go to zero, and suddenly this thing that was very impractical and expensive because you need specialist humans to do it, will just accelerate dramatically with AI So I think you look out 5, 10 years from now, the vast majority—I mean, if we progress along the capability curve as we have been, the vast majority of software written will be verified and provably correct. And I think that’s a really exciting future.

I also think, like, on the more theoretical side, it’s not just math, but physics is essentially math. Theoretical physics is one of the main ways the frontier of math gets implemented. And I think it would be amazing to me personally to accelerate some of the fundamental physics research at the frontier, and really just develop an understanding for why the universe is the way it is, why the laws of physics exist, and also help figure out experiments to test those. So I would be very proud if we contributed to that effort.

Sonya Huang: And do you think you’ll mostly be contributing to math and math-adjacent areas like physics and software engineering, or do you think anything that involves reasoning is in spec for Harmonic?

Vlad Tenev: Yeah. I mean, I think we try to be focused on the things that—we’re still a small company. Over the long term, I think if you believe math is reasoning—and we do—then yeah, if we get really, really good at math, and computer science is a very natural analog, then yeah, I mean, anything is in scope for those models. Even the history essay, I think.

Tudor Achim: We’ll see.

Sonya Huang: [laughs] Tudor’s not sure on the history essays.

Vlad Tenev: Yeah, I really enjoyed writing history essays, even though my parents were like, “You know, humanities, language arts, just ignore all that stuff.” But I think my math skills led me to write great history essays, too. So hopefully Aristotle will be no different.

Pat Grady: One day.

Vlad Tenev: Aristotle wrote some great historical commentary.

Sonya Huang: You are truly a polymath.

Vlad Tenev: Yeah. I mean, Poetics is—if you’ve read Poetics, it’s …

Tudor Achim: [laughs]

Lightning round

Sonya Huang: Should we wrap it up with some rapid fire?

Pat Grady: Let’s do it. Lightning round.

Sonya Huang: Okay. In what year will you win the IMO?

Vlad Tenev: [laughs] What do you think, Tudor?

Tudor Achim: Soon.

Pat Grady: All right. 2025?

Tudor Achim: Soon.

Vlad Tenev: Maybe 2024.

Pat Grady: All right, we’ll sign you up for 2024. All right. How about the Millennium Prize?

Vlad Tenev: Ooh, that’s a tough one. I would guess 2029.

Pat Grady: 2029?

Vlad Tenev: Yeah.

Pat Grady: Okay. I heard 2028.

Vlad Tenev: Is that what they’re—yeah, I guess it’s fully AI-unassisted Millennium Prize, or AI-human hybrid?

Pat Grady: Well, how about what do you think for hybrid?

Vlad Tenev: Hybrid, I could see 2028.

Tudor Achim: Are we talking an easy Millennium Prize or hard?

Pat Grady: [laughs]

Vlad Tenev: Yeah. Is it, like, …

Pat Grady: Easy Millennium Prize.

Vlad Tenev: NavierStokes might be 2026. Riemann hypothesis, I’ll give you 2029.

Pat Grady: All right. All right. There we go. Good.

Sonya Huang: Given we can’t even do arithmetic today with LLMs, that’s pretty amazing. When do you think we’ll have human- or superhuman-level reasoning more broadly defined?

Vlad Tenev: I think to some degree, if you define it as something that can reason and solve math problems in excess of any human, like something that Terry Tao would—you know, would give Terry Tao a run for his money, I think we’re a couple of years away, but I think the models within the next year will get to probably like 99.99th percentile.

Sonya Huang: Would Terry agree with that?

Vlad Tenev: I think so, yeah. I don’t know. You’d have to ask him, but I think he would agree with that.

Pat Grady: One of our favorite questions is: who in the world of AI do you admire most? And we’ll modify it slightly for you guys. Who in the world of AI or math do you admire most?

Vlad Tenev: I like von Neumann. We were just talking before about von Neumann’s biography. I think what I find really interesting about him was he started as a mathematician and he was discouraged. I think his father, who was like a Hungarian businessman, was trying to discourage him from doing math because it wasn’t very monetizable. And so he got his friend who was a great mathematician to try to talk him out of it. But the friend came back and he’s like, “I can’t do it. This guy’s too good. It would be just a disservice to society if he didn’t use his talents for math.”

And then he pioneered computer science, and the von Neumann machine was the blueprint for all modern computers. He contributed to the Manhattan Project, which is a little controversial, but very, very practical and impactful. And, you know, created probably the canonical text in game theory as well. So yeah, I think it’s pretty amazing. Also, a fellow Eastern European. Well, some people debate whether Hungary is in Eastern Europe.

Tudor Achim: Yeah, it’s an interesting question. I think. I definitely admire almost all scientists and mathematicians, but I think that if you’ve heard of the mathematician Leibniz, what was kind of shocking to learn during the course of working on this company is that, so Leibniz was competing with Newton to create calculus, and Newton’s formulation went out, but Leibniz was basically there.

But one thing people don’t know is that Leibniz also had a lot of other work, and one piece of work that is just incredibly prescient—keep in mind, this is the late 1600s—he created this thing called—an idea called the “universal characteristic,” which is essentially the notion of having a deductive language, an automated procedure to deduce things using that language, and an encyclopedia of work in that language that you can build on to derive things.

And so the amazing thing to me is that this thinker hundreds of years ago essentially predicted what would be happening in 2024, and it seems that the only thing that was required was having AI get a little better and having computers that can do something like Lean, right? And I think it’s just incredible to have a human being predict that with no concept whatsoever of what’s going to come later, but to understand that that’s such a fundamental thing that we’re going to end up working on that hundreds of years later.

Pat Grady: Awesome. Thank you, guys.

Sonya Huang: Thank you.

Vlad Tenev: Thanks for having us.

Tudor Achim: Thanks for having us.

Mentioned in this episode

Mentioned in this episode:

  • IMO and the Millennium Prize: Two significant global competitions Harmonic hopes to win (soon)
  • Riemann hypothesis: One of the most difficult unsolved math conjectures (and a Millenium Prize problem) most recently in the sights of MIT mathematician Larry Guth
  • Terry Tao: perhaps the greatest living mathematician and Vlad’s professor at UCLA
  • Lean: an open source functional language for code verification launched by Leonardo de Moura when at Microsoft Research in 2013 that powers the Lean Theorem Prover
  • mathlib: the largest math textbook in the world, all written in Lean
  • Metaculus: online prediction platform that tracks and scores thousands of forecasters
  • Minecraft Beaten in 20 Seconds: The video Vlad references as an analogy to AI math
  • Navier-Stokes equations: another important Millenium Prize math problem. Vlad considers this more tractable that Riemann
  • John von Neumann: Hungarian mathematician and polymath that made foundational contributions to computing, the Manhattan Project and game theory
  • Gottfried Wilhelm Leibniz: co-inventor of calculus and (remarkably) creator of the “universal characteristic,” a system for reasoning through a language of symbols and calculations—anticipating Lean and Harmonic by 350 years!