Skip to main content

Phaidra’s Jim Gao on Building the Fourth Industrial Revolution with Reinforcement Learning

Jim Gao convinced his bosses at the Google data center team to let him work with the DeepMind AlphaGo team. The initial pilot resulted in a 40% energy savings and led he and his co-founders to start Phaidra to turn this technology into a self-improving infrastructure product. 

Summary

Phaidra co-founder and CEO Jim Gao discusses the application of reinforcement learning to optimize industrial systems, drawing from his experience leading DeepMind Energy at Google. Gao’s insights offer valuable lessons for AI founders looking to tackle complex real-world problems using advanced machine learning techniques.

  • Reinforcement learning can drive significant improvements in complex industrial systems. At Google, Gao’s team achieved a 40% energy savings in data centers using reinforcement learning. This success demonstrated the potential for AI to optimize complex systems beyond human expertise, discovering novel solutions that even experienced engineers hadn’t considered.
  • The key ingredients for applied reinforcement learning are objective functions, actions, and constraints. Gao emphasizes that as long as a problem can be mapped into this framework—essentially a constrained optimization problem—it can potentially be solved using reinforcement learning. This insight helps founders identify suitable applications for RL technology.
  • AI creativity is a powerful differentiator from traditional automation. Gao argues that AI’s ability to discover new knowledge and solutions, rather than simply automating existing processes, is its most transformative aspect. This “AI creativity” is particularly valuable in complex domains where human intuition is insufficient but data is abundant.
  • Turning technology into products is crucial for real-world impact. Gao’s experience at DeepMind taught him that to make a significant impact, cutting-edge AI technology must be transformed into usable products. This insight led to the founding of Phaidra and underscores the importance of productization for AI startups.
  • The “AI readiness journey” is a critical consideration for industrial AI applications. Many potential customers are at different stages of data infrastructure maturity. AI founders must be prepared to help clients progress through stages of sensorization, data storage, cleaning, and accessibility before implementing advanced AI solutions.

Transcript

Contents

Jim Gao: A lot of times, like, when we talk about AI, both in the Valley and elsewhere, I think there’s a conflation between AI and automation, right? Like, AI can absolutely automate things. There’s no doubt about that, right? Especially, like, routine things, right? But I think that honestly undersells the real promise of AI, right? I think the real promise of AI is what—Demis the CEO of DeepMind calls, like, “AI creativity,” right? It’s the ability to acquire knowledge that did not exist before, right? And I, of course, experienced this firsthand. The reason why I’m such a true believer in the technology is because again, I was the expert who helped design the system, but this very AI agent that we created is telling me new things about the system that I didn’t know about before, right? And that’s a very, very powerful feeling.

Sonya Huang: Hi, and welcome to Training Data. Please welcome Jim Gao, founder and CEO of Phaidra. Jim was previously the leader of DeepMind Energy, one of the first and only AlphaGo-style reinforcement learning applications in the wild. DeepMind Energy used reinforcement learning to manage Google’s data centers and drove some staggering metrics, including 40 percent energy savings. We’re excited to ask Jim about reinforcement learning in the industrial world, and to learn more from him about what other real-world applications are poised to be transformed next by deep reinforcement learning.

What is Phaidra?

Sonya Huang: Thank you so much for joining us. Maybe before we get started, we’re going to spend a lot of time today talking about your DeepMind Energy journey, but maybe can you give everyone one or two sentences on your background and what you’re building?

Jim Gao: Yeah, of course. So Phaidra is an AI company, of course. Fundamentally, we are an AI automation company. So what we do is we use a type of AI known as ‘reinforcement learning’ to directly control and operate our customers’ very large, mission-critical industrial facilities. So in practice, these AI agents, they act as virtual plant operators, virtual members of the plant operations team.

“Reinforcement learning plus data centers equals awesome?”

Pat Grady: Let’s go back in time and talk about the journey that led to this journey. And I believe that you once sent an email with the subject line, “Reinforcement learning plus data centers equals awesome.”

Jim Gao: Question mark.

Pat Grady: Question mark. Yes. “Awesome?” Sorry, sorry. Let me say it differently. “Reinforcement learning plus data centers equals awesome?” Can you tell us, who did you send that email to? Why did you send that email? What was on your mind at the time? And then, of course, what did that lead to?

Jim Gao: Yeah, of course. So the reason why there was a question mark is because it was generally an unknown if the combination of reinforcement learning with industrial facilities would actually be awesome. So that was an email that I had sent to a person named Mustafa Suleyman, who would later become my boss at DeepMind. And really, the impetus was something called AlphaGo.

So to set the stage properly, I had been experimenting as part of my—famed—the 20 percent time at Google with machine-learning technologies. And it was actually a very specific course, “Introduction to Machine Learning” by Andrew Ng on Coursera, that had just come out. This is back in 2013, mind you. I think I was, like, the second cohort or something. And that class has completely changed my life. I taught myself how to program, and just started tinkering around with machine learning on the side. It was a very interesting technology.

Pat Grady: And your background was mechanical engineering and environmental systems type thing?

Jim Gao: Yes, that’s absolutely right. So my responsibility at the time was to one, help Google design and operate their very large data centers. And once these very large data centers, which consume enormous amounts of energy, were built, we of course shifted our focus to operating these complex industrial systems in the most energy-efficient way possible, because they use billions of dollars in electricity, right?

So that was kind of the background. I was already tinkering around with machine-learning technologies on the side to analyze the enormous amounts of data that Google’s data centers were generating. In 2016, AlphaGo came out, and I was one of hundreds of millions of people around the world watching. It was like 3:00 am in the Bay Area or something, and I found it absolutely captivating, and to the point where I sent an email to Moose describing this idea that if DeepMind could beat the smartest, most intelligent people in the world at complex games like Go, then surely we can train these same AI agents to play a very different game that I’m familiar with called “Let’s optimize the PUE, the power usage effectiveness, of Google’s data centers,” right?

So that was the context for that email. And I remember internally the way I pitched it to Google’s leadership—so specifically, Joe Kava, who leads Google’s data centers and Moose was I showed a picture of a Go board on one side and a video game controller, like an Xbox controller on the other, and I’m like, “Look, there are objective functions that we’re trying to minimize or maximize. There are concrete knobs and levers, so actions that we can control. There are constraints that we have to stay within. And all of this happens within a very measurable environment. I think reinforcement learning and operating large, complex industrial systems are actually one and the same thing.” So that was the original kernel of insight, I guess, that inspired it all.

The three key ingredients for applied reinforcement learning

Pat Grady: And I know Sonya has accused me of going rogue with some of the questions we ask here. I’m gonna go ahead and go rogue for a minute.

Jim Gao: Already? It’s been like one minute. [laughs]

Pat Grady: We’re gonna come back. I don’t want to skip the story. This is a brief diversion. Bear with me. The three things you mentioned that allowed you to see the parallel between reinforcement learning and control systems, or control theory: objective function, actions, constraints.

Jim Gao: Yeah.

Pat Grady: Are those the three key ingredients for where reinforcement learning can be applied to real-world systems?

Jim Gao: Yes, absolutely. That is 100 percent how we think of it, right? You know, the reinforcement learning systems, they need, like, KPIs to optimize for. They need to know how good or bad an action is, right? They obviously need things to control, and they need to know what are the constraints they have to stay within. So really what we’re saying is, as long as we can map the problem we’re trying to solve into a reinforcement-learning framework, which really, from a mathematical perspective, what we’re saying is we’re solving a constraint optimization problem, right? If you can map the constraint optimization problem, if you can define it and map it to the underlying data, then it should be able to be solved using reinforcement learning, right?

So that’s very much the lens through which we look at things at Phaidra as well. And, you know, to take it one step further, you know, we often talk about how reinforcement learning and, you know, controls and optimization are like two wildly different fields historically that have somehow independently converged to the same area, right? Like, they’re two very similar concepts. Well, we’ve been calling them by different names this whole time. So you’ve had, like, almost these independent evolutions, right? Different ways of tackling the same problem. And Phaidra is really kind of the intersection of both of these.

The email to Mustafa

Pat Grady: Okay, let’s get you back onto the story. So you sent the email—sorry for the diversion. So you sent the email to Mustafa, and then what happened?

Jim Gao: Yeah, so we sent the email to Mustafa. Two weeks later, Moose had actually flown out to Mountain View, where I was working at the time on Google campus with a team of DeepMind folks. And we actually started mapping out exactly, like, how reinforcement learning could be used to control and optimize Google’s data centers.

So that actually kicked off the original partnership between Google and DeepMind around the application reinforcement learning for the data center work. It was very, very fascinating, but most importantly, it’s actually also how I met one of my two other co-founders, right? So Veda was one of the original engineers on the AlphaGo project. So he had gone to—you know, he went to Seoul, South Korea, right? And, you know, he actually got to meet Lee Sedol and Larry Page and all—you know, all this fun stuff. And after AlphaGo, he came back to the States and—you know, or rather to the UK, and he was wondering, “Well, what is my next big thing going to be?” And I managed to convince Veda, like, “Hey, what if we applied self-learning frameworks like AlphaGo to control and optimize Google’s data center?” So that’s actually how I started working with my co-founder, Veda.

Sonya Huang: Did people think it was going to work? Or was it like, “This is a crazy moonshot. Let’s just try, but who knows?”

Jim Gao: I didn’t even know if it was going to work. Conceptually, it made sense in my mind, right? I’m like, “Hey, operating a data center is just a different game to play, right?” And there’s all kinds of different games in the industrial world, right? Maybe the game is maximize energy efficiency. Maybe the game is minimize water consumption. Maybe the game is maximize the yield of a factory, right? But there’s all these games that we’re constantly playing, right? So in my mind, it made sense.

But to answer your question directly, no, I had no idea if it was going to work. I still vividly remember to this day when we turned on the AI system and we watched the energy just drop. And it was so surprising for two reasons. Number one, well, we had designed the system. I played a role in designing that very mechanical system, right, that the AI was now controlling and optimizing. So in theory, I’m literally supposed to be the subject matter expert who knows everything about these systems, but the AI is teaching me things that I didn’t know about the system I helped design in the first place, right?

And two, the moves that the AI was making were just very counterintuitive, right? Like, when we looked at the decisions that were coming out, you know, we looked—the plant operators and I, you know, we were sitting in, you know, a giant cornfield in Iowa where, like, Google likes to put its data centers, and we were looking at the decisions, and we thought to ourselves, “There’s no way this is right. Like, this AI sucks. It learned the wrong thing. But we’re here anyway, so let’s try what the AI is saying.” And we tried it and it worked. And we just—we saw the energy plummet. So I think that was kind of when I became a believer in this technology, right? That fundamentally, this technology is creative, it helps us discover new knowledge that didn’t exist before from raw data.

Pat Grady: Was there a performance trade off, or was this just straight up Pareto gain? Like, performance held …?

Jim Gao: That’s a great question. No, it respected exactly the same constraints as the plant operators and engineers had already put in place. So this is pure gain, respecting exactly the same temperature profile, exactly the same constraints around how quickly you can turn on and off a chiller, minimum pump VFD speeds, all that sort of stuff. So this is pure optimization, pure gain. Which I think is one of those crazy things we don’t really expect. Like, usually when you think about energy efficiency, for example, in the world that I come from, people usually think about expensive CapEx, like, “Oh, we gotta rip out the chillers. We gotta buy a bunch of new chillers from Johnson Controls and Trane or whatever, and then we have to install them.” So they’re like hardware efficiency gains, right? But you don’t really think about, like, pure software, like data-driven efficiency gains, right? And I think that’s part of what was surprising for us.

Industrial control systems

Sonya Huang: Can you walk us through the before and after, maybe before, what you all implemented? Was this industrial control systems? Was this manual plant operators turning knobs? How did this work before and then after?

Jim Gao: Yeah, it’s a great question. So let me set the stage for—you know, for folks who are not as familiar with, like, large industrial facilities, right? So they’re very—modern industrial facilities are very, very complex, right? There’s all kinds of machines that people are operating and controlling, right? So I often tell folks to do, like, a simple thought experiment.

So imagine you have just 10 machines you’re controlling. So say they’re like pumps, right? And each one of those machines has 10 possible set point values, so 10 modes associated with it. So think something like 10 percent pump speed, 20 percent pump speed, 30 percent pump speed, et cetera, right? Then in this very simple toy example, you have 10 raised to the 10, or 10 billion different permutations for how you can operate your toy system, right?

So then the question becomes: well, at any given point, what is the most optimal way of operating your toy system? And by the way, these are dynamic systems, right? So the IT load is changing, the weather is fluctuating, right? You know, the people operating these systems are changing, the pipes are corroding, the heat exchangers are fouling, right? So the point is, these are very complex dynamic systems. Real-world systems have a lot more than 10 machines, and each machine has a lot more than 10 set points. So you can start seeing why technologies like AlphaGo, which managed to navigate immense complexity, are helpful here. It also helps explain why there’s often so much room for optimization in the first place. Because there’s so much complexity, right?

Like, if you think about the total action space, all the possible actions within a modern data center, for example, because of risk averseness, but also because of hard-coded rules and heuristics, we’ve only ever explored, like, 0.00001 percent of all the possible ways that you could operate that system. So then the question becomes: what is in this 99.99999 percent of the action space we’ve never explored? Surely there are more optimal ways of operating the system than what we’ve done historically. So it’s kind of an intuitive explanation, hopefully, of why there can be such large efficiency improvements in the first place that are undiscovered. And the way that we operate these facilities is constrained by a mixture of hard-coded controls logic, right?

So don’t get me wrong, these are automated systems today already, right? They’re just not, you know, automated intelligently, I would argue.

Sonya Huang: Yeah.

Jim Gao: And, you know, there is a healthy mixture of human intuition as well, where we have people like myself or plant operators who are constantly monitoring the system, who are, like, nudging the system, adjusting things, or adjusting the rules for that system, the constraints that the system has to operate within. But fundamentally, human intuition plus hard-coded controls logic is still limited when you talk about this degree of complexity, right?

The results at Google

Sonya Huang: Yeah. Can you talk to us about the key results? So you saw the energy levels drop immediately but, you know, what results were you able to drive for Google?

Jim Gao: Initially, so there’s two types of results, right? You know, for Google in particular. There was results from the pilot. So in 2016, we released—we announced, like, the results of the pilot, right? Now the pilot was done on a couple of data centers, but fundamentally, it was not an autonomous control system, right? So what I mean by this is it was the AI generating recommendations for human experts like myself to manually review and implement. And of course, you know, we didn’t want to jump straight to taking our hands off the steering wheel because it’s a new, novel technology. But also, no one knew at the time, is it even possible to use AI from the cloud to control big-ass infrastructure, right?

So step number one was do the pilot. The AI generated recommendations. That’s where we saw, like, really steep, like 40 percent energy savings right? Now that experience taught us, like, “Hey, we think there’s something real over here. We should actually just let the AI control things directly, to get the value automatically.” And also, quite frankly, the plant operators were getting tired of checking their email, like, every 15 minutes, waiting for the AI to tell them what to do rather than manually input things. They had better things to do, right? So we actually decided—and rather Moose and Joe decided, like, hey, it’s time to go to a fully automated system, right? This was total uncharted territory. At that point, like, forget about can AI control things, we didn’t even know, like, is it possible to control machines from the cloud, like, huge industrial infrastructure in the cloud? Because to our knowledge, no one had done it before, right?

Pat Grady: Is it fair to assume that a lot of the hardware, a lot of those machines are things that Google built from scratch? Or does Google use a decent amount of commercially available data centers?

Jim Gao: It’s a mixture of both, right? So, you know, obviously, Google does a lot of things in house, but it doesn’t manufacture chillers and that sort of hardware. So Google does buy off-the-shelf hardware, but there’s a lot of modifications and Google-specific things that we did, right? Like, for example, programming some of our own PLCs or making modifications to the building management system. So, like, the software control layer looked quite different. That was done in house.

But I still remember, like, very vividly, actually, to this day, Veda and I, we were standing in a large 90-megawatt data center. So it was a fairly large data center. And Veda is typing away in his MacBook, right? He submits the PR, it gets merged, and all of a sudden, this huge—honking, huge chiller that is the size of a bus that we’re standing right next to roars to life. And as it’s coming to life, right, like, the ground is shaking vigorously and we’re like, “Oh, my God! Like, with a few keystrokes on its MacBook, like, we just turned on this enormous chiller.” And that was like the very first data point to us. Like, yes, it is possible to control things from the cloud. So now the next question is: how do we control things intelligently from the cloud, right? You know, where all the compute resources live.

AI creativity

Sonya Huang: What were your biggest takeaways from that experience? You mentioned the creativity of the machine. Any other big takeaways or learnings?

Jim Gao: Yeah. So the creativity is absolutely a big one. I think just to elaborate on that briefly, a lot of times when we talk about AI, both in the Valley and elsewhere, I think there’s a conflation between AI and automation, right? Like, AI can absolutely automate things. There’s no doubt about that, right? Especially like routine things, right? But I think that honestly undersells the real promise of AI, right? I think the real promise of AI is what—Demis the CEO of DeepMind calls, like, “AI creativity,” right? It’s the ability to acquire knowledge that did not exist before.

And I, of course, experienced this firsthand. The reason why I’m such a true believer in the technology is because again, I was the expert who helped design the system, but this very AI agent that we created is telling me new things about the system that I didn’t know about before, right? And that’s a very, very powerful feeling. It’s kind of like when, you know, if you think back to AlphaGo, right? Like, Lee Sedol was the best in his field at Go. He was the world champion for a decade. He was at the top, and his Elo rating was just something outrageous. It was like 2800 or something. It was outrageously high, and—but it had flatlined, right? So for a full decade, his Elo rating was the same, and there was no one to challenge him because he was already at the top. So once he hit the top, he just kind of plateaued.

And then after AlphaGo happened—and he actually got to play against AlphaGo privately a few more times because DeepMind had let him continue interacting with the system—what happened? For the first time in a decade, his Elo rating started climbing.

Pat Grady: Huh!

Jim Gao: And this is what I mean when I say that I think the real power of AI is helping us discover knowledge that we didn’t necessarily know about before. And where you’re going to see the most gain from that, it’s not going to be in routine automation things like call centers or whatever, right? It’s going to be, I think, in very, very complex areas, right? Like, areas where human intuition is insufficient because of immense complexity, but that is yet underpinned by data.

So that’s why you’re seeing such things like protein folding, for example. I mean, yeah, that’s fucking extraordinary, right? And, you know, it’s those areas of just, like, massive permutational complexity underpinned by data. That’s where I think we’re going to see some of the most interesting companies and products. So that was a rather long tangent. But so one, creativity is something that I learned. The other one lesson that my co-founders and I learned is really around, you know, if you want real impact, you got to turn the technology into a product.

And this is actually the core reason why we decided to leave DeepMind and train technologies to start Phaidra, right? Like, over and over again, we were seeing the technologies that we were helping to develop at DeepMind were just extraordinary, right? I mean, they were achieving crazy things like with protein folding, but the problem is, you know, in order for the technology to make the most impact, you have to get into the real world. People have to actually use it, right? And that fundamentally means we’re talking about a product. Turning a technology into a product is like—I mean, you guys would know much better than myself. It’s like a hundred-fold, a thousand-fold more work. And that, for us, led us to the conclusion that, like, hey, it’s time to leave. It’s time to actually start a company that creates these intelligent virtual plant operators, these intelligent AI agents, as a real product.

The AI readiness journey

Pat Grady: Let’s talk more about that. For what you’re building now, how much of what you learned at Google DeepMind sort of translates directly into what you’re doing now? How much is new? Because the environments are different, the customers are different. There’s something different about it.

Jim Gao: I think the most important thing that we learned from our Google DeepMind experience is that it’s possible. Like, this is not a crazy—and that isn’t to, like, downplay what we learned. It’s actually a huge thing, right? We learned that it is, in fact, possible to use closed-loop learning systems like reinforcement learning, to drive very large improvements in complex industrial facilities. It hadn’t been done before, to our knowledge, right? And that was a massive proof point. I think the problem, though, is that the real world is quite diverse. Every single customer is diverse, but especially when you talk about industrial facilities, like, every industrial facility is a snowflake, right?

So for us, I mean, the learnings have just been, like, massive since we left Google and DeepMind, right? Because every time we onboard a new customer, we’re learning something new about how equipment are connected, or some product gap that we didn’t know about before that needs to be fixed, or new ways that data can break. At this point, I can tell you, like, a hundred different ways that, you know, data associated with mission-critical cooling systems can break. Probably not the most interesting party topic for most folks but, you know, I personally find it quite interesting. But yeah, there’s certainly been quite a lot of learnings in that regard.

Sonya Huang: Are the folks you’re talking to, are they ready to let the technology take over the system and, you know, let the cooling system just start going?

Jim Gao: Yeah. I mean, yes and no, right? And that actually gets back to your early question, Pat, as well, about the specific learnings from Google. I mean, when I look back, I think what we helped pioneer at Google and DeepMind could only have been done right at a company like Google. The reason why I say that is because Google is a very forward-leaning company.

Pat Grady: Yeah.

Jim Gao: But also, one of the things I’ve learned too is that Google is absolutely an anomaly when it comes to, like, how much data it has, and the pristine quality of the data and the ease of access of the data. Google is fundamentally a data analytics company, right? And as such, it invested all this infrastructure in high-quality, high-availability data on which you can do things like real-time intelligence applications, like what we were doing, and there are many other examples of this within Google and DeepMind.

Having left the nest, one of our rude awakenings was Google is definitely an anomaly. And I mean, gosh, everyone is in various stages of their AI journey, right? Like, Google is certainly on one extreme. We have customers we’ve encountered where, like, you know, forget about real-time intelligence, they’re not capturing the data in the first place, right? Or, you know, they may be sensorizing. In the industries we work with like pharmaceuticals and district cooling and especially data centers, almost always the customer is sensorized, right? Because these are billion dollar facilities, so of course it makes sense to throw a million dollars worth of sensors on it. But that doesn’t—just because you sensorized doesn’t mean that you’re storing the data, right?

A lot of customers of ours aren’t necessarily storing the data beyond, like, 90 days or six months or a year or whatever, right? And—you know, and they might cite some reasons like, “Well, it’s costly to store the data,” right? Or like, well, we’re not—or the more commonly, “We’re not using the data for anything,” which is a true statement, right? A lot of our industrial customers, they aren’t using the data. It’s more like a forensics thing where if something goes wrong, then we go back and we look at the logs to see what happened, right?

Pat Grady: Yeah.

Jim Gao: And then, you know, so if we think about it like Maslow’s hierarchy of data needs or something, you got your sensorization, you got your storage. Then you have to invest in making sure that the data is cleaned, right? There’s a lot of effort, as we all know here, you know, around making sure the data is actually cleaned and usable, right? And that requires you to know what bad data looks like, what good data looks like, and how to convert bad data into good data so it’s actually useful.

And then once you have clean data, you also need to make it accessible in a streaming and batch historical manner, right? So there’s different gradients, I guess, is what I’m trying to say of AI readiness. The customers whom we work with are all over this—are all over the spectrum. But, you know, like Phaidra today is at the point where we are autonomously controlling data centers for our customers.

Pat Grady: So I was going to ask you if the basic workflow or the basic loop is: data goes in—which is a lot of what you just talked about, getting the data into the system. Step one. Step two, decision is made.

Jim Gao: Yep.

Pat Grady: Step three, action is taken as a result of the decision that was made. Step four, action is evaluated against the objective function of the system, and then the loop continues. So the front end of that process, which is data goes in, sounds like there’s a lot of work to get the real world data ready to go.

Jim Gao: We call it “the AI readiness journey,” right? So, like, if you think about our work with customers, there is a chunk of upfront work where it’s just like, hey, we’re going to get your facility—we’re going to get you and your facility AI ready.

Self-improving infrastructure

Pat Grady: How about on the action-is-taken piece of that? Are the systems ready to be controlled by some sort of autonomous system, or is there work that needs to happen there, too?

Jim Gao: Yeah, it’s a really good question. Yes and no, right? And I’ll elaborate on what I mean by that control systems today were, like, designed in, like, the 1980s.

Pat Grady: [laughs] So was I.

Jim Gao: [laughs] Well, me too, for that matter.

Pat Grady: There we go.

Jim Gao: But, you know, like, what I mean by that is, you know, that was the—the ’70s and ’80s was the third industrial revolution, right? So with that was the shift from analog to digital and the advent of the first automation systems, right? In order to automate, you fundamentally first have to sensorize. But these are simple automation systems. The fourth industrial revolution is—we’re biased, but Phaidra, we think the fourth industrial revolution means intelligent infrastructure, infrastructure that can operate itself and fundamentally get better over time at doing so, self-improving infrastructure.

But right now, we’re shoehorning intelligence into systems from the third industrial revolution, right? So they certainly weren’t designed for this. But what we do instead is, most importantly, we ride on top of the existing control system. So there is a hard-coded layer of rules and heuristics. So millions of lines of if-then statements programmed into what we would typically call the BMS, the building management system, or a SCADA system, that defines how the facility should operate.

The problem with hard-coded systems is that because they’re hard coded, they operate the same way today as they did yesterday or a year ago or five years ago, or more like ten years ago, because people don’t very frequently go into the back end programming to update that control logic. Now what Phaidra does is we insert a new cloud intelligence layer at the very top of the control stack, so we’re not—we don’t introduce any hardware, we don’t introduce any new sensorization, right? We actually ride on top of the existing control stack. That’s really, really critical.

You can think of it as a general in the battlefield. The general has a global view of everything that’s happening across the system, and it’s issuing command signals to the troops on the ground for actual execution. So the AI is looking, in our case, at 10,000 trends a minute in real time, right? And it’s issuing decisions like which pumps to turn on and what their pump speeds should be to the local BMS system and/or the PLCs for automatic implementation and execution.

So that’s why I said it’s a mixture of yes and no. Were they designed for this in the first place? No. There is a lot of work that we have to do with our customers to be able to accept this type of external intelligence. There’s a lot of work that we do in defining the safety nets and guardrails to ensure that the AI can’t do bad things to the customer’s system, right? But fundamentally, we are still riding on top of the existing controls architecture. And to be clear, we always want to do that. You don’t want AI controlling things like how fast a valve opens and shuts. Like, that’s a terrible application of AI. Like, a hard-coded valve will do great there. So if you were to look at the, you know, like, the overall system, like, 90 percent of it is fine with just, you know, hard-coded rules and heuristics, because it’s like granular controls logic that doesn’t need non-deterministic, crazy powered intelligence behind it, right? But it’s the higher level thinking and reasoning, that’s where you want the AI. It’s the global optimization.

The results for Phaidra’s customers

Sonya Huang: Have you seen any of your customers at Phaidra kind of get the DeepMind order of magnitude results?

Jim Gao: So I’m glad you asked. So one of the things we’re really excited about is actually—just actually literally this week, earlier this week, Merck Pharmaceuticals became our first public customer. So we’re pretty proud about that. We’ve actually been working with them for two years now. They’ve been using Phaidra for over two years, the full autonomous AI system to control a massive 500-acre vaccine manufacturing facility in Pennsylvania.

This is the definition of mission critical complex. They’ve got 62,000 tons of cooling, so they’ve got four very large plants interconnected with each other across 500 miles of manufacturing space, right? Hundreds of machines interacting with each other. Like, this is where the AI really shines. And yeah, the results that we saw with them were quite strong, right? Like, you know, I think Merck actually just shared some data at a conference we were at which showed 16 percent energy savings when we first, you know, trialed the system at one of their chiller plants.

But, you know, what I always tell our customers is don’t over-index on the magnitude of the energy savings initially. Like, we honestly have no idea what the energy savings are going to be, or the reliability improvements are going to be ahead of time, right? Because these are non-deterministic systems. And by definition, if I could tell you what things you’re not doing in order to get energy savings, like, why do you need the AI in the first place, right?

So, you know, but what we do know is that the unique thing about this technology, about Phaidra and about reinforcement learning in particular, is that it is a closed-loop system. It is a self-learning system. It can learn because it’s able to take actions and it can measure the impact of its actions against its predictions, right? And that means it gets better over time. So maybe we start off at one percent energy savings. Maybe we start off at five percent, maybe we start off at ten percent, right? But fundamentally, it will learn and it will get better over time. Now not infinitely, because there still are laws of physics, right? But it will get better over time, and once it reaches optimal, it will stay at optimal.

That’s super important because with hard-coded rules and heuristics, when you tune the system as you’re commissioning it, so when you’re turning it on for the first time, that system today no longer performs the same way that it did 10 years ago when you first, you know, commissioned that system, right? Because the pipes have corroded and the heat exchangers have fouled and the cooling towers are scaled or whatever, or you ripped out equipment. But the promise of an adaptive self-learning system is that it will change with you as your customers are, for example, now putting in a bunch of H100 and soon H200 GPUs, right? Well, the system will learn and adapt on the fly with you, so it can stay optimal.

Other real world applications of RL

Sonya Huang: I’d love to transition for a minute beyond industrial control systems and get your opinion on—I mean, you were one of the first and maybe one of the only real world applications of reinforcement learning.

Jim Gao: Yeah, we’re definitely not the only.

Sonya Huang: Not the only. I’d love to get your thoughts on the “not the only.” So I mean, what else is—what else are people doing with reinforcement learning in the wild today?

Jim Gao: Yeah, absolutely. So unfortunately, my knowledge is very heavily indexed on the Google and DeepMind space because that’s how we spent so much time, right? But even within Google and DeepMind, there were other very cool reinforcement learning applications. For example, the team that sat right next to us, they used RL systems to help prolong battery life, for example. So you may notice that your Android phone, like, the battery life has been increasing. And yes, there are hardware changes associated with that, but there are also intelligent software changes behind the scenes that proactively manage your battery life.

There were reinforcement-learning systems for—you know, for, like, YouTube video recommendations, for example, right? And a whole host of other things. So absolutely, there are, you know, reinforcement-learning applications in the wild. To your point, though, I wouldn’t say that there are a whole lot of them. And I think that it’s not a coincidence that you tend to see them at more of, like, the big tech companies where they’ve already invested in the data infrastructure, so the underlying infrastructure, so that they can benefit from this technology. Like, outside of the biggest tech companies, there are very few applications of, like, real world reinforcement learning. Like, in production, at least.

Pat Grady: Yeah.

Sonya Huang: And do you think that’s because of kind of low applicability? You know, you started this podcast by talking about the necessary ingredients for RL to be a good solution. Do you think it’s just there’s not that many applications where RL is a good solution, or do you think it’s just tech readiness?

Jim Gao: No, absolutely not. No, I think the applications for reinforcement learning are freaking massive, and Phaidra is one of many examples. And we’re just scratching the surface as an industry of what we can do with this technology, right? Like, fundamentally, the power of the technology is that it is a self-learning system. AlphaGo and its successor, AlphaZero, taught itself to become the best in the world at Go, chess and shogi. Three vastly different games, same learning framework, right? And it taught itself.

So I think there’s a lot of very interesting application areas. I think the data infrastructure is missing in a lot of them, but just to, like, you know, like, list off a few, right? I mean, obviously, you know, we’ve already talked about, like, the protein folding, right?

Sonya Huang: Yep.

Jim Gao: But, you know, there’s an entire untapped field around logistics. Like, that is such a gnarly computational challenge, right? You know, when you start looking at operations research, operations research underlies trillions of dollars worth of, you know, industrial-like activities, right? Not just industrial, but other source activities, right? Like shipping, airplanes, FedEx driving routes, like, these are all applications of operations research. Grid balancing, right? I mean, I think grid balancing is probably the single most important way that AI can fight climate change. I generally believe that is where AI will have the most impact on climate change.

AI load balancing on the grid

Pat Grady: If you had to guess, first time you deployed this into a data center at Google, you saw 40 percent energy savings. If we had just killer AI doing load balancing on the grid, what sort of energy savings do you think we could see?

Jim Gao: Oh, my gosh. I mean, that would be wild. I think it’s not so much about the magnitude of the energy savings per se, but rather about the potential cost savings, because then you could start shifting your loads around to when it’s most cost effective to do compute. Or if you had CO2 signals, you could start scheduling loads around when it’s the least carbon intensive to do your non-latency sensitive workloads, which I think Google has already been experimenting a bit with.

But honestly, I think it’s really more around the global system-level optimization, right? We have to keep in mind that data centers already are, but increasingly, you know, just massive, massive load banks, right? Like data centers, like, they were 1.5-2 percent of US energy consumption. That’s about to increase to four percent, right? Like, this year, I think. And then by the end of the decade, it’s projected to get up to, like, nine percent of the US. In Ireland, right? Right now, Ireland is—22 percent of Ireland’s national energy electricity consumption goes to data centers alone. The International Energy Agency predicts that that’s going to increase to 37 percent by the end of the decade, right? Like, just mind-boggling numbers.

But the point, the reason why I mention this is because these are massive load banks on the grid, right? There is an actual opportunity if you could somehow coordinate the data centers together to help balance the grid. And that is such a gnarly, gnarly challenge. And it is what is holding the energy transition back, because as more and more renewable energy starts coming onto the grid, the supply side becomes increasingly stochastic. We used to have this perfectly deterministic system, at least on the supply side, where a grid operator can call someone who operates a coal-fired power plant and say, “Hey, ramp up or down your power production.” That’s deterministic. But now …

Pat Grady: Yeah, ramp up or down the sun.

Jim Gao: Yeah, totally, right? So now as you get more and more renewable penetration coming onto the grid, you have a somewhat non-deterministic demand side—it’s somewhat predictable, but there are definitely spikes—and a massively non-deterministic supply side. And what is the problem with that? The problem is that, you know, because we do not know how much energy we’re going to generate, you now have all this wasted excess capacity in reserve, right?

So there is a concept of spinning reserves on the grid where, you know, like, there are peaker plants, you know, like giant natural gas turbines, right? That as we speak, are just sitting there idling just like your car idles at a stoplight, right? In case we need that power as a buffer against the uncertainty. And as renewable penetration increases, ironically, the amount of buffer you need also increases. If you look at Germany’s failed energy transition, they decommissioned their nuclear baseload while ramping up their renewable energy penetration, right? Good motivation on the surface, although I personally think we need a lot more nuclear on the grid. But that’s a whole other topic.

But it ended up backfiring, right? Because Germany actually ended up needing to build more fossil fuel power plants to buffer against all the renewable energy that was coming onto the grid now. So that’s why I think AI for grid balancing, we need it. And it probably is the single most impactful thing that AI can do to solve climate change.

The intersection of transformers and RL

Pat Grady: Let’s talk a bit about some of the limitations of reinforcement learning, and also where you see it intersecting with transformers.

Jim Gao: Yeah. So I should state that first of all, my co-founder, Veda, is by far the expert on this topic. He knows way, way more than me. You know, I’m just a simple mechanical engineer who happened to learn a bit about AI. I think the intersection is really interesting. Like, very, very potentially complementary strengths and weaknesses is how I would describe it, right? It’s certainly not mutually exclusive, right? Like, what I mean by that is—and I was just talking with Veda about this earlier, right?

So Veda will tell you that, like, you know, all intelligent systems have certain hallmarks of intelligence, so that we can say they are intelligent. They need to deeply understand the world, the environment, that—you know, that they’re modeling, right? There needs to be some element of memory, right? So, like, remembering things. And very importantly, there needs to be the ability to plan and reason, right? Very interlinked. Transformers are clearly quite good at the first one, in the sense that they can take in huge amounts of structured and unstructured data to learn quite good models of the world. But it is limited in the sense that these models are primarily through correlation and not causation, right?

That makes it challenging for—at least for what Phaidra does because we work with real world systems, we have to have causality. Like, we have to understand why is the AI doing certain things, right? Like, why is it not doing other things? How do we force a certain behavior that we know has to exist in our system, right? So these are mission-critical systems is what I’m trying to say. There has to be causality. So that’s where the limitation is. With reinforcement learning systems, I mean, the power of RL-based systems is very, you know, much in the planning and reasoning part, where, you know, you’re able to plan long trajectories of actions and learn really, you know, like, intricate policies, right? I think where it gets really interesting is the intersection where potentially transformer architectures can learn models, you know, like value functions, right? That the—or models of the world that the AI can learn policies against. But without that causality piece, it’s going to be quite tricky to cut it over into at least industrial control applications like what Phaidra does.

Lightning round

Pat Grady: Should we move into a rapid fire round? What are you most excited about in the world of AI in the next five or ten years?

 

Jim Gao: So in the very near future, I’m excited about just the absolute explosion of AI applications. It feels kind of like a Precambrian explosion of sorts, where, like, there’s like a primordial soup, and, like, all these AI startups and services are all of a sudden springing up, right? So it’s quite exciting. But when I look at where that—where that activity is happening, where that research and that entrepreneurial activity is happening, it’s very clearly focused on, like, around LLMs and even more specifically around, like, natural language interactions, text-based interactions, right? And that certainly is a large part of the economy. It is very exciting. But in the five to ten year-frame, to answer your question, I’m most excited about when we can start getting some of this technology into real world physical applications. It’s the intersection of this technology with the real world infrastructure that we live in: big industrial systems, cars, homes, like, you know, physical things. I think that’s where we’re going to see some really interesting things in the future.

 

Sonya Huang: Who do you admire most in the field of AI?

 

Jim Gao: Gosh, a tricky question. I admire a lot of people. [laughs]

Pat Grady: You’ve worked with some of the greats, and so it’s going to be hard.

Jim Gao: Yeah. I mean, of course, my mind jumps immediately to a lot of the people whom I’ve worked with, right? You know, I admire very much the DeepMind researchers whom we worked very closely with. I often tell people working at DeepMind, it’s kind of like being a kid in a candy shop if you’re a technologist like myself. It’s like you get to see years in the future all this cool technology on the forefront, and then it just makes your head spin as to all the possible applications of that technology.

I admire Moose a lot, my old boss, who has, of course, since moved over to Microsoft. I was saying earlier, like, one of the biggest lessons I learned and my co-founders learned at DeepMind, is that making a technology like what we did for Google’s data centers versus making a product like what we’re doing at Phaidra? Totally different things. Wildly, wildly different things. And there are few people as good in the world as Moose at, like, taking technologies and turning them into real products.

I remember my co-founder Katie and I, we were sitting down, we were grabbing drinks with Moose at some random dive bar in Seattle, right? He happened to be up there. And this is before OpenAI released ChatGPT-2 and just ushered in a world of craziness. And, you know, he was raving to myself and Katie about the applications of LLMs and, like, you know, the, you know—and how powerful these systems are. And we were like, “Okay, Moose, but, you know, let us tell us—let’s tell you about Phaidra.” Like, we had no idea what he was talking about, right? [laughs] But I mean, he was prescient. Like, he saw this ages in advance, right? You know, what the technology—the technology that was being developed and the capabilities that it would usher in. And then, of course, he went off and he started Inflection. So I admire him a lot for the ability to turn technology into actual products.

Pat Grady: All right, last question. You are building a very ambitious business, very hard business to build. And you’ve been at it for a while in the context of the new wave of AI startups. What advice do you have for other founders, or would-be founders who are trying to build companies here?

Jim Gao: I mean, I’m not sure I’m even qualified because one, I hope you ask again in one or two years when hopefully Phaidra is wildly successful. We certainly didn’t choose the easy path by focusing on real-world infrastructure. [laughs] Honestly, my mind gravitates towards more, like, would-be founders, people like my co-founders and I who were thinking about leaving to start something new, right?

And my advice there is twofold. One, make sure you have co-founders. Like, my God is so stressful. [laughs] There’s so many things that can go wrong, and you’re constantly on this emotional roller coaster of up and downs. Having co-founders to lean on both for the workload but also just for the emotional support and mental sanity, is so important, right?

Advice number two would be the risk is less than you think it is. I’m biased, but I think people should take the jump, right? A lot of times when I talk with my former colleagues and other people who are thinking about making the jump, they’ll say things like, “Well, but I’ve got a nice job over here, you know? You know, they pay me well. They’re—you know, I’m on a rising trajectory.” But my point to these folks is always like, no matter, you know, like, how valuable and successful, you know, you are today in the organization, like, you will only be more valuable and successful for that organization or other organizations or to society in general if you learn new skill sets, right?

Like, take the plunge, go out, start a company, learn what it’s like to turn technologies into products, right? And if that fails for whatever reason—hopefully it doesn’t, right? But if you fail, then the Googles, the Microsofts, whatever the world, they will only want to hire you back at an even higher premium. So why not take the plunge, right? It’s the—you know, the biggest, best investment—and obviously much smarter people than me have said this for a really long time, but the best investment you can make in yourself is you, up leveling yourself, learning new skill sets. That’s always the best thing you can do.

Sonya Huang: Thank you, Jim. This is a fascinating conversation.

Jim Gao: Yeah. Thank you very much for having me. You guys really enjoyed it.

Pat Grady: Thank you.

Mentioned in this episode

Mentioned in this episode:

  • Mustafa Suleyman: Co-founder of DeepMind and Inflection AI and currently CEO of Microsoft AI, known to his friends as “Moose”
  • Joe Kava: Google VP of data centers who Jim sent his initial email to pitching the idea that would eventually become Phaidra
  • Constrained optimization: the class of problem that reinforcement learning can be applied to in real world systems 
  • Vedavyas Panneershelvam: co-founder and CTO of Phaidra; one of the original engineers on the AlphaGo project
  • Katie Hoffman: co-founder, President and COO of Phaidra 
  • Demis Hassabis: CEO of DeepMind