Roblox’s Stef Corazza: How AI is Empowering Video Game Creators

Training Data: Ep28

Stef Corazza leads generative AI development at Roblox after previously building Adobe’s 3D and AR platforms. His technical expertise, combined with Roblox’s unique relationship with its users, has led to the infusion of AI into its creation tools. Roblox has assembled the world’s largest multimodal dataset. Stef previews the Roblox Assistant and the company’s new 3D foundation model, while emphasizing the importance of maintaining positive experiences and civility on the platform.

Listen Now

Stream On

Summary

Roblox Studio head Stef Corazza brings deep expertise in computer vision and machine learning to his role leading generative AI development at one of gaming’s largest platforms. In this episode, he outlines how AI is transforming game creation through a unique approach that emphasizes empowering creators while preserving human creativity and maintaining platform safety.

The community comes first in game development. Roblox’s approach to AI centers on a reciprocal relationship with its creator community—they provide training data and in return receive free access to sophisticated AI tools that enhance their creative capabilities. This model has gained overwhelming community support while generating one of the world’s largest multimodal datasets for game development.
AI should remove friction, not replace creativity. Rather than attempting to automate game creation entirely, Roblox positions AI as a “dishwasher”—handling tedious tasks like bulk modifications while preserving creators’ essential creative vision. This philosophy has driven an 180% increase in code creation and a 60% increase in material generation among creators using AI tools.
Constrained creativity enables wider participation. By providing AI tools that work within defined boundaries—like customizing vehicles or characters rather than entire worlds—Roblox makes game development accessible to those with creative vision but limited technical skills. This “constrained creativity” approach maintains quality while democratizing creation.
Neural rendering will transform game visuals. Within five years, Stef predicts AI-powered neural rendering will enable instant visual restyling of games without changing underlying assets or physics. This technology will allow developers and potentially players to dramatically alter a game’s appearance through simple text descriptions or reference images.

The future of AI is 3D-native. While 2D AI has made remarkable progress, achieving true temporal and spatial coherence in games and other applications requires training on 3D data. Roblox’s massive dataset of 3D assets, physics interactions and player behaviors positions them to develop models that truly understand game development rather than just generating assets.

Transcript

Chapters

The scale of Roblox
Building Robolx Assistant
How will Assistant evolve?
Constrained creativity for better game play
Generative AI and UGC
Roblox’s 3D foundation model
What is neural rendering?
Mentioned in this episode

Stef Corazza: We have this unique synergy and collaboration with the community, where basically we told the community, “Hey, give us access to your data to train AI. We are going to make the best AI companion, the best AI assistant that we can. And that assistant goes back into Studio and is free, right? So we’re not making money off your data, we’re actually helping you create more.” And so we found the overwhelming majority of the creators in our community gave us permission to use their data for training. And that’s why I was mentioning earlier, we have not only one of the largest data sets in the world, but also the most multimodal.

Konstantine Buhler: Welcome to Training Data. Today, we have an amazing guest in Stef Corazza. He leads generative AI at Roblox, one of the largest gaming platforms on the planet. Roblox has 79 million daily active users, and they have a creator economy that pays out hundreds of millions to creators. Because of this, Roblox is uniquely positioned to transform how games are made and played with AI.

Stef is a founder at heart. He started Mixamo, a pioneering AI company for character animation, and was acquired by Adobe. Roblox brought him in to revolutionize how games are made with AI at their own platform. Under Stef‘s leadership, Roblox is pushing the boundaries of AI in gaming–from their groundbreaking AI assistant, which lets you generate games with simple, natural language, all the way through their 3D foundation model technologies. Welcome, Stef, to Training Data.

Konstantine Buhler: Today we get to talk about games. In particular, we’re talking about AI at Roblox. Roblox is one of the largest gaming universes on the planet. And I said ‘universe’ instead of ‘platform.’ ‘Platform’ is a pretty overused term, and really, Roblox has created much more than a platform. It’s a place where you can create, it’s a place where you can play, where you can meet new friends. And it’s all done online virtually.

Now we are technologists, but we’re also investors. And I want to spend a moment on how remarkable of a business Roblox is. It’s obviously an amazing technology—and we’ll get to there. But the business is exceptional. You’ve got a $29-billion market cap company with over $3.5-billion of run rate revenue. An amazing stat here is they’re really building an economy. It’s not just a selfish company, it’s a company that actually has produced over $800 million for their creators, for the people actually building on Roblox over the course of a year. 70 billion hours of gameplay on Roblox per year—70 billion! And they’re able to deliver cash-operating profit. So a $600-million operating profit.

That’s because they’ve catered to a huge audience: 79 million daily active users. Over time, they’ve actually shifted up in age, and you’ve got 46 million of those daily active users are actually now over the age of 13. 3.8 million daily active voice users. And the numbers just continue to grow on this amazing business.

Stef, we’re so excited to have you here today. We could not have asked for a better person in the category of AI gaming. From machine learning to computer vision to biomedical engineering, you have a pretty impressive technical background, an amazing journey from what was initially biomedical to generative AI efforts. And we were hoping that you could kick it off by just telling us how you got here. How did you get to becoming the head of generative AI when you started off as an engineer in a very different field many years ago?

Stef Corazza: Thank you for the introduction. I think Dave is very kind, maybe slightly overstating, but I find this flattering. And thank you for the introduction about Roblox. It’s really like an amazing example of a compounding effect. And every day we are mesmerized ourselves about the success. My journey started at Stanford about 20 years ago when I came from Italy as part of my exchange program. And I was focusing on computer vision, machine learning for the measurement of human motion.

And so you have basically two markets there, one is the biomedical that you mentioned and the other one is animation. And so we were basically at the boundary between the two, and at some point I realized there was a much bigger opportunity in the animation space. And so I basically work on, like, video-based motion capture and animation creation solutions that then led to the spinoff of Mixamo, the company that I started in 2008, that later got acquired by Adobe in 2015 and is still today one of the most used machine learning services in the industry to rig and animate characters.

And so after a few great years at Adobe, actually seven of them, I helped build the 3D offering there, including products that my team built like Adobe Stager and Adobe Aero for Ar. And then we acquired Allegorithmic, and so we built the full 3D portfolio. And then after that I was really passionate about Gen AI. I was working with the CTO of Adobe trying to figure out completely new ways to generate things. And that’s when basically Roblox reached out and I had, you know, breakfast a few times with Dave, and we talked about it. And I really realized that potential the Gen AI had was really finding in the gaming space, and specifically in Roblox being a platform where so much data exists, you know, 15 million experiences every year are played. I found basically Roblox to be the place where this could really blossom with a massive impact worldwide.

Konstantine Buhler: So before we get into the AI components, can you tell us a little bit more about the platform? You’re an engineer by background, and so scale is incredibly exciting to you, I would imagine. The sheer scale at Roblox is pretty mind boggling, I think, to anyone, especially users. There are obviously platforms that have more daily actives—the Facebooks of the world. There are platforms that have more monthly actives, et cetera. But it’s very rare to have the amount of bandwidth and compute and graphics and everything in one place. Maybe you can share a little bit about what that means to have 15 million daily sessions and 79 million monthly actives, and a little bit about the stats and what that means technically.

Stef Corazza: Yeah, we always like to talk about the daily active which is the 79 million that you mentioned. But the monthly active that I think we don’t communicate to the outside world, it’s even more staggering, being in the several hundreds of millions, right? So it’s a massive community that is growing very healthy and pretty fast. And sometimes people ask us where are those games coming from that people play? Some of those games are now, like, worldwide franchises with tens of millions of concurrent players. And then sometimes I give this number which is also like a reminder of the scale. So every day we have roughly 90,000 experiences in games published on Roblox.

Konstantine Buhler: Wow!

Stef Corazza: And so that gives you the scale of the human creativity, if you want, and how much, really, this is becoming more and more of a creation platform and game development platform that has incredible numbers in terms of scale of creation. And also it’s an economy on its own. As you mentioned, we paid out $800 million to our creators. And so people have jobs, people buying houses, people have, like, companies, that are now—you know, we have game studios that have, like, more than 100 people. Some of them are VC funded, and so it’s basically creating its own, like, creation economy.

And so that, I think, is very humbling, and at the same time has incredible potential because the uniqueness, I think, of Roblox is that we are one of the most vertically-integrated companies on the planet. You know, we own our own data center, we have many data centers around the world where we own, like, the hardware, but then we also own the app that distributes all these games and the players and all the services like video chat and live chat and chat translation and so on. But then we also have the creation tool, which is the Robo Studio that I have the privilege of leading with my team, and then also we basically have all the services for creations that go with it.

So it’s all the way from, like, the bare metal CPU-GPU cluster all the way to the creation tools, with the big difference that we only charge our—we only basically take some revenue ourselves when our creators make money, right? We make money when they make money. There’s no upfront fee to get into the game, there’s no upfront fee for the tool. The tool is free, a lot of services are free. And so there’s really very little friction to start using Roblox. And if your game has one user or a million users, you don’t have to worry about anything. We scale it for you. We pay for all those, like, CPU-GPU instances, storage in the cloud and everything. And it’s completely opaque to you as the creator. So I think that’s the uniqueness. And as part of the fully vertical integration, we are also able to subsidize AI, right? So we are one of the few companies out there with a full-fledged AI assistant offering for game development, for code creation, material texture, assets, everything. And it’s all free to the creators.

Sonya Huang: I’d love to get into that. Maybe can you just walk us through today, what is the experience of creating a game? Like, who are the typical creators on your platform? Is it a high school student? Is it a professional game developer? And what types of games are they creating?

Stef Corazza: That’s a great question. So we have several million creators on a monthly basis on the platform, so big numbers there as well. And usually I think the average age is in, like, the mid-20s. It’s a little bit older than our player demographics, of course. And typically, the majority of them are doing, like, world building. They’re, like, building stuff. They’re artists, they are making games. And then we have about 30, 40 percent they actually are coding. Then of course there’s an overlap between the two audiences. But basically, roughly this is what we are seeing. And it’s used for the most different creations you can imagine, from natural disaster simulations to learning to the more classic gaming experience. Events, concerts, fashion design. We are seeing new types of experience popping up on a daily basis, which is very fascinating.

Sonya Huang: What’s your favorite game?

Stef Corazza: [laughs] I’ve been playing lately Racing Empire quite a bit. I like car games, so that’s a good one that I really like.

Konstantine Buhler: Stef, I was actually surprised to hear that you said 30 to 40 percent of the developers are actually coding as opposed to world building. Why is that? And maybe that kind of parlays into what you’ve created with Assistant.

Stef Corazza: Yeah, so there’s just—it’s a skill that is harder to master and to get into, right? A lot of people just, like, go from players and they want to create something and so they start, like, building the world. And that is probably more intuitive than writing code where you have to understand this, like, high level constructs and apply them to get interactivity.

And so that is one of the things that we wanted to tackle with Assistant. We wanted to basically remove that friction or having to learn to code, having to learn a programming language in order to create interactivity. And so that was one of the initial inspirations, and that’s why Code Assist was the first feature that we released. Now this was in March, 2023, which feels like two decades ago, and now a year and a half later, we basically have a full Assistant that has game development capabilities. They go from writing code, autocompleting code, explaining code, debugging code, applying scripts to parts and objects in your scene. So that’s all the coding stuff.

Then we have documentation, right? People ask, “How do I do XYZ?” And usually they have to browse through dev forums and documents on the Internet. Instead, now Assistant can basically summarize those informations for them. And then the third aspect is creation of assets. So we rolled out a material generator, we rolled out a texture generator, which was a lot more complex, where you can basically texture any 3D object with quite good fidelity resolution just from a text prompt. And so all these together is what we call Assistant with an umbrella name. And basically, it’s allowing now to create entire simple games from scratch just by typing natural language. In the future there will be more like multimodal input through images, but basically right now we have, like. simple games that people are making also as a test where like they’re only using Assistant. And so you can imagine you can make that game on your phone, you know, with a microphone. You just speak to it, and then Assistant will just generate the world, will create the forest, will create the enemy, the boss that you have to fight, and then also add all the game mechanics to it. Everything automated.

Sonya Huang: Are the games created with assistants? Like, how good are they if you had to give a score out of 10, the games created from coding versus what people are doing right now with AI versus where it’s going?

Stef Corazza: I mean, if you can code, of course you get to another level of sophistication in terms of the complication of the gameplay and all the nuances that make, like, a game fun to play. So of course we’re not there. The examples that I see are more mostly like, we do game jams and, you know, we spend, like, two hours to make a game with Assistant and what can come out is pretty incredible. But we haven’t had, you know—in the community, I’m sure they’re going to take it to the next level. The goal is not to exclusively use Assistant. The goal is to basically combine Assistant and learn skills through Assistant. So maybe at the beginning of the first script, Assistant will write it for you and will attach it to a part. Then you know where the script should go, and then you know how to, I don’t know, make some hobby platform, move up and down, and then you learn on the way. So we see Assistant as a companion, that it shows you by doing how you make a game. And then over time, people really develop skills that otherwise it’s hard to acquire.

Konstantine Buhler: Stefano, have you seen that the type of development has changed? As in the Assistant not only has language, like the documentation you described, it does also have code, it even has images. I saw that it’s—and we’ll get into the technical specifications in a little bit. But as a user, you have language, you have code, you have images. Have you seen the behavior change for how people develop in Roblox, as in tactically? Have you seen the number of people actually coding going up? I would have guessed that outside looking in because the barriers to entry of coding have gone down. But does that just mean that more people are developing in general and the ratio is actually stay the same?

Stef Corazza: That’s a great question. So what we have seen, we have measured the productivity of people that use Assistant versus not, and so we have found that people that use Assistant create 180 percent more code.

Konstantine Buhler: Wow!

Stef Corazza: And so the individuals are a lot more productive.

Konstantine Buhler: That’s benchmarks to people who already wrote code? Or does that also say just more …

Stef Corazza: Yeah.

Konstantine Buhler: Wow. Okay.

Stef Corazza: Two cohorts of coders, one using Assistant and Code Assist. Code Assist suggests code and Assistant creates from scratch, but they all kind of integrate into a similar user workflow. And then if you’re looking at the same cohort comparison for Material Generator, creators that use Material Generator create 60 percent more materials. And so also on the art front, there’s more productivity. And then if you look at the final goal, which is how much they publish, right? Because publishing the game is the ultimate goal. The lift is about 30 percent. So people that use Assistant publish 30 percent more than people that are not using it. And remember that this is now still in beta mostly. It’s going to get out of beta soon. I can’t give you the exact day but, you know, it’s going to be relatively soon. And so we’re going to see even a broader adoption, of course, there and impact.

Konstantine Buhler: One more question on this, on Sonya’s quality question. Like, what about usage? So they’re publishing 30 percent more. Do you have any KPIs that actually give you a sense of if the Assistant games, the hours spent, the Robux spent on them, whatever the KPIs might be, if they are also that 30 percent lift?

Stef Corazza: Yeah, that’s a great question. So our number one KPI right now is around retention. And so we are seeing, like, a week over week retention of people using a system that is much higher than any other features that we roll out. And also we are seeing that over time that retention in the long run increases quite a bit. So we have—people are using a system with Studio. We are seeing significant lift in the overall retention of Studio, and then of course we see a high retention on Assistant. So the number of daily users has been growing organically and very steady. We don’t do any marketing of course on this. And it’s free, which is I think the best marketing. You know, like, it’s not cheap, let me tell you that, right? I think Roblox is very generous on that. But we have this unique synergy and collaboration with the community where basically we told the community, “Hey, give us access to your data to train AI. We are going to make the best AI companion, the best AI assistant that we can.” And that assistant goes back into Studio and is free. right? So we’re not making money off your data, we’re actually helping you create more.

And so we found the overwhelming majority of the creators in our community gave us permission to use their data for training. And that’s why I was mentioning earlier, we have not only one of the largest data sets in the world, but also the most multimodal if you want. Because we have code, we have images, we have 3D assets, we have audio, video. All that is part of a gaming experience, and also with the interactivity is the glue for all of that and with the analytics on the users, right? So it’s a very powerful data set that we are, you know, treating with a lot of respect and a lot of, you know, the best practices to keep that data, of course, very secure, but at the same time allows us to really harvest the value.

And ultimately what we are doing is we are teaching AI game development. Basically that’s what we are doing, right? We’re not teaching how to make an image, we’re not teaching how to write code, we are teaching game development. That’s the ultimate goal. And so all these tools at the beginning will feel a little bit, hey, this is a tool to make a material, this is to make the textures, to write code. We are already seeing that they are converging. We already started that process of converging some of those, like, lower level tools into larger ones where basically the AI is actually learning how to develop a game, as opposed to just how to do a small task.

Sonya Huang: I love the vision of the AI learning how to develop a game. And the question I have is: If you break up game development into its component parts, which parts do you think are most likely to be taken over by AI well in the near term, versus what do you think humans will be uniquely good at for a while?

Stef Corazza: That’s a great question. So we don’t see AI as taking over, by the way. I think the best parallel that we made in RDC like a couple of weeks ago was AI is your dishwasher, right? Like, nobody wants to wash the dishes, and so we are really focusing on tasks, especially with the last release of Assistant actions. We are focusing on the tasks that you don’t want to do, right? Washing the dishes, doing the laundry.

So we have introduced amongst the Assistant capabilities a new capability where Assistant can basically make large-scale modifications to the data model. I’ll give you an example. I have made a beautiful open world game with a huge forest. This forest has 100,000 trees. Now all of a sudden, I want these trees to actually follow the seasons and, you know, get leaves more yellow because fall is coming. The amount of work that will take me to implement that will be huge. Assistant can do that with three lines of text. I can say “Select all the trees,” or I can say “Select all the pine trees.” Actually, those don’t become yellow, so that’d be the wrong one, sorry. “Select all the trees other than the pine trees and make the leaves yellow.” Right? I can just give these three liners, Assistant can go, can select, you know, the 57,000 trees that are no pine trees, and can change the color of the leaves for me in just a few seconds.

So that’s the kind of task that we are seeing a lot of value, and honestly, this was the feature that the community loved the most. Of all the AI features we released in Studio in the last two years, this was by a landslide the one that got the highest score and appreciation for the community because again, it was the dishwasher. We are not replacing your talent, which we believe is irreplaceable, but we are actually helping you in the tasks that you don’t want to do.

Konstantine Buhler: I am always so impressed by Roblox’s emphasis, genuine emphasis on community and teaching. You’ve said this in the past few minutes. You’ve talked about community quite a bit, about teaching how to code and how to create games, and really uplifting the entire community, frankly. And I just want to say that this is very much true all the way to the core of the business. I got to follow my really good friend Craig Sherman to board meetings in 2017 and 2018 at Roblox. And even behind closed doors years ago, this was always the focus. It wasn’t the banality of a lot of board meetings on monetization and financials. It was community. It was uplifting community, teaching them how to code, teaching everyone how to use Roblox in a way that actually benefits themselves in their own learning. And this dishwasher analogy sounds very consistent with the ethos of your business. How impressive. How did you implement it? It sounds like a segment anything type of algorithm. Was it a segmentation approach? Or this particular feature, how did you do it?

Stef Corazza: That’s a good question. So we basically have found a way to give—so Assistant is really good at generating code, right? It’s based on LLMs, and some of them actually we are supporting as open source projects like Starcoder and we train with our own data. So over time, Assistant is getting better and better at generating code. Some of the code, instead of running at run time, you can actually run it at edit time in Studio. Studio has a command bar where you can just execute some of that code. And so Assistant Actions basically is creating code that is executed directly in Studio at any time. And because we integrate in Studio, it has full awareness of the data model of your scene, and so it knows, “Oh, this is a tree, this is a car.” It knows what you did and it has full awareness of that. And so we have combined the ability to generate code with the ability to create using the same code, Lua commands in Studio and the awareness of the data model. Those three things coming together can basically unleash the power of the LLM onto, like, data model manipulation.

Konstantine Buhler: Do you have these things labeled, or is it dynamically determining that this is a car?

Stef Corazza: There’s a good amount of inference that just happens.

Konstantine Buhler: Cool.

Stef Corazza: Yeah.

Sonya Huang: How do you imagine the creation experience evolves, let’s say 10 years from now? How do you see Assistant evolving?

Stef Corazza: So when we met about roughly two years ago and we said, “Okay, how is AI going to impact creation and where should we start?” The one paradigm shift that we envisioned, that we thought was going to happen in the industry, not just the Roblox, was a shift between fine control on the creation to capturing intent. And so I spent quite a bit of time in the Photoshop land where basically there you give control to the color of the individual pixel of an image, even though most people don’t need it, but some might, right?

And so there it’s all about 100 percent control, non-destructive workflows, but 100 percent control on the actual artifact that you are generating. And then we are now moving towards a complete different generation of tools where the tool is successful and can produce good outcomes only as long as it can capture the intent from the user. And so all the digital tools that we have seen in the last 30 years, it’s all about surfacing more control to a user, and then the user will figure out how to use that control, right? Instead we are migrating from control to capturing intent.

And so we’re going to see probably quite a big change. And there’s going to be, you know, a thousand startups trying to do it in different ways, and there will be new UX paradigm popping up. But we can see audio as an input, we can see high level gesture, not just with your hands but also with mouse and keyboards. So things that are just providing an input into what you want to do, we are seeing multimodal input. I can describe a word better if I can type and I can provide maybe a concept art and maybe some high-level sketching on top of it, right? So this is very different from having the phenomenally granular control, but at the same time the velocity is like two orders of magnitude faster.

The challenge in all this is that for casual creators or people that have a limited amount of time to spend on it, what AI is providing, it’s already good enough to share on TikTok or to make an experience on Roblox and invite your friends over. But for people that want to spend a lot more time, you still have to provide that fine control, right? And so the challenge is how do you make a tool that is really good at capturing all the initial intent, but at the same time allows the real pros to iterate with the same level of control of the traditional tools. So that’s a little bit of the challenge that I think a lot of companies like we are facing in Studio, a lot of other companies are also facing.

Konstantine Buhler: We have a really fun episode of Training Data with this team that created a company called Dust. And one of the founders, Gabriel Hubert, talks a lot about rasterization versus vectorization. And this is definitely your language as a graphics person. They also were Stanford computer vision-type folks. And it seems like that transformation has happened, and the vectorization kind of what you’re describing, like you can expand it to a great deal based on intent, you can shrink it to very small and almost like fractals go down into more and more detail. When do you think we will get there? When do you think we’ll get to the point where you can say “Hey, this is the intent,” and then you can go in and at a fractal level of detail change each pixel also based out on intent, but on a much smaller scale?

Stef Corazza: Yeah, that’s a million dollar question. So what we’re trying to do is allow Assistant first to be able to perform those operations, right? So the same AI that gives me the rough slot machine type of input—I throw in some text, an image, I pull the lever, let’s see what we get, we want basically Assistant to also be able to go in and do the fine grain change. So, “Hey, only the trees that are above five feet, can you just change the color of that?” Right? “And then can you take the texture of the tree and then open up and I’m going to paint over it.” And so we want to allow AI to already be able to go beyond one shot and allow for iteration. We believe iteration is a fundamental way people create, and so we want to make sure that AI can support that from the get go.

And then we will always have a fallback with some tools. And maybe there, you know, it’s going to be more like a progressive disclosure, where we don’t throw it in front of every user, but only the users, they want to go deeper, then, you know, we can basically pull the curtain and then allow them to go a little bit deeper. In some cases, honestly, we would just like interop with our other tools. There are things that only in Blender you can do or only Photoshop or Substance Painter. You know, Studio won’t become, you know, this crazy place where you can do everything, but it’s hard to do anything, right? So we are very, I think, committed to Studio being good at what it does, and then it’s okay to have, you know, for really the pros that want to go deep to have great interoperability with external tools.

We talked a lot about Studio, but actually the thing that we are the most excited is taking all this AI goodness and bring it to inexperienced creation. And so we think that’s going to be the next frontier. So two aspects in the industry, we are at the very beginning and we are curious, too. We don’t know how it’s going to play out and we are super curious. Using AI for substantially different gameplay. Like, companies like ego.live are experimenting on that. And then taking all the ability to create that AI has unleashed and bring it to inexperienced creation. So I’m there, I’m playing my game on Roblox, now I want to modify this level. I want, with my friends, to create something new that we can play together. Those are, like, completely different type of creation experiences that we’re going to see be unleashed by AI. So right now we are incubating in Studio. We are making it solid, robust, we are making sure the output is high quality, but we are very excited to bring those APIs to inexperienced, and we think that’s going to be the real impact.

Sonya Huang: I’d love to dig more into that, maybe starting with just the in-game gameplay experience. Like, how do you think from the player’s perspective these games will be different?

Stef Corazza: I think they can be with basically no effort or very little effort from the developer, these games can be more personalized, and also they can be always different. Like, if you play the same game but there’s an LLM and he’s aware of your past, all the things that you have done in past sessions and then you come back, then you can make the game different, more interesting, can change the challenge, can spin a little bit the story. So all these things, when you have a very smart AI backend who keeps track of what’s happening in the history and knows also who you are, it can really morph the game onto something that is really more enjoyable for you, specifically. So we think that will be a big opportunity. There’s lots of companies that are now doing experiments, and we are very excited to basically provide AI as a platform and then let them experiment with different gameplays.

Konstantine Buhler: That sounds, Stefano, technically very hard. And I remember even for the simplest type of Roblox gaming—which there isn’t really a simple type, you’re generating digital worlds continuously—it’s actually a very, very heavy lift. You mentioned having your own data centers, right? Low latency, being able to play with people internationally across the world. How do you think about adding this new level of complexity, this AI inference complexity for gameplay, as in generating the world, varying it on the fly. What will have to change from an infrastructure perspective?

Stef Corazza: That’s a great question. So I think the first step towards that will be NPCs, so non-playable characters that you hook up to an LLM. That doesn’t need …

Konstantine Buhler: That’s pretty straightforward.

Stef Corazza: Right. That’s pretty straightforward, and I think that’s going to be the first rep, and we’re going to see what kind of impact that creates. Then we have seen other experiences where they actually want to create the whole world. And of course, there are some challenges there. And let me tell you, what I think is the biggest challenge is actually moderation, because at Roblox, safety is our number one product, and we want to keep the platform safe and we want people to connect with civility. And when you allow people to create anything, of course the bar goes a little bit higher. And so we opened this Pandora’s box, but we also have to build the guardrail so things, you know, stay positive and, you know, everybody can have a positive experience on the platform.

So I think that is more of a challenge than, like, latency and infrastructure because we can use CDN, we can cache things, we can pre-generate some of the content, we can use a level of detail. So there’s a lot of things that the gaming industry has developed in order to basically stream. We are using streaming as well in the platform, so we can basically in real time generate and stream more assets. I think the challenge will be more, like, if you have now the control of the game that you’re playing, how do you make sure that everybody else has a great experience? I think that’s going to be the challenge, both on the moderation side and also, like, in making it fun. Again, we have infinite respect for game developers because they know how to make things fun. And not every player, you know, has mastered the same skills of, like, making games fun for the last 20 years, right? So it’s like, how do you give this freedom to create while you somehow control the gameplay and the story so it stays compelling? So these are all things that we are very eager to learn from the community, honestly. And I think that’s the beauty of being a platform: We don’t need to have any opinion about it and we don’t need to figure it out ourselves. We just provide the APIs to the community, and the community with their infinite creativity will figure it out.

Sonya Huang: So I’ve heard NPCs as that first stepping stone in game AI frequently. What do you think is going to be the next stepping stone for how game mechanics or gameplay might change? Because I imagine you’re not going to go from NPCs to entire worlds, right? Is there another—what’s the second stepping stone?

Stef Corazza: I’ve seen games where they limit the creation to one specific item. So for example, there’s a very popular game on Roblox called Build a Boat. You go into this game, you’re building a boat, and then you’re sailing that boat, and then you’re racing other players. And so you don’t create the world; the world is predefined by the developer of the experience, but you basically allow that constraint creativity. As in these are the materials that you can use to build the boat, and this is the size that has to be. And then you can go crazy and build whatever thing you want. And because Roblox is such a physics sandbox, right, we have aerodynamics, fluid dynamics, and so we can actually see what you create and, like, simulate the wind and you’re going to see the outcome of what you created. And so I think experiences like that right now are a little bit more difficult without AI, and AI, I think, can really power those up quite a bit, where you can create your race car, you can create your boat, your airplane, or like some element of the gameplay, based on what you create, you can have, you know, an advantage in the game, but you’re not completely changing the game itself.

Sonya Huang: Constrained creativity. That’s a great answer.

Stef Corazza: Yeah, exactly.

Konstantine Buhler: I have two follow-ups from that. The first is you just mentioned physics engines, and there’s been a lot of enthusiasm for years now around PINNs and these neural physics engines, et cetera. Is that now in the physics engine at Roblox, if you can disclose? Are you using neural networks as part of the physics engines, as part of the—you mentioned aerodynamics, right? Are you guys at the point where you’re estimating Navier-Stokes because it’s cheap enough because you can use neural networks? Or is that maybe in the future? And then my second question is about NPCs. Would you guys ever allow a world or a game that is purely NPCs, as in something you watch as opposed to something that you participate in?

Stef Corazza: Those are great questions. So on the physics side, I think the use of neural networks for physics is not being proven as effective. I think there is more—like, there’s an infinite amount of, like, real world approximations that you can use that are, like, very computationally efficient. And I think that’s more important than how good you’re approximating the world. And so on that front, I’m not sure neural networks are going to provide that much value, honestly, especially when you have, like, already implemented all the basic functionalities and people are already using them successfully.

Konstantine Buhler: Especially for a purely digital world, I mean, if you’re trying to—if there’s a noisy real world, I imagine it’s different from, like, Roblox world, where everything can work in the physics estimations. Is that fair?

Stef Corazza: Yeah, that’s fair. So maybe I would say that’s all I can say is basically “maybe.” I think at the moment it’s still not fully proven as a path. And then on the NPC front, you know, if you think about a game where all the players are NPCs and you’re just watching it, to me it sounds like TV, right?

Konstantine Buhler: Yeah.

Stef Corazza: So people watch this all the time. And, you know, these NPCs can be super smart and do super fun things, can race, can do whatever they like, and maybe it’s just fun to watch. So definitely we are seeing such a huge community of people that just watch other people play games. And so those players could be LLMs in the future and still generate great entertainment and fun. So I will not exclude that. I think at the beginning we’re going to find a hybrid model which is like, you know, NPCs existed where people will inject NPCs in the game to just make it more interesting and have people that—you know, you also can populate games that just launched. They don’t have, you know, a thousand concurrent players yet, and so you can basically populate a huge world with characters that are interesting to talk to without having to have, you know, the real people there right at the beginning. So I think there’s going to be a lot of potential on that front, for sure.

Sonya Huang: What about for user-generated content? Like, how do you imagine UGC changes now that generative AI is really coming into the experience?

Stef Corazza: That’s a great question. We have seen some of this with the GenAI features that we released for avatar creation. So avatar creation is another huge area of creativity. There’s a massive community in Roblox. They work on, like, making assets and clothing and accessories for avatars. And a lot of people make a living doing that.

And so we have found that there were a lot more people like always that had amazing creative ideas but may not have the tools to, like, do 3D modeling, for example, right? 3D modeling is a very narrow, specific skill, they will have to learn, you know, Blender or Maya, and so not all of them were up for the challenge, but they had a great idea about what outfit to create. And so there we experimented with using images as an input or text as an input and generate avatars, and we found that it substantially democratized how many avatars could be created. And this is an area we launched, like, an early beta of our avatar auto setup, and we are doubling down on that. We’re going to do a lot more in the coming months that you’re going to see. But basically, we want to allow people to create avatars, clothing, accessories with just multimodal, easy input. Again, there’s more people that have great ideas than people that can actually execute on those. And so we want to basically remove those barriers and really leverage AI for that type of creative expression as well.

Sonya Huang: Do you think that generative AI blurs the line between what it means to be a user versus a creator? Like, do you imagine those two become kind of the same thing? And I imagine you think of them distinctly today.

Stef Corazza: Oh, for sure. Yeah. And I mean think about what happened with the music scene, right? The boundary between the composer and the people that could play an instrument and people in the audience were just so set at the beginning. And then, you know, I think karaoke, like, completely blurred that, right? And then now everybody can create things on GarageBand. And now with AI, people can just type lyrics and the genre and then a full song is created. So they are—like, the boundary got completely blurred, I think, in the audio and music space. And, you know, for game development, it’s going to take a little bit longer because it’s just more complex as a type of content. But I think it’s, you know, just entropy only increases. And so we’re going to see that, for sure.

Konstantine Buhler: Stefano, you mentioned that you guys have an absurd amount of data. You guys have an insane amount of video data. I think Roblox is still the number one VR application in the world by a lot, like, including on VR headsets and then also obviously personal computers, all sorts of devices, et cetera. Ton of audio data, ton of textual data, like, kind of everything data. And so you kind of—you mentioned this and then kind of walked by it, but obviously Sonya and I are very curious, with all that training data—yes, the name of the show—are you guys going to have a world model? I think I know the answer to this. I think it might even be announced already, depending on when this airs. Are you going to have a world model? What’s it going to look like? What will be its boundaries, and where do you go from there?

Stef Corazza: Yeah, that’s a great question. So yeah, we do have an enormous amount of data, and I think the challenge for us is less about gathering data and more, like, being able to use it, because there’s a lot that goes into from the raw data to be able to train an LLM or AI in general. So that is where, like, the work is focusing. We have announced just a couple of weeks ago that Roblox is working on a 3D foundational model. Our intent is to open source that, and basically what it does is it will allow the digital synthesis of scenes and world from multimodal input. So that’s the goal.

And I think also we can go a little bit beyond that. As I said before, our goal is basically to teach game development to AI. And so it’s not just about world creation. I think that will be the first maybe area that we can really attack in a meaningful way, but then there’s going to be all the interactivity of that world, the ability for things to move, bend, doors to open, characters to run around, right? And then the next level will be all the interactive gameplay. And so we are seeing the data, and we also can see from the data what works, what is fun for the user. And so we will be able to guide AI to not just make games, but actually make games that are fun because you can see which games get more traction or, like, which specific levels get played more. And so they could be in the inference aspects of quality of the experience, not just like, “Hey, I’m just going to generate whatever you ask me, but I can also, like, if I can pick, I can pick things that are like more fun to engage with.”

And so again, that one is going to be—you know, it’s going to take a while to figure out, but, you know, we have the data that can support it. I think it’s more about on us to be able to pick up. You know, there’s going to be a lot of, like, unsupervised learning that we have to do, and so we have to pick up, we have to find ways to pick up signals of intent from our creators that then we can use in an unsupervised way to classify things and then figure out what AI should be learning from.

Konstantine Buhler: I’d imagine because of the physics engine, you guys have such a competitive advantage with object placement, object interaction. Like, the struggle with a lot of training data for these video and world models is, like, they’re two dimensional, and there’s no concept of—I mean, maybe you can infer from it the concept of what is three dimensional and how objects interact, but is that something that you guys are leaning on particularly heavily? I’d imagine it’s a big advantage to the 3-D model you create.

Stef Corazza: Yeah, it’s a huge advantage for getting also consistency over time and spatial consistency. If you actually have the full 3D model and if you have a full 3D scene, then you can anchor the spatial and the temporal coherence in a much stronger way than if you’re estimating those things from a 2D. I mean the classic example is if you’re trying to stylize video and you use just video to video, you’re going to see drifts and artifacts and everything. Then you know, with ControlNet and the depth map, which is like a, you know, 2.5D approach, this video-to-video type of stylization became a little bit more reliable and more consistent temporally. But you still, like, don’t have—you have basically the situation where, like, if a character is looking at you and then you stylize that face and then they look away and then look back, they are like a different person, right? We have all seen those examples. And so the only way I think to overcome that is to actually train with 3D data.

And the industry went so fast so far with, like, just 2D, right? And is now making magic with 2D. And I think there’s still, like, the work to incorporate 3D information into those algorithms still has to come. And I think it’s going to bring to fruition this, like, temporal and spatial coherence that right now we don’t have, or that it’s so hard to generate out of 2D data because it just is not rich enough.

Sonya Huang: Just to make sure I understand what you just said, do you think that 2D is going to lead to a dead end and you have to start over from 3D, or do you think that 2D can kind of get there with scale as well?

Stef Corazza: I think if the goal is to operate on a single image, 2D has done fantastic. And so if you need to texture a character or stylize an image, all these things, it’s fantastic. If you want to actually have video, so if you add the risk of time on the axis, and now you are also maybe moving the camera, so the camera is moving and then, you know, there’s, like, 2,000 frames ahead of you that have to be consistent. At that point, if you’re not using 3D data, you’re at the disadvantage and you’re going to be dealing with drift. And yes, you can enforce it in many possible ways, but the problem of 2D data is that it has occlusions, right? So if you’re looking at me and my arm is behind my back, and then at some point my arm pops out, you have nothing to enforce consistency of the look of my hand because you just have not seen it before.

So all these problems don’t exist if you’re operating on a 3D representation of your scene, because even the data they are not seeing from the camera is there and it’s available. And so I think the industry has gone, like—you know, made amazing progress in trying to cope with that, which is a fundamental lack of data. And then if that data is actually used and can be incorporated in, let’s say, a stable diffusion, then we’re going to see a much better outcome.

Sonya Huang: Stef, in our homework for this episode, we heard that you’re passionate about neural rendering. Can you tell us what that is?

Stef Corazza: Yeah, I don’t know where you read that, but yes, it’s true. I’m very passionate because I have a strong belief that neural networks will change the way we’re going to stylize games and make games visually a lot more compelling than they are today. And so if you look at the history of game development, the creation of assets and the style of the game were always tied together. You know, if you’re making Super Mario and that’s the style of the game, you can’t just like on the flip of a coin say “Okay, make it look like a Call of Duty.” It would just not work. You will have to redo all the assets from scratch.

But if you’re using neural rendering or generative rendering techniques, you’re able, with some text description and maybe some reference images, to restyle your game in real time. And then you leave the geometry where it is, so it’s physically consistent, right? Your physics capsule is in the same place, but the visual look can be very different. And this can be used also to make games completely photorealistic even though you’re using some low resolution meshes and textures, you can use this as a final pass. Like, in games you add typically bloom at the end to make things look really cool, and so it could be like a final generative rendering pass, make it photorealistic, or stylize it in this specific way that I like. And you can restyle your game in a way that actually doesn’t add any extra assets that you have to download. It doesn’t change the assets. It’s really like a compute layer that happens at the end. Right now it’s very compute intensive, but looking at the speed at which things have moved, you can imagine that in the future developers can just add this filter to their game, describe the style, give a reference image, and then have the game look beautiful without changing a single geometry in the assets.

Konstantine Buhler: Wow.

Stef Corazza: So I’m a believer that in five years, this will be the way games will be built, and it’s going to run also, at least on the high-end phones. And yeah, we’re very excited about it.

Konstantine Buhler: Stef, does it ever become the end gamer’s choice the way it’s rendered? As in I put my skin or style into whatever game?

Stef Corazza: That’s a good question. I think if the developer of the game allows that freedom to the player, why not? Right? I think it’s going to be an artistic choice. “Hey, you can play my game and you can make it in any style you want.” And again, you can still play with people that will visually see a different game, but the game is still consistent. All the physics, all the game playing still hold.

Konstantine Buhler: Stefano, thank you so much for joining us today. We learned an incredible amount, certainly about the scale and complexity of Roblox, not just from technology, but also the scale of impact—the 79 million daily actives, the hundreds of millions of monthly actives, the amount of technical depth that’s necessary to get there. We learned about how you’re using AI as dishwashers to empower your customers and developers and gamers, and not just in the Assistant, but also in the code assistant and the material generator. And we learned a little bit about what you think the future is going to look like. We might be using infrastructure for guardrails to make sure that the civility of Roblox is preserved. We might be watching some NPC television, and we certainly will have a 3D world created by Roblox. Can’t wait to live in that oasis.

Stef Corazza: Thank you so much for having me. It was a pleasure.

Sonya Huang: Thank you.

Mentioned in this episode:

Driving Empire: A Roblox car racing game Stef particularly enjoys
RDC: Roblox Developer Conference
Ego.live: Roblox app to create and share synthetic worlds populated with human-like generative agents and simulated communities|
PINNs: Physics Informed Neural Networks
ControlNet: A model for controlling image diffusion by conditioning on an additional input image that Stef says can be used as a 2.5D approach to 3D generation.
Neural rendering: A combination of deep learning with computer graphics principles developed by Nvidia in its RTX platform

Roblox’s Stef Corazza: How AI is Empowering Video Game Creators

Training Data: Ep28

Listen Now

Stream On

Summary

Transcript

Chapters

Contents

The scale of Roblox

Building Robolx Assistant

How will Assistant evolve?

Constrained creativity for better game play

Generative AI and UGC

Roblox’s 3D foundation model

What is neural rendering?

Mentioned in this episode