Google NotebookLM’s Raiza Martin and Jason Spielman on Creating Delightful AI Podcast Hosts and the Potential for Source-Grounded AI

Training Data: Ep17

NotebookLM from Google Labs has become the breakout viral AI product of the year. The feature that catapulted it to viral fame is Audio overview, which generates eerily realistic two-host podcast audio from any input you upload—written doc, audio or video file, or even a PDF. But to describe NotebookLM as a “podcast generator” is to vastly undersell it. The real magic of the product is in offering multi-modal dimensions to explore your own content in new ways—with context that’s surprisingly additive. Raiza Martin and Jason Speilman join us to discuss how the magic happens, and what’s next for source-grounded AI.

Listen Now

Stream On

Summary

NotebookLM’s Product Lead Raiza Martin and Design Lead Jason Spielman created a breakthrough AI research and writing tool at Google that gained viral attention for its ability to generate stunningly lifelike podcasts from any source material. Their innovative approach to source-grounded AI and intuitive user experience demonstrates how AI tools can augment rather than replace human capabilities, while making complex information more accessible and engaging.

Understand your users’ real needs: Building NotebookLM began with a personal insight—Raiza’s experience as an adult learner struggling with textbooks sparked the idea for an AI tool that could make dense information more digestible. This led to discovering broader applications, from sales teams distributing knowledge more efficiently to venture capitalists accelerating their analysis of investment materials.
Start simple but make it magical: The team’s “one-click” approach to generating audio overviews exemplifies how reducing friction can accelerate adoption. Rather than overwhelming users with controls and options upfront, they focused on creating a delightful initial experience that naturally draws users into the product’s deeper capabilities.
Build for Source-grounded workflows: NotebookLM’s fundamental design choice to be source-grounded—requiring users to upload content before engaging with the AI—creates a more focused and valuable experience. This constraint actually enables more meaningful interactions by providing concrete context for the AI to work with.
Balance innovation with familiarity: The team approached new AI interfaces by making them recognizable—like turning document summaries into podcast-style conversations—while pushing boundaries in ways that feel natural to users. This “skeuomorphic” approach helps users adapt to novel AI experiences through familiar formats.
Move fast but stay focused: Despite operating within a large organization, the team maintained startup-like agility by prioritizing shipping features and gathering user feedback. They kept their team small and used creative techniques like “fake deadlines” to maintain momentum while taking advantage Google’s advanced AI models and technical expertise.

Transcript

Chapters

Introduction
Google’s ChatGPT moment?
NotebookLM’s genesis
Making Audio Overview magical
User experience and use cases
Surprising moments
Design choices
Challenges for a new AI-native experience
Where does NotebookLM go from here?
Making Audio Overview conversational and adding personality
Where did the idea for Audio Overview come from?
Building NotebookLM at Google

NotebookLM Host 1: Hey everyone, we’re here on Training Data.

NotebookLM Host 2: It’s great to be here. I’m a longtime listener and fan of Sequoia Capital.

Host 1: Pretty exciting, to have us the hosts of another podcast, join as guests.

Host 2: Yeah. I really just want to say a huge, huge thank you. Thanks for having us on the show.

Host 1: Yeah, seriously. Thank you to Sonya and Pat for having us.

Host 2: I mean, it sounds like we’re on the show because we’ve had lots of listeners ourselves, listeners of Deep Dive.

Host 1: Oh, yeah. We’ve made a ton of Audio Overviews since we launched, so it’s nice to get to talk about it.

Host 2: Definitely a cool experience getting to be here.

Host 1: And exciting to share what we’re going to do next.

Host 2: Exactly. We’ll keep learning and getting better for you.

Host 1: We’re glad you’re along for the ride.

Host 2: So yeah, keep listening. Keep listening and stay curious. We promise to keep diving deep and, uh, bringing you even better options in the future.

Host 1: Stay curious.

Sonya Huang: Two weeks ago, a mysterious, experimental product from Google became the talk of the town–NotebookLM, an AI-powered research tool that went viral for it’s ability to create staggeringly and hilariously realistic podcasts from any source material. Today we’re excited to feature Raiza Martin and Jason Spielman, the product and design leads on NotebookLM at Google. We talked to Raiza and Jason about the inspiration for the product, the development process for a project like this inside a large organization like Google, the surprising use cases that have emerged, and what’s next for NotebookLM.

Sonya Huang: Raiza, Jason, thank you so much for joining us today, and we’re glad that we could get you on today before your AI podcast hosts take over the podcasting world and put us out of a job. So thank you for joining us.

Raiza Martin: Thank you for having us.

Sonya Huang: I’d love to start by asking, you know, people are calling NotebookLM Google’s ChatGPT moment. It was an experimental product kind of in preview mode, and just went viral. The GPUs are going, brrr. You guys are, you know, the talk of the town. Do you agree with that take?

Raiza Martin: I mean, ChatGPT was pretty big for me, and so to imagine the comparison there, for me, feels a little bit like, whoa, is it? But I think what we’re seeing from a lot of people is that it’s having a similar impact of, like, wow, this is AI? This is what AI can do? So that’s been really cool.

Jason Spielman: Yeah. And I think I would agree in some sense that the first time I listened to an Audio Overview, you know, when that second host came on, it really was like a mind-blowing experience. But I think it’s also, like, at the underlying layer, like, the fact that we have Gemini 1.5 Pro digesting all this really complicated information and spitting it out in a way that’s pretty concise, I think the combo of those, to me, is definitely a pretty unbelievable moment.

Sonya Huang: Just for everyone who’s listening, what is Notebook, for anybody that hasn’t played with the product yet?

Raiza Martin: Yeah. Notebook is an AI-powered research and writing tool, but I think nowadays it’s more commonly known as; upload a source, and then it will generate an audio overview for you—or a podcast.

Sonya Huang: And did that happen by accident? Like, did you start out wanting to build a podcast host killer? Or did that come out by accident somehow?

Raiza Martin: I think, you know, honestly, we were always working on the different modalities for output. I think voice was the next one, and we chose dialogue. Did we know that it was going to be a killer? I want to say no. I thought it was pretty magical, but the way that it’s really landed with people has been delightful and surprising.

Pat Grady: And I know you guys have been working on NotebookLM for a while. Can you take us back to the beginning of the project? What was the initial idea? How did it come to be?

Raiza Martin: Yeah. I mean, I remember I was working on AI Test Kitchen, and this was last year, and Notebook actually started as a 20 percent project. We had one engineer who had been working on something called “talk to small corpus.” And super funny! I know, I know. I was like, “What is a corpus?” But then I would chat with him and he was like, “Hey, you know, it’s really the idea that you can use LLMs to talk to your data, try to extract stuff from it.” And I was like, “Oh, that’s super interesting.”

So I started thinking about, okay, what are the practical use cases here? And I actually went to school as an adult learner, and to me, I was like, wait a second, if I could use an LLM and I understood what LLMs could do, and I could use this maybe to talk to something like a textbook. Like, oh, this is pretty exciting. I could see how that could change my life. It could change lots of lives. And that’s when we started really revving on, like, hey, what do we build to introduce the first version of this to people? And, you know, it was in May of ‘23 that we introduced Project Tailwind, and it was just that. You uploaded a source, a PDF, you could chat with it.

Jason Spielman: Yeah. I think that the fact that we are source grounded is what makes the product so unique. I think even when I started thinking about this project, I didn’t realize that everything in my life that I create often has some sort of prior artifact or document that I used to create something new. And so I think right now at least, I would call this a source-grounded tool. But we’re really becoming a source-grounded tool for creation and a bunch of other stuff as well.

Sonya Huang: Are there any stats you can share about NotebookLM?

Raiza Martin: I think what I’ll share is that we were on a steady growth path before audio overviews, but since we launched it, it sort of rapidly accelerated. And that’s been really exciting. It’s been a really good hook to bring people into the product. I think the other thing I’ll say is that while it brings people in, people generally stay for the rest of the features. And that’s been also really interesting to see in terms of, like, what people are trying to get out of a tool like Notebook.

Sonya Huang: So the podcast or the audio overview experience is absolutely magical. Can you tell us a little bit about how it works behind the scenes? Like, how did you make it so lifelike? How did you make it—how did you make the dialogue so good and engaging? It just draws you in. Like, how do you do it?

Raiza Martin: Yeah. I mean, first I’ll tell you it was a lot of work. It was a lot of teamwork. There was a lot of craftsmanship that really went into it. But at the heart of it is really Google’s models. You’ve got Gemini 1.5, which is such an incredible model in terms of taking all of that data that you give to NotebookLM and then producing something new out of it. And then you have the voice models, the audio models, that back NotebookLM.

I’d say the real sort of powerhouse between those two is something we’ve built called Content Studio. And that’s really what brings to life sort of the editorial, right, between you bringing your content and then coming out with the podcast, there’s some editorial liberty that we take with the Studio.

Sonya Huang: Hmm. And so in the future, do you see yourself exposing the Studio element to people? You know, make this one funnier, make this one more serious?

Raiza Martin: So I think, like, we hear a lot, like, particularly because so many people are using it, so many people are delighted by it. I think the next step is then people want the knobs, right? They want to be able to control it. And this is where, you know, my gut reaction was, “Okay, let’s ship the knobs.” But I’m trying to have a little bit more discipline in thinking that, hey, you know, people fell in love with it because it was delightful, it was magical. How do I ship delightful and magical knobs?

Jason Spielman: Totally. [laughs]

Raiza Martin: And that—you know, there’s only so much I can do, but I think there’s a way. And so I’m very interested in doing that.

Jason Spielman: And I actually do think part of the explosion of audio overviews was the fact it was a simple one-click experience. You know, I was on the phone with my grandma, trying to explain to her how to use it, and it actually didn’t take any explanation. You know, I’m like, “Okay, drop in a source.” She’s like, “Oh, I see. I click this button to generate it.” And I think that the ease of creation really actually is what catalyzed so much explosion. And so I think as we think about adding these knobs, I think we want to do it in a way that’s very intentional.

Raiza Martin: Also just like fun.

Sonya Huang: Yeah. You mentioned people come for the podcast and then stay for everything else. What are some of the best use cases you’ve seen for the everything else?

Raiza Martin: I think I’d say one of the most surprising ones—I talked a little bit about the educational use case. It was very personal to me, and I saw a lot of students and a lot of educators using NotebookLM. But what’s been surprising is to see the amount of people that are using NotebookLM at work.

So one good example is we ran a pilot case study inside of Google. And so within the ads team, we have a lot of ad sellers, ad specialists, and I didn’t know this, but for these ad sellers, a lot of their sales training and documentation are hundreds of pages long.

Pat Grady: Mm-hmm.

Raiza Martin: I was just like, how does anybody learn this? Right? And then the stuff changes all the time. And so it’s very hard to keep track of how something works well enough such that you can sell it. And what the sales teams normally do—or before NotebookLM what they would do is they would ping each other. They’d be like, “Hey, Joe.” Right? “How does this thing work? How do I position it for this client?” You wait for Joe to respond, and then you’re like, okay, let me copy-paste this into an email, fiddle it a little bit, and that was it.

But it turns out people like Joe, who have a lot of that knowledge and read all of this documentation, they build the notebooks, and then they distribute it to their sales associates. And then that’s hundreds of people automatically that are using the notebook because now they don’t have to ping Joe. And that’s really interesting to me because I was like, oh, it’s like a really simple use case. And then there’s so much more that you can really build on top of that.

Jason Spielman: Totally. Actually, I was talking to a friend who’s in sales who was like, “Dude, it’s great. I made this whole notebook, and when I’m on calls and I don’t know the answer, I can quickly ask and get a response.” And so I think that this idea of knowledge distribution is really helpful for a large sales team or data centers and stuff. I think another use case that I also think is really interesting that actually you may align with is I have a lot of friends who work in venture and PE, and this idea of a confidential information memorandum, a sim, I had never heard of this before, but I have a friend who, he’s like, “This is my whole job is basically going through these packets of information. And so what I do is I take these documents I receive or slides, I put it in a notebook, and I’m able to now way faster than before, go through all this fairly complex information.” And I think that he was telling me he 10Xed his job speed, which is great. And it’s just this is empowering him to be faster.

Sonya Huang: Podcast host and venture capital. You are really going after our jobs.

Raiza Martin: [laughs]

Jason Spielman: We’re helping out your job.

Sonya Huang: What have been the moments that have surprised you the most? For me, it was the moment where the AI hosts kind of realized that they’re AI, that was a really cool moment. But what have been the moments along the way that have surprised you the most?

Raiza Martin: I mean, I’ll start. I was going to bed. This was last weekend, I think, or just several days ago, and I was on Twitter—probably not healthy to be doing this before bed.

Sonya Huang: Your Twitter is fire, by the way.

Raiza Martin: [laughs] I was scrolling through and I saw the poop fart one. Folks who haven’t heard it …

Pat Grady: [laughs]

Raiza Martin: I also now I’m just like, this is a …

Jason Spielman: This is a very important mention, by the way. Like, if you haven’t heard it, you really do need to understand how magical this one is.

Raiza Martin: Okay. If folks have not heard the poop fart one, somebody …

Pat Grady: Explain it to us.

Raiza Martin: Somebody decided they were gonna upload a document where the only words in the doc were poop and fart over and over and over and over again. So it was a pretty lengthy doc, but it was just those two words. And I saw that’s what they had done with it. They described it, and I was like, “Oh, man, should I listen now? It’s 11:00. If I tap this and it’s a safety flag, I’m not gonna be able to go to bed, right?” Because I’m gonna have to open a bug, I’m going to ping the engineers. It’s like, “Hey, we got this thing going on.” I was like, “All right, I’ll just listen.” And it’s actually unbelievable.

Jason Spielman: Unbelievable. I also—like, I saw it and I was like, “Uh oh! Like, we—let’s see what this is gonna be.”

Raiza Martin: Yeah! [laughs]

Jason Spielman: And you listen, you’re like, “Oh, this is, this is fantastic. Like, this is even better than I could have ever imagined.”

Raiza Martin: It was one of those moments where I was like, “Well done, NotebookLM. You did good, little guy.”

Sonya Huang: Amazing. What design choices have you made that have made Notebook work so well and so intuitive for people?

Jason Spielman: You know, I think that—I clarify, we’re still making those decisions. I think right now we’re very much in the process of, you know, launching quickly and then working closely with our users to understand what’s best and what they want. You know, tech is evolving so fast right now that it’s really hard to know what’s even possible. And so I think that we’re really pushing for this model where we launch quickly and then work kind of alongside our users to build the best product.

But to answer your question more specifically, I think that one thing that we’ve done that I think was almost a happy accident in a way was make that left source panel really clear. I think that we are a source-grounded project, and we need to make that clear that you’re talking to the sources that you’ve uploaded. And so I think that having those sources there on the left is a pretty crucial part of this project. I do also think, though, as I was mentioning earlier, audio overviews being one click seem to actually pay off, that we really leaned into this simple experience. But that being said, there’s a lot more coming and we’re actively working with users to improve the product.

Raiza Martin: I think one of the things that I’ll chime in in terms of design choices and really, I think, on the product prioritization sort of side of things is really thinking through what does it take to make something new, intuitive? And it’s really hard, especially something as nuanced as, like, oh, first you have to upload a source. Like, users generally balk at that step of, like, why? Right? Like, “I don’t have to upload a source to ChatGPT. I don’t have to upload a source to Gemini. It just works.” And so I think we still have a lot of work to do around the it just works category.

Sonya Huang: What do you think are the biggest challenges remaining as you kind of bring people onto this new AI-native experience?

Jason Spielman: Yeah, I think that we’re kind of in this quote-unquote skeuomorphic era of AI design. And I think to explain skeuomorphism, it’s when a virtual object reflects a real world object. And that was seen in early iOS when the notes app had a leather bounding at the top and the pad was yellow, and that was made to kind of ease users into this virtual world from the physical world. I think now we’re seeing something similar with AI, where we need to build UIs that help meet users where they are. And I think right now we’re doing our best to kind of be really creative and think about these new kind of crazy experiences while also understanding that many of these users, this is their first time interacting with artificial intelligence.

Sonya Huang: How do you think about—you know, one thing I think Midjourney has done extremely well is just making it easy to, you know, get over the blank wall prompt problem. And so, like, to me that’s something that Midjourney has done extremely well. Are there any other applications that have kind of approached some of these UI challenges that you admire?

Raiza Martin: I have one. Recently, I just tried Pika, and I really love the Pika effects where you can see exactly what’s going to happen to your image if you upload one. Because Pika is similar to the extent that you have to upload something and then you have to maybe write a prompt or choose an effect. And I think it was really well done where it’s like, hey, here’s a preview, right? It will squish the thing. I was like, “Oh, this is fascinating!” And of course, I uploaded—there was like one that was like cake. I uploaded a drink, a picture of a drink, and I was like, “Make it cake!” And just like, the anticipation of the drink is gonna become cake. I was like, “Come on, come on, come on!” I was like, “Should I pay for it now?” It was like my first generation, too. I was ready to pay for it. That’s how I knew. I was like, oh, like, there’s definitely something there about show the user what’s on the flip side that I think really incentivizes the user to not only give you the image, but they’re really excited to see what happens. And then if you’re me, you’re like, “Take my $10!” So it’s really effective.

Jason Spielman: I think for me, I love Claude Artifacts. I think they’ve done such an amazing job of code creation. We’ve talked a lot about writing code creation, and it was awesome to see others in the space thinking about that as well. I think that right now, as I was just briefly mentioning, we’re in this space where we want to equal the hierarchy between, you know, AI and human. And I think we definitely don’t want to take your jobs. We just want to help support your jobs, you know? And I think that Claude Artifacts was a perfect example of that in my mind, which was cool, you can talk to the chat, but also start building out something on this right side as well.

Sonya Huang: Hmm. How do you think your products kind of compare and contrast to the approach that Claude has taken? Do you think it’s similar things that you’re going after, or how do you think about the differences?

Jason Spielman: I think first and foremost, right now at least, we’re a source-grounded tool, which kind of immediately makes us a bit different. That being said, I think we’re thinking a lot about creation broadly, utilizing the sources that you uploaded.

Raiza Martin: Well, I think to that point, the contextualization of your LLM interactions is really powerful, and I think it creates a stickier user experience. I think that if I had to guess, the folks at Claude probably know this, or the folks at Anthropic, the folks at OpenAI probably know this. Certainly the people at Google know this, but I think there is a question of when to introduce it and what are the right surfaces? So I think this is where for NotebookLM, I’m excited because we started there, right? So there’s like a little bit of hey, as people catch on to the importance of, like, source-grounded workflows, source-grounded stories, this could be the tool that they’re looking for. And if we just sprint at this, hopefully we’ll get farther along before everybody else who’s juggling all of these other use cases.

Sonya Huang: You mentioned earlier that, you know, Chat is kind of a skeuomorphic interface for AI, and that you guys are experimenting with crazier things. Like, what might the crazier things look or feel like? Give us a taste.

Jason Spielman: You know, I think just that at a high level, I’m super intrigued with these kind of dynamic UIs. Actually, Claude is an example of that, right? You see this artifact come in that wasn’t originally there. I think we’re thinking a lot about how to—we’re trying to do a lot here, right? Reading, writing, and I think there’s only so much that you can do before a user gets overwhelmed. And so I think we’re really exploring how do we take advantage of what you’re doing at that moment while also not overwhelming you with all the other possibilities?

Raiza Martin: I think, for me, I think a lot about leaning more into new modalities, which is what does it mean from an input and an output side? And I do a lot of prototyping on my own and experiment with a lot of my own behaviors. And one of my favorite ones is sort of the idea that I can sort of walk and talk with my LLM, right? Or with, like, an AI sort of ecosystem.

And one of my favorite recent examples is I started doing this with a daily journal, where instead of writing my journal, I just go back and forth, and it creates the journal log for me. And then it creates a visualization of basically like, hey, you know, this week you had more bad days than good, or you had more good days than bad. Here are the things that made you happy. Here are the things that made you upset. And I think there’s a lot of richness there in interaction where I think about it, it’s like, hey, source-grounded AI, of course, there’s some really practical sort of work use cases there, there’s some educational use cases, but the personal ones are also really compelling. And so I’m trying to think about how do I take these learnings and bring it back into NotebookLMm? And, you know, probably in something like the mobile app, right? We might see more of that.

Sonya Huang: So you now have—you know, you have lightning in the bottle with NotebookLM. Where do you hope to take it from here?

Raiza Martin: I think honestly, just keep going. I just want to keep building more cool stuff. We want to deepen the experience for users. We want to make it really useful. I think there is a lot of magic right now, a lot of delight, and I think we want to deliver on the promise of that initial hit and just show people, like, hey, you can stick around. It’s gonna be great.

Pat Grady: What do you think is the biggest thing missing from the product today?

Raiza Martin: Oh. Well, I’ll say if I could rewind back in time and build more things as part of this launch, I would definitely build a better share experience. Just the amount of when I scroll on X and I see all the videos and visualizers that people use instead of our native ones, you know, as a product lead, I’m like, “Oh, I’m missing out on counting this user here because they’re on a different surface now.” So I think for me, it’s really the sharing and sort of collaboration around the audio overviews that’s missing.

Jason Spielman: And I think as we started talking about, I think I’m really excited about the addition of a writing experience. I think that we know that people are often doing Q&A and then taking that answer and then create something new. And so I’m just excited to help kind of fulfill that whole user journey.

Sonya Huang: How do you make it—like, are you prompt engineering to make it kind of—like, do you tell it to be conversational? Funny? What are you doing on the technology?

Pat Grady: Yeah, how do you design the personalities? I’m really curious about that, too.

Raiza Martin: So there’s a lot that we do in the background, and I think you hit on some really good aspects of it, especially around, you know, the show is called Deep Dive. There’s clearly two hosts. And what I’ll say is that there’s a lot more editorial liberty that the personas themselves take to generate that show. And I think that’s where, you know, even for me, I am always interested to see where they’re going to take the show based on the sources that are uploaded.

Sonya Huang: Oh, interesting. So you’ve given each of the sources its own personality, its own how it approaches things, and then you let it create the podcast.

Raiza Martin: Yeah. In a nutshell, I think that’s the best explanation for what we’ve got going. And so when we think about the editing experience, right? It’s like, oh, what are the controls for something like that? Of course, there’s basic stuff, which is maybe I don’t want a deep dive. Maybe I want a different show. Maybe I want a different length. Maybe I want shorter, maybe I want longer. Maybe I just want to specify a topic instead of the whole thing. Because today it’s like an overview-based audio. So I think there’s a lot there that we can tweak, but the heart of it is really this editorial liberty around your sources and trying to give you an overview of it.

Sonya Huang: Hmm. And every time I’ve joked that you’re going to take our jobs, you say we’re not. But I don’t know if you’re just saying that to be nice, because what you’ve generated is legitimately so good. And so the real question I have is: when you say that it’s not good enough to replace real podcasts, why do you say that? Because to me, it feels good enough to replace a real podcast.

Raiza Martin: I think that’s a good question, and one that I try to approach really carefully, particularly because hey, if there’s real risk, I want to look at it in the eye and say, “Okay, how do we address this?” But from what I’ve seen, a lot of what people are making are not the same things that I feel like we would have a real podcast about, right? Like, do I want to listen—do I want to take an article and make a podcast out of it that replaces, you know, one of my favorite podcasts? Lenny’s, right? I listen to Lenny’s all the time. It’s like, no, I want to listen to Lenny. I want to listen to what he thinks about this particular topic.

And then what’s funny is like, people are making audio overviews of things like their resumes, right? Their LinkedIn bio, or startup founders putting in their landing pages and trying to figure out, oh, was my messaging clear? Like, that stuff is really cool because it’s like, no one’s ever gonna make a podcast of that. I mean, maybe not at this stage, right? But that’s where I think, okay, this feels really good. It feels like we’ve created a space where personalized generation really is about meeting my needs exactly where I’m at, and there isn’t an existing thing out there. And that’s really special.

Jason Spielman: It does almost feel like a different media type. Like, sure, it sounds like a podcast, but I think you give great examples to kind of prove all these random use cases people are using it for. But I think there’s also a reason that reaction videos are so popular online. People aren’t just listening to this right now because of us, because they want to hear from both of you who are in this space. And I think that’s also important to remember when thinking about podcasts.

Raiza Martin: I will say one interesting thing about the dynamic is even though people are sharing the audio overviews that they’re generating, they’re very personal. It’s like, I made this for me. I didn’t make it for you to listen to my resume. It was me. I was delighted by the audio overview of my resume. Or there’s this really cool TikTok of where this woman uploads her diary from 2004, and it’s like, it was interesting to listen to that together, but it was really her reaction to her diary that, you know, she wasn’t going to listen to a podcast about that ever.

One of my favorite use cases, actually—I don’t know if this was in the Discord, but somebody recently took—they said over the weekend their group chat with their college friends had blown up. And so they didn’t read the messages, but they took all of it, and they copy pasted it into a doc, and they’re like, “Well, Monday morning, I’m gonna listen to what my college friends said on my drive to work.” I was like, that’s incredible. And I think that’s what personalized generation is.

Pat Grady: So in a world of chat boxes, where did the idea of hey, people want to listen to this? People want to consume this content in podcast form? Like, where did that idea come from?

Raiza Martin: Yeah. I mean, I think that it goes back a little bit to something Jason was saying, which is: how do we deliver new things in a recognizable format, or in a way that’s easy for people to understand such that they would be willing to try it? And I think the combination of upload your source, generate a new voice thing, we’re like, well, what are—what’s the universe of voice things that we could generate? We have this really powerful voice model. And we experimented. We were like, we could do a monologue, we could do a dialogue. We could give the user a switch. But it was really the dialogue that was resonating with people because it was like, oh, it’s a podcast, right? It’s not just like a text to speech, like reading the output like we typically expect. And I think once we saw how much that delighted people, we knew that was the thing we should do.

Sonya Huang: Okay, so you now have this killer feature in the podcast, and you have an incredibly general horizontal surface as well. Where do you go from here? Do you go deeper into the podcast thing or do you go build out the …?

Pat Grady: When do we get YouTube videos?

Raiza Martin: Yeah. Yeah, I think, you know, that’s just a cost problem. [laughs]

Jason Spielman: You could drop those in now as an input, but an output, yeah, I think we gotta work on that a little bit.

Raiza Martin: Yeah, I think it’s exciting because the roadmap is, I don’t want to say it’s fairly straightforward, I feel like I’m gonna jinx myself and something’s gonna happen tomorrow. But we know that we want to deliver on the promise of bringing in all the inputs that matter to you, and letting you use the power of AI to create something new. And I think the podcasts are definitely one type of output that we want to go deeper on, especially because we’ve seen how much people care about them. So that’s one part of it. But I think we want to deliver on the rest, like the more practical things as well, just because everybody has a different preference, right? Even—I think this was two days ago, someone was like, “Hey, can you just output better code? It’s like, the podcast is cool, but can you just output better code?” And I was like, “Oh, that’s a good idea.” I mean, it was on the roadmap, but I was like, “Yeah, for sure.” We should just deepen the investment into the outputs themselves.

Pat Grady: Well, I’m gonna ask you a sensitive question.

Raiza Martin: Okay.

Pat Grady: Not sensitive in terms of like, emotionally sensitive.

Raiza Martin: Oh, okay then.

Jason Spielman: We’re both here to help, actually.

Pat Grady: Possibly sensitive question. You guys, it seems like you have executed on this much the way you might see a startup execute on something. Like, pretty scrappy, lean team, move fast, a lot of user feedback, iterate in real time, release something imperfect to the world, and kind of test it—test it in production, which seems a bit different than what people might stereotypically expect of something coming out of Google. And so I guess the question is: in what ways has being part of Google helped with NotebookLM? And in what ways have you maybe broken the mold a bit with this project?

Raiza Martin: That’s such a good question. I think I’ll start from the perspective of what’s been great and what’s been really special at Google, which is the two top things that I’ll say: access to the models before they’re fully ready and just being able to see the capabilities that are being planned helps me to think about the way we build the product in a different way, which is okay, knowing that these capabilities are coming, how can I make this particular journey better? And that’s been really—that’s been really good.

I’d say the second thing that’s been really special is just really the people. Just really smart, really talented, and really collaborative people that also just want to build cool things. And so having the combination of these two things is really, for me as a product builder, is like, wow, right? Like, this is it. This is all I need. And I can just, like—all I have to do is execute. I just have to deliver. And if I keep going, we’ll ship something interesting.

I think, on maybe the stuff that doesn’t quite fit the mold or maybe things that we’ve done a little bit differently, I’d say that coming into Labs, I knew the most important thing for us to do would be to ship. And it’s easier not to ship than it is to actually do it, right? Especially from my experience at Google, I think there are many times that I second guessed myself. I was like, “Oh, how would it affect this or that?” Like, there’s so many considerations, but I think once you change your orientation to know the PZero is to ship, and you have to do it at all costs. And now I’m about to say it on the podcast, and I hope our engineers aren’t listening. I also create a lot of fake deadlines.

Pat Grady: [laughs]

Raiza Martin: And it’s really funny. It’s really funny because it works. I’ll be like, “Guys, October 10. We have to do it! It has to ship!” And everyone’s like, “October 10? That’s in two weeks!” It’s like, “Yeah, what do we do?” And they’re like, “All right, well, we got to, like, do it now.” It’s like, “Oh, yeah. I know.” And so we just, like, really crank on it. And it’s—you know, I’m making light of it, but for the most part, people don’t really ask, like, “What—what’s happening on October 10?” And so it works. It works for us. It’s worked for two years. So hopefully they really don’t listen to this.

Pat Grady: [laughs]

Jason Spielman: But I do also think right now,— do actually think there is a misconception that Google is slow. You know, in my seven years at Google, I’ve actually been surprised how quickly things move. But you just also have teams that are really big that affect billions of users every day. I think we’re in a sweet spot now where you have all the values of an incumbent, like a big company, like the scale and the data. But I do think also now, because we’re a small team of about 10 people, We also can move quickly.

Pat Grady: Yeah.

Sonya Huang: We can’t wait to see what you guys continue to build with the product, and hopefully don’t put us out of a job too soon. But it really is delightful what you’ve built so far. Congratulations.

Raiza Martin: Thank you. Thank you for having us on. This has been really fun.

Jason Spielman: Thank you.

Google NotebookLM’s Raiza Martin and Jason Spielman on Creating Delightful AI Podcast Hosts and the Potential for Source-Grounded AI

Training Data: Ep17

Listen Now

Stream On

Summary

Transcript

Chapters

Contents

Introduction

Google’s ChatGPT moment?

NotebookLM’s genesis

Making Audio Overview magical

User experience and use cases

Surprising moments

Design choices

Challenges for a new AI-native experience

Where does NotebookLM go from here?

Making Audio Overview conversational and adding personality

Where did the idea for Audio Overview come from?

Building NotebookLM at Google