Kumo’s Hema Raghavan: Turning Graph AI into ROI

Training Data: Ep26

Hema Raghavan is co-founder of Kumo AI, a company that makes graph neural networks accessible to enterprises by connecting to their relational data stored in Snowflake and Databricks. Hema talks about how running GNNs on GPUs has led to breakthroughs in performance as well as the query language Kumo developed to help companies predict future data points. Although approachable for non-technical users, the product provides full control for data scientists who use Kumo to automate time-consuming feature engineering pipelines.

Stream On

Summary

Kumo AI co-founder Hema Raghavan brings a unique perspective to building AI companies, after leading machine learning teams at LinkedIn during their growth from 400M to 700M users. She founded Kumo to democratize graph neural networks. Her insights emphasize the importance of reducing complexity while maintaining power and flexibility in enterprise AI systems.

Build simple interfaces to complex technology. The key to widespread AI adoption is creating simple interfaces (like Kumo’s SQL-like query language) that hide complexity while preserving power. Give users an intuitive way to access sophisticated capabilities without requiring deep technical expertise.
Optimize infrastructure for real-world economics. Don’t just chase performance at any cost. Kumo’s hybrid CPU/GPU approach shows how thoughtful infrastructure design can dramatically improve cost efficiency without sacrificing capabilities. This makes enterprise-scale AI accessible to more customers.
Focus on customer time-to-value. Rather than requiring customers to convert their data into graph format or build and maintain complex feature engineering pipelines, provide automated solutions that let them focus on business outcomes. The faster customers can see results, the more likely they are to expand usage.
Make trust and transparency core features. For enterprise AI adoption, especially in regulated industries, explainability can’t be an afterthought. Build capabilities for model transparency and bias detection into your core architecture from the beginning. Ideally, be able to report on a per-instance level.
Meet customers where they are. Success in enterprise AI requires aligning with customers’ existing data infrastructure and capabilities. Rather than requiring customers to radically change their systems, integrate with established platforms like Snowflake and Databricks where their data already lives.

Transcript

Chapters

AutoML on GPUs
Graphs on the brain
Converting data scientists to believers
Connections between graph world and LLM world
Explainable AI is table stakes
AI growth at LinkedIn
The data sophistication curve
Vision for the future
Mentioned in this episode

Hema Raghavan: If you have your data laid out as relational tables, Kumo just sucks it in. So you just specify through connectors, tell Kumo what your schema is, and then you can just start writing predictive queries. So the graph is abstracted away, but if you have someone like a data scientist who loves tweaking the neural network parameters.

Sonya Huang: In case Konstantine …

Hema Raghavan: Yes. Decides to …

Konstantine Buhler: I would be interested.

Hema Raghavan: Exactly.

Konstantine Buhler: Guilty.

Hema Raghavan: Exactly. You can look under the hood. And the analogy we always use is we’ll give you the self-driving car, but if you want to look under the hood or if you want to drive stick, we’ll let you drive stick.

Konstantine Buhler: We have a brilliant guest today on Training Data. Welcome Hema Raghavan, Co-founder and Head of Engineering at Kumo AI.

Hema brings decades of experience leading AI initiatives at LinkedIn. She came up with the “People You May Know” technology and other core features that leverage the power of graph learning.

Her journey in AI predates many of the technologies we all take for granted today. She was working on NLP before BERT was even a thing. With Kumo, Hema and her team are revolutionizing how companies harness AI by making advanced graph neural networks. These neural networks let you do AutoML, automated machine learning, on any platform, from Snowflake to Databricks.

Kumo’s innovative approach allows companies to leverage their existing data warehouses in order to build sophisticated AI models faster, cheaper, easier. You don’t require the deep expertise in graph learning, or maintaining complex features, you can just go straight to business value. Welcome Hema, to Training Data.

Konstantine Buhler: Welcome to Training Data. Today we have the amazing Hema Raghavan. You are building Kumo AI, which is AutoML on the data warehouse, using advanced neural networks and graph neural networks. AutoML was incredibly promising a few years ago. It was a major trend in the last wave of AI, five, six years ago. It went through a little bit of a trough of disillusionment. A lot of the AutoML players receded from the forefront, and companies started to store their features in feature databases and the like. Why are you focusing on AutoML? What’s different about Kumo?

Hema Raghavan: Okay, so there’s AutoML and then there’s AutoML on GPUs. And I think that’s the big difference for Kumo AI. And let me give you a little bit of an example from my own career. So I started in NLP, and when we would build systems back in the early 2000s to answer a question like, “When did Marco Polo land in Asia?” we would be encoding features like, “Marco Polo is the subject of the sentence and it’s going to be the subject of the answer,” and all of that.

So we had to know a lot about language, about linguistic structure and so on. And then the GPU revolution came that enabled neural networks to come at the forefront of this technology. We don’t write features like that anymore. Those intermediate layers in a neural network really learn the parts of speech, the named entities, all of those properties of language. It’s the same in other classes of problems. So in the AutoML that was happening maybe a decade ago, we were looking at CPU-based models. So think of logistic regression, think of ASG pools, SVMs and so on. And all AutoML did then was parallelize what a data scientist would have done, which was a lot of hand-computed features. And that required you to be—you had to write code to think like a data scientist. So you were trying to get the machines to think like humans.

Whereas here what we’re doing is we use graph neural networks. So it’s a neural network technology, and you can think of a GNN as a superset of CNN, which is used for images, or a sequence model, which is used for languages. GNNs allow for arbitrary structure, and the GNNs are learning all of the features that you would normally use for predictive problems. So Kumo sits in the space of predictive AI, and we’re really bringing transformer technology to predictive AI problems.

Sonya Huang: Can you say—you mentioned graph neural networks and you gave a great explanation. Can you explain to me like I’m five years old, because that might be where my level of understanding is? Are graph neural networks good for any class of problem? Is it good for—you know, you came from LinkedIn, where you were working on, you know, the social graph of LinkedIn. Is it good for specific types of domains?

Hema Raghavan: That’s a great question, Sonya. So let’s say you’re going to put this podcast episode out and it’s going to be on some video streaming site. And we want to recommend the relevant podcasts for users of that video streaming site.

Konstantine Buhler: YouTube, to be explicit.

Hema Raghavan: YouTube.

Konstantine Buhler: If someone’s watching this on YouTube, you want to recommend someone to watch it or not.

Hema Raghavan: Exactly. So user logs in, and you not only have the content of this podcast episode, but you also have what you might have watched in the past. So you can think of that—the records of what you’ve watched in the past are sitting in a views table.

Konstantine Buhler: Collaborative filtering era.

Hema Raghavan: Exactly, exactly. But the difference with collaborative filtering is it’s just looking at views. How can we take the view data—so the view data is a network. So coming to Sonya’s question, right? There’s a podcast episode, there’s all the users who are watching it. So you got it in terms of you have a bi-directional graph—the users and the podcasts—but then you have the organization, you have Sequoia Capital, you have the channels from Sequoia Capital, you have other metadata that you may have. So all of that can lend itself naturally to a graph and start thinking about links across these, you know, nodes of a graph. And effectively what a graph neural network is learning is let’s look at what Sonya watched in the past. It seems like she really likes AI.

Sonya Huang: AI and baby shark videos at the same time.

Hema Raghavan: Okay, so AI and baby shark. [laughs] Okay, but—and then the neural network also learns that Konstantin likes AI. And what would it be for you?

Konstantine Buhler: Probably AI. That might be AI and, like, history.

Hema Raghavan: That’s great. So there’s AI and history, right? So the neural network can learn that there’s an overlap between both of you on the AI pieces of content. You both engage a lot with Sequoia content, and it’s learning across this network, right? But the next time Sonya watches a baby shark video, we don’t want to be recommending that to Konstantine, right? So how do you take that content that you engage with—the view data, the click data that you’re engaging with, and learn across all of these edges? Think of clicks, views, all your behavioral signals that you engage with with entities in this world as a graph. And how do we learn across that graph?

So you don’t have to be a social network to have a graph. Everyone—almost every enterprise I know has a graph. Fintech has graphs because they have customers, they have transactions, they have related data. Think of, you know, one of your delivery services. They have the inventory, the suppliers, the means of transportation. So they’re all sitting as tables, they’re all sitting as entities and they’re all linked across each other. Graph learning lets you learn across that.

Konstantine Buhler: That’s a pretty key insight, the tables.

Hema Raghavan: Yeah.

Konstantine Buhler: Before we go there, that was a very smart five year old. I think that you have a five year old.

Hema Raghavan: I have an eight year old.

Konstantine Buhler: An eight year old.

Hema Raghavan: And a 12 year old.

Konstantine Buhler: A 12 year old. Okay. Well, they’re very, very smart if they understood that explanation. Like, what would be, to Sonya’s point, if you were five and you were gonna say graph learning versus any other type of machine learning, what’s the difference?

Hema Raghavan: Ah, graph learning versus machine learning. Easy. Fast. I think those would be the two things. Just, you know, low code. I think that would be the key about Kumo.

Konstantine Buhler: It learns all the weights, it learns all the features and discovers them over time. Is that fair to say?

Hema Raghavan: Exactly. And my eight year old doesn’t know machine learning, but if you were going to write a classifier the old school way, you’d be writing features that say, “Okay, users in this platform, we need to look at clickthrough rate data for the last three months and six months and eight months for every single video.” And we discover that Sonya has a preference for data that’s for videos that are evergreen. So six-month windows really matter for Sonya. So imagine all of that code being written as features. Graph neural networks eliminate all of that code.

Sonya Huang: Do you think that means feature engineering goes away as a discipline, or what happens to it?

Hema Raghavan: I think feature engineering goes away. And that’s not a bad thing as such, because prior to Kumo, I was at LinkedIn for almost seven, close to eight years. And data scientists love finding opportunities for the business to make value, right? And it doesn’t mean that feature engineering is the place where you spend—you know, that’s the time well spent. You’d much rather try out N different models on N different parts of the app or whatever your business is, and drive value. So trying out models in different parts of your application is where a data scientist needs to spend time.

Sonya Huang: We started this episode, and you said AutoML on GPUs is different from AutoML.

Hema Raghavan: Yes.

Sonya Huang: And so what about GPUs specifically makes what you are describing possible? Like, was it even possible to do this on a CPU, or is it faster now? Or what’s different than what you’re doing on GPUs?

Hema Raghavan: Yeah, that’s a great question. So it’s definitely possible. It’s much slower. Right? So it’s very similar to what neural networks brought to the text and image spaces, in that we can scale these models to large amounts of data. And while these models existed before the GPU revolution, we can actually take entire enterprises like fintech data and learn graph neural networks for them.

Konstantine Buhler: Yeah. In the previous era of AutoML, so much of the juice in the performance came out of ensembles. So you do these logistic regressions or you do these SVMs or what have you, and then you’d ensemble them together. For me, in the Kaggle era, which was how I first met your cofounder Yuri, and the data science era of Kaggle and the like, always the ensembles won. Even in the Netflix prize back in the day, it was the ensembles that won. And there was something to the fact that these ensembles are just tons of little algorithms chained together. And what is a neural network but tons of little algorithms chained together? I mean, you could consider it billions of sigmoids or billions of logistic regressions. And really, the way I see graph neural networks is you’re able to discover the features and the ensemble that you chain together to actually optimize towards the solution. So to me, graphs are the most general data type.

Hema Raghavan: Yeah.

Konstantine Buhler: And a graph neural network is the most general—you mentioned it’s a generalization where even a transformer is a subset of this generalization—the most general type of algorithm that can do some learning.

Hema Raghavan: Yeah, absolutely. And as you mentioned, ensembles, something that struck me was try maintaining that in production. You have N different feature generation pipelines and an ensemble. And I’ve seen a world where you’d have one front-end engineer change how we were logging the view data.

Konstantine Buhler: Yeah.

Hema Raghavan: And everything either needed to change or something—you know, one pipeline …

Konstantine Buhler: Breaks.

Hema Raghavan: Breaks.

Konstantine Buhler: And it’s all done.

Hema Raghavan: And it’s a mess to debug. So graphs give you a simple, elegant framework to get at the same outcome.

Konstantine Buhler: It also reminds me a lot more of our brain.

Hema Raghavan: Yes.

Konstantine Buhler: Right? Our brain, we think, operates like a graph, and is forming and pruning connections more like a graph, even more so than a more structured neural network. And so have you guys experimented or thought about that as an analogy, and any ideas of the pros and cons of that analogy?

Hema Raghavan: I think it’s very similar to—the way I think about it is let’s go back to that video watching example, right? And if I think of Sonya as a node in a graph, and what these neural network algorithms are really good at is learning these embedding representations, right? And on this big graph, which has Sonya with her preference for baby shark and …

Sonya Huang: Or her household’s preference.

Hema Raghavan: Exactly.

Konstantine Buhler: Makes more sense. That checks out.

Hema Raghavan: Your embedding vector would be pretty close to both AI, so you’re close to Konstantine, but you’re also close in Euclidean space or, you know, in some big N dimensional space to all the baby shark-loving folks, right? And we’re basically learning these representations. So people or all the entities in the graph, like even Sequoia Capital in that case, becomes a representation. So in that sense the idea is very similar, but what GNNs do is allow for arbitrary structure. And that’s where I think it’s a lot closer to the human brain, because I don’t think the human brain is wired as a linear sequence or as a grid, as images. Yeah.

Sonya Huang: Can you say a word about how it works under the hood? Like, how are you able to—let’s say you go and work with, I don’t know, a food delivery service. How does it actually work for you to go and kind of, you know, automatically learn this graph representation? And how are you training models on that?

Konstantine Buhler: Or pick YouTube, given people might be watching us there, and we’re talking about AI baby shark in history already.

Hema Raghavan: Exactly. So there’s two pieces to Kumo. Historically, graph learning has been restricted to, I want to say PhDs in graph learning, because it’s not easy to view the world as a graph. People think in terms of relational data. That’s the most common data layout in companies. And that’s largely because of the analytics revolution that preceded the AI revolution. So everyone thinks in terms of relational data, but really relational data and graphs have a one-to-one mapping because you have data laid out in tables, usually an entity is a primary key in a table, and then you have all these relationships—primary key, foreign key relationships—which encode the edges in a graph.

So that automatic construction from a table layout to a graph layout is one of the innovations inside Kumo. The other bit is we’ve invented a language called predictive query language. And the language allows you to specify any machine learning problem in a few lines that looks very much like SQL. So think of SQL with a predict clause. So we’ve created this very simple abstraction layer on top of relational data warehouses. There’s already a universe of people who are writing SQL queries, and we’ve created a language that appeals or, you know, that resonates with them in some sense. So that’s one of the innovations of Kumo.

The other one is running these graph neural networks. So once you go from relational to graph, just running graph neural networks at scale. And that again is something that has not been easy to do. There are a few companies in the world that can do it, and it usually takes a huge infrastructure team to build that out because graphs inherently—like, unlike databases, where you can think of some logical partitioning graphs …

Konstantine Buhler: It’s not a matrix.

Hema Raghavan: It’s all entangled in. So how do you split it across different machines with limited memory and so on? So all of these bits coming together makes Kumo easy to use. But that said, when we go to a company like a YouTube-like company, we’ll often talk to a data science team that is looking to get faster return on investment in AI. But then Kumo becomes really easy to do because if you have your data laid out as relational tables, Kumo just sucks it in. So you just specify through connectors, tell Kumo what your schema is, and then you can just start writing predictive queries so the graph is abstracted away. But if you have someone like a data scientist who loves tweaking the neural network parameters …

Sonya Huang: In case Konstantine decides …

Konstantine Buhler: I would be interested

Hema Raghavan: Yes, exactly.

Konstantine Buhler: Guilty.

Hema Raghavan: Exactly. You can look under the hood. And the analogy we always use is we’ll give you the self-driving car, but if you want to look under the hood or if you want to drive stick, we’ll let you drive stick.

Konstantine Buhler: So concretely in the YouTube example, historically, if I was in analytics at YouTube and I’m watching this video, I can look and say, “Hey, query all AI.” There’d be some tagging or some system to understand all AI historically. Let’s see what the trends are over time. That’s querying the past.

Hema Raghavan: Yes.

Konstantine Buhler: What you’re saying is once you have this in this database, in this structure, you’re able to predict how many people are going to watch AI videos in the next several weeks? How many are going to watch baby shark videos? How much are they going to spend? What is going to be their monetization, what are their ads? What else can you do with this?

Hema Raghavan: So you can say, is this user going to churn, for example.

Konstantine Buhler: Yep.

Hema Raghavan: Right? And then you can say, “What’s the most relevant video that I want to show this user in order to retain them on my platform?” Right? So I want to drive value for my business. Given the past videos that they’ve watched, what’s the next video to watch? And so on. And we can also do demand forecasting. So we have customers in—in fact we have in the healthcare sector, and they use Kumo to forecast demand so that they’re well stocked on their emergency room, right? So the applications of using Kumo go from consumer to healthcare to fintech. Where fintech we see applications and fraud, for example. Just is this user’s behavior suspicious? Should we flag the user, so on and so forth. So think of any question which says “How much?” I love the use of “query the future.” How much? Is this event going to happen? What’s the next best action for this user from an action space? Those are all the kinds of questions that Kumo can help answer. And I love that you said ‘analyst’ because Kumo aims to be as AutoML as you want it to be. And, you know, we—but we also have a Python interface. So, you know, you want to be a neural network expert, you can go all in.

Konstantine Buhler: Cool. So it’s the brain. It’s the brain, it’s the analytical brain.

Sonya Huang: Out of the applications you discussed just now, I would imagine, you know, there’s such classical ML problems that you discuss. Each of them probably has a five-person fraud team and a 15-person demand forecasting team. What do those ML people think when—you know, when Kumo is pitching the company? Walk me through that spiritual journey, and are you actually able to get results out of the box that are better than a 15-person team maintaining it can do?

Konstantine Buhler: I’m okay with it as long as they don’t have VC prediction.

Hema Raghavan: [laughs] So for a lot of the companies we work with, the data scientist is excited about Kumo. As I mentioned, writing those feature engineering pipelines comes with maintenance jobs to maintain those pipelines, and that’s not where they want to spend their time. Data scientists in most companies are incentivized with direct business impact. So did I push that ad CTR model out this quarter? Did it drive X percent revenue? So a lot of our customers will come to us and say, “You know what, I signed up for X percent revenue, but I’m only one third of the way there. Can you guys, you know, help us accelerate?” And we do a four-week POC. So you know we’ll—and within four weeks we’ll almost always—I’m trying to think of a case when we’ve not shown value but I can’t remember one, but we’ve always shown value within those four weeks.

Sonya Huang: Wow. So you convert them into believers.

Hema Raghavan: Into believers, yeah. And it’s about where you want to spend your time, right? So I think once they get hands-on product, many times people will come in and say “Oh, but feature engineering is where I spend all my time.” Right? “And how can you say that I don’t have to do it manually anymore?” But we’ll remind them. We’ll remind them of the NLP journey, and then we’ll also remind them that once they get hands on keyboard with the product and they realize that the journey in Kumo, it’s not completely automated away, right? Because we say a data scientist knows their business well. So if you’re going to define churn prediction for your business, maybe on YouTube activity in the last 30 days is a good predictor of churn. So, you know, you want to bring your events table with a 30 day window.

Konstantine Buhler: Of a schema, the actual structure that you expose it to, yeah.

Hema Raghavan: The schema or the window, right? Because these are all queries, and these are all parameters in the queries. Or you could play with 90-day or 365-day activities. So these are all queries. You can write five of these queries and say “Oh, really? On my system, the best predictor of churn is behavior in a 365-day window, and I didn’t even know that because I’m spending all my time looking somewhere else.” So the data scientist spends a lot more time finding the relevant tables in their organization that are going to bring value, and then finding the right query formulation or the right business formulation in this case, you know, for example, churn. What’s the right definition of churn for my business? And once they see that, actually they realize that this is a lot more fun than what I was doing before.

Sonya Huang: Yeah, totally.

Hema Raghavan: Yeah.

Sonya Huang: You mentioned tables, structured data, schema. That naturally leads me to think about Snowflake and Databricks. A lot of companies have spent the last five years heavily investing in their data warehouses. How do you work with the data warehouses?

Hema Raghavan: Okay, that’s a great question. So at the outset, we started as a purely SaaS company, emulating a lot of the principles from the Snowflake architecture, looking at their success stories. One thing we realized is that data scientists, though, they need to see value on their own problem because they’re so KPI- or business-impact focused, showing them value on a Kaggle data set doesn’t really count.

So the easiest way to show value is of course when they can connect to their own data. But connecting to your own data on a SaaS product means you go through a huge security review through the company, which can, in many organizations, can take a couple of months. So we wanted to reduce that friction, and we started partnering with the warehouses to think about deployment models where compute can be closer to the data. And we have a deployment with Snowflake, which is part—we use a combination of what is called Snowpark Container Services. And really, Kumu can deploy as a container in Snowflake’s compute pool.

So from a data scientist’s point of view, we’re also a native app in Snowflake. So a data scientist in an organization, let’s say YouTube, can go in and click “Install Snowflake.” So it’s like an app on your iPhone. It gets installed, and then they can start writing those predictive queries and looking for value. And oftentimes the security team is completely okay with it because there’s no data leaving the ecosystem. We have a very similar deployment model with Databricks, though in that case, we manage the GPU compute, but data residency stays completely inside Databricks. And that also led us—so we started that from the point of view of letting data scientists get hands on keyboard with Kumo quickly. But we also realized that freed us up a lot to not have to think about security compliance, governance, and let the data warehouses, as they’re already building all of the tools and technology for management of data, let it stay there, let it be managed there. But Kumo just, you know, talks to data directly sitting inside the warehouse.

Konstantine Buhler: Hmm. So you talked about relational versus graph data. And relational data is kind of how many of our brains have been taught to think.

Hema Raghavan: Yes.

Konstantine Buhler: We think about things in spreadsheets, oftentimes. We might go down and say if we have a series of AI videos, you have them as rows, and then you have some descriptors of them as columns. But really, when you start to see things as graphs—which I did, frankly, back in the day, around Yuri’s time, as a professor—I think everything starts—you can start to see everything as a graph. It’s the most general data type. And when you start to see things as graphs, it’s actually kind of how our brain thinks.

Hema Raghavan: Yes.

Konstantine Buhler: Hey, here’s a video. And that has a pointed characteristic that’s some other part of the graph which is connected to another part. How do you ingest all of this relational data, which is the way that the world has been run for—the way computers have been run for 50 years, and put them into a graph structure? Sounds like a very heavy lift. And doing that inside of Snowflake and Databricks is probably pretty hard.

Hema Raghavan: I want to say that’s part of the magic of Kumo, right? And that was the friction that prevented graph learning from taking off and it staying within …

Konstantine Buhler: The big companies.

Hema Raghavan: Yes, exactly. The few people who could hire these individuals. But really, it is a question of—we have a unified schema that’s the graph schema. And looking at the relational schema, we’re able to identify what the entities are. So in that YouTube example, it’s a video ID, it’s a user ID, it may be a channel ID and so on. And often those are primary keys. And then it’s a lot of I want to say SQL-like code that runs under the hood, and it could—or I want to say Spark-like code that runs under the hood that converts this data to the graph format.

Konstantine Buhler: I see. I see. It makes sense.

Hema Raghavan: And beyond that, once we get to the graph, there’s an edge index. So we store all of the edges in a very proprietary and compressed format, and then we distribute out the nodes. Okay? Because we realized that edges …

Konstantine Buhler: To GPUs. Distribute them to GPUs or …?

Hema Raghavan: To CPUs, because we wanted to keep costs low. So we keep the—we only reserve the GPUs for training when we are doing the learning, but we store the edges in what we call the graph engine, and then we have a column store where we store the features. So we can bring in arbitrary features that represent the users, right? So everything about Sonya that we can infer, we’re not—constrained by memory that just horizontally scales. Everything on the CPU machine horizontally scales. And we’re only using the GPUs for message passing.

Konstantine Buhler: Cool. Amazing, in fact.

Hema Raghavan: Yeah. So that has also reduced costs for our customers. And they’re often surprised that we can run graph learning at the scale that we do, at the costs that we do.

Sonya Huang: You’ve made the comparison to large language models a couple times.

Hema Raghavan: Yeah.

Sonya Huang: I’d be remiss not to ask what are the connections between your graph world and the LLM world? And are there—you know, are there synergies between the two?

Hema Raghavan: Absolutely. What a great question. And so many synergies. So let’s take the example of this podcast, which will get generated. It’s gonna get transcribed by an LLM. It’s going to—so you have all of the summaries, you have all of the semantic information that will come from the large language models, right? A graph neural network can actually take all of those features that, you know, the semantic representation that is inferred for this particular video as a node feature. And what the GNN is learning is it’s learning across all of the interactions that one may have.

Now let me give you another example. We have a demo of this up on our website or on our LinkedIn channel. But an example would be a lot of people think of the LLM revolution as creating chatbots, okay? So let’s say you come to a clothing store and you are searching for yellow summer dresses. So you search “yellow summer dresses.”

Konstantine Buhler: All the time.

Hema Raghavan: Yes. And you’re not a logged-in user, and the LLM is going to probably get you a really good set of things that look like yellow summer dresses. But if you were a logged-in user, and we knew all of that information about the kind of interactions that you’d had in the past, we can actually use Kumo’s predictions to inform the LLM. So think of RAG and think of Kumo predictions as feeding a RAG algorithm to ground its truth to be closer to what is personalized. So you can do that as well. So there is the bringing in features, but then there is also—they’re complementary because Kumo brings you all of that personalization based on all that behavioral data that the app has, which the LLM doesn’t take into consideration.

Konstantine Buhler: You mentioned RAG and we’ve talked about graphs. Graph RAG is having a moment in AI right now in general.

Hema Raghavan: Yes.

Konstantine Buhler: Thoughts on Graph RAG, which is different from our—from our approach at Kumo, but thoughts on Graph RAG, and then also how it differs from using a graph neural network to do certain inferences tied to some sort of RAG.

Hema Raghavan: Yeah, so graph RAG is a lot closer to what we just talked about. But many organizations may have knowledge graphs, and that’s another entire field of study in graphs. Think medical domains, for example. You have all of your insurance codes, and how the insurance codes connect with each other. You have symptoms, you have all of that. There’s a lot of knowledge bases sitting out there with interconnected nodes. Graph RAG allows you to ground your LLM output in the answers that are answered from these kind of knowledge graphs instead of going to a search index. So you can think of RAG as going to a search index, a knowledge graph, a recommender system like Kumo, and making the LLM output more or hallucinate less.

Sonya Huang: I want to ask about explainable AI. One of the things we’ve been discussing in prior episodes of the show is these LLMs, will we ever be able to understand how they think? And I remember the Anthropic results were really interesting. How do you think about explainability as it comes to Kumo’s models?

Hema Raghavan: That’s such a great question, Sonya, because for the kinds of problems we work with and the customers that we worked with, this was another area we had to actually develop a solution, because we have customers in insurance and healthcare, and they often need to understand why a recommended output was recommended to them. You want to know that the model didn’t over-rotate on race, color, ethnicity and so on and so forth, right? And so it became table stakes for us to actually solve this problem. And at Kumo, we’ve innovated by actually developing an algorithm which after the training part of the algorithm, looks at the graph and looks at the gradient algorithm, and can come down at the table level to say these were the tables that were used, these are the columns that were used. And we have some early results that show that we can even come at an instance level and predict, you know, here’s an instance, the score, why Sonya was recommended that video is high because of these specific features.

Konstantine Buhler: Wow.

Hema Raghavan: So it was table stakes just given the domain we were going into.

Konstantine Buhler: So we talked a lot about AI and graph learning. You have been in the AI space for a long time, Hema. Can you tell us a little bit about what you developed at LinkedIn, and specifically what AI growth was at LinkedIn? Maybe if you can, how graph neural networks help there and the types of challenges that you dealt with really, really large scale operationalizing AI?

Hema Raghavan: Yeah, that’s a great question. So I joined LinkedIn just a couple of years after the IPO, and AI was making its way into various products. I joined the growth team, and the team I first led was the “People You May Know” team. And People You May Know is all about graphs; it’s about large scale graphs. And the amazing thing about LinkedIn was how closely tied People You May Know—because it’s a social network—was to our core consumer metrics.

So I had come to LinkedIn as an AI researcher, and I suddenly found myself responsible for one of its core KPIs, which was sessions and monthly active users. And by then I’d also started owning notifications, which is a huge part of the growth ecosystem at LinkedIn. And along the way, we had to start operationalizing AI. And before MLOps became a word, we were actually thinking about, hey, how do you—from the time when you deploy a model, how do you measure, how do you A-B test, and then how do you maintain a model in production? We would see models degrade in production. We would see …

Konstantine Buhler: Why is that, by the way? Why did you see that so frequently? If the graph wasn’t losing nodes or edges, why would it degrade over time?

Hema Raghavan: Because depending on your business problem and the kind of behavioral change. People behave on LinkedIn in the new year very differently from summer break, right? So creating those pipelines which do auto training, those all became very important. And then when you talk about scale, that was an interesting problem as well, because I started at LinkedIn when we were, I think, 400 million members, and then it was rapidly growing. So that’s when we had to start thinking about infrastructure and the fact that the CPU-based algorithms, we can’t just keep horizontally scaling them. So what would be more efficient ways to run AI models in production? Graph neural networks now at LinkedIn, and I want to say there’s been an amazing team that took it forward after I left as well. It took them about four to five years to build and many, many, many engineers. But now it powers everything from the ads to the feed to jobs and so on. They published a paper recently about it.

Konstantine Buhler: So you guys, the founders—there’s three founders at Kumo—you were senior AI leads at LinkedIn, at Airbnb, at Pinterest. All those are massive scale, also really sophisticated. That can be, I think, intimidating for smaller companies that have problems and say, “Wait, this is a champagne problem. This is what the hyperscalers with hundreds of millions of users have and we don’t have nearly the same problem.” Is that true? And what kinds of companies are not a good fit for graph learning?

Hema Raghavan: That’s a great question. So I think there are two things. The first question is the reason why we ended up inventing predictor query language, because what we needed to do was create a platform that was super easy to use, right? And many of the other companies had few data scientists and large numbers of potential avenues where they wanted to bring AI. So oftentimes what we would get is, “Hey, we would love to be a LinkedIn, an Airbnb or a Pinterest, but we can’t put so many people to it.” So giving them that easy-to-use interface, and giving them that managed infrastructure at scale actually lets us get in. But that said, when is Kumo not a fit? Kumo is not a fit if you’re so early on that you haven’t figured out your data landscape.

So sometimes we’ll talk to customers who are super excited about Kumo, but they haven’t figured out how to measure the value of AI. Or sometimes we’ll talk to customers and they’re still in spreadsheets, and they’re moving to one of the warehouses. So we’ll say, “You know, get the data layout settled.” It’s like building a city, right? You’ve got to have your roads and the foundation first, and then the vehicles come on it. So we’ll wait, and customers come back in a year or so.

Konstantine Buhler: But there’s no category or type of problem. It’s more a data sophistication.

Hema Raghavan: Exactly.

Konstantine Buhler: I see.

Hema Raghavan: Yeah. But you don’t have to be so far along the sophistication curve. You don’t have to be an Airbnb or LinkedIn.

Konstantine Buhler: Yeah, it’s table stake. You know certain KPIs that can be optimized by an algorithm.

Hema Raghavan: Yeah.

Konstantine Buhler: You can quantify certain things, A. And then B, you have access to that data and something that you can plug into like a data warehouse.

Hema Raghavan: Yeah. And the way I’d look at the evolution of an organization and data is often that they’ll—you, of course, have to know what your product market fit is. After that, you start figuring out your data ecosystem. You build your data ecosystem for analytics, because now you’ve built a product, you’ve got to start measuring what that product is doing, what the behavior is. And that’s when leaders usually start thinking about AI, which is, “Okay, now I know how to query the past, but I now need to start bringing in AI to instrument the change that I need in the ecosystem.”

Sonya Huang: I’d love to close with some questions about your vision for the future. Maybe—you know, you’ve been in AI for a long time. You mentioned you were an NLP before BERT was a thing.

Hema Raghavan: Yeah.

Sonya Huang: What are you most excited about in AI most broadly?

Hema Raghavan: Oh, I think I’m most excited very broadly about the productivity gains it’s giving all of us. I mean, Kumo is one part of it, but just how we write documents or how we—or think about health, right? Like, if health improves, productivity improves. For example, if you’re just using one of those health apps that do monitoring but nudge you for behavioral change, that is better health, it’s better productivity. So what I’m most excited about is how we’re going to evolve as a human race with all of these productivity gains.

Konstantine Buhler: What about technically? What features or approaches or algorithms or venues do you think are going to be most interesting?

Hema Raghavan: I’ve always found the big innovations come at the intersection of hardware and software. And I think while GPUs were invented for graphics, there’s probably something more that has to happen on the processor side so that you can scale these graph neural networks or neural networks further, make models maybe less expensive. So I’m looking forward to that technically.

Sonya Huang: What about your vision for the future of Kumo? What can we expect to come out of the product in the future?

Hema Raghavan: So in terms of Kumo’s vision, I’m actually really excited about the kinds of apps people are going to build with Kumo. We’re starting to see people plumb Kumo with LangChain and Pinecone, and put together apps like the one we talked about, like the, you know, chat agent that recommends for you the yellow summer dresses, right? So I’m very excited about the top layer of applications that are going to get built on top of Kumo, and what that’s going to power.

Konstantine Buhler: So Hema, one of the star features about you, you’re incredibly technically deep, but you also are really good at culture. If you talk to anyone at Kumo, there’s basically no regrettable turn ever. And you guys hire some of the best PhDs in the world in ML, and certainly in graph ML. How do you do that? What have you done to make the Kumo culture exceptional, and to have so much retention within your team?

Hema Raghavan: I was at a leadership training once, and we had to think about what our true north was. And I realized that—and the ‘true north’ concept defines your true north, which is a value as who you are but, you know, the kinds of problems you solve either in your work and what you bring to the table. And for me, it’s always about empowering people to do more than what they think they can. So it’s common to empower people to get to full potential, but it’s those a-ha moments like, “Wow, I built this!”

So you hire a smart team, you get them—I think good managers step away, but have their eye on how the team is operating. And you get people to innovate, get people to own what they’re building. So just see that vision. So you want to be able to hire people, whether it was LinkedIn where the value was economic opportunity, or at Kumo, where the value is about building an AI platform that makes AI so easy to use. It’s about when you bring smart people together that rally together on the same value, that’s when magic happens. And my job is to just let the magic happen.

Konstantine Buhler: Hema, why is there passion around relational data? You could have done—we’ve talked about this sometimes in the past. You could have used graph learning as a different type of architecture to do language models, or you could have done graph learning to do any sort of AI.

Hema Raghavan: Right.

Konstantine Buhler: Once you’ve figured out at scale, how to do the generalization, why can’t you do the specifics by taking this big marble block and carving away all the nodes and edges until you get to a superior architecture. But you decided to do it on relational data. Why is that? Relational data usually is not the most exciting thing in the world for most people.

Hema Raghavan: Yeah.

Konstantine Buhler: But it is your life passion.

Hema Raghavan: Because nobody else was doing it. And there’s so much data in relational format, and that was such a pain in our past jobs. So I feel like there’s magic happening in the core area of NLP, where, you know, I’m happy to see that revolution and all the investment that’s happening there—people, money and so on. But there’s this whole workload that’s out there, a whole set of data scientists that work on those workloads. How do we bring that magic to them? So it’s really about the opportunity and the past pain that each of us saw in our previous jobs.

Sonya Huang: Okay, I have one last question. For the young Konstantines out there who are watching this episode on YouTube, what advice do you have for aspiring AI engineers who want to really make a dent in the field in the future?

Konstantine Buhler: She has a specific yellow dress recommendation.

Hema Raghavan: [laughs] I would actually say tools come and go, languages come and go. I know there’s a lot about learning Python and taking the class on the latest deep learning, but I would say don’t skip your probability and linear algebra classes, because whatever method has been there in the last several decades, it’s always come down to core linear algebra and probability. So don’t skip those classes.

Sonya Huang: Great.

Konstantine Buhler: And mine is: there’s a lot of graph enthusiasts out there. When do graph neural networks come main stage in the AI revolution? Best guess for timeline.

Hema Raghavan: I think we’re getting there. I was at a conference called KDD recently. It’s one of the biggest data mining conferences. A lot of academics, lot of industry folks, and more than half the papers were graph neural networks. So I think we’re sitting at that explosion. It’s going to happen.

Sonya Huang: Thank you, Hema. This was fantastic.

Hema Raghavan: Thank you, Sonya. And thank you, Konstantine. It was lovely being here.

Mentioned in this episode:

Graph Neural Networks: Learning mechanism for data in graph format, the basis of the Kumo product
Graph RAG: Popular extension of retrieval-augmented generation using GNNs
LiGNN: Graph Neural Networks at LinkedIn paper
KDD: Knowledge Discovery and Data Mining Conference

Kumo’s Hema Raghavan: Turning Graph AI into ROI

Training Data: Ep26

Stream On

Summary

Transcript

Chapters

Contents

AutoML on GPUs

Graphs on the brain

Converting data scientists to believers

Connections between graph world and LLM world

Explainable AI is table stakes

AI growth at LinkedIn

The data sophistication curve

Vision for the future

Mentioned in this episode