Skip to main content

Dust’s Gabriel Hubert and Stanislas Polu: Getting the Most From AI With Multiple Custom Agents

Founded in early 2023 after spending years at Stripe and OpenAI, Gabriel Hubert and Stanislas Polu started Dust with the view that one model will not rule them all, and that multi-model integration will be key to getting the most value out of AI assistants. In this episode we’ll hear why they believe the proprietary data you have in silos will be key to unlocking the full power of AI, get their perspective on the evolving model landscape, and how AI can augment rather than replace human capabilities.

Summary

As we move from deterministic to stochastic technology experiences, having optionality and flexibility in our workflows will be key to getting the most out of AI tools.

Embrace a Stochastic Mindset: Gabriel and Stab discuss the shift from deterministic to stochastic technology and emphasize the importance of adopting a mindset that embraces variability and uncertainty. Founders and builders should get comfortable with tools that provide drafts or suggestions rather than definitive answers.

Multiple Models and Flexibility: The discussion underscores the importance of using multiple AI models rather than relying on a single one. This allows businesses to switch between models based on specific use cases, data sensitivity, and performance requirements. Teams should consider building systems that can integrate and switch between various AI models to optimize outcomes.

Reasoning Capabilities and AI Progress: While AI models have improved in many areas, reasoning capabilities have not advanced as quickly. Gabriel and Stan think OpenAI’s o1 model represents an incremental advance in reasoning but not a major breakthrough. This highlights an opportunity for founders to focus on enhancing the reasoning abilities of AI systems, which could lead to breakthroughs in AI applications.

Human-AI Collaboration: The episode emphasizes that AI should be used to enhance human capabilities rather than replace them. Gabe and Stan see great promise in AI tools that work alongside humans to improve productivity and decision-making processes.

Enterprise AI Deployment: The conversation points out the challenges and opportunities in deploying AI within enterprises, particularly in terms of data integration, security, and user adoption. Founders should prioritize building AI solutions that are easy to integrate into existing workflows and that address enterprise-specific needs, such as data privacy and compliance.

Transcript

Contents

Introduction

Gabriel Hubert: We’ve asked the entire world to move from calculator technology—punch the same keys, you’ll get the same results—to stochastic technology—ask the same question, you’ll get a slightly different result. This has not happened. This is the biggest shift in the use of the tools that we have since the advent of the computer. We’re asking an entire cohort of the workforce to move to a stochastic mindset. And the only way you get that is by having a risk-reward ratio that you’re comfortable enough. You know what? I’m not asking it to be right 100 percent of the time. I’m asking it to give me a draft that saves me time many, many, many times over. And that distribution of ROI is something that I’m comfortable exploring with and iterating on.

And I think that that is really one of the predictors that we see in people who’ve tried ChatGPT, or in people who are just curious with new technology is they expect that some of it’s gonna be a bit broken. But the upside scenario to them is so clear and so 10X that they’re willing to make that trade off or that local risk to get things started.

Konstantine Buhler: This week we welcome Gabriel Hubert and Stanislas Polu, the co-founders of Dust, a unified product to build, share, and deploy personalized AI assistants at work. Founded in early 2023 after spending years at Stripe and OpenAI, second-time founders Gabe and Stan started Dust with the view that one model will not rule them all, and that multi-model integration will be key to getting the most value out of AI assistants. They were early to be convinced that access to the proprietary data you have in data silos will be key to unlocking the full power of AI, and they know that you want to keep that data private. We’ve worked together for 18 months and their predictions have been consistently prescient. So today we decided to ask them about those predictions.

We’ll get into their perspective on how they see the model landscape evolving, on the importance of product focus over building proprietary models, and on how AI can augment, rather than replace, human capabilities.

Stan, Gabriel, welcome to Trading Data.

Stanislas Polu: Thank you. Glad to be here.

Gabriel Hubert: Yeah. Thanks, Konstantin. Super happy to be here.

One model will not rule them all

Konstantine Buhler: Guys, first thing that I want to ask is you started this company in early 2023. At the time, it seemed like one model might rule them all. And that model at the time was—I think it was probably GPT 3.5. I don’t know if 4 had yet come out, but that was way ahead of the curve and people were super blown away. You guys came out with a pretty contrarian view that there actually would be many models, and that the ability to stitch those together and do advanced workflows on top of that would be important. So far you’ve been completely right. How did you get the confidence to make that decision a year and a half ago?

Stanislas Polu: Yeah, I think on the model part, it was clear that many labs were already emerging. It was not clear from the kind of general audience, but for the people that knew the dynamics of the market, many labs were emerging. And I think it was kind of natural to us that there would be competition in that space, and as a result there would be value in enabling people to quickly switch from one model to another to get the best value depending on their use cases.

Gabriel Hubert: Yeah. And I think from a user standpoint, the point on being able to quickly evaluate and compare is obviously important. Looking ahead or already at some of the conversations we’re having, it seems that the levels of scrutiny, security, sensitivity of the data that’s being processed may also influence some different use cases. And so we’re excitingly seeing people thinking about running smaller models on device for some use cases. And you can imagine a world where you want to be able to switch between an API call to a frontier model for something that’s less sensitive, absolutely crucial to get, like, cutting edge reasoning capabilities for, and some smaller classification or summarization efforts that could be done locally, while the interface that you use for your agent or your assistant remains the same. And that switching sort of requires the ability to have a layer on top of the models.

Konstantine Buhler: You guys have been right about this every time, as you’ve called this out. And so many of your predictions over the past couple of years of partnership have been non obvious and then correct. I think this is still not obvious, as in there will be many models, you’ll have some local models, you’ll have some API calls, and then you actually as a customer want to choose between them or want to have some control. First of all, why do you think that will be? As in, why will there be multiple models? Secondly, why doesn’t that get abstracted away by some sort of router mechanism, some hypervisory layer? And does that happen? And would you be that hypervisory layer? Yeah, help me understand that.

Stanislas Polu: I think there’s really two modes of operation if we think about the future. So it’s a bimodal distribution basically of the future. There’s one where the technology as it stands today keeps progressing rapidly, in which case there’s still going to be competition from rather big labs because there’s going to be incredible need for GPUs to build those larger and larger models, because the only way we know to get those models better today is mostly by scale. In that world, this kind of dynamic of being able to switch to the best model at time T will remain true for a long time, I guess, until we reach whatever it is that is at the end of that dynamic.

And then there’s the hypothesis, and we can talk about that later in more detail, but the hypothesis of maybe the technology plateauing, and in which case it’s not going to be one model, it’s going to be a Gaussian model and eventually everybody will get their model. And eventually on your MacBook M6 you’ll be able to train a GPT6 in a few hours, in a couple of years. And then the kind of router, the router need kind of disappears because the technology is really commutized in terms of just producing the tokens. And every company will have their own model. We would have our own model in that world.

Pat Grady: Well, we got to push you on that. You’re building a business where you sort of win regardless of which one of those worlds we go into. Which one of those worlds do you think we’re going into?

Stanislas Polu: [laughs] This one is definitely tricky. I mean, it’s interesting because—so in terms of capabilities of the models, we’ve seen or had the perception that the ecosystem was moving very quickly over the past two years. We’ve seen larger context, support for audio, support for image and stuff, but at the same time, the one core thing that matters for changing the world is the reasoning capabilities of those models, right?

And the current reasoning capabilities of those models has been actually pretty flat over the past two years. They are at the level of GPT4 at the end of its training, which is roughly slightly over two years ago, if I remember correctly, for the end of the internal training at OpenAI. And so that means that over the past two years, in terms of reasoning capabilities, it’s been somewhat flat.

Reasoning breakthroughs

Pat Grady: Well, so hang on. So there’s the Kevin Scott point of view, which is there’s actually exponential progress, but you only get to sample that progress every so often. And so in the absence of a recent sample, people interpret it as having been flat, when in fact there’s just sample bias. You can’t see it. So do you think he’s right? Do you think there’s actually exponential progress, we just haven’t gotten to see it yet? Or do you think it’s actually like asymptoting and reasoning breakthroughs have not progressed at the rate one might hope?

Stanislas Polu: As far as I’m concerned, I have a strong feeling that it hasn’t been moving as fast as I would have expected in my most optimistic views of the technology. And so that’s why I’m allowing myself to ask the question or simply consider the different scenarios.

Pat Grady: I think one of your predictions for 2024 also is that we would have a major reasoning breakthrough. Do you think it’s coming?

Stanislas Polu: Yeah, it’s going to be a tough one because this one hasn’t come, for sure. And even GPT-5 or GPT-N+1 or Claude-N+1, it doesn’t matter who cracks that first, hasn’t come yet. And there’s many reasons to believe it might not be a core technological limitation. You can make many hypotheses as to why it might be the case that it takes time. The scale of the clusters required to train the next generation of model is humongous, and it involves a lot of complexity from an infrastructure and really programming standpoints. Because GPU fails when you scale to that many GPUs per cluster, they fail pretty much all the time. All the training is very synchronous across the cluster, and so it might just be the case that scaling up to the next order of magnitude of GPUs needed is just very, very, very hard. And that wouldn’t be kind of an inherent limitation, it’s just a phase where we learn how to go from red one to red five, but for GPUs, basically.

Konstantine Buhler: Stan, you were at OpenAI at a pretty critical point in time. So people know you from the Dust experience, but one thing that has to be remembered about Stan is you were a critical researcher at OpenAI from 2019 through late 2022. You’ve got a bunch of wonderful publications, some of them relate to mathematics and AI. You worked on these with Ilya Sutskever and the crew at OpenAI. Do you think that mathematics will be essential to this type of reasoning breakthrough, or is it orthogonal? That’s something that we’re actually going to learn on textual or language data.

Stanislas Polu: I remain quite convinced that it’s a great environment to study. And it was the thesis that we had at the time with Guillaume Lample who then founded Mistral. He was working at Fair on exactly the same subjects. And our motivation was exactly—it was really shared at the time. We really were frenemies competing in the workspace, but really framed by the ideas. And I think the idea there was that mathematics, in particular in its form of formal mathematics that gives you perfect verification, is a very unique environment to study reasoning capabilities and to push reasoning capabilities, because you have a verifier. So you’re not constrained by being able to verify the prediction of the model that in an informal setup would require humans checking them to some extent. And so that very bit is probably something that has to unlock something at some point. It hasn’t yet for many reasons, but at some point it should unlock something. So I remain extremely bullish on the kind of maths and formal maths and LLM studies.

Gabriel Hubert: Yeah, I remember one of the ways you were presenting it to me when I was still very much ramping up was maths is the door to software, software is the door to the rest. And starting with some of the critical systems that were the very only ones to have been hand proven and hand verified as an example of how much more costly it was to do it by hand than do it by machine, and an indication of the future gains we could expect from being able to extend that and democratize that.

Konstantine Buhler: You guys see a lot of action through the Dust API calls. When you build a Dust Assistant, you’re able to choose what type of underlying model to use. You’re able to call many different models. Like me as a user, I often call not just Claude 3, but I call GPT-4, and I call the Dust Assistant and I call in my custom assistant. I select one of many options. What have you guys seen in terms of trends? What’s performing really well? I’ve personally been super impressed by the Anthropic models as of late, but you guys have a much closer view of that.

Gabriel Hubert: I think the—I mean, so a word of caveat on trends, you’re going to have the usual cognitive biases. The grass is always greener, people are going to want to switch just to see what it looks like on the other side. And so when you’re observing those switches, you’re not necessarily observing a conviction that the bottle on the other side is better, you’re observing the conviction people want to try. But it is true we’ve gotten great feedback on Claude’s latest Sonnet release, and empirically we’re seeing some stickiness on that model in our user base. I think that word on the street is for some coding application, Code Strata is actually performing very, very well. We haven’t yet made it available through Dust, but …

Stanislas Polu: We did yesterday.

Gabriel Hubert: Ah, there we go. Sorry. See? This is the thing you get from being in San Francisco and waking up to do a recording at seven o’clock in the morning. So yeah, Code Strata apparently is really interesting on some coding capabilities. And then you have to mix it in with the actual experience that people are getting. So reasoning cannot be fully made independent from latency. Latency at some points last year could be basically a way to tell the time in San Francisco. You know, you could see latency literally in the API as people were waking up on the West Coast. So people have use cases that may be more or less tolerant of those. So we cover the Gemini models, Anthropic’s models, OpenAI and Mistrada right now, and we have seen some interest in moving away from the default, which when we first launched were OpenAI’s models. Not to say that 4.0 isn’t performing very, very well.

The future of the open source ecosystem

Konstantine Buhler: Over the past year there’s been a lot of enthusiasm about open source models. And it’s actually one of your predictions, Stan. You have these great predictions every year about AI. I always really enjoy reading them. One of them was that at some point this year an open source model will take the brief lead for LLM quality. That doesn’t seem to have happened yet. And it also seems like the enthusiasm around—not the enthusiasm around, but rather the lead/acceleration of the open source models in comparison to the closed source models has maybe slowed down a little bit, maybe back to that Kevin Scott point about resampling at discrete times as opposed to continuous times, and we just haven’t seen it yet. But where do you think the open source ecosystem is going to go? Will it actually at some point surpass the closed source ecosystem?

Stanislas Polu: I mean, that remains—that echoes with what we said earlier. It’s really in that bimodal distribution. There’s one distribution where open source goes nowhere and there’s one distribution where open source wins the whole thing, right? Because if the technology plateaus, open source obviously catches up and eventually everybody can train their high quality models themselves. And at that point there is no value in going for a proprietary model. So I think there’s a scenario where open source really is the winner at the end, which would be a fun turn of events, obviously.

And then in the current dynamic it’s true that open source has been lagging behind so far. Obviously there’s—I think the one that has to be called out is really Facebook or Meta efforts because they have what it takes to train an excellent model, and so far they’ve been releasing every model very openly. And so that’s exciting to see what will come out of them in those next four months to maybe make the prediction true. The caveat to that is that assuming the best models are the largest, which is a somewhat safe assumption, yet it can be discussed, it means that that model will be humongous to some extent. And so that means that even if it’s open source, nobody will be able to make it run, right? It’ll just cost too much money. You’ll need eight GPUs just to do it in France. And so that will really trump the kind of usage of those models, even if they’re better in the current state of affairs in terms of cost of running them.

Gabriel Hubert: It’s a point for consumption that’s interesting because that means that you might still have a world where there’s a lot of API-based inference and demand for API-based inference, regardless of whether the model on the other end is controlled, hosted, open, waits, whatever, just because of the technical abilities to perform that.

Model quality and performance

Pat Grady: One of your founding assumptions kind of related to model quality and model performance—and this goes back almost two years now—was that even as of two years ago, the models were powerful enough and potentially economically viable enough that you could unlock a huge range of unique and compelling applications on top, and that the bottleneck even at that point was not necessarily model quality so much as product and engineering that can happen on top of the model. I don’t know if that’s a consensus point of view today. We still hear a lot of people who are sort of waiting for the models to get better. For what it’s worth, we happen to agree with you. But the question is: what did you see in 2022 that gave you that point of view? And if we fast forward to today, what has your lived experience been deploying this stuff into the enterprise in terms of where are the product and engineering unlocks that need to happen to bring this stuff to fruition?

Stanislas Polu: My triggering point for leaving OpenAI was seeing and playing with GPT-4. And it was coming from two very contradictory motivations. The first one I see is GPT-4, it is crazy useful. Nobody knows about it, nobody can use it yet, and still it exists. And literally it’s almost already in the API. I mean, at the time it was GPT-3.5 in the API, which was kind of a slightly smaller version of GPT-4, but on the same training data. It was a crazy good model which was—and it was basically Codex, the base model. And it was much better than ChatGPT, and it was available in the API. And yet the ARR of OpenAI was ridiculously small at the time, like in existence, by all standards of what we see today. And so that was kind of the motivation, and that was mixed with the fact that I was starting to feel the—I mean, I had the intuition that it would be hard to invent an artificial mathematician with the current technology. And so I was kind of seeing not a dead end, but a very long slow path forward on what I was working on. At the same time I was seeing the utility of those models already when you use them for your day-to-day tasks.

And so that was the first motivation. And the very contradictory motivation that I shared with Gabriel at the time was if that technology goes all the way to AGI, it’s the last train to build a company. So we better do it right now, because otherwise next time it’s going to be machines.

Pat Grady: [laughs]

Stanislas Polu: And I absolutely didn’t answer your question, but I’ll let Gabriel answer your question.

Gabriel Hubert: I think what got me excited when we did start brainstorming on the ways to deploy this just raw capability in the world, where it made sense to dig was one insight on some of the limitations of the hype around fine tuning at the time. People were talking a lot about fine tuning, a lot of consultancy firms were selling a lot of slides that were essentially telling big companies to spend a lot of money fine tuning. And the two things that cut it for me was Dan saying, you know, one, it’s expensive and you do it regularly and nobody knows that they’ll have to do it regularly. And two, it’s really not the right idea for most of the things people are excited to fine tune on. And in particular, like, fine tuning on your company’s data is a bad idea, as opposed to maybe sometimes fine tuning on some specific tasks where you can see gains.

But the idea that bringing the context of a company, which is obviously every real company’s obsession—how does this work for me? How do I get it to work the way I like it to work?—was going to happen with technologies that weren’t just changing the model itself, but rather controlling the data it has access to, controlling the data any of its users have access to. And those are somewhat hybrid models between new world and old world. The very old world version of it is the key holders are still the same. The CSO is the one deciding how new technology is exposed to members of a company, the guardrails that are in place, the observability that’s available to the teams to measure its impact, and any data leaks. Those are old software problems, but they still need to be rolled out on very new interfaces because the interfaces now are these assistants, these agents.

And then some of the new problems are around access controls. Does access controls look and feel the same in a world where you have half of the actions done by non humans? I might want to have access to a file that’s like 2020. Like, do I have access to the file, yes or no? And 2024 is like, well, maybe an assistant might have access to the file and can give me a summary of it that leaves out some of the critical information I should not have access to, but still gives me access to some of the decision points that are important for me to move on with my job. And that set of primitives, that set of nuances just doesn’t really exist in how documents are stored today. So if you think about deploying the capability in a real world environment where people are still going to have to face those controls and those guardrails, the product layer is actually very thick. The application layer to build the logic and the usability to ensure performance but also adoption is quite thick. And that was the—I think that was the go to say, “All right, there’s a lot to do here and we might get started.”

“No GPUs before PMF”

Konstantine Buhler: Maybe you can dig into that, because when we intersected in Q2 2023, Q1, Q2 2023, a lot of people were still starting these foundation model companies. And you guys had a very specific opinion which is the future is application layer, and there’s going to be a lot going on under the hood and we’re just going to be an abstraction layer on top of that and let things happen as it happens. We’re going to succeed in any case by building something that people actually use and love. First, how do you have the conviction for that? Secondly, how has that been playing out? What has been the hard part about it? You mentioned the CSOs and the enterprise and enterprise deployments. You guys have been way ahead of the curve on RAG. I mean, everyone was talking about fine tuning, but you guys have done so much in terms of retrieving. This was before it was even called that, really, retrieving and actually making smart decisions around information. Walk us through the step by step from the idea of application layer to where you are today.

Gabriel Hubert: You can imagine the application layer conviction existing in a world where you still decide to build a frontier model. The reason we split those two is one, it seemed like a lot of money for a lot of risk, and I mean a lot of money for a lot of risk to try and develop a frontier model or an equivalent to a frontier model, and also make a bet on the way it was going to be distributed. So our internal slogan was “No GPUs before PMF.” We don’t see the value in training our own model until we actually know which use cases it’s going to get deployed on. And there are much cheaper ways to explore and confirm which use cases are actually going to make most of value and generate most of the engagement.

The second reason was really about this data contradiction, like the fact that the cutoff dates for training on internet data are hard to set continuously. The fact that you can’t actually get an internal understanding of what happened last week in a frontier model means that fine tuning is a hard problem, that it is not a solved problem at scale. And so if you work from that conviction backwards, that means there are many cases where it’s not solved. So another technology has to be the one to deliver most of the gains. And extracting a small piece of context from documents where it lives, feeding it into the scenario, the workflow that you need help for, the one trend that seemed interesting was that actually many decisions require limited amounts of context and information to be greatly improved. So the context windows at the time that were small were already compatible with some scenarios of saying let’s just bring the information in.

And what we’ve seen over the last year, of course, is the increase in size of those context windows, which just makes it easier to expose all the right data—no more than the right data, hopefully—to the reasoning capabilities of the frontier model. And what we’ve experienced is first of all, it takes time for people to understand those distinctions. It’s hard in that you have to get yourself out of your own bubble regularly to realize that it’s true that the future isn’t quite evenly distributed yet, and people have varying assumptions on what it means to roll out AI internally, or roll out the capabilities of these frontier models on their workflows. And you have to walk them back on what they really care about, which is always very simple things. You know, I want to work faster, I want to know the stuff that I’m missing out on. I want to be more productive or more efficient in some tasks that I find repetitive. And then only bring the explanation of what technology is going to solve that when it’s absolutely necessary, because people will worry about their experience and how they feel about it more than how it’s working under the hood 99 percent of the time.

The big insight that’s happened and that I think we’re leaning into, we have been for a while and it’s great to see some of the market also doing that, is people are actually really good at recognizing which tool they need in the toolbox. Like, I think we’ve not respected users enough in saying you need a single user that does absolutely everything, and the routing problem should be completely abstracted from you. You should ask this question to the one oracle and the oracle will reply.

People are pretty comfortable telling a screwdriver from a hammer. And when they want to get to work and they need a screwdriver, they’re very, very disappointed when the one they get is a hammer, and it sounds like a hammer response. And so specializing agents, specializing assistants and making that easy to do, design, deploy, monitor, iterate on, improve, all those verbs that require [inaudible] service, it was quickly apparent to us that people were very comfortable with that.

And so the number one question that made us feel like we had an insight to hang on to and lean in on was everybody asking us about Dust was obsessed with the top use case and said, “What are people using it most for? What is the top use case across companies? What are—” and I could almost see the Amazon eyes trying to decide which Diapers.com they’re going to verticalize and integrate. Like, which verticalized use case should we now just build as a specialized version of this?

But I think the full story is fragmentation. I think the story is like giving the tools to a team or to a company to see opportunities for workflows to be improved on, augmented, and understanding the Lego bricks that are going to help them do that. So rather than encapsulate the technological bricks that are useful and abstract them away from users, exposing them at the right level gives people a ton more autonomy, and really just the ability to design things that we had never thought of. Some of the scenarios that have come up, we literally could not have imagined ourselves.

Dust in action

Pat Grady: That idea makes sense. Like, the fragmentation and providing people with the Lego blocks to see what sort of use cases emerge. Just to make it a little bit real, though, can you share a couple of use cases that you’ve seen in your customer base that have been unique or surprising or particularly valuable? Just something to make it a little more tangible.

Gabriel Hubert: There’s obviously a ton that people are thinking about. The category of obvious use cases that have been interestingly and quickly deployed are: enablement of sales teams, support teams, marketing teams. And that is essentially context retrieval and content generation. I need to answer a ticket. You know, I need to understand what the answer to the ticket is and generate a draft to the ticket. I need to talk to a customer. I need to understand which vertical they’re in and how our product solves their problems, and draft an email to follow up on their objections. I need to prepare a blog post to show how we’re differentiated from the market. Again, I’m going to go and plow into what makes us special and generate with our tone of voice. Those were pretty obvious and quite expected.

What I’ve been excited by is to see two types of things. One, very individual assistance—personal coaches. People—generally actually quite young people in their first years of career asking for advice on a weekly, on a daily basis. Like, “How did I do today versus my goals? Where do you think I should focus my attention in the coming days? Can you actually break down my interactions on Slack and in Notion over the past couple of days, and say where I could have been more precise? I’m getting the feedback that I’m sometimes talking too theoretically. Can you point out the ways in which I can improve on that in these two notes that I’m about to send?”

Gabriel Hubert: And so that’s exciting because our bet was, you know, we want to make everybody a builder. We want to make everybody able to see that it’s not that hard to get started. And by reducing the activation energy there to see small gains immediately rather than wait for the next model or the next version that’s going to really solve everything for them, personal use cases have been great for that.

The second family of use cases that I’m excited by are essentially cross functional. So where the data silos exist because the functions don’t speak the same—they speak the same language, but they don’t speak the same “language.” And so understanding what’s happened in the code base when you don’t know how to code is powerful. Having an assistant translate into plain English what the last pull request that’s been merged does is powerful. It’s powerful to people that were blocked in their work, didn’t know who they should bug to actually get an update. So, you know, marketing to engineering, sales to engineering.

The other scenarios are extracting technical information from a long sales call is powerful because it means that the engineer doesn’t need the abstraction of a PMM or a PM to get nuggets from the last call with a key account. They can just actually focus the attention of an assistant on that type of content on their own project and get those updates. So I’d say that’s the family of assistants that we’re excited by because they really represent, I think, the future of how we’d love fast-moving, well-performing companies to work where the data that is useful to you and the decisions you should make is always accessible. You don’t need to worry about which function decided on it or created it. You can access it, and that fluidity of information flowing through the company helps you make better and faster decisions day in, day out. Yeah. Any other examples that I’m missing, Stan, that you think you’re excited by?

Stanislas Polu: No, I think what I wanted to add is the fact that as you said, the usage is extremely fragmented. We see over and over the same scenario, and so we have data to back that kind of a proposition, as we—we built Dust as a sandbox, which makes it extremely powerful and extremely flexible, but also has the complexity of making activation of our users not trivial. Because when you have a horizontal sandbox-like product, you’re like, yes, but for what?

And so generally, the pilot phase that goes with our users starts by clearly identifying use cases. So they really kind of try to answer the question: what are the use cases I should care about for my company, and try to identify a couple of them. And we always see the same pattern. We see first use case gets deployed, usage starts. We try to move laterally to another use case, second use case gets deployed, usage picks up a little bit more. And then we generally go through a phase where the usage is kind of flat, increasing slowly, and eventually it reaches kind of a critical mass of usage, and all of a sudden it skyrockets to something like 70 percent of the company. And that’s kind of the pattern of activation of our users. And the skyrocketing to 70 percent, the usage picks up a ton. The original use cases that were identified by the stakeholders become just anecdotal compared to the rest of the usage. And that’s where we feel like Dust provides all its value. And it’s very hard to know for us what are all those use cases, because we have examples of a company with a few hundred people and a few hundred assistants. And so it’s just hard to answer the question: what are the best use cases like?

Pat Grady: Those are great examples. And that calls to mind an analogy that I would like to try out on you guys. And you may puke on this analogy, but this is what just showed up in my brain, which was a lot of those use cases you described, you could imagine some sort of vertical application being built around those use cases. And the analogy that comes to mind is: there are a gazillion vertical applications, and yet where does a lot of work happen? Spreadsheets. Why does it happen in spreadsheets? Everybody knows how to use a spreadsheet. They’re there, they’re flexible, you can customize them to your heart’s content. And so the analogy that I’m wondering about is there’s almost like the spreadsheet of the future, you know? Some of these applications may get peeled off at vertical-specific applications, but even then, people are still going to come back to the personal agent, because it’s just there, it’s available, it has access to your data. It’s familiar. You know how to use it. You can build what you want quickly and simply and effectively. Like, is that a reasonable analogy for what this kind of …?

Gabriel Hubert: I think it’s an amazing analogy for another thing that I’m thinking about, which is it took me the longest time to get Stan to use spreadsheets when we started working together. This is way back. This is way back when. This is—this is like I don’t know if it was 20 years ago or 15 years ago.

Gabriel Hubert: And then at one point Stan uses it for something and it’s like, “Oh, wow! This is kind of like a cool REPL interface where you can just get the results of your functions in real time.” And it’s like, “Yeah, that’s how it works. Like, it’s a cool REPL interface for non engineers. I get it now.”

And yeah, I think it’s also interesting for that, the experimentation cost is very, very low. If you think about the way in which, like, some of our customers try and describe the gains that they’re experiencing or that they’re seeing and their excitement for the future is some functions we’ve had 80 percent productivity gains. Some functions we’re seeing five percent productivity gains, and we’re not even sure that we’re measuring them right. But we’re seeing gains when the specialization of the assistant is close enough to the actual workflow that it is able to augment.

The distribution problem of that with a verticalized set of assistants is almost impossible to solve. How are you going to get that deep into that function at a time where budgets are tight, decision making on which technology is going to be a fit is sometimes complicated, when sometimes that’s where the performance gains are the most obvious? One of our users has seen, like, 8,000 hours a year shaved off two workflows for an expansion into a country where they decided not to have a full-time team. And so basically sparing you some of the boring details, but like the ability to review websites, compare them to incorporation documents in a foreign language, have a policy checker that was making a certain number of checkpoints very clear to the agents that were reviewing the accounts, all in a language and in a geography that none of these people were yet familiar with because they were really exploring the country.

And immediate gains. Like, very, very easy iteration on the first version of the assistant, Two weeks to launch it into production, roll it out to three human agents that were then assisted by these assistants and their CTO sharing, like, you know, we’re seeing north of 600 hours a month. I’m thinking our pricing is terrible. But what I’m excited by is that, is that that case could not have been explored or discovered with a verticalized sales motion, because I just don’t know how you get to that fairly junior person in a specific team and actually are able to pitch them and deploy that quickly. Whereas if you have that common infrastructure that people understand the bricks of—not everybody knows how to do some product, not everybody knows how to do a pivot table, but everybody understands that they can just play around with the basic things and probably get help from somebody close to them.

That’s the other thing we’ve seen. You know, the map of builders within companies, this heat map of people, what’s amazing about it is that it’s people who are just excited about iterating, exploring and testing new stuff, which I think correlates well to high performance or high potential in the future. It’s like Dust is heat seeking for potential and talent across your teams because the people using it the most are people who are the most comfortable saying, “I don’t feel threatened by something that’s going to take the boring and repetitive side of my job away from me. I’m excited to have that go away and focus on the high value tasks.”

Konstantine Buhler: I think for this first six months, I was one of the loudest voices saying, “What is that main use case?” I think you guys heard many, many times. And then eventually I realized—this is a primitive. We’re talking about spreadsheets. You could talk about, frankly, a Word document. You could talk about Office suite. When I interface with Dust, I think about it like Slack, except I’m not slacking my colleagues, I’m slacking assistants, and they actually do this kind of work for me and I can show them the kind of work. So it feels, Pat, to your point, something like a spreadsheet meets the ergonomics of a Slack, as in it’s brought to me as opposed to I have to go to it. And that is—it took me a while to get there, and now I see how the fragmentation is the power of what you’re going after.

How do you find “the makers”

Pat Grady: And Gabriel, I have a quick question on sort of the psychographic of your user, because your comment that it’s like heat seeking for the people who are sort of ambitious and innovative and stuff like that, I don’t know if you have a name for them, but let’s call them “the makers.” You know, the people who are not afraid to try new things and try to build stuff. Have you come up with a systematic way to find those people, or do they tend to find you through word of mouth or some other thing? Because that’s not—you know, LinkedIn profiles don’t say, you know, “Gabriel. Maker.” Right?
Gabriel Hubert: I think it’s a super interesting question at a couple of levels. But our motion is dual, right? So the things that predict a great outcome with Dust—I’m coming out of a call. I’m trying to think about what was most powerful about this call that I had yesterday with the chief people and systems officer at the company that could not stop interrupting me five minutes into my pitch. I said, “Yes, I did a talk on this. Yes, I’ve already written about this. I’ve got a blog post on this. Okay, when can I demo? Where do I put my credit card? Let’s call you next week.”

The top-down motion is enthusiasm and optimism about this technology changing most things for most people who spend most of their days in front of a computer. You need that. That’s a necessary condition because I think it unlocks three things. One, it unlocks the belief in a horizontal platform for exploration, the ability for security, to be in the supportive business rather than a blocker, and genuinely sometimes example setting. Like, we have founders and leadership teams that is like how have you augmented your own workflows last week? And leadership meetings are being asked, they’re doing off sites about, like, how are you going to get better at answering some of your team’s queries faster with us?

So once you have that then you have the right sandbox, I’d say the right petri dish. I don’t think we’ve fully cracked the builder identification. So right now it’s more like bait. It’s like the product is incredibly easy to use. Anybody can create an assistant even if they have not been labeled a builder by their organization. And it’s just the sharing capabilities of their assistant that are somewhat throttled. But we can see from the way in which people explore the product, create assistants for themselves, share them with their teammates in a limited way, a great predictor of that type of personality.

Now if you ask me to look at LinkedIn and predict who are going to be—who are going to be in that family, I’d say the number one discriminator is somewhat, to a degree, it’s a bit ageist, but people who are maybe earlier on in their careers, who have a mix of tasks that they obviously know they can get an assistant to help with, and so they have use case one just laid out for them. People who have repetitive tasks and people who scripted their way out of a lot of repetitive things before.

Konstantine Buhler: Just to be explicit, like we had the conversation, I think it’s okay to say, like, it is people under 25, like we were saying yesterday, the power users, the people that are using this all the time at the companies are the people under 25 because they aren’t set in their ways, just to be explicit. And that doesn’t mean everyone. You can be 70 and constantly innovating in a new way, but in general they don’t have the pattern that they’ve been set to. And by the way, that’s true of a lot of the next generation of productivity Notion which Pat works really closely with. That is a under-25 power law type-business. And you know, the teammates here under 25 keep pushing me to transfer over to Notion. And it’s just a different type of thinking. It feels like a very similar motion at Dust.

Gabriel Hubert: Yeah, I think that the one thing we have which is useful is that the immense B2C success of ChatGPT as a now obviously world famous product has made it really easy to set up pilots by just telling teams, “Do you know what? Send a survey out. Ask people how often they’ve used ChatGPT for personal use in the last seven days. Rank by descending order, and that’s your pilot team.” That’s the people you want to have poke holes at, kick tires, because they have—you know, we’ve asked the entire world to move from calculator technology—punch the same keys, you’ll get the same results—to stochastic technology—ask the same question, you’ll get a slightly different result. This has not happened. This is the biggest shift in, you know, the use of the tools that we have since the advent of the computer. We’re asking an entire cohort of the workforce to move to a stochastic mindset. And the only way you get that is by having a risk-reward ratio that you’re comfortable enough. It’s like, you know what, I’m not asking it to be right 100 percent of the time. I’m asking it to give me a draft that saves me time many, many, many times over. And that distribution of ROI is something that I’m comfortable exploring with and iterating on.

And I think that that is really one of the predictors that we see in people who’ve tried ChatGPT, or in people who are just curious with new technology is they expect that some of it’s going to be a bit broken, but the upside scenario to them is so clear and so 10X that they’re willing to make that trade off or that local risk to get things started.

The beliefs Dust lives by

Konstantine Buhler: So you guys have a lot of very strongly-held beliefs internally and externally. And the good news is you’ve consistently been right about the strongly-held beliefs. You’ve named a few of them. I mean, you’ve talked about the shift from deterministic to stochastic way before it was mainstreamed. You talked about rasterization and vectorization. I think about that. That can be unpacked if you’d like. Certainly would need being unpacked on the show if we go down that rabbit hole. You talked about no GPUs versus PMF, right? Can you just walk through some of the beliefs that Dust lives by? It can either be philosophical, as a couple of these are, or tactical, like the no GPUs before PMF.

Stanislas Polu: Yeah, the first one is really the continued belief that focusing on product is the right thing to do because it really feels to me like we are only scratching the surface of what we can do with those models. Right now we are starting from the conversational interface, so that’s why you use the Slack analogy. And I really truly—I mean truly believe that that analogy, the Slack analogy will not sustain in time because the way we interact with that technology will change. It started with the conversation interface, but it will end in a very different place, in my opinion.

Basically those models are kind of the CPUs of the computer. The APIs and the tokens are really the BASH interface. What we’re doing right now is merely inventing BASH scripts. And we have yet to invent the GUI, we have yet to invent multiprocessing, and we have yet to invent so many things. We are really at the very beginning of what we can do from a product standpoint with that technology, whether it evolves or whether it stays like it is.

Gabriel Hubert: Yeah. One word that I think is going to be important, and I feel recent news has actually helped confirm or is an interesting new drop in the bucket for is the notion—so one of our product mottos is “Augmenting humans, not replacing them.” And it’s not just the naive version of saying, like, we’re not here to get people fired. It’s really that we think there is tremendous upside in giving people who will still have a job in five to ten years time the best possible exoskeleton. And that it’s a very different kind of company and kind of product conversation to be like, “All right, how many dollars are we going to take away from your OPEX line next year, versus this is the number of latent opportunities that you are not able to explore as a business because your people are dragged down in pushing, like, stale slideware around, or not even knowing what dependencies they have on the rest of the company? This is how much friction you’ve imposed on the smart people you’ve spent so much money hiring, because half of their day or part of their week is spent doing things that we should literally not be talking about in 2024.” So that’s one. And the thing that comes back to …

Konstantine Buhler: To interrupt for a second. You’ve been saying that from the beginning, Gabriel. And in the beginning, you didn’t use the word ‘productivity.’ Like, you didn’t want to use the word ‘productivity.’ I wonder if that shifted, and if so, the nuance around why you chose not to.

Gabriel Hubert: I think productivity—there’s two terms that I was hesitant on. Productivity, to me, sometimes feels like an optimization, when really there’s two ways to be productive. There’s doing the same things faster, and there’s doing just better things. And I think, you know, the mix effect of productivity is enshrined in effort versus impact. At the end of the day, your boss is never going to be mad if you spent no time doing the things you were assigned to do, but brought in the biggest deal for the company. Nobody’s actually going to make any comments on that being the bad decision, because I think the more you grow in your career and the more you’re close to the leadership of a company, the more you realize it’s not about the effort, it’s really about the impact.

And the impact comes in sometimes unplanned, hyperplenary, completely, like, left-field ways where it’s like, of course we needed to focus on this, and it’s clear in hindsight, but you need to free up time, space, energy and mental cognitive spaces for that.

The other one was enterprise search. I just feel like enterprise search is one that we didn’t want to put on the website because retrieval of information is obviously a use case that people are very excited about very quickly. But we’re just very convinced that looking for the document is a step that people are not particularly passionate about. Nobody wakes up in the morning and is like, “I’m so happy that I’m going to just get the right document the first time around when I do the search.” People just want to get their job done. And it just so happens that using context from three different documents across seven data silos help them get it done faster or better.

And so I think the search bit is just—it’s never the job to be done. Nobody really wants to search. They want to complete, they want to prove, they want to test, but the search bit is a step that we think will get abstracted. And going back to Stan’s point, I think that the interfaces and the experiences we have with this technology will sort of really try to forget about what the original data source was quite fast, potentially, once we’ve gone over the trust hurdles that exist today.

The thing that this all comes back to is collaboration, collaboration between human and non-human agents. And I think projects by Anthropic are an amazing example here. We thought about coaddition last summer, and we have an amazing intern from MIT with us last summer and who spent their time working on a coaddition interface. How do you chat to an assistant to make something that you’re thinking about better—whether it’s an app or a project or a document or a script? And this is something that obviously the recent release by Anthropic has made very palpable to many more people.

That is to me, the interface and the interaction that we need to get right, and that will be in the future. So we say ‘augmentation’ and we’ll stick to it, because I think it really helps us focus on the interfaces that help humans and non-humans make progress faster. It’s going to be about proposals. How do I get to have a human in the loop with a proposal that’s written just in the right way to decide if we swipe left or swipe right on it? It’s going to be coadditioned. How do I have the language of the human in front of the assistant be as easy to interpret and as foolproof as possible for the final project to move into its final form as quickly as possible.

And so you need that interface, that interaction between the agent and the human. And you forget that when you replace too quickly. When you focus on just replacing and removing, you’ve built something that is fire and forget, essentially. And you’ll see the gains, you’ll see the dollar gains, but if you’ve automated 100 percent of your customer support tickets, you still need the insights from what people are pissed off about. You still need to understand and have your finger on the pulse of why people are stuck, otherwise you’re slowing down your product development efforts. And product development efforts today live and die by some of the comments that are coming in from support tickets. And so how you’ve made that problem go away and become, like, actually maybe cheaper, sure, but also maybe virtual and harder to connect to is not, I think, a super long-term view of how your product and business is going to serve your customers best, because you still need to think about the ultimate interfaces that are going to enable the decision making to make it better and strategic and the best option for your customers in the future.

Keeping the human in the loop

Konstantine Buhler: So keeping the human in the loop, always. I mean, human in the loop is one way to say it, but it is human driven. Like, the whole point of all of this technology that we are building is to serve humans better. And as soon as you remove that, you’ve made a terrible mistake, because someone else is not going to do that and they’re going to actually have a better experience with customers and employees and stakeholders, and then they’re going to win.

Gabriel Hubert: Obviously there’s scenarios in which you’re going to catch me and you’re going to be like this one, we know that humans get it wrong way more and so we should obviously replace it. And this is a complex and nuanced problem, so I’m sure there’s certain areas of it where pure replacement has fully understood non-external, like with no negative externality, value. But I’d venture that we’re pretty poor at modeling where value is created and how it’s funneled through the parts of our company today. And you know, economists have been great at showing that when you don’t price negative externalities, well, we end up in a pretty messy situation.

And so this is the question that I pose to leaders who are asking, you know, “What should I automate first?” I’m like, “Well, I don’t know. Which parts of the company do you worry about the most?” And often I just find that CEOs are panicked about what their customers say on support tickets. And so making that problem go away, making that problem less visible might be great for some OPEX conversations. And your stock price could have unforeseen consequences if you haven’t funneled it through in the right places.

But also, I think there’s so much more to do than to shave three percent off your balance sheet. The spectrum of opportunity that you’re giving your team if this technology is in their hands and if they’re able to come up with ideas, is broader than just firing people out of their jobs. And I’m not saying you shouldn’t do that. Like, I think I don’t want Dust to be perceived as, like, naive in this ecosystem where the disruptive nature of this technology is going to take some people’s jobs away because those jobs were currently being done by humans for lack of a better alternative.

I think in certain situations you could see those jobs as having been created because we were waiting for the robots, having been framed in a way that was because we were waiting for the robots. But I don’t know that that’s what leaders of companies are excited by. I think that the upside, the future, the way in which we need to be resilient, antifragile for what’s to come and what our competition is going to come up in, those are the ways in which energy and support, I feel, should be fueled to support teams.

Second time founders

Konstantine Buhler: You guys, second time founders. You started your first company over 10 years ago. You were an early acquisition of Stripe. You guys were there super early on. What have you learned and done differently this time as second time founders?

Gabriel Hubert: I think really understanding that a few explosive bets are more likely to get you anywhere meaningful than over optimizing too early on on something that is still meaningless in the market. That’s one thing that I think we think about differently. So, like, exploring versus exploiting, and all those frameworks. That’s one.

I think the transparency that you could—the trust and empowerment that you give to your team is—I don’t think we were against it. It’s more that we were clueless about how much more empowering you could be. So the idea—one of the best words from my Stripe years was ’paper trail.’ And it was, you know, you had two people in a corridor have a conversation, and then one of them would take the time to just write a paper trail in Slack or in a document, and say, “You know what? We just had this exchange, and we’ve moved the needle in this direction.” And it saved N other humans the time and effort to go in a meeting room or figure out that this decision has been made. And it feeds a graph network of trust and respect for your coworkers that is, I think, second to none in how you can then just achieve more as a team. So culturally, you need to sort of push that to begin with because, especially people who are earlier in their career, will not always feel comfortable with how information should be shared. So I think that’s one where example is important.

Big markets that you really believe in for a long time. We loved technology when we started our first company, like 12, 13 years ago. This is great. This is amazing. These are QR codes. Everybody’s going to use them. And it’s like, no, we have to wait for a pandemic to sell QR codes. Okay, I’ll do that next time. And so falling in love with the technology and not really fundamentally understanding how big the business could be if it’s successful and asking that question early and unabashedly is one thing that I feel is different.

Stanislas Polu: So what we kept is our experience together. I think it’s an unfair advantage to having built a company with a person because you’ve explored everything. You’ve explored the beauty, the terrible, the joy, the pain, and you know pretty much the entire API in and out. And so that makes—that enables a much more efficient cofounder interaction and collaboration. I think it’s a really big unfair advantage.

And I think the biggest one that I think is completely different for me is—and that Gabriel mentioned, is about empowering people. Really, as a founder, it’s not you that’s good. I mean, it’s not to you—early on, it’s to you to build and to build that initial spark, but then for the sake of the company, you are not the one that has to build. You’re the one that has to create an environment for people to be empowered to build those things and explore and create new stuff. And the best value you can give is—I don’t like to use that word ‘leadership,’ that was coming to mind. It’s not necessarily leadership. It’s really guidance, and trying to create an environment where everybody has the chance to do what they want. But yes, in a guided environment so that everything works as a whole. But that would be the biggest difference and something that we learned about a lot at Stripe.

Konstantine Buhler: So guys, let’s move to a lightning round. We’ve got a couple questions for you.

Lightning round

Pat Grady: All right, lightning round question number one. Stan, you share these predictions for where the world of AI is going on Twitter from time to time. At this moment, what is your top contrarian prediction for where the world of AI is going? And don’t give me this bimodal little bit of this, little bit of that. Let’s hear a point of view. What’s your top contrarian prediction for where the world of AI is going?

Stanislas Polu: I think it’s a lightning round, so I have to answer something. It’s going to be tough. It’s going to be—I think we’re on the verge of entering a pretty tough period.

Pat Grady: How so?

Stanislas Polu: The excitement will go down. Maybe it’ll take time to get to the next stage of the technology. There’s tremendous value to create, but people will not see it yet, and it’ll take a long time for it to diffuse through society. So there is a massive amount of value to create, but it’s going to be—we might have tough times in front of us.

Pat Grady: All right. Short-term pessimist, long-term optimist. I’ll take it. All right, lightning round question number two, and this is for both of you. Who do you admire most in the world of AI?

Stanislas Polu: Ilya is just incredible. I’ve had the chance to work with him. He’s my favorite people in AI. He’s extremely smart, but he’s not a genius builder, he’s a genius leader. He’s just a visionary, and I think that has been incredible. Karpathy, I know him. I actually don’t know him, but I admire him a lot. And in terms of pure genius in AI, I think it’s Szymon and Jakub at OpenAI. They have crazy last names so I’ll let people look it up. But Szymon and Yakub are [thumbs up].

Gabriel Hubert: I’m impressed by those who’ve been around for a while and are good acting as good resistance and condensator elements in the system. They’re just providing the friction to remain optimistic but cautiously so. And to me, one of the first, I can’t remember if it was a tweet or a podcast or an article, but the [inaudible] like, you know, we can make pretty good decisions with a glass of water and a sandwich, and these things require power station-sized data sources and are not making great decisions on some things. So we feel something is missing.

And just like elegantly putting that back into perspective has been interesting to me because it’s hard to not cave to the hype, I think. And so in some ways pushing for a simple ideal like being open, which I think [inaudible] is doing quite aggressively despite that not always probably being the easiest decision. And also saying, you know, we probably haven’t solved everything all the time is nice. And from my personal experience, the researchers that have worked for or with him have learned and taken from that quite a bit. And so that—and it’s not French but, you know, it’s a touch of modesty, touch of temperance I’ve appreciated in my discovery of the generative side of artificial intelligence. Like, after 10 years of just doing your prediction and classification from fraud and risk and onboarding at Stripe, and healthcare claims management and things like that, it’s nice to feel like there’s some people who’ve seen a lot, done a lot, and are just questioning rather than affirming.

Konstantine Buhler: All right, so that brings me to the third and final lightning round question. You chose a Frenchman for your most admired, Gabriel. And Dust is proudly made in France. Paris has been in an epicenter, certainly an epicenter for all things AI. Your take on the Parisian ecosystem, and what do you want to say for the French founders listening to this podcast other than, “I’m sorry, it was in English. It’s their fault, not ours?”

Stanislas Polu: Yeah, I think that the French ecosystem is awesome because compared to where it was 12 years or 15 years ago when we started our first company, you know, we have talents because there’s been a generation of scale ups that went through the market and train all of that talent, and most recently that kind of explosion of AI talent as well, which is super exciting. So I think it creates a pool of talent, and the right conditions to create incredible companies. Obviously, it’s not—I mean, tackling the US market from France is a challenge. And so that’s something to be taken into account, of course.

Gabriel Hubert: Yeah, I think if you have ambition, there’s a lot more to do. And then as long as you’re not naive, where there are still some realities. Like, you can fight some aspects of narratives. You can’t fight gravity—or at least you shouldn’t. You should probably work with gravity way more than you should fight it. But there’s a ton more we can do.

And I think we have to behave a little more like tech countries like Israel, I think, in mixing ruthless ambition, a recognition for where talent is and how it’s already connected, and has high trust connective tissue, which I think is a great catalyst and accelerant in making great companies happen. But a recognition for where the markets are, where people are buying, where people are paying, and how quickly people are making decisions on shifting to new technologies, especially in that space.

Stanislas Polu: I think the biggest advice is as a French founder, if you’ve always been in France, you have kind of that feeling that something magical must be happening in the US. Something special. There must be something special about those people. Well, I’ll tell you, I’ve been at Stripe, I’ve been at OpenAI. I’m working with Sequoia. These all are normal humans. They don’t have any magical capabilities. They’re just like us. And so it’s really important to really be ambitious, and believe strongly that you can make it, you can do it, whatever it is from France versus the US.

Pat Grady: Wonderful. That’s a good place to end it. Thank you, gentlemen.

Konstantine Buhler: Thank you, guys.

Gabriel Hubert: Thank you very much.

Stanislas Polu: Thank you so much for having us.