Why AI Will Create Abundance and Transform Customer Experience: Cresta CEO Ping Wu
 
					Ping Wu built Google’s contact center business before becoming CEO of Cresta, where he’s pioneering a unique approach to contact center transformation. Rather than full automation Ping advocates a dual approach, automating what’s ready while using AI to assist humans with the rest. He makes the case for an abundance mindset—imagining new customer experiences like talking to airline apps or turning synchronous interactions asynchronous. Ping breaks down the technical challenges of deploying Contact Center AI at scale, from solving latency to orchestrating 20+ models in real-time. Sequoia’s Doug Leone shares his framework for building AI companies at speed and why he believes we’re at the front end of an Industrial Revolution 2.0.
Hosted by: Sonya Huang and Doug Leone, Sequoia Capital
Listen Now
Summary
Cresta CEO Ping Wu emphasizes the importance of practical deployment over flashy demos, and why the real opportunity lies in an abundance mindset rather than race-to-the-bottom scarcity.
Speed matters more than percentage of automation: While experts debate whether 30% or 100% of contact center work will be automated, the critical factor is speed of adoption. Companies that move with extreme urgency—removing obstacles and challenging assumptions about growth limits—will capture the most value in AI’s early innings.
Meet customers where they are, not where you want them to be: Unlike self-driving cars that require 100% automation to deliver value, contact center work is divisible. You can automate ready conversations while using AI to assist humans with authentication, knowledge retrieval, and after-call work—creating immediate value while building toward full automation.
Production systems require deep vertical integration: The gap between AI demos and production deployment is massive. Real contact centers involve on-premise systems without APIs, multi-hour calls, call transfers, PII handling, and data residency requirements. Success requires building the entire stack, not just calling foundation model APIs.
Focus on abundance, not scarcity: The biggest opportunity isn’t replacing existing interactions but enabling entirely new ones—talking directly to websites and apps, asynchronous customer service, multilingual support at scale. AI can create personalized experiences that were previously impossible due to staffing constraints.
Value accrues up the stack to applications: While foundation models are powerful, sustainable value lies in the application layer closest to customers and business users. Companies with deep domain expertise, customer data, and integrated systems will capture the most durable value as models commoditize.
Transcript
Introduction
Ping Wu: Today, if you think about the business, they feel like multiple personalities to the customer. So in the sales phase they call you very, very aggressively. And once you sign up and become a customer, you’re dealing with an entirely different personality, right? And you’re dealing with service departments. I feel like these are really disconnected, right? And I do feel like AI agents can make this entire experience a continuous long-going conversation throughout the entire customer journey. And an LLM is a perfect tool to do that, and that will really bring the level of personalization, the level of customer experience that wasn’t possible before.
Sonya Huang: Hi and welcome to Training Data. Today we’re joined by Cresta CEO Ping Wu and Sequoia’s Doug Leone, who sits on the Cresta board. Today’s episode dives into the gnarly world of the contact center—a giant legacy industry filled with slow-moving incumbents that is responsible for driving the vast majority of company-customer conversations. Ping understands this world deeply, having first built Google’s contact center business before becoming product leader and then CEO of Cresta. Ping joins us to talk about the different waves of technology that have hit the call center, how he sees the future of customer experience evolving with LLMs towards an abundance future, and why his playbook is to make customers where they are, blending human agent assist with autonomous digital agents.
Doug Leone also shares his perspectives from several decades of investing in company building, and his hot takes on whether we’re in an AI bubble. He also shares where he believes the value will accrue in AI. Hint: it’s in the application layer in this gnarly last mile. Enjoy the show.
Challenges and Opportunities in Contact Centers
Ping, welcome to the show.
Ping Wu: Thank you for having me.
Sonya Huang: And thank you for bringing along our special guest Doug Leone as well on your board.
Doug Leone: My pleasure. Thank you.
Sonya Huang: Thank you both for joining. Ping, I want to start by asking: a big part of the AI thesis is that AI is going to replace labor globally, and that the TAM is in the tens of trillions of dollars. Obviously, the contact center, the call center is a big pool of labor that, you know, is just begging to be automated. If you had to guess, how much of call center labor spend will actually be automated fully by AI?
Ping Wu: The reality is I don’t think anyone knows for sure, and if you ask, it depends on really what they’re selling. And you ask different people and they give you different answers. And some people will say that 100 percent of humans will be gone in contact centers, and some Gardner research actually shows that none of the Fortune 500 over the next five years will have contact centers gone entirely humanless. So it’ll also probably fall somewhere in the middle.
And in fact, we got asked this question two years ago when GPT-4 first came out. And a lot of people will say that maybe in two or three years there will no longer be humans in the contact center. So at that time, our belief was that probably the transformation, especially for existing Fortune 500 companies, will probably take way longer than a lot of people think.
Sonya Huang: Hmm. Doug, what do you think? What’s your bet?
Doug Leone: At the limit it’s 100 percent, but I’m mindful that there are still IBM mainframes and Cobalt being used in America in the banking system. So to me, it’s not really what percent, to me it’s the speed of which this is going to happen. Is it going to happen within 10, 20, 25 years, 30 years? Because whether the answer is 30 percent or 60 percent, if it happens in 50 years, that means one thing for companies like Cresta, if it happens in three years, it means something else. So the end number is not the relevant metric for me. To me it’s the speed of adoption.
Sonya Huang: Great distinction. Ping, you’ve been working in the contact center AI space for well over a decade. Prior to becoming CEO at Cresta, you ran the equivalent function over at Google. And so maybe for those of us in the audience that don’t know the contact center market, can you tell us a little bit about what it is, how big it is and how technology has served it so far?
Ping Wu: Yeah, when you first talk about contact centers, a lot of people will naturally think about call centers. It’s a lot of humans sitting there listening, answering calls. But the contact center really is a broader category that’s including the omnichannel interactions from emails to digital chats and on websites and in apps, and also including calls, of course. And the overall market is quite big, and there are historically around 17 to 20 million agents, human agents who actually work in the contact centers. For the software market, it’s probably in the tens of billions. And for the AI market, according to some research, it will be in the high tens of billions of dollars.
Sonya Huang: And is the use case mostly customers calling in to complain, customer support? Is that what these contact centers are mostly used for?
Ping Wu: Oh, so yeah, so customers call in. There are all kinds of reasons they call in, right—complaints or fixing the issue. But also I think a lot of people may not realize that there are probably a quarter of the contact center, 25 percent, is actually revenue generating. That’s including selling stuff or collecting money or retaining customers, and those kind of conversations. So it’s not 100 percent customer support.
Doug Leone: So I have a question for you that I never asked you. If you look at the contact center—and I’m old enough to date myself—you go back 30 years, you heard names of Aya and God knows whoever else that’s barely living in and out of bankruptcy. You go back 15 years, you see the genesis of the world. What caused a bright young engineer called Ping Wu 15 years ago to be attracted to this market that one could have said it’s always been a stodgy market, it’s always been of low interest, it always created these slow-growing companies. What is it that interested you? Of course, now we understand it’s a vibrant market with lots of opportunity, but turning the clock back 10 years ago, what attracted you to this market?
Ping Wu: First of all, 15 years ago I didn’t even realize that there’s a long history of slow growth of the market. Otherwise maybe I would think differently. And second, at that time I just do remember there’s a period of time where there was a lot of excitement in the conversation about AI technology, and especially around consumer-facing speakers. And at that time people think that that would disrupt Google, that would become the entry point for all the consumer interactions. And I happen to really believe that the contact center will probably be the most exciting opportunity for conversational AI to transform.
And it’s because it has all the issues that traditionally people get excited about, VCs get excited about. It’s a massive market, a lot of humans working there, and it’s in the middle between businesses and customers, right? And it’s all the interactions going through. And also, no one’s happy in contact centers. So if you—you know, by “no one” I mean there are three different parties. There are customers that call in that most of us may not be too happy because the wait time is very long. And the agents, by the way, I think a lot of people may not realize the agent, the workforce attrition in contact centers is massive. It’s on average 35 to 40 percent. In some cases during COVID, some companies had more than 100 percent turnover.
Sonya Huang: They just get yelled at all day long.
Ping Wu: Right. So it’s very high stress, and it’s not a very fun job. And also, the businesses also feel like there are always the opportunities to do more with less. It seems no one is happy, and it’s a massive market, but I think that’s the great opportunity for AI and technology to bring abundance. And then abundance is the answer, in my opinion, to solve all these issues.
Sonya Huang: So you were working on this at Google 10 years ago. I would imagine this was the small language model wave and the BERT days. Was the technology ready at that point? And maybe walk us through the different waves of technology that have hit the contact center.
Technological Waves in Contact Centers
Ping Wu: Yeah, so that’s a great question. Even long before that, there’s technology called IVR, that you press one, two, three for different routes and for different call reasons. And then since then there are innovations around the input, right? You can, instead of pressing, you can directly speak natural language, and that’s with the advance of natural language processing and TTS and text-to-speech generation. That experience is getting better and better. When we first started the contact center at AI Google, it’s even before BERT, actually, it’s before transformers. It’s mainly using AI—or at that time using AI to do classification, intent classification, and entity extraction using pre-transformer models. But the conversation experience was still manually crafted, right? So that’s the last generation of technology. And then after that, of course, the transformer came along. Initially it was also for classification purposes, and still the experience is manually crafted. But then the LLMs entirely changed the whole thing. Not only the conversation experience on the automation side, but also can understand conversation in a way that never was able before.
Sonya Huang: And what does that mean practically in terms of the rollout of this technology inside contact centers? Does it mean that customers were just extremely unhappy when it was IVR, and then they were slightly less unhappy when you started to have kind of more transformers in the flow, and now customers are very happy to be talking to an LLM-based agent? Or how has the evolution of technology changed the customer experience?
Ping Wu: Yeah, I think the way we like to think about it is really from the first principle, right? And a lot of the conversations shouldn’t even happen in our view. And the fact that it happens is because the customer is not happy. I think the solution for that is to use the AI to really understand, to bring a hundred percent visibility into all the interactions in the contact center today, and using AI to analyze it and then to do deep research and then find out the root cause. And then that usually reflects some process that’s broken or website updates that freak out people or firmware updates that bring down networks and all that kind of stuff. So you need to fix that first, right? And first, avoid interaction if it’s not necessary.
And beyond that, I do feel like AI can automate a lot of interactions that no one wants to have, like, neither the business nor the customer want to have those interactions. Those are what we call low-emotion-value interactions that should be self served. And then on top of that, I do think that contact center AI will enable new interactions. That’s the ones that you cannot afford to do today. So all these are improving customer experience.
AI vs Human Agents: The Future
Sonya Huang: Do you think end customers will ever prefer talking to an AI agent over a human agent? And have we reached that point yet?
Ping Wu: So look, I mean, that’s a really interesting question. So I’ve been thinking about this on my way here. So I never met anyone that had this experience of talking to a customer support agent on the phone and go, “I’m really frustrated. Send me your AI, please.” And we never had that experience. And in fact, I would encourage people to look up some of the companies in a search for their customer service. The first question that people ask on Google and Google will surface, what is the most popular question? The first question is always, “How do I talk to a live person for this type of customer service?” So I think that that time probably hasn’t arrived fully yet. It depends on what kind of interactions again.
Sonya Huang: I’m maybe too techno optimistic or AGI-pilled here, but I feel like I’ve seen some recordings now where the AI can be emotionally intelligent. It has infinite patience, right? It’s not trying to hit some metric on time to resolution. And so, for example, if somebody calls in and they’re having a really bad day, for example, your AI can be a lot more patient and empathetic than a human agent even could. And so I’m sort of optimistic on the side of the bots here.
Doug Leone: Well, I agree. There’s the human component of patience or the subtleties of humanity, but there’s also the training of the agent versus the training of the AI. Three years from now, who’s going to be much more equipped to answer a question? It’s clear that AI is the answer. I kind of think of gold versus Bitcoin. Somehow the analogy came to my mind as you said that. It is clear that Bitcoin is going to win. It is clear that Bitcoin is going to be worth more than gold.
Sonya Huang: Not investment advice.
Doug Leone: Not investment advice, but it is clear that the agents, by definition—and a lot of which don’t even reside in America, there’s a language component. You know, I’m not saying anything bad about the agents, but there’s a language component, there’s a training component, there’s the human component. And I think in all those dimensions, I think AI is going to win in the next two to three years.
Sonya Huang: Hmm. Bitcoin as digital gold is a really interesting analogy to the digital agent versus the human agent question.
Ping Wu: Yeah, from our perspective we really want to meet customers where they are today. So unlike self-driving cars, you really have to automate the entire thing a hundred percent of the time, otherwise you do not have the economic impact. For contact centers, what we find is very unique is that the work is very divisible. So first the conversation is, you know, those are—every conversation is an independent unit, and you can automate X percent of conversations that’s ready to be automated. And for a lot of reasons—we can get into details. And then for the remaining ones, you can still use AI to assist humans, and to take away the initial maybe 10 percent of the interactions like authentication or intake or lead qualification, and then take away all the after-call work. And also have AI agents to help humans in the middle of the conversation to do knowledge retrieval, to do data entries, all that stuff.
So that’s not mutually exclusive. And as long as we feel like the customer’s not ready to say that we just need to turn on our call center today and then go full AI, we feel like there is a long—you know, depending again on what kind of business and what kind of, you know, IT infrastructure. So I think the journey will probably take a different time frame. But our goal is really to meet the customer where they are.
Sonya Huang: Yeah. So Cresta is in an interesting position, because you both have the agent assist product that helps make existing contact center agents more productive, and then you have the actual AI agent product that is a directly customer facing autonomous agent. Where do you think most customers are today? Are they ready to go full force, just, you know, put the agent on my website, let it go crazy. Are they, you know, experimenting with that? Where is the customer today?
Ping Wu: It depends on the customer. If you and I start an e-bike store today on Shopify then we can automate a hundred percent, I’m sure, because it really depends on how complex is your product. It can be ordered a magnitude difference between, like, a simple product like an e-bike or versus a real world touching many different countries, and then millions or tens of millions of people. So it’s very different, and then that impacts the complexity of the conversation handled by the contact center.
And then the other part is the IT infrastructure. A lot of people may actually realize that before you actually enter the contact center you will feel like oh, this should be very easy to automate. The reality is a lot of those things that humans do in the contact center today is optimized for humans. So those system records or the system action ticketing system, these have been around for decades. A lot of them just simply do not have APIs, right? So the only thing to make changes is through a graphic user interface that’s optimized for humans. And without a real time API, just again, these are not AI problems, and we believe that these are the opportunities that we work with our customer to develop those real time APIs. And so that’s why we feel like those transformations which depend on the nature of the business would take different timeframes.
Sonya Huang: Yeah. It’s interesting you made the self-driving car analogy earlier, because I was thinking about your business earlier this morning, and if you think about Tesla, part of the beauty of them getting to full autonomy is that they have so much data coming in from their cars even when they’re on L2, right? For you guys, because you are the agent assist, you actually get full data of the conversation, whether it’s voice, whether it’s conversational-based, digitally. And that can become a training base for customers to automate more and more of their conversations over to the agent over time.
Ping Wu: Yes, a hundred percent. And in fact, the journey when it first started seven, eight years ago, it was really automation only. I really believe it should be automation only, and then fast forward, we run into all kinds of real deployments. And then we really actually broadened my own horizons, then I believe that in order to really do the best possible automation, it’s counterintuitively you need to know what actually happened in the contact center. What are humans actually doing? So not only just the conversations, but also what they’re seeing on the screen. That’s super important to actually build the best automation possible.
Doug Leone: One of them is the sex appeal, it’s the sizzle, it’s what everybody wants to talk about, which you have to have, otherwise you’re a tired old company. The other is the realities of a business to run and what they need. And so if you are one of these new age companies, you’re quickly going to hit a wall because you don’t have the data and you don’t have the systems that you really need to run a contact center. But if you’re the former and don’t have the latter, then you’re labeled as an online company. So here in our case, we understood this a while back, and we make sure we invested. We not only doubled down on the operational system for agent assist, but we also developed the sex appeal product because that’s what a lot of customers want to talk about day one.
Ping Wu: Yeah. And another aspect of it is really just tied to the point I made earlier is that a lot of those costs shouldn’t really happen. People call in, there’s no way to make them happy. It’s because they’re not happy to begin with, right? And, you know, if your product works, if your process works, this shouldn’t really happen. So look, if in this room we feel really, really cold, maybe the answer is not a heater. Maybe there’s a broken window, or there is a patio door wide open. The solution is to turn on the light and see the root cause, and then fix that first before you turn on the heater.
Sonya Huang: Love that. Customer support is one of those canonical examples of where people think large language models will be most transformative. And it’s almost a consensus category for venture startups at this point. How do you compete? What is it like to compete when everyone has access to the same LLMs and is latching onto the same big picture vision?
Ping Wu: Yeah. So again, in order to really deliver value in the context and the transformation, it’s not just the models, it’s just not a model. A model is a bunch of weights and data, and itself is not going to provide a value, right? And now the question is how much do you need to build on top of it to deliver that value? If that layer is very, very thin, then our argument probably is you don’t have much opportunity to accrue value.
And then also if that layer will be gone when the model gets better, there’s no way you have a durable business. But that is not the case for contact centers, where the majority of the agencies are still on premise and where there are so many—look, on average, agents in the Fortune 500, we look at some surveys, they interact with eight to ten different systems. Remember, these companies also acquire other companies over years, over decades, those back end systems may not even talk to each other. You know, it depends on where you book the flight, or depends on where you booked the hotel, they may need to log into different systems, right? So that’s the reality we’re talking about. So that’s why we believe our strategy is meeting customers where they are and then drive value on day one.
Building a Company in the AI Era
Sonya Huang: Vertical integration from the steak to the sizzle. That’s how you win. What do you think is overhyped and what’s underhyped in the kind of contact center AI space right now?
Ping Wu: Yeah. For overhyped, I think it’s the mindset of scarcity, is the job displacement, I think, in the short term is probably a little overhyped. And what’s underhyped is the mindset of abundance. Think about a new experience that AI can enable. For example, can you talk to a website? Can you directly talk to the app? And can you turn a synchronous interaction into an asynchronous interaction? Can you talk to the airline app and say that I want you to do this XYZ, and then call me back when you get it done? And then can you have that super multi-language AI agent to have those conversations? Or there are so many interactions that today you just cannot happen just simply because you do not have the staff, right?
And then the other thing actually I feel is really underhyped is people really seem obsessed with one side of the conversation, which is the workforce. And then people ask how many of the workforce were replaced by AI, but no one ever asked the question is how many inbound calls will be replaced by AI? So my belief is that there will be, over the next few years, you will probably see a race to getting the AI assistant on the consumer aggregators, and then a lot of things that consumers probably will dedicate to the AI assistant, including making the phone calls. So I think that’s maybe an interesting thing to pay attention to.
Sonya Huang: That’s really cool. Okay, so you could talk to the United Airlines app and have it, you know, asynchronously go figure something out for you and call you back. Is that something that you’re working on?
Ping Wu: We’re not commenting on that.
Sonya Huang: Okay, very cool. Okay, I want to transition to talk a little bit about company building. Doug, you’ve been around the block for a while, seen the movie a few times.
Doug Leone: Means I’m old. That’s what you just said.
Sonya Huang: [laughs] I was trying to say it nicely.
Doug Leone: Yes.
Sonya Huang: How is building a company right now—you’re seeing this live with Ping. How is building a company in AI different from your last few decades of building legendary companies?
Doug Leone: It’s not very different. What I mean by that is you need a terrific founder—and we’ll talk about the Cresta situation a little later, hopefully. You need to plug in world-class engineers at the very start. Unless you start with A pluses, you’ll never move up, you’ll only be moving down. You have to plug in salespeople that are not administrators, that are fresh. Maybe they were a regional sales manager early on, because one, you can’t get the world-class people; and two, if you get them, they’re too big for the company. You have to figure out what the ramp is that you’re willing to fund. You have to figure out what the role of marketing is. You have to solve this thing that I call the merchandising cycle that’s been getting some play online, which is from product marketing to BDRs to revenue. Wherever that’s broken, it looks like a bad sales guy, a bad VP of sales, but you have to get that right. And so I think the business fundamentals are very similar.
Sonya Huang: I do think one of the characteristics of the companies that are doing the best in AI right now is they just move with extreme speed. And maybe that’s always been the case, but I think it’s even more intense right now. How do you think about instilling the need for speed in the companies you work with, and even at Sequoia?
Doug Leone: So I thought of answering that as part of my answer, and the reason I left it out is all the boards I’m on move with extreme speed. And that’s because I paint a picture for the founders of a river, a river with rocks. And the founder’s and the CO’s job is to remove those rocks. So when you give me next year’s plan, I don’t care that’s 150 percent net new AR growth. I want to know why the plan is the plan, and I want to challenge you why it’s not 3x that.
And maybe the answer is funding. But we can get funding in this market. Maybe the answer is management experience. Well, that’s often a good answer. Some people will say market. Well, no way that’s market. We’re a little company that is—and so in my mind it’s forcing the understanding that these companies are capable of doing things which they don’t believe they are capable of doing yet, and to remove those rocks. And I push and I push and I push and I said, “Why can’t we go faster and do it in a linear fashion?” Because God forbid something isn’t going to happen. If you hire 250 salespeople in Q1, and then you realize in Q3 something’s wrong with the product, then you’re stuck with a burn. So I’m a believer and I hear, “No, we gotta train them all the time.” Baloney. Give us please a revenue ramp that’s linear so we can make mid-course corrections up and down. And let’s not be stuck by these numbers. We have 10 fingers, 100 percent growth. That’s all bullshit. How fast can we possibly grow? That’s always been the mantra in all the boards that I’ve served on. AI is not different.
Sonya Huang: What does Cresta need to do next? What does Cresta need to do over the next five-plus years in order to become a great company, a legendary company?
Ping Wu: So …
Doug Leone: Well, first of all, it has to continue to develop product. It has to continue to put one foot in front of the other. It has to always see, whenever some people reach a Peter principle of their role, it has to be relatively aggressive in making sure it hires people that are capable of taking it from that point on and forward, staying away from these, quote, “very experienced people” that start feeling a bit like suits and administrators. Point one, that’s the most important thing.
But the other thing that Cresta has to do, it has to up its game in marketing. There’s a lot of companies—I use the word “the sizzle.” There’s a lot of companies with a lot of sizzle and no steak. We have a whole bunch of steak. We’re a modern company. We’re best in class in one category, we’re going to be best in class in the other category. We have beautiful growing run rate in both the agent assist and in the AI part of the product, in the automated part of the product. I just think we need to attach a marketing overlay so we become a household name out of the market.
Where Value Accrues in AI
Sonya Huang: Wonderful. Well, glad you’re on the podcast, then. [laughs] Maybe stepping back, Doug, you’ve seen some market cycles. Are we in an AI bubble?
Doug Leone: The word “bubble” implies you invest money in and you lose money because either due to lack of supply of companies or abundance of capital. And there’s certainly an abundance of capital. But I’ve noticed over the last two cycles, the internet cycle with Netscape going public in ‘95, two great companies being built in the late ‘90s in Google and Amazon, a few others’ names that came to me. Then a bit of a pause, even the words I heard, “The internet is a fraud. It’s not going to do anything.” And then three years later, the world went crazy.
That latency was a lot less in mobile. I remember when we first looked at these apps and Jim Getzen, our former partner, said, “How do you make money from a $19 app? How do you build a multi-billion dollar company?” Never thinking of Airbnb, never thinking of DoorDash. A year or two later, we saw Airbnb and DoorDash.
Again, that from initial birth to real market, shrunk from the internet, I think this has shrunk even further. I think AI is here. I think you have to invest. I think you’re at the front end of a cycle, which doesn’t mean you have to invest in everything. But one of the mistakes that we made at Sequoia is whenever we see a bit of revenue, the momentum, we have some geniuses around the partners’ meeting that say, “Oh, it can stop, it can be substituted.” Keep it very easy. You see a small company with great momentum in a front end of the market—I’m not talking about the SaaS market in 2021 where you’re down to niche verticals. At the front end of the market, you start seeing the modicum of revenue momentum, you lean in and you hold your nose on price.
Sonya Huang: I love that. As you think about where value accrues in the market, there’s compute, there’s other infrastructure, there’s the foundation models, there’s the application layer. Where do you think value accrues?
Doug Leone: Up.
Sonya Huang: Up?
Doug Leone: It always accrues up. Just look at the gross margins as you move up markets. Look at the gross margins of chip companies, look at the gross margins of the system companies, look at their gross margin of this—well, but that’s—and Nvidia, of which we were the first investor, is a great company. Jensen was able to see the future many years ahead, and he pulled one of the great—probably the greatest coup in Silicon Valley, what he did. It’s just spectacular. But if we’re looking over time, I think value is going to accrue, to quote the application layer, what that ever looks like, you know, it’s going to accrue up near the customer, near the money, near the business user.
Sonya Huang: I agree. How do you think the AI wave is different than internet or mobile?
Doug Leone: I thought of everything else being tools to make us more productive, meaning we all became networked and we all became networked and mobile. I view the AI wave as the Industrial Revolution 2.0. I think this is much, much larger. I remember thinking, “Boy, we have just seen the biggest market caps five years ago.” Why is it? Because it was connectivity that created this revenue growth. Never imagined that there was this thing that was going to be much bigger than connectivity and the mobility. It was a complete redoing of humanity, of how humanity exists, works, lives, enjoys. And I think AI is both going to be a wonderful thing for us, and maybe even a kiss of death to us over the next 10, 20 years.
Ping Wu: I totally agree with what Doug said, and I think one thing AI is very unique is that there are so many surprises. There are surprises of underlying capabilities that you never seen before in internet or mobile age. You know, if you take the world view in 2015, and take a time machine to give that to someone in 2007, when Steve Jobs first introduced the iPhone, I think someone can resonate with that. And then same for internet. I think people can kind of foresee what’s coming. But for AI, I feel there’s so many surprises. As the underlying model gets better, there are things that even the authors for the transformer paper would not have imagined some of the capabilities that just came after the large language models, and that continue to surprise us. So I do think that, you know, a lot of the improvement is nonlinear, it’s really from zero to one continued happening at the bottom layer. So I think that’s something that makes it even more exciting.
Doug Leone: You know, I’m going to remind you of something. In March of 2022, which now sounds like an eternity, it was my last annual meeting where we meet with all the investors. And it was a goodbye kind of thing, you know, where I present the performance and everything. And I had a slide that talked about all the waves back from the chip wave to the systems wave to the LAN/WAN wave to internet to mobile. And the next box, a short three and a half years ago, was a question mark. We did not know as a partnership—and we are as advanced as anybody, we are the bleeding-edge investor in seed, we did not know not see the wave coming. And this wave has been a tsunami, and I don’t think there’s any end in sight.
Cresta's Stack
Sonya Huang: Thank you. Thank you for sharing those insights. Do you want to talk about Cresta’s technical stack, or should we bug Ping on that?
Doug Leone: I’d like to. Well, in fact, I’m going to have to go in a few minutes because I’m in a process of recoding some of the …
Sonya Huang: Are you vibes coding the Cresta app?
Doug Leone: Yes, yes. I’m vibes coding everything.
Sonya Huang: Ping, tell us about the tech stack.
Ping Wu: Yeah, so we have a pretty broad surface or product, and I can maybe talk about the voice AI agent. We’re streaming end-to-end audio bidirectional, and we orchestrate multiple different models. There are speech-to-text models and then noise cancellation models to improve the audio. There are models that detect the terms and the speech activities and to handle interruptions. And then, of course, there’s a foundation model to handle the conversation. And the other side is the TTS text generation model.
And then in parallel, we also run multiple smaller models to do guardrail checking and to make sure that nothing is going crazy. And as well as those models, we’ll do company-specific kind of checks, for example, never give out tax advice or never give out financial promises, things like that, right? And then that’s the runtime of voice AI agent.
And also there’s design time. There are components like running large scale simulations to really stress test the AI agent to cover all the edge cases. There’s test case management components. And similarly, if you think about our voice AI assistant, so it’s also streaming audio but again, so there’s a lot of similarities between the infrastructure, but it’s now bi-directional, right? It’s one direction. And in listening to the call and then understanding what’s actually happening in the call with two humans, and then orchestrating 10-plus more models, actually.
In fact, similar to Vertex AutoML, we have a platform that can allow customers to build their custom models to detect interesting events in the conversation, and then marry that with workflows. And people use that to detect fraud, call center fraud, and to train agents to how to handle objections. There are so many use cases now with that tool we call Opera, they can express and trigger workflows. And underneath is teacher-student distillation to distill into really small models that we can run in real time and to understand two human conversations.
Sonya Huang: What’s the latency when I talk to one of your agents?
Ping Wu: So it’s around below 800 milliseconds.
Sonya Huang: Wow! So it feels like talking to a human.
Ping Wu: Yes.
Sonya Huang: So you’re running all these models in near real time then?
Ping Wu: Yes.
Sonya Huang: Are you running open-source models, or are you running ElevenLabs in the equivalent?
Ping Wu: So across the platform there are 20 different models. Some are open source, some are fine tuned. There are small models that, for example, we only do chat or email for human agents, and we autocomplete their sentences and type ahead. Those are very, very small models. And for TTS, yes, we use ElevenLabs. They’re a great partner. We also use other vendors and we constantly compare the performance.
Sonya Huang: Really cool. And then the actual meat of the conversation, though, the dialogue or the conversational flow, how do you control that in a way that’s not so rigid that it’s like the IVR systems of yesterday, but not so free form that, you know, customers can go crazy and get their refunds on airline tickets, and have the bots say crazy things and embarrass the customers? How do you control the flow and get the best of both worlds?
Ping Wu: Yeah, so it’s really just how you train humans. You give them the specification about what’s the goal and these are the tools. That’s the beauty of large language models to handle those messy kind of workflows. So there’s a lot of discussion about what’s workflow, what’s agentic. Workflow is anything you can write down in code. That’s step by step, that’s workflow. And car wash. Car wash is actually a workflow. If you think about boba tea, milk tea, those are physical workflows, but they cannot do other things. For human conversation, it’s very messy, it’s non-linear, right? So it’s like that’s how the agentic workflow comes in. That’s where LLM is really good at. And then on top of those, you want to determine [inaudible]. And that’s how we’ve introduced the testing, the simulation, and then the guardrails to make sure that whenever you have a change in any part of the system, the behavior is still expected.
Sonya Huang: Do you tune your customers’ models to—because you also have this agent assist product, so you’re in the flow of all these customer conversations. Do you tune the agent to that training data, or is it completely net new forward-deployed engineers on site mapping out conversations?
Ping Wu: Yeah, so we have a tool that can map from, you know, what’s actually in the human conversation to extract the blueprint of the conversation, right? So, you know, I think the beauty of that again is to discover a lot of unknown unknowns. So there are a lot of topics and there’s a lot of things, the reasons that people call in you may not even know that may actually contain the call volume, a very large call volume. And then once you have that, you can now look deeper and you can use an LLM to do all these analyses and extract what are 57 different ways that people express the same intent, and what are the different ways that the call flow will go, right? And then we can summarize and extract that. So all these are building the products, and then, in fact, the tooling gets better, the forward-deployed engineers will just be a lot more efficient.
And then there are also other ways we use the human side of conversations. For example, we extract the model for the visitors. So that’s how you build your simulation. And the simulation is a huge part of improving the AI agent, and we believe that having access to exactly how your real customer humans come in and describe ways and in different ways sometimes it’s very messy. You can extract the model and then do a better simulation on your AI agent as well.
Sonya Huang: And then what methods do you use to make LLMs really bespoke for customer environments? Like, is it RAG, is it prompt engineering, is it fine tuning? Is it all of the above? Reinforcement learning? What are you most optimistic on in terms of techniques?
Ping Wu: Yeah, so we use almost everything. So definitely prompting and then RAG for those simpler agents. But we’re still exploring by looking at the human behavior and then the outcomes, how do you use RL to improve this end-to-end performance? But for the AI agent by itself, I think the foundation model itself is already pretty good. You just need to get the best out of it, at least for a digital channel, for chat. But for other use cases, there’s a lot of opportunity to fine tune the models and to make them for tasks like summarization, for tasks like auto completion of sentences and that kind of stuff, I feel like there’s a lot of room to extract from the fine tuning open source models.
Sonya Huang: What goes into building a successful flashy demo versus production-ready AI systems?
Ping Wu: Yeah, so that’s a really interesting question, because I think one thing unique about AI is that there’s a huge gap between the demo and production. And on one end of the spectrum you have rocket launches. For rocket launches, the demo is the production and the production is the demo. You cannot fake it, right?
But for AI it’s a little different. And I can just give you an example, right? So auto summary. Auto summary feels like a commodity capability that anyone can use ChatGPT to create auto summary. But in order to deploy in some call centers today that have 20,000 people across multiple continents and call centers, the challenge—huge list of challenges. First, how do you get the real time audio? In the demo, you can demo very easily on Twilio in the cloud. But remember, 50 percent of the conversation happened on premise, right?
And then sometimes how you access that will cost you a lot of money as well. And how do you go around that? And then in the real 20,000 agent calls there are transfers. There are a lot of transfers. And then there are third party, third callers that come in, that’s healthcare specialists. All that needs to be transcribed and summarized.
And sometimes the conversation goes so long. How do you handle, like, three-hour, four-hour calls that go beyond the contact window, right? And then things like is there background noise? And then things like for different core reasons there can be different templates. You really, really want to extract these types of information, you cannot miss that. How do you make sure you do that almost a hundred percent of the time.
And by the way, how do you handle PII? And you cannot have the personal identifier information [inaudible]. And then by the way, how do you handle, you know, a data residency if you are talking to a multi-continental multinational bank or a healthcare provider? So all these have become additional requirements that make something that would feel very commoditized like auto summary become very, very much harder to do in an actual contact center.
Doug Leone: And that’s why you need a product-minded chief executive officer for one of these companies.
Sonya Huang: Absolutely. And that’s also why all the pain and all the value is in the last mile. This is why the value is in the application layer.
Doug Leone: That’s right.
Ping Wu: Yeah, I tend to agree with that.
The Future of the Contact Center
Sonya Huang: Yeah. Talk to us about the future. What happens if everything goes right? What does that mean for Cresta and what does that mean for the world?
Ping Wu: I think that AI will, just like any technology before it, like electricity, it will disappear. It will disappear into workflows, and I think 20, 30 years later no one will realize that they may actually be talking to AI or is a human assisted by AI. I feel like there’s one thing I’m really excited about is that today if you think about the business, they feel like multiple personalities to the customer. So in the sales phase or the marketing phase, they really, really want to talk to you. They call you very, very aggressively, and once you sign up and become a customer, you’re dealing with an entire different personality, right? And you’re dealing with service departments and they tend to use the terms like “tier defense,” “deflection” and to just handle—you know, to refer to the exact person that they were calling just a few days ago.
And then even if you have a long conversation on the customer support line and share a lot of feedback, two weeks later another department will come in. “What’s your feedback? How about you fill out this survey, you know, to our business?” It feels like these are really disconnected, right? And I do feel like AI agents can make this entire experience a continuous long-going conversation throughout the entire customer journey. And LLM is a perfect tool to do that. And that will really bring the level of personalization, the level of customer experience that wasn’t possible before.
Sonya Huang: Yeah. The point that really stuck with me that you said earlier was about kind of the scarcity versus the abundance mindset, and how much can business-to-customer communications really evolve and, you know, app experiences really evolve if you take the abundance mindset to bringing LLMs into this field. Thank you, Ping. Thank you, Doug, for joining us today. I love this conversation.
Ping Wu: Thank you.
Doug Leone: Thank you for having us.
 
								