Partnering with Skild: The Future of Embodied Intelligence

Most people think robotics is a hardware problem—but Deepak and Abhinav know it’s a software problem.

Published July 9, 2024

Deepak, Stephanie and Abhinav at Sequoia.

Robots performing extreme parkour. Robots manipulating objects, including doors and drawers, with their hands. Robots scaling stairs as high as themselves—forwards, backwards, indoors and outdoors. Robots moving with the fluidity of nature. These advancements have broken the internet over the last few years, and nearly all of them can be traced back to two elite researchers: Deepak Pathak and Abhinav Gupta.

Different Perspectives

Unlike most in the world of robotics, Deepak and Abhinav come from computer vision and deep learning backgrounds. While traditional robotics focused on gathering specific data to train robots for specific tasks in specific environments, Deepak and Abhinav leveraged large-scale data to build a foundation model using their adaptive architecture, based on transformers. What they got by doing this was totally unique: a way to unlock intelligence in the embodied, physical world, and create a model for robotics that is generalized, robust, and leads to emergent behavior.

Outlier Founders

Deepak comes from a small town in India. While most promising students moved to bigger cities to prepare for national exams, he opted not to leave—and still got into the Indian Institute of Technology Kanpur. His accomplishment made local headlines. Deepak learned how to program by manually writing code on paper at home, which he vigorously double-checked before using his limited minutes at the local cafe to run programs he built. He excelled at IIT, winning a gold medal in computer science (and later, was the youngest winner of the school’s Young Alumnus Award). He went on to pursue a Ph.D. in AI at Berkeley while joining Facebook AI Research (FAIR), co-founded a startup that was later acqui-hired, and then became an assistant professor at the Robotics Institute at CMU, where he has since published cutting-edge research. It goes without saying: Deepak is a force of nature.

Abhinav, too, is a legend in the field: a tenured professor at the Robotics Institute at CMU, and a former founding member and research leader at FAIR Robotics. He and Deepak discussed starting a company for a decade. In early 2023, they saw the acceleration of technical advancements in their field—so many of which they had contributed to—and knew the time was now to scale their work to the next level.

The Opportunity

What exactly was the monumental shift Deepak and Abhinav saw coming? In the years they’d spent pursuing the goal of building general intelligence for robots, the key challenge had been how to build a large-scale model to train a robot when no data at that scale exists. Unlike for large language models, there was no internet of data readily available. So they explored different strategies to learn from what was available: online videos, tele-operation, real world data, simulations and more. In 2015, they reached their first unlock, scaling robotics data by 1,000x, and in the next few years, they were among the first to experiment with human teleoperation and low-cost robotic teleoperation platforms. In 2017, they proposed the famous curiosity-driven learning algorithm for building agents that can continuously explore and learn on their own. In 2021 and 2022, they broke through again with a promising strategy that used large-scale adaptive SIM2REAL, or virtual-to-real-world training, winning the Best Robotic System Award at the Conference on Robotic Learning.

These were the building blocks that led to the vision for Skild: a general-purpose model capable of tackling any task, in any environment, without specific training. If Deepak and Abhinav could accomplish that, they would achieve a breakthrough on the scale of GPT-3—with results that could apply to nearly every field.

Robotics’ GPT-3 Moment

Today, Skild has the potential to build in the real world what OpenAI has built in the digital world. In fact, Deepak and Abhinav’s approach challenges our current notion of AGI—that it can be built by learning from digital knowledge alone. In their vision of AGI, we learn by doing. We build an understanding of how things work by attempting novel tasks in novel environments, and combining the near-instant feedback of what happens with everything we already know.

Imagine that world: where a foundation model for AI robotics exists that is capable of performing any task, in any environment, on any robot hardware. It would mean a massive expansion of the types of robots we could then build, and at orders of magnitude lower cost to what we see today.

Potential Scale

I’ve now been at Sequoia for nine years, and once in a blue moon, I’m lucky enough to hear the vision of a founder where the ambition and potential scale of impact stops me in my tracks. I first met Deepak and Abhinav on a Thursday afternoon, as they were raising their initial seed round, and we were partners by Tuesday. I remember the awe, wonder, and almost daunting curiosity I felt as I imagined how the world would change if they succeeded. And as we’ve partnered together over the past nine months, those feelings have only grown. This team is onto something that could change the world we live in forever.

I’m excited to finally announce our partnership together, join the board, and share the team’s latest fundraising milestone: a $300M round valuing the company at $1.5B.