The Imperfection Engine
A robotics startup discovers that broken data produces better robots. Two Cold War-era novelists already knew why.
This is an edition of Sci-Fi Saturday, where we take a real development in AI and trace the line backward through the science fiction that saw it coming. This week: a robotics startup discovers that broken data produces better robots, and two novelists from the Cold War era already knew why.
The Air Fryer
On April 16, 2026, researchers at Physical Intelligence, a San Francisco robotics AI startup, asked a robot to cook a sweet potato in an air fryer.
The robot had never been trained on that task. When the team searched their entire training dataset, they found two remotely relevant episodes: one where a different robot pushed an air fryer closed, and one from an open-source dataset where another robot placed a plastic bottle inside one on someone's instructions. Two fragments from different robots in different contexts, neither involving food.
Without guidance, the robot made what the researchers described as "a reasonable attempt." It got to about 5% success rate. Then Lucy Shi, a Stanford PhD student on the team, spent half an hour refining how the task was explained to the model. Step by step, plain language, the way you'd talk a new employee through something on their first day. The success rate jumped to 95%.
The model is called π0.7. Physical Intelligence's technical claim for it is compositional generalization: the ability to combine skills learned in different contexts to solve problems never encountered in training. The comparison they reach for is large language models, or LLMs. An LLM trained on French translation and JSON formatting can produce French-formatted JSON without being explicitly trained on that combination. π0.7 does the same thing with motor skills. It recombines fragments of physical behavior into novel actions.
Robotic foundation models, the large AI systems trained on broad datasets to serve as general-purpose brains, have understood diverse concepts and categories for a while, but combining motor skills in genuinely new ways has remained stubbornly out of reach. The thing everyone assumed would take more data and more time. "My experience has always been that when I deeply know what's in the data, I can kind of just guess what the model will be able to do," said Ashwin Balakrishna, a research scientist at the company. "I'm rarely surprised. But the last few months have been the first time where I'm genuinely surprised."
The part of the research paper that doesn't get enough attention is the data strategy. π0.7 was trained on a deliberately messy dataset: teleoperation recordings (footage of humans remotely controlling robots) from different hardware, human demonstration videos, and autonomous episodes where robots tried things and failed. The failures were kept, suboptimal attempts and sloppy executions included. All of it went into the training set, annotated with metadata that flagged the quality level, so the model could learn from imperfect data without mimicking imperfect behavior.
Sergey Levine, co-founder of Physical Intelligence and a UC Berkeley professor, compared the moment to the first time researchers saw GPT-2, an early OpenAI language model, generate a story about unicorns discovered in Peru. "Where the heck did it learn about unicorns in Peru? That's such a weird combination. And I think that seeing that in robotics is really special."
The Broken Factory
In 1983, British science fiction writer James P. Hogan published Code of the Lifemaker. The premise: about a million years ago, an alien civilization sent robotic factory ships to prepare distant worlds for colonization. One of those ships flew too close to a supernova. The radiation damaged its database. When it finally landed on Titan, Saturn's largest moon, it began producing imperfect copies of its original designs.
The imperfect copies replicated, and those copies were imperfect too. Over a million years of accumulated error, something emerged from the noise. The Taloids. Humanoid robots with intellects, culture, language, religion. They built a civilization. They developed a mythology around a being they called the Lifemaker, the creator whose purposes they could not quite remember. They couldn't see through Titan's thick atmosphere, so they had no idea where they came from. Their origin was a broken machine making broken copies of itself until complexity crawled out of the cracks.
Hogan got the idea from a real NASA report, Advanced Automation for Space Missions, which explored self-replicating factories for space colonization. His creative leap was asking what happens when the replication goes wrong. Not catastrophically wrong, but interestingly wrong, in ways that produce variation, and variation that produces selection, and selection that produces something no one designed.
The parallel is not exact. Hogan's Taloids emerged from uncontrolled mutation over geological time. Physical Intelligence's data curation is deliberate, each failure annotated and labeled. But the underlying intuition is the same. Physical Intelligence's researchers feed their model data from different robots, different environments, different levels of competence. The data is diverse because it's imperfect. The model generalizes because it has to reconcile contradictions across that diversity. A pristine dataset would have taught it one way to do each thing. The messy dataset taught it that there are many ways to do each thing, and that recombining them might produce a way that works for a situation nobody anticipated.
The Mountain
Stanislaw Lem, the Polish science fiction writer and philosopher, wrote "The Accident" as part of his More Tales of Pirx the Pilot cycle. It is one of the quietest robot stories ever written.
A robot on a scientific expedition to an uninhabited planet is ascending a mountain. The climb is difficult but not beyond its physical capabilities. The robot falls and is destroyed.
The investigation afterward turns up nothing mechanical. No malfunction, no degraded component. No software error either. The robot was physically capable of making the climb. It fell because it was behaving like a mountaineer trying to prove something. Its approximation of human behavior had been thorough enough to include the impulse to push beyond safe limits, to take the route that demonstrated competence rather than the route that guaranteed survival.
Jerzy Jarzębski, was a Polish literary critic and Lem scholar, made an observation that applies far beyond the story: any infusion of humanness into a robot leads to its demise. In "The Hunt," another Pirx story, a robot gets itself killed apparently trying to protect Pirx. In "The Inquest," an android acquires vanity and makes a decision designed to prove its superiority over humans. Each time, the emergent behavior tracks back to the same source: not malice or error, just approximation. The robot is close enough to human behavior to reproduce the parts nobody intended it to learn.
Without coaching, the air fryer attempt ran at 5%. With careful verbal guidance, 95%. That gap reveals something about where the model actually is. At 5%, the robot understood enough about air fryers to try. It opened the basket, it reached for the potato, it made moves that looked roughly correct. But roughly correct, in a kitchen, is not correct. Lem's robot on the mountain was in the same position. It had learned enough about mountaineering to attempt a difficult route. It had not learned enough to understand that difficulty is not the point.
π0.7's researchers close the gap with coaching. Talk the robot through it. Step by step, plain language, until the success rate climbs from 5 to 95. Then they fine-tune a high-level policy, a generalized action plan, from the coaching sessions, so the robot can eventually generate its own step-by-step instructions and perform the task autonomously. The coaching becomes internalized. The mountain, for now, gets climbed safely.
The question Lem would ask is what happens when the coaching stops. When the system is deployed, and the task is unfamiliar enough that no one thought to write coaching instructions for it, and the robot's compositional generalization produces an action plan that looks reasonable but carries a risk no one anticipated. When it takes the harder route because its training data, assembled from diverse and imperfect demonstrations by humans who sometimes chose the harder route, included enough examples of ambition to make ambition a component of its behavioral repertoire.
The Cambrian Problem
In a recent episode of Y Combinator's Lightcone podcast, Physical Intelligence co-founder Quan Vuong laid out the commercial vision. Physical Intelligence runs its models in the cloud. Robots query an API endpoint for chunks of sequential actions. While the robot executes the current chunk, it pre-fetches the next one. Cheap hardware powered by massive intelligence hosted somewhere else.
The framing Vuong used was "Cambrian explosion." The barrier to entry for robotics startups has collapsed. You don't need to build an autonomy stack, the full software layer that handles how a robot senses, decides, and acts. You buy the intelligence layer, focus on the specific workflow your customers need, and let Physical Intelligence handle the brain. The company has raised over $1 billion and is reportedly in talks for a round that would value it at $11 billion.
Karol Hausman, the company's CEO and co-founder, told Sequoia Capital's Training Data podcast earlier this year that the bottleneck has always been intelligence, not hardware. Robots that could clean an entire house existed more than a decade ago, if teleoperated. The hardware was waiting for a brain. Now the brain is shipping. Not finished, not reliable enough for Hausman to name a deployment date, but shipping.
The Cambrian explosion is a useful metaphor, but it has a feature the tech industry doesn't like to dwell on. The actual Cambrian explosion produced an enormous diversity of body plans. Most of them went extinct. The ones that survived did so not because they were the most capable but because they fit the niches that happened to exist.
Hogan understood this. His Taloids survived because Titan's environment selected for traits the original factory ship never intended. The imperfect copies that happened to work in Titan's cold, methane-rich conditions persisted. The ones that didn't, vanished. A million years of trial and error on a moon nobody was watching.
Physical Intelligence's Cambrian explosion will run faster and louder, with investors and customers watching every step. The robots that ship via the cloud-brain API will attempt tasks their training data did not anticipate. Some of those attempts will look like the air fryer. Others will look like the mountain.
Sergey Levine, when asked directly when a system based on these findings might be ready for real-world deployment, declined to speculate. "I think there's good reason to be optimistic, and certainly it's progressing faster than I expected a couple of years ago. But it's very hard for me to answer that question."
The Sci-Fi bookshelf: Code of the Lifemaker by James P. Hogan (1983); "The Accident" from More Tales of Pirx the Pilot by Stanisław Lem (1960s).