Over the past year, one question has come up again and again: Will robots in the future inevitably become humanoid? Behind this seemingly simple question lie two very different philosophies about how embodied intelligence should develop: what I call *creationism *and evolutionism.
By creationism in robotics, I mean the belief that if we can perfect a single, powerful, general-purpose form, the humanoid, it eventually will handle almost all real-world tasks. In this view, one humanoid platform is developed to do everything: moving boxes, cleaning, cooking, caregiving, companionship. The goal is to “create” a universal technological life form.
By evolutionism, I mean the opposite approach: robots should diversify and evolve into different forms, driven by scenarios, costs, and data…
Over the past year, one question has come up again and again: Will robots in the future inevitably become humanoid? Behind this seemingly simple question lie two very different philosophies about how embodied intelligence should develop: what I call *creationism *and evolutionism.
By creationism in robotics, I mean the belief that if we can perfect a single, powerful, general-purpose form, the humanoid, it eventually will handle almost all real-world tasks. In this view, one humanoid platform is developed to do everything: moving boxes, cleaning, cooking, caregiving, companionship. The goal is to “create” a universal technological life form.
By evolutionism, I mean the opposite approach: robots should diversify and evolve into different forms, driven by scenarios, costs, and data from large-scale deployment. No embodiment is predefined as the final answer. Robot vacuums, warehouse arms, delivery robots, mobile bases, exoskeletons, each is tested by the market and by reality. Forms that fit real scenarios and economics scale; those that don’t gradually disappear.
As embodied AI moves from demos to deployment, my personal view is that the future will follow the evolutionary path. It better matches how technologies scale, how businesses work, and how our physical world is actually structured.
View 1: One robot for all tasks evolves sequentially; diverse forms evolve in parallel.
Trying to make a single robot form cover all scenarios almost guarantees a slow, sequential path. If a humanoid is expected to clean, cook, provide elder care, fetch items, organize the home, and more, every new domain demands its own motion library, data collection, perception stack, control logic, and safety testing. These are serial additions: each scenario requires a largely separate engineering effort. Skill transfer between very different tasks is limited, so there is no simple “solve X and get Y and Z for free.”
At the same time, humanoids still face difficult challenges in high degree-of-freedom control, robust balance, compliant force control, and perception in cluttered environments. In many everyday tasks, their structural advantages are small or non-existent compared with simpler machines. Pushing one humanoid platform to “do everything” tends to produce a long list of partial capabilities, each added slowly and at high cost.
Reality proceeds differently. Multiple robot forms already have reached large-scale deployment and are improving rapidly in their own domains. Cleaning robots now have a global installed base in the hundreds of millions. Their navigation, path planning, obstacle avoidance, and floor modeling capabilities have been trained and stress-tested by billions of hours of operation in real homes. Delivery robots on campuses and in city districts run high-frequency missions in complex environments, forcing fast progress in localization, routing, and recovery from errors. In warehouses, manipulators and automated guided vehicles (AGVs) perform enormous numbers of pick-and-place and transport operations every day, constantly generating data that sharpens grasping and motion strategies.
This is what parallel evolution looks like: many forms evolving side by side in high-frequency niches, creating an ecosystem of capabilities that grows in a networked, compounding way. The efficiency comes from concurrency and specialization, not from polishing a single “one-size-fits-all” body.
View 2: Forms with shipment scale have the strongest evolutionary momentum.
If multi-form evolution is the right pattern, the next question is: Which forms are best positioned to evolve fastest? In practice, it is the robots that already ship at scale.
Robot vacuums, delivery carts, service platforms, and mobile bases did not win because they looked human, but because they reached millions or tens of millions of units per year. That scale brings two things. First, a deep supply chain: motors, wheels, gearboxes, cameras, lidars, IMUs (inertial measurement units), batteries, and compute platforms become cheaper, more reliable, and more energy-efficient when refined at volume. Second, vast real-world data: these robots operate in homes, restaurants, hotels, campuses, factories, and warehouses, encountering real noise, real dirt, real people, and real edge cases. Their algorithms and mechanical designs mature under continuous, messy feedback.
Once a platform has this kind of base, lightweight extensions become very powerful. Add a small arm to a cleaning robot and it can pick up toys or socks, place items into bins, or handle basic tidying. Add a gripper or drawer-opening module to a delivery base and it moves from “door-to-door” to “door-and-handoff.” Navigation, power, connectivity, and safety have already been solved; new capabilities ride on that infrastructure.
This is a genuine business flywheel: scale lowers cost and raises reliability; better value drives more deployments; more deployments generate more data, which improves performance further. From an evolutionary standpoint, embodied AI is unlikely to spring from an ideal body designed in isolation. It will emerge from upgrading and combining the forms that already have scale, data, and supply chains.
View 3: Using humanoids for everything runs into a hard wall—cost.
For humanoids, the critical bottleneck is often not physics, but economics. We can build impressive prototypes; the harder question is whether their cost structure matches the value of the tasks they perform.
Today’s humanoids typically combine dozens of actuated joints, expensive gearboxes and motors, rich sensor suites, large batteries, and high-end compute. Even with optimistic volume assumptions, unit costs will likely stay in the tens of thousands of dollars for some time—more like a high-end car than a home appliance.
If such a robot spends most of its time wiping tables, carrying drinks, folding laundry, or doing other low- to medium-complexity tasks, most of its hardware potential is idle. It is, effectively, like using a rocket for parcel delivery. By contrast, specialized or semi-general-purpose robots that have already scaled tend to be “just enough”: enough sensors, enough degrees of freedom, enough reliability to solve a specific class of problems at a price point that makes sense for households or enterprises. That is why vacuums can succeed at a few hundred dollars, service robots can be used in restaurants, and warehouse arms and AGVs can be justified as capital investments.
In embodied intelligence, cost is a selection pressure. Forms that deliver sufficient value at acceptable cost survive and scale; forms that are overbuilt for their main use cases may be spectacular on stage, but struggle to cross the commercial chasm. If we insist on making humanoids the universal gateway for all tasks, we risk making them economically unviable in many of the workhorse scenarios that dominate the actual world.
View 4: Large-scale evolution will surface common functions and some may converge toward humanoids.
None of this means humanoids have no role. An evolutionary lens clarifies where they might genuinely matter.
As different robot forms scale, their underlying capabilities increasingly overlap: navigation, SLAM (simultaneous location and mapping), obstacle avoidance, grasping primitives, visual understanding, balance, compliance, environment modeling. We already see this in practice. The SLAM stack in a floor-cleaning robot and in an indoor AMR (autonomous mobile robot) may share more code and concepts than their shapes suggest. Over time, a reusable library of embodied “skills” will crystallize across many forms.
In certain high-value scenarios, it may then be natural to package these skills into a more humanoid body, not because humanoids are abstractly superior, but because the world is built for humans. Door handles, tools, stairs, workbenches, cabinets, and corridors all assume a bipedal body with two arms and eyes at roughly human height. Where tasks depend heavily on existing tools and infrastructure, or where robots must work shoulder to shoulder with humans in unmodified spaces, humanoid-like morphology can be a genuine advantage.
If such convergence happens, it should be seen as the result of evolution, not its starting dogma. Today, the robots that truly scale are arms and wheeled bases, not bipedal walkers. The consumer hits are vacuums and simple home devices, not universal household butlers. The market is already indicating which forms currently balance cost, reliability, and fit.
The right question is not “Humanoids: yes or no?” but “In which specific scenarios, after the ecosystem has evolved, does a humanoid end up being the best form?” That answer will emerge from deployment, not by design.
Data-Rich Decisions
My personal view is that the future shape of embodied intelligence will not be decided by whoever draws the most inspiring humanoid sketch. It will be decided by the long, noisy, data-rich process through which millions of robots collide with the real world and either survive or fail.
Rather than trying to build a single omnipotent humanoid robot, we should embrace a diverse ecosystem of robot forms evolving in parallel and let cost, scenarios, and scale do their work. If humanoids eventually emerge as a convergent solution in some domains, it will be because evolution and the market chose them, not because we declared them the answer in advance.
Shaoshan Liu is a member of the ACM U.S. Technology Policy Committee, and a member of U.S. National Academy of Public Administration’s Technology Leadership Panel Advisory Group. His educational background includes a Ph.D. in Computer Engineering from the University of California Irvine, and a Master of Public Administration (MPA) degree from Harvard Kennedy School.
Submit an Article to CACM
CACM welcomes unsolicited submissions on topics of relevance and value to the computing community.
You Just Read
Creationism and Evolutionism in Embodied Intelligence
© 2025 Copyright held by the owner/author(s).