GEN-0: SoTA 10B+ Foundation Model for Robotics with Harmonic Reasoning
generalistai.com·4h·
Discuss: Hacker News
Flag this post

For years, foundation models in robotics have primarily used vision-language pretraining as the stepping stone towards scaling robotics, allowing us to transfer1 the benefits of semantic generalization from existing large multimodal models. But what’s been missing is how to effectively scale large multimodal model training in the domain of robotics itself—to establish scaling laws that corroborate the consistent (and predictable) improvement of robot intelligence with more compute & data, as has underpinned progress in other domains e.g. LLMs.2 This requires an architecture, training procedure, and data engine that pushes new sensorimotor capabilities, provides behavioral generalization, and grows with the vast and ever…

Similar Posts

Loading similar posts...