When Does Language Matter? Multilingual Instructions Reveal Step-wise Language Sensitivity in Vision-Language-Action Models (opens in new tab) 🖼️CLIP Content type: Academic

arxiv.org··Covered by ai-brief.liziran.com·Open original

Vision-Language-Action (VLA) models have shown strong performance in language-conditioned robotic manipulation, yet their robustness to linguistic variation remains poorly understood. In this work, we present the first systematic multilingual evaluation of VLA models by translating the LIBERO benchmark into ten languages, revealing severe performance degradation under non-English instructions, with success rates dropping by 30-50%. Through fine-...

Read the original article

Sign in to keep reading the full article.

Sign Up Log In

Cited by 1 article

In other languages

Arbor科研增益2.5倍，50环境抵300个

ai-brief.liziran.com·