VisioPath: Vision-Language Enhanced Model Predictive Control for Safe Autonomous Navigation in Mixed Traffic
arxiv.org·1d
ViDove: A Translation Agent System with Multimodal Context and Memory-Augmented Reasoning
arxiv.org·3h
On the Effectiveness of Methods and Metrics for Explainable AI in Remote Sensing Image Scene Classification
arxiv.org·2d
TalkFashion: Intelligent Virtual Try-On Assistant Based on Multimodal Large Language Model
arxiv.org·2d
Loading...Loading more...