7B视频agent反超72B模型 (opens in new tab)
长视频理解不必逐帧全看 :OmniAgent把感知建模成可自主决策的推理动作,7B agent在LVBench做到50.5%,反超大10倍的Qwen2.5-VL-72B,还表现出正向test-time scaling|AI论文简报
Read the original article长视频理解不必逐帧全看 :OmniAgent把感知建模成可自主决策的推理动作,7B agent在LVBench做到50.5%,反超大10倍的Qwen2.5-VL-72B,还表现出正向test-time scaling|AI论文简报
Read the original article