AI论文简报

20B搜索器外置状态打平前沿 (opens in new tab)

给搜索agent删过期观察省上下文,收益是倒U形而非单调:从4B到284B、三种检索器扫一遍,强检索器配中等模型最划算,模型本身够强时反而把有用证据也删掉、准确率掉点。 把「记账」从策略外置给环境,20B搜索器平均recall 0.730:比次强开源搜索子agent高11.4分,还在held-out迁移benchmark上提升最明显。 报告里塞图容易,塞对没人验过:TVIR用100个专家curate的多模态深研任务,把「视觉元素的事实可靠性和与正文对齐」单独拎出来当评测维度。 零标注教模型推断意图:MindZero用planner的行为可解释性当自监督奖励,训练用重推理、部署蒸成单次前向,在gridworld和家居场景超过又慢又贵的model-based方法。

Read the original article
Sign in to keep reading the full article.

Keyboard Shortcuts

Navigation

Next / previous post
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Discover
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help