Kimi K2.5: Visual Agentic Intelligence (opens in new tab)

Covered by 3 sources including runware.ai, fig.inc

arXiv:2602.02276v1 Announce Type: new Abstract: We introduce Kimi K2.5, an open-source multimodal agentic model designed to advance general agentic intelligence. K2.5 emphasizes the joint optimization of text and vision so that two modalities enhance each other. This includes a series of techniques such as joint text-vision pre-training, zero-vision SFT, and joint text-vision reinforcement learning. Building on this multimodal foundation, K2.5 introduces Agent Swarm, a self-directed parallel...

Read the original article

Sign in to keep reading the full article.

Sign Up Log In

Covered in 3 articles

runware.ai·

The closed-source LLM premium has collapsed

Discussed on Hacker News

fig.inc·

Breaking Browser-Use Models Using Domain Randomization

Discussed on Hacker News and Hacker News

yinghonglan.substack.com·

Long-Form Video Understanding: Bottlenecks and Design Choices – Part 1

Discussed on Substack