AI Agents

Feeds to Scour
SubscribedAll
Scoured 154 posts in 6.7 ms

Autoregressive Diffusion World Models for Off-Policy Evaluation of LLM Agents

 💬LLMs  Content type: Academic
arxiv.org·
Less-relevant results

Craig Federighi details Apple’s collaboration with Google for Siri AI in iOS 27

 🔄Transformers

Thoughts on starting new projects with LLM agents

 🎮Reinforcement Learning

datasette-agent-edit 0.1a0

 📐Scaling Laws
simonwillison.net·

IAPO: Input Attribution-Aware Policy Optimization for Tool Use in Small Multimodal Agents

 🎮Reinforcement Learning  Content type: Academic
arxiv.org·

VESTA: A Fully Automated Scenario Generation and Safety Evaluation Framework for LLM Agents

 💬LLMs  Content type: Academic
arxiv.org·

Can Open-Source LLM Agents Replace Static Application Security Testing Tools? An Empirical Assessment

 💬LLMs  Content type: Academic
arxiv.org·

Silent Failure in LLM Agent Systems: The Entropy Principle and the Inevitable Disorder of Autonomous Agents

 🎮Reinforcement Learning  Content type: Academic
arxiv.org·

OCELOT: Inference-Leakage Budgets for Privacy-Preserving LLM Agents

 💬LLMs  Content type: Academic
arxiv.org·

MemToolAgent overview with a simple restaurant booking scenario where the agent retrieves similar memories, receives feedback on an invalid time format, and generates a reflection to update its memory

 ⚙️Model Training  Content type: Academic
arxiv.org·

Exploration Structure in LLM Agents for Multi-File Change Localization

 💬LLMs  Content type: Academic
arxiv.org·

ConMem: Structured Memory-Guided Adaptation in Training-Free Multi-Agent Systems

 💬LLMs  Content type: Academic
arxiv.org·

Agents All the Way Down; A Methodology for Building Custom AI Agents from Substrate to Production

 🖥️ML Systems  Content type: Academic
arxiv.org·

AURA: Intent-Directed Probing for Implicit-Need Surfacing in Situated LLM Agents

 💬LLMs  Content type: Academic
arxiv.org·

AGENTSERVESIM: A Hardware-aware Simulator for Multi-Turn LLM Agent Serving

 🖥️ML Systems  Content type: Academic
arxiv.org·

Layer-Isolated Evaluation: Gating the Deterministic Scaffold of a Production LLM Agent with a No-LLM, Regression-Locked Test Harness

 💬LLMs  Content type: Academic
arxiv.org·

SecureClaw: Clawing Back Control of LLM Agents

 💬LLMs  Content type: Academic
arxiv.org·

Beyond tokens: a unified framework for latent communication in LLM-based multi-agent systems

 💬LLMs  Content type: Academic
arxiv.org·

REFLECT: Intervention-Supported Error Attribution for Silent Failures in LLM Agent Traces

 💬LLMs  Content type: Academic
arxiv.org·

My Chemical Harness: Evolutionary Molecular Design over Synthetic Pathways with Large Language Model Agents

 ⚙️Model Training  Content type: Academic
arxiv.org·

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help