🤖 AI Agents - Bingran · Scour

Autoregressive Diffusion World Models for Off-Policy Evaluation of LLM Agents

💬LLMs Academic

Less-relevant results

Craig Federighi details Apple’s collaboration with Google for Siri AI in iOS 27

🔄Transformers

9to5mac.com··Hacker News, r/apple

Thoughts on starting new projects with LLM agents

🎮Reinforcement Learning

eli.thegreenplace.net··Lobsters, Hacker News, Hacker News

datasette-agent-edit 0.1a0

📐Scaling Laws

simonwillison.net·

IAPO: Input Attribution-Aware Policy Optimization for Tool Use in Small Multimodal Agents

🎮Reinforcement Learning Academic

VESTA: A Fully Automated Scenario Generation and Safety Evaluation Framework for LLM Agents

💬LLMs Academic

Can Open-Source LLM Agents Replace Static Application Security Testing Tools? An Empirical Assessment

💬LLMs Academic

Silent Failure in LLM Agent Systems: The Entropy Principle and the Inevitable Disorder of Autonomous Agents

🎮Reinforcement Learning Academic

OCELOT: Inference-Leakage Budgets for Privacy-Preserving LLM Agents

💬LLMs Academic

MemToolAgent overview with a simple restaurant booking scenario where the agent retrieves similar memories, receives feedback on an invalid time format, and generates a reflection to update its memory

⚙️Model Training Academic

Exploration Structure in LLM Agents for Multi-File Change Localization

💬LLMs Academic

ConMem: Structured Memory-Guided Adaptation in Training-Free Multi-Agent Systems

💬LLMs Academic

Agents All the Way Down; A Methodology for Building Custom AI Agents from Substrate to Production

🖥️ML Systems Academic

AURA: Intent-Directed Probing for Implicit-Need Surfacing in Situated LLM Agents

💬LLMs Academic

AGENTSERVESIM: A Hardware-aware Simulator for Multi-Turn LLM Agent Serving

🖥️ML Systems Academic

Layer-Isolated Evaluation: Gating the Deterministic Scaffold of a Production LLM Agent with a No-LLM, Regression-Locked Test Harness

💬LLMs Academic

SecureClaw: Clawing Back Control of LLM Agents

💬LLMs Academic

Beyond tokens: a unified framework for latent communication in LLM-based multi-agent systems

💬LLMs Academic

REFLECT: Intervention-Supported Error Attribution for Silent Failures in LLM Agent Traces

💬LLMs Academic

My Chemical Harness: Evolutionary Molecular Design over Synthetic Pathways with Large Language Model Agents

⚙️Model Training Academic

Log in to enable infinite scrolling