🕵️ AI Agents - cwensel · Scour

ADK Arena: Evaluating Agent Development Kits via LLM-as-a-Developer

🔗LLM Workflows Academic

Self-Paced Curriculum Reinforcement Learning for Autonomous Superbike Racing in Simulation

🤖AI Coding Academic

Latent Reasoning Guidance for Parallel Code Translation

🧠LLMs Academic

From Holistic Evaluation to Structured Criteria: Rubrics Across the Evolving LLM Landscape

🔗LLM Workflows Academic

Entropy-Based Evaluation of AI Agents: A Lightweight Framework for Measuring Behavioral Patterns

📚RAG Academic

Representational Similarity and Model Behavior in Multi-Agent Interaction

🔗LLM Workflows Academic

RSC: Decentralized Rigid Formation Flocking for Large-Scale Swarms via Hybrid Predictive Control and Online Reconfiguration

🤖AI Coding Academic

MAVIS: Multi-Agent Video Retrieval via Structured Video Understanding

🔗LLM Workflows Academic

Plan First, Judge Later, Run Better: A DMAIC-Inspired Agentic System for Industrial Anomaly Detection

🔗LLM Workflows Academic

FALSIFYBENCH: Evaluating Inductive Reasoning in LLMs with Rule Discovery Games

🧠LLMs Academic

TianJi-Environ: An Autonomous AI Scientist for Atmospheric Environmental Research

🔗LLM Workflows Academic

SCOUT: Semantic scene COverage via Uncertainty-guided Traversal

🧠LLMs Academic

Strabo: Declarative Specification and Implementation of Agentic Interaction Protocols

🔗LLM Workflows Academic

Cascading Hallucination in Agentic RAG: The CHARM Framework for Detection and Mitigation

📚RAG Academic

Trustworthy Smart Fabs via Professional Proxies: Scaling Safe and Sustainable by Design (SSbD) through Industrial Data Spaces

🎼Data Orchestration Academic

The Saturation Trap and the Subjectivity of Intervention Timing: Why Affect-Based Triggers and LLM Judges Fail to Time Interventions on Autonomous Agents

🧠LLMs Academic

AutoLab: Can Frontier Models Solve Long-Horizon Auto Research and Engineering Tasks?

🤖AI Coding Academic

No more posts from cwensel's subscribed feeds.

Scour all 25255 feeds Learn more about Feeds

Log in to enable infinite scrolling