Prompt Engineering

Feeds to Scour
SubscribedAll
Scoured 46 posts in 7.0 ms

BEACON: Behavioral Entropy Aggregation for Cross-Model Hallucination Detection in Large Language Models

 🤖Agentic Engineering  Content type: Academic
arxiv.org·

Declarative Skills for AI Agents in Knowledge-Grounded Tool-Use Workflows

 🤖AI Agents  Content type: Academic
arxiv.org·

Mutation Without Variation: Convergence Dynamics in LLM-Driven Program Evolution

 🧠LLMs  Content type: Academic
arxiv.org·

LLM-Guided Neural Architecture Search for Robust Co-Design of Physical Neural Networks

 🧠LLMs  Content type: Academic
arxiv.org·

You Only Index Once: Cross-Layer Sparse Attention with Shared Routing

 🤖Agentic Engineering  Content type: Academic
arxiv.org·

When No Answer Is Correct: Diagnosing Absent Answer Detection for MLLMs in Video Understanding

 🤖Agentic Engineering  Content type: Academic
arxiv.org·

Think Fast: Estimating No-CoT Task-Completion Time Horizons of Frontier AI Models

 🤖Agentic Engineering  Content type: Academic
arxiv.org·

Proxy Reward Internalization and Mechanistic Exploitation: A Learned Precursor to Reward Hacking and Its Generalization

 🤖Agentic Engineering  Content type: Academic
arxiv.org·

Arithmetic Pedagogy for Language Models

 🤖Agentic Engineering  Content type: Academic
arxiv.org··Hacker News

VisualLeakBench: Reproducible Action-Boundary Propagation Failures in Vision-Language Agents

 ⚙️AI Automation  Content type: Academic
arxiv.org·

Quantum-Inspired Trace-Augmented Evidence Selection for Reasoning over Structured Hypothesis Spaces

 ⚛️Quantum Computing  Content type: Academic
arxiv.org·

LLM-Based Code Documentation Generation and Multi-Judge Evaluation

 🧠LLMs  Content type: Academic
arxiv.org·

Towards Autonomous Accelerator Design: FPGA Accelerator Generation with SECDA

 🤖Agentic Engineering  Content type: Academic
arxiv.org·

IMUG-Bench: Benchmarking Unified Multimodal Models on Interleaved Understanding and Generation

 🤖Agentic Engineering  Content type: Academic
arxiv.org·

Dep-LLM: Training-Free Depression Diagnosis via Evidence-Guided Structured Multi-factor with Reliable LLM Reasoning

 🤖Agentic Engineering  Content type: Academic
arxiv.org·

CogManip: Benchmarking Manipulative Behavior in Multi-Turn Interactions with Large Language Model

 🧠LLMs  Content type: Academic
arxiv.org·

The Arbiter Agent: Continually Monitoring Multi-Agent Conversations to Detect Emergent Misalignment

 🤖Multi-Agent Systems  Content type: Academic
arxiv.org·

How Small Can You Go? LoRA Fine-Tuning 270M-8B Models for Merchant Information Extraction in Financial Transactions

 🤖Agentic Engineering  Content type: Academic
arxiv.org·

UrduMMLU: A Massive Multitask Benchmark for Urdu Language Understanding

 🧠LLMs  Content type: Academic
arxiv.org·

Domain-Conditioned Safety in Frontier Computer-Using Agents: A 793-Episode Browser Benchmark, a Coding-Domain Cross-Reference, and a Reproducibility Audit of Recent Red-Teaming

 ⚙️AI Automation  Content type: Academic
arxiv.org·

Keyboard Shortcuts

Navigation

Next / previous item
j/k
Open post
oorEnter
Preview post
v

Post Actions

Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s

Recommendations

Add interest / feed
Enter
Not interested
x

Go to

Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/

General

Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help