Agents are under-elicited: A case study in optimization tasks (opens in new tab)

"Knowing is not enough; we must apply. Willing is not enough; we must do." — Johann Wolfgang von Goethe In , we introduced inverse rubric optimization (IRO): tasks where an agent must learn the preferences of a black-box judge under a label budget. These are LLM optimization tasks - where an agent iteratively optimizes a metric. In this post, we study which general prompt and scaffold methods can improve performance in these LLM optimization settings, by intervening via prompt elicitation and...

Read the original article