Interactive Proving, Tactic Composition, Proof Automation, Mathlib
Human reward hacking
danmackinlay.name·1d
Effective Code Reviews with Conventional Comments • Paul Slaughter & Adrienne Braganza
youtube.com·2d
Know When to Explore: Difficulty-Aware Certainty as a Guide for LLM Reinforcement Learning
arxiv.org·4d
Loading...Loading more...