Program Correctness, Preconditions, Postconditions, Axiomatic Semantics
Can We Predict Alignment Before Models Finish Thinking? Towards Monitoring Misaligned Reasoning Models
arxiv.org·1d
Loading...Loading more...
Program Correctness, Preconditions, Postconditions, Axiomatic Semantics