Dependent Types, Proof Assistant, Type-driven Development, Verification
AgentCoMa: A Compositional Benchmark Mixing Commonsense and Mathematical Reasoning in Real-World Scenarios
arxiv.org·9h
Interpretable Early Failure Detection via Machine Learning and Trace Checking-based Monitoring
arxiv.org·2d
Loading...Loading more...