Detecting Distillation Data from Reasoning Models
arxiv.org·1d
⚙️ABNF Mining
Is ChatGPT-5 Able to Provide Proofs for Advanced Mathematics?
machinelearningmastery.com·22h
🎯Proof Tactics
Toy Binary Decision Diagrams
philipzucker.com·2d
🧮Algebraic Datatypes
LLM Optimization Notes: Memory, Compute and Inference Techniques
gaurigupta19.github.io·1d·
Discuss: Hacker News
💻Local LLMs
ProofOfThought: LLM-based reasoning using Z3 theorem proving
dev.to·2d·
Discuss: DEV
SMT Integration
TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular Reasoning
arxiv.org·5h
🎯Performance Proofs
LexiCon: a Benchmark for Planning under Temporal Constraints in Natural Language
arxiv.org·5h
🧮Kolmogorov Complexity
MathArena Apex: Unconquered Final-Answer Problems
matharena.ai·3d·
Discuss: Hacker News
🧮SMT Solvers
AI Fixed Coding, but Not the Bottleneck: Why Lisp, FP Still Matters
github.com·8h·
🔗Lisp
Aria: An Agent For Retrieval and Iterative Auto-Formalization via Dependency Graph
arxiv.org·1d
Proof Automation
H1B-KV: Hybrid One-Bit Caches for Memory-Efficient Large Language Model Inference
arxiv.org·5h
💨Cache Optimization
Structured Cognition for Behavioral Intelligence in Large Language Model Agents: Preliminary Study
arxiv.org·5h
🧠Intelligence Compression
A tiny recursive reasoning model achieves 45% on ARC-AGI-1 and 8% on ARC-AGI-2
alexiajm.github.io·13h·
🧠Intelligence Compression
The Alignment Auditor: A Bayesian Framework for Verifying and Refining LLM Objectives
arxiv.org·5h
💻Local LLMs
BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs
arxiv.org·1d
Automated Theorem Proving
Constraint Satisfaction Approaches to Wordle: Novel Heuristics and Cross-Lexicon Validation
arxiv.org·2d
🧮SMT Solvers
Prompting Techniques for Specialised LLMs
dev.to·2d·
Discuss: DEV
🔗Constraint Handling
Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning
arxiv.org·1d
Incremental Computation
DRPO: Efficient Reasoning via Decoupled Reward Policy Optimization
arxiv.org·1d
🎯Performance Proofs
Thinking on the Fly: Test-Time Reasoning Enhancement via Latent Thought Policy Optimization
arxiv.org·1d
💻Local LLMs