Can AI Replace Teachers?
bloomberg.comยท5h
๐ŸŽญProgram Synthesis
Simple HexArch + crate for ledger system (advice pls)
crustyengineer.comยท1dยท
Discuss: r/rust
๐Ÿ”ขBinary Formats
Show HN: Comparegpt.io โ€“ Trustworthy Mode to reduce LLM hallucinations
news.ycombinator.comยท2dยท
Discuss: Hacker News
๐ŸŽฒParser Fuzzing
An enough week
blog.mitrichev.chยท2dยท
๐ŸงฉConstraint Solvers
There will soon be AI agents working on our behalf
blog.cip.orgยท1dยท
Discuss: Hacker News
๐ŸŽญProgram Synthesis
Analyzing text with hsk vocab
reddit.comยท1dยท
๐ŸŒณTree Diffing
Tiny AI model outperforms o3โ€‘mini and Gemini 2.5 Pro in ARCโ€‘AGI benchmark
the-decoder.comยท3d
๐ŸŒฑMinimal ML
Small Language Models for Agentic Systems: A Survey of Architectures, Capabilities, and Deployment Trade offs
arxiv.orgยท5d
๐Ÿ’ฌSmalltalk VMs
The Markovian Thinker
arxiv.orgยท3d
๐Ÿ“กBinary Protocols
Small amount of poisoned data can influence AI models
techzine.euยท2d
โœจEffect Inference
Which Heads Matter for Reasoning? RL-Guided KV Cache Compression
arxiv.orgยท2d
๐Ÿ—บ๏ธRegion Inference
Two-Stage Voting for Robust and Efficient Suicide Risk Detection on Social Media
arxiv.orgยท2d
๐Ÿ”ML Language
I got my little text editor to a first usuable state
github.comยท23mยท
๐Ÿ’ปTerminal UIs
I built a community crowdsourced LLM benchmark leaderboard (Claude Sonnet/Opus, Gemini, Grok, GPT-5, o3)
reddit.comยท19hยท
Discuss: r/webdev
๐ŸLanguage Benchmarks
Evaluating Small Vision-Language Models on Distance-Dependent Traffic Perception
arxiv.orgยท2d
๐ŸŒฑMinimal ML
Realistic Reward Hacking Induces Different and Deeper Misalignment
lesswrong.comยท2d
โœจEffect Inference
CAM: A Constructivist View of Agentic Memory for LLM-Based Reading Comprehension
arxiv.orgยท4d
๐Ÿ”ML Language