Show HN: Debugg – 0-Config AI browser (E2E) tests that review every commit
debugg.ai·14h·
Discuss: Hacker News
🏠Homelab Pentesting
Curing Miracle Steps in LLM Mathematical Reasoning with Rubric Rewards
arxiv.org·1h
🧮Theorem Proving
Tool or Agent? The impact of AI in your code and in your wallet It all boils down to math again!
blog.codeminer42.com·16h
Proof Automation
🤖 AI as Your QA Pair Buddy
dev.to·11h·
Discuss: DEV
Proof Automation
Three Solutions to Nondeterminism in AI
blog.hellas.ai·1d·
Discuss: Hacker News
🎯Performance Proofs
An enough week
blog.mitrichev.ch·9h·
🧮Z3 Solver
When AI Remembers Too Much – Persistent Behaviors in Agents’ Memory
unit42.paloaltonetworks.com·7h
🔲Cellular Automata
Adversary TTP Simulation Lab
infosecwriteups.com·2d
🏠Homelab Pentesting
SigmaEval – statistical evaluation for GenAI apps
github.com·1d·
Discuss: Hacker News
📊Feed Optimization
Navigating the Vast AI Security Tools Landscape
optiv.com·8h
🎯Threat Hunting
Organize automated tests without getting eaten by your devs
octomind.dev·15h·
Discuss: Hacker News
❄️Nix Flakes
Seriously Testing LLMs
satisfice.com·4d
Proof Automation
Programmer in Wonderland
binaryigor.com·13h·
Discuss: Hacker News
🔩Systems Programming
Enhanced SoC Design via Adaptive Topology Optimization with Reinforcement Learning
dev.to·2h·
Discuss: DEV
🧩RISC-V
Reflection raises $2B to be America’s open frontier AI lab, challenging DeepSeek
techcrunch.com·6h
🚀Indie Hacking
Intent Weaving for AI Coding Agents
autohand.ai·3h·
Discuss: Hacker News
Proof Automation
Abstraction for Abstraction’s Sake: How Developers Talk Themselves Into Complexity
hackernoon.com·22h
🧬Functional Programming
Who watches the watchers? LLM on LLM evaluations
stackoverflow.blog·15h
📏Code Metrics
The Library Method: Understanding @cache
dev.to·4h·
Discuss: DEV
Cache Theory
Test Case Generation from Bug Reports via Large Language Models: A Cognitive Layered Evaluation Framework
arxiv.org·2d
🧪Binary Fuzzing