Beyond Benchmarks: Testing Open-Source LLMs in Multi-Agent Workflows
blog.scottlogic.com·1d
Performance Mythology
Flag this post
New study shows genetics play a role in human cognition
pnas.org·9h·
Discuss: Hacker News
🔲Cellular Automata
Flag this post
PPO for LLMs: A Guide for Normal People
cameronrwolfe.substack.com·10h·
Discuss: Substack
🔗Constraint Handling
Flag this post
TypeAgent: Microsoft's Open Source Personal Agent Architecture
github.com·2h·
Discuss: Hacker News
Proof Automation
Flag this post
A Vision for Future Low-Level Languages
antelang.org·3d·
🦀Rust Borrowing
Flag this post
Interpretable Next-token Prediction via the Generalized Induction Head
arxiv.org·21h
🧮Kolmogorov Complexity
Flag this post
Implementing Minimax for Game AI: From Basic Algorithm to Optimized Search
dev.to·5h·
Discuss: DEV
🔲Cellular Automata
Flag this post
Building Better Software: Why Workflows Beat Code Every Time • Ben Smith & James Beswick • GOTO 2025
youtube.com·12h
🔄Reproducible Builds
Flag this post
Ken Thompson's "Trusting Trust" compiler backdoor - Now with the actual source code (2023)
micahkepe.com·3d·
💻Programming languages
Flag this post
Neural Networks for Chess
github.com·5h·
Discuss: Hacker News
Homebrew CPUs
Flag this post
The Death of Static Prompts: Building ChronoLM
dev.to·1d·
Discuss: DEV
⏱️Interval Parsing
Flag this post
We Programmers Need "Results"
rockyj-blogs.web.app·2d·
Discuss: Hacker News
📜Proof Carrying Code
Flag this post
Bridging Language Gaps with Adaptive RAG: Improving Indonesian Language Question Answering
arxiv.org·21h
🧮Kolmogorov Complexity
Flag this post
The New Calculus of AI-based Coding
blog.joemag.dev·8h·
🔄Reproducible Builds
Flag this post
Amortized Active Generation of Pareto Sets
arxiv.org·21h
🔲Cellular Automata
Flag this post
How to organize your Rust tests
blog.logrocket.com·6h·
Discuss: Hacker News
🦀Rust Macros
Flag this post
Don’t Make Assumptions About Assertions: Even with AI you still have to write your unit tests
dev.to·1d·
Discuss: DEV
🔍Concolic Testing
Flag this post
Dynamic Decisions: Making Memory-Efficient AI a Reality with Differentiable Algorithms by Arvind Sundararajan
dev.to·1d·
Discuss: DEV
Incremental Computation
Flag this post
Scalpel: Automotive Deep Learning Framework Testing via Assembling Model Components
arxiv.org·21h
🐛Fuzzing
Flag this post